Learning to read and write research papers in one shot

I think most researchers, whether expert or rookie, would agree with me that writing scientific papers is hard. First, you have to come up with an original idea, materialize them with meticulous experiments, and then wrap it all up in a few pages of LaTeX that will decide its fate. At the time of writing this article, I am a full-time research assistant and an about-to-be PhD student. Most of my career is devoted to finding out useful new things and writing them down; but even after years of practice, the second part doesn’t get any less dreadful. I have read books, taken courses and workshops, watched tutorials (more on these later). Although they did help, but not as much it was necessary. Because the best way to learn to write is to get feedback on your writing from someone who knows better than you. Which, in a specialized field of research, is incredibly hard to find. So, I have recently come up with a method to improve my paper writing skills without any supervision; a way that also improves paper reading skills and vocabulary as a byproduct, which may come in handy to some of the struggling writers out there.

Autoencoder: data to summary to original data back. ( Image © Keras blog )

The answer came to me from an unlikely source. This is actually a problem that AI researchers are trying to solve for many years. Typically, we train an AI with human-provided samples (“supervised learning” for you AI nerds!); for example, by feeding in a lot of handwritten digits with human annotation of the digit’s number, we can teach an AI to recognize digits. But this method of data collection is expensive and time-consuming. Hence, scientists came up with some clever algorithms to do this task without human intervention. One such is called Autoencoder. In simple terms, what we do is, we take a data (image or text) and tell a neural network named Encoder to summarize it in a tiny space. When it does, we give the summary to another neural network named Decoder to generate the original image or text back. Finally, we can compare the generation to the original data, and by calculating their difference, voila! We have a feedback. With this feedback, we can train the encoder to summarize better, the decoder to generate better.

Can you guess where we are going with this? We will use this exact method for improving our writing skills. Here is the Autoencoder writing method in 5 simple steps:

  1. Take a research paper you would like and choose a section (Abstract, Introduction, Conclusion, etc.).
  2. Read the section, note down important key points (Encoding).
  3. Now, try to rewrite the section from your notes (Decoding).
  4. Compare your writing with the original paper. Note down what you have learned.
  5. Repeat.

In Step 1, I suggest you do this exercise one section of a paper at a time, as different sections of a paper follow different structures. In Step 4, when I say to compare your writing with the original, I don’t mean you have to write exactly what the authors wrote. Rather, take notes on how the original writing is better than yours or yours is better than theirs. Take thorough notes on what you learned about your weaknesses and how to improve them. This, is the most important part of this exercise.

This exercise is beneficial in two ways; first, you will learn to read a paper and find key ideas (encoding). If you can’t find the key ideas correctly, during the rewrite, it will become apparent to you which key ideas you have missed and how to capture them better. And thus you will improve at finding key ideas and overall reading paper. Second, you will practice writing detailed text from key ideas (decoding). That’s exactly what we do while writing our own paper, right? But unlike while writing your own paper, this time you have a reference. You are free to check out the original writing and compare your writing with theirs. You can do this process several times until you are satisfied with the writing quality. Additionally, I would also recommend keeping a separate sheet of the new vocabulary, phrases you have just learned or you found interesting and build your own “vocabulary inventory”. You can take it one step further, you can follow a particular researcher and try to imitate their writing style. When you are comfortable with section by section rewrite, you may attempt reading an entire paper and writing the abstract by yourself.

Enough talk, let’s see the Autoencoder writing method in practice. Here is the abstract from a paper titled “Integrating Multimodal Information in Large Pretrained Transformers” by some of my peers.

After reading the Abstract, these are the key points I noted in OneNote:

You can see that the key points are highly unstructured, as they are only meant to be understood by me. From these key points, this is what I wrote.

Now the most important part, reviewing your own work. Not only the writing, we need to review both our summary and the writing. Here are my reviews of my writing.

These reviews tell me the weaknesses in my writing. The next task is to work on them and try again.

The beauty of this method is, it covers the entire scientific process. Reading papers, understanding them, summarizing key ideas, writing full-fleshed text from key ideas, and finally reviewing what you have already written. Also, this gives you an excuse to read all the papers you always wanted to read. So you become a better reader, writer, reviewer, improve your vocabulary, keep up to date with the latest works, in one shot, and — with no supervision.

Feel free to let me know if the method worked for you. Happy writing. :)

P.S. I promised some other resources for improving writing skills.

  1. On Writing Well — by William Zinsser
  2. Writing with Flair — Udemy
  3. Writing in the Sciences — Coursera
  4. How to Write and Publish a Scientific Paper — Coursera
  5. This is one of the handiest resources I came across, a list of “Useful phrases” in paper writing.

Researcher in NLP and Machine Learning | masumhasan.net