Researchers do wonderful work, and it deserves being read. We found this preprint on Promising Preprints, meaning that it is doing very well on Twitter. This is a good sign that the scientific community is excited about the paper.
Unfortunately, we found the writing overly complex, dry, and hard to read. Here is the abstract:
Ancient DNA has revolutionized our understanding of human population history. However, its potential to examine how rapid cultural evolution to new lifestyles may have driven biological adaptation has not been met, largely due to limited sample sizes. We assembled genome-wide data from 1,291 individuals from Europe over 10,000 years, providing a dataset that is large enough to resolve the timing of selection into the Neolithic, Bronze Age, and Historical periods. We identified 25 genetic loci with rapid changes in frequency during these periods, a majority of which were previously undetected. Signals specific to the Neolithic transition are associated with body weight, diet, and lipid metabolism-related phenotypes. They also include immune phenotypes, most notably a locus that confers immunity to Salmonella infection at a time when ancient Salmonella genomes have been shown to adapt to human hosts, thus providing a possible example of human-pathogen co-evolution. In the Bronze Age, selection signals are enriched near genes involved in pigmentation and immune-related traits, including at a key human protein interactor of SARS-CoV-2. Only in the Historical period do the selection candidates we detect largely mirror previously-reported signals, highlighting how the statistical power of previous studies was limited to the last few millennia. The Historical period also has multiple signals associated with vitamin D binding, providing evidence that lactase persistence may have been part of an oligogenic adaptation for efficient calcium uptake and challenging the theory that its adaptive value lies only in facilitating caloric supplementation during times of scarcity. Finally, we detect selection on complex traits in all three periods, including selection favoring variants that reduce body weight in the Neolithic. In the Historical period, we detect selection favoring variants that increase risk for cardiovascular disease plausibly reflecting selection for a more active inflammatory response that would have been adaptive in the face of increased infectious disease exposure. Our results provide an evolutionary rationale for the high prevalence of these deadly diseases in modern societies today and highlight the unique power of ancient DNA in elucidating biological change that accompanied the profound cultural transformations of recent human history.
That is 347 words, Flesch Kincaid readability grade of 20, and SMOG of 20.9 (more on these scores in our future posts). Scientists working in the same field will not have any trouble understanding what the authors are trying to say. But how many readers, journalists, and specialists working in other fields will start reading the paper but never finish? How many smart brains won’t understand the importance because of the unreadability fatigue*?
At unreadable, we believe that authors can tell their complex research stories using clear and effective language. So, we challenged ourselves to reduce the readability grade level to 12 and SMOG to under 15**. We also wanted to cut the length by at least 40%. Here’s what we did.
The fixes
Ancient DNA has revolutionized our understanding of human population history.
I always tell my authors to avoid overstatements. Exaggerations make the reader trust your writing less. Modern academic style is full of exaggerations and so many researchers don’t pay too much attention to them. But for anyone else, overstatements signal overconfidence, narrow focus, and poor reliability. “Revolutionized” is an overstatement. Perhaps ancient DNA has improved our understanding of human history, but has it completely changed it in ways nobody thought possible? Probably not. Even if you feel like it did, this feeling is subjective.
However, its potential to examine how rapid cultural evolution to new lifestyles may have driven biological adaptation has not been met, largely due to limited sample sizes.
The sentence sounds lifeless and passive. Readers expect to see a subject doing something in the sentence. And did you notice how this sounds stronger and clearer than “the readers’ expectation is to see a subject doping something?”
It takes the reader half of the sentence to reach the verb “met,” but it is the most important part. By then, if my brain is tired, I would have already forgotten how the sentence started. The authors use “due to” to sound clever, but what they really mean to say is “because.”
So, split the sentence into two, and they will be much clearer. Use active voice, and get rid of “due to’s.”
We assembled genome-wide data from 1,291 individuals from Europe over 10,000 years, providing a dataset that is large enough to resolve the timing of selection into the Neolithic, Bronze Age, and Historical periods. We identified 25 genetic loci with rapid changes in frequency during these periods, a majority of which were previously undetected. Signals specific to the Neolithic transition are associated with body weight, diet, and lipid metabolism-related phenotypes. They also include immune phenotypes, most notably a locus that confers immunity to Salmonella infection at a time when ancient Salmonella genomes have been shown to adapt to human hosts, thus providing a possible example of human-pathogen co-evolution. In the Bronze Age, selection signals are enriched near genes involved in pigmentation and immune-related traits, including at a key human protein interactor of SARS-CoV-2.
The authors keep doing more of what we have seen in the first two sentences. They overcrowded the sentences I marked in red. They chose complex verbs (“confer”, “identify”) instead of using simpler ones. And they use passive voice (“are associated with,” “signals are enriched”) and nounization (“with rapid changes in frequency”).
Only in the Historical period do the selection candidates we detect largely mirror previously-reported signals, highlighting how the statistical power of previous studies was limited to the last few millennia.
The authors overcrowded yet another sentence, and they chose “selection candidates” as their subject. “We detect” is a modifier here, not something that is doing active work. And it should be doing active work. Don’t turn active parts of the sentence into modifiers; this makes your prose sound weak.
The Historical period also has multiple signals associated with vitamin D binding, providing evidence that lactase persistence may have been part of an oligogenic adaptation for efficient calcium uptake and challenging the theory that its adaptive value lies only in facilitating caloric supplementation during times of scarcity.
I will let the reader decide if “The Historical period” is a good subject. Could the sentence also be overcluttered?
Finally, we detect selection on complex traits in all three periods, including selection favoring variants that reduce body weight in the Neolithic. In the Historical period, we detect selection favoring variants that increase risk for cardiovascular disease plausibly reflecting selection for a more active inflammatory response that would have been adaptive in the face of increased infectious disease exposure.
Scientists overuse the “including this and that” construction, although technically there is nothing wrong with it. I would still cut it. And I would split the second sentence in three.
Our results provide an evolutionary rationale for the high prevalence of these deadly diseases in modern societies today and highlight the unique power of ancient DNA in elucidating biological change that accompanied the profound cultural transformations of recent human history.
“Provide an evolutionary rationale” and “elucidate” are two fancy ways of saying “explain.” “Modern societies today” is the same as “modern societies.” High prevalence simply means that something is common. “Unique power” is another exaggeration.
What we came up with
Our edits are not perfect, they never are. At this point it would be up to the authors to tell us if we missed some important bits. But these changes will be trivial. What is important is that the we cut the length by 60%, and improved readability to 12 and SMOG 14. Take a look:
Ancient DNA studies have helped researchers understand humanity’s history. They can also help us study how culture has affected genetic selection. But these types of studies need large sample sizes. We collected genome-wide data from 1,291 European remains over 10,000 years. With this large data set, we can resolve the timing of selection in the Neolithic, Bronze Age, and Historical period. We found 25 genetic loci that rapidly changed their frequency in these periods. In the Neolithic transition, selection affected loci related to weight, diet, and lipid metabolism. We also detected signals related to immunity, and specifically, a locus that supports immunity against Salmonella. Selection in the Bronze Age influenced genes related to pigmentation and immunity. Our results in the Historical period are consistent with previously-reported selection patterns. Selection in this period affected genes related to vitamin-D binding and to heart disease. Our results help explain why these diseases are so common in modern societies. ⭐️
* Unreadability fatigue is that feeling you get when after a long work day you start reading a piece of prose but cannot understand what the authors are talking about. You have to re-read the same passage multiple times, and you forget how the sentence started by the time you reach the end.
** Hoping that the authors will appreciate our efforts.