A wish list for automation in academic writing
Artificial intelligence (AI) tools are now being widely explored to ease several publishing-related aspects. Academic writing is one of them. In this post, Charlotte Baptista, an expert on language automation, shares her experience-based views on what an AI tool for academic writing should be able to achieve for it to be truly beneficial to researchers.
Academic writing is notoriously specialized. While the earliest scientific papers were written in the form of letters in the 1600s1, the structure, tone, and writing practices have evolved to reach a certain level of standardization mostly within the last 50 years. The growth in academic knowledge itself accelerated greatly during this time, and academic writing norms developed to keep pace with this change. That is why writing a research paper can be a daunting task. Can automation help make academic writing easier? I believe it can, if it fulfils some critical requirements.
What are the challenges in building a tool of this sort?
Academic writing is complex
Research articles, which make up the largest proportion of the literature, commonly follow the IMRAD (Introduction-Methods-Results-And-Discussion) pattern. Different sections use a familiar narrative style and story, with statements and disclosures in their designated places. A formal academic tone is the norm, and authors usually comply with either UK or US English conventions. Word choice must convey ideas without ambiguity and avoid obvious repetition. The rules of verb tense are governed not only by time but by context: which section one happens to be working on, whether one is describing a commonly held truth, and so on.
Strict conventions are followed when representing abbreviations, units, mathematical notations, chemical symbols, species/genus names, and other entities. Citations may be styled as bracketed or superscripted numbers or written in author-date format and sequenced alphabetically or by year. The citations should further correspond with the reference list, and every element of the list needs to follow a journal-prescribed format or one of the widely used styles (the APA, MLA, Harvard, Chicago, IEEE, etc.). Punctation is deceptively simple; every innocuous comma serves a very specific need: clarity. Ultimately, every rule and convention in academic writing is designed to meet a single purpose—to communicate the research effectively enough so that it is published in a journal of sufficient esteem and widely circulated and accessed.
Replicating human expertise is difficult
These writing prescriptions are covered by academic and journal style guides that often run into hundreds of pages. Seasoned researchers or copyeditors who have spent enough time polishing manuscripts develop a sixth sense for these rules and conventions, applying them instinctively for the most part.
The task of building a machine to handle this level of complexity then becomes truly a superhuman effort. It isn’t surprising that most grammar checkers on the market today fail to do what they advertise.
Nonetheless, there is immense value in leveraging AI and NLP technologies to build automated solutions that can add to the researcher’s arsenal. Broadly speaking, below are 3 key considerations that makers of such technologies should look into—a wish list of sorts for an AI-based writing aid tool.
- Intersection of AI with subject matter expertise
Language-check solutions are developed using AI/ML techniques and/or NLP rules. A deep-learning based grammar correction model “learns” from “ground truth” or real-world data (the “input”) and attempts to replicate that data when making language corrections (the “output”). This type of deep-learning model is a black box, meaning that the internal workings of why the machine arrived at a certain prediction is unknown, unlike other algorithms which are explicitly programmed to deliver a specific outcome. As a result, the output of black box solutions are sometimes not very clean and directly usable. On the other hand, pure rule-based systems work by encoding specific outputs as long as a predefined set of conditions are fulfilled. These result in limited coverage of errors and tend to fail often because exceptions and edge cases are difficult to solve for.
However, if these solutions are designed in conjunction with professional editors who infuse it with subject matter expertise, it can eliminate the disadvantages and produce strikingly human-like results in terms of the quality and quantity of suggestions.
- Ability to “understand” the academic context
Skeptical researchers might dislike the prospect of leaving a manuscript they’ve spent months working on in the hands of a machine. Will the AI mess up my structure? Will it be able to understand the terminology? Will the hours I spent organizing references go to waste?
To be truly suited to scientific writing, the AI solution should be enriched with a layer of intelligence that helps it understand how to approach an academic paper. For example, the AI should understand whether it is dealing with a research article, a case report, or just a general essay and offer suggestions designed for that context. This is because the rules of structure, tense, etc., vary across these paper types.
- Feature-rich, time-saving solutions
Some researchers may object to automated tools if they feel these do not offer the breadth of functionality required for their research-writing needs. AI writing automation should meaningfully direct a busy researchers’ attention where it is needed and allow them to save time wherever possible.
Beyond grammar and spelling, the tool could provide discovery mechanisms to allow authors to verify ideas in the published literature. At a glance, it could indicate if the stipulated word counts have been met, if all abbreviations are defined, or the necessary disclosure statements are provided. The ability to recommend suitable journals for submission would also be incredibly handy.
The golden age of AI
Publishing an academic article is a lengthy process involving multiple revisions before a researcher can arrive at a high-quality paper. AI tools can help make this process shorter by helping researchers catch issues before submission. Tasks like proofreading and reference formatting, which would ordinarily take hours to complete, can already be accomplished in minutes.
If designed thoughtfully, AI-based automated solutions tailored to academic contexts can have a massive competitive edge over generic grammar checkers. There’s a fair bit of ground to cover and makers of these solutions certainly have their work cut out.
AI tools have already made a huge impact in the industry, with the emergence of big data converging with massive advances in computing power and data analytics. I’m confident it won’t be long before AI tools for science writing deliver on their promise.
Reference
- Swoger, B. The (mostly true) origins of the scientific journal. Scientific American Blog Network https://blogs.scientificamerican.com/information-culture/the-mostly-true-origins-of-the-scientific-journal/ (2012).
Comments
You're looking to give wings to your academic career and publication journey. We like that!
Why don't we give you complete access! Create a free account and get unlimited access to all resources & a vibrant researcher community.