Data sleuths and their role in maintaining research integrity
The true crime genre has enjoyed a huge resurgence in recent years, with everything from big-budget Netflix productions to amateur podcasts getting millions of monthly streams. There is something about picking through a hidden trail of clues, uncovering the truth, and delivering justice that resonates with people, and investigators in these cases often work tirelessly to gain breakthroughs.
While their work and motivations are different from those of police detectives and investigative journalists, data sleuths have started to gain some attention for their often-thankless efforts to ensure that research is done both ethically and legally.
Meet the data sleuths
A “data sleuth” is a rather new idea. While one goal of peer review is verifying whether research data are plausible, there were never traditionally people who made a career out of investigating research data and processes to maintain scientific integrity. However, numerous high-profile scandals around the world have rocked public confidence in science, and the replicability crisis is an ugly truth about modern sciences1. While many researchers are dedicated to carrying out their work with passion and integrity, some may choose questionable ways of producing their work. This means that data sleuths are increasingly becoming valuable voices in guiding ethical and sustainable research practices2.
Data sleuths use a special skill set, including digital forensics, subject matter knowledge, and contact with whistleblowers to identify potential signs of research misconduct and bring them to the attention of interest parties before they can damage research and the scholarly record.
What data sleuths look for
Image manipulation
Image manipulation can be relatively “innocent,” or can even facilitate massive academic fraud. Using digital forensics tools, EXIF data, and a keen eye, a data sleuth can spot many clues suggesting potentially unethical image manipulation3.
Paper mills
An academic paper mill offers authorship positions on papers detailing dubious or faked research, which are often then published in predatory journals. In a “publish or perish” paradigm, some researchers are seduced by an easy way of increasing their publication count, but such paper mills do little more than steal money and potentially fund more criminal enterprises preying on researchers, especially those in the early or middle stages of their careers.
Data sleuths have identified and stopped several paper mills by looking at odd patterns of authorship and collaboration, such as when one Russian paper mill was detected and publicized4.
Fraudulent peer review
Journals usually have a robust peer review process, where submitted manuscripts undergo thorough evaluation by experts trusted to be ethical and impartial. Many predatory journals have a fraudulent peer review system, in which nearly any paper can pass peer review. Spotting unreliable and non-existent peer review processes on the basis of the quality of research published by a journal may sometimes be difficult. However, such practices have still been easily identified by individuals by submitting a bogus paper to suspected predatory journals and confirming the quality of their peer review process.
Inaccurate, flawed, or faked data
Much like how images get manipulated, data also gets “massaged” to produce the desired results, such as in “p-hacking.” This can include omitting or editing data and drawing spurious connections between variables. While manipulated data are often not easily discoverable, statistical knowledge can help data sleuths identify suspicious patterns in data5, helping them make informed decisions about the quality of the information.
Undeclared conflicts of interest
Virtually all research journals include a conflicts of interest section for articles. This section usually operates on an honor system, with researchers being trusted to provide an accurate summary of their financial connections that may influence their objectivity toward their research. This means that researchers can claim to have no conflicts of interest and usually go undetected. Failure to disclose such information can bring the integrity of research into question, particularly when there is a profit motive.
A surprising amount of information is public, ranging from patent documents to shareholder meeting minutes. A determined sleuth with a search engine can uncover such information with ease.
How sleuths are maintaining research integrity
In 2005, an enormous national scandal broke in South Korea when investigative journalists for MBC, a television broadcaster in the country, uncovered a huge range of ethical violations and false data in what was to be dubbed “The Hwang Affair6.” Hwang Woo-Suk was South Korea’s most famous scientist, becoming a household name and national hero. His research into somatic cell nuclear transfer promised to realize human cloning and the production of safe pluripotent stem cells, which were promoted as a possible miracle cure for many diseases. The journalists’ investigation, with the aid of whistleblowers, found evidence of research fraud, intimidation, embezzlement, and money laundering. Despite this, MBC faced an uphill battle, with members of the public and government trying to silence their reports. However, the sleuths of the MBC were vindicated when the full range of Hwang’s violations became clear.
In the years 2009–2014, an anonymous message-board user named “11jigen” created blogs in English7 and Japanese, as well as a YouTube channel8, highlighting image manipulation in 24 papers published by Shigeaki Kato, a professor at the University of Tokyo. These allegations sent waves through the field; from 2011 to 2012, Kato’s publications were successively withdrawn or corrected until he resigned in 2012. After a lengthy investigation, the University of Tokyo identified fraud in 33 of 165 published papers9. Although the culprit behind this fraud has never been conclusively identified, Kato’s career has never fully recovered.
Elisabeth Bik, arguably the most visible data sleuth today, has written extensively on research misconduct. She was one of the major figures raising questions about the research of Lesné et al. and busting wide open an extensive paper mill based in China. Her Twitter account10 often shares the latest findings in scientific integrity and she is highly active on PubPeer11. She played a major role in uncovering the potentially fraudulent findings on Aβ*56 that shocked dementia research last year12.
Key takeaways for you as a researcher
Stay up to date on crucial discussions related to ethical problems in your field
Much like in Wikipedia or many open-source software initiatives, people volunteering their time on platforms like PubPeer can do amazing things. When conducting your literature reviews, looking up post-publication comments on such platforms for your papers of interest can provide important information on whether the research is reliable.
The biggest problem posed by fraudulent research is that it can get into published literature and influence future work, potentially wasting large amounts of funding and setting back progress in a field by years. Being alert to early warning signs put out by data sleuths can save you time, effort, and money. Moreover, following the work of data sleuths may help you stay alert and become better at recognizing problematic research. This can prove useful in identifying potential issues either when you collaborate with other researchers or when you conduct peer reviews.
Data sleuthing is becoming a rewarding career
Research integrity consultant is a new profession in science, and many organizations and publishers are now using their services to verify that findings are plausible and properly reported. If you are interested in such careers, the information that data sleuths freely put online is invaluable for training yourself to see the common forms of misconduct employed by bad-faith actors.
References
1. Wingen, T. How to start a replication crisis. Nat. Rev. Psychol. 1, 317–317 (2022).
2. Parker, L., Boughton, S., Lawrence, R. & Bero, L. Experts identified warning signs of fraudulent research: a qualitative study to inform a screening tool. J. Clin. Epidemiol. 151, 1–17 (2022).
3. How a Sharp-Eyed Scientist Became Biology’s Image Detective | The New Yorker. https://www.newyorker.com/science/elements/how-a-sharp-eyed-scientist-became-biologys-image-detective.
4. Russian site peddles paper authorship in reputable journals for up to $5000 a pop. https://www.science.org/content/article/russian-website-peddles-authorships-linked-reputable-journals.
5. Elliott, G., Kudrin, N. & Wüthrich, K. Detecting p-Hacking. Econometrica 90, 887–906 (2022).
6. Kim, J. Public feeling for science: The Hwang affair and Hwang supporters. Public Underst. Sci. 18, 670–686 (2009).
7. 11jigen. Shigeaki Kato (the University of Tokyo): SUMMARY of alleged image manipulation. Shigeaki Kato (the University of Tokyo) https://katolab-imagefraud.blogspot.com/2012/01/summary-of-alleged-image-manipulation.html (2012).
8. Whistleblower Uses YouTube to Assert Claims of Scientific Misconduct | Science | AAAS. https://www.science.org/content/article/whistleblower-uses-youtube-assert-claims-scientific-misconduct.
9. 記者会見「東京大学分子細胞生物学研究所・旧加藤研究室における論文不正に関する調査報告( 最終 )」の実施について. 東京大学 https://www.u-tokyo.ac.jp/focus/ja/press/p01_261226.html.
10. Elisabeth Bik is in New York (@MicrobiomDigest) / Twitter. Twitter https://twitter.com/MicrobiomDigest (2023).
11. PubPeer - Search publications and join the conversation. https://pubpeer.com/search?q=lesne.
12. Faked Beta-Amyloid Data. What Does It Mean? https://www.science.org/content/blog-post/faked-beta-amyloid-data-what-does-it-mean.
Comments
You're looking to give wings to your academic career and publication journey. We like that!
Why don't we give you complete access! Create a free account and get unlimited access to all resources & a vibrant researcher community.
Subscribe to Conducting Research