Discover how xFakeSci, a cutting-edge algorithm, detects fake scientific papers, significantly improving the authenticity of articles.
When ChatGPT and other generative artificial intelligence can generate scientific publications that appear genuine, especially to those outside the field of research, how can you know which ones are fake? A machine-learning program named xFakeSci was developed by Ahmed Abdeen Hamed, a visiting research fellow at Binghamton University, State University of New York. It can identify up to 94% of fraudulent papers, which is almost twice as successful as more conventional data-mining methods(1✔ ✔Trusted Source
Detection of ChatGPT fake science with the xFakeSci learning algorithm
Go to source).
‘xFakeSci identifies up to 94% of fraudulent #papers, which is twice as successful as conventional methods. #chatgpt #research #medindia
’
Hamed and collaborator Xindong Wu published a paper in Scientific Reports where they created fake articles on Alzheimer's, cancer, and depression and compared them to genuine articles on the same topics. "My primary research area is biomedical informatics, but because I work with medical publications, clinical trials, online resources, and social media mining, I'm always concerned about the authenticity of the information being disseminated," said Hamed. "Biomedical articles in particular were hit badly during the global pandemic because some people were publicizing false research."
Hamed said when he asked ChatGPT for the AI-generated papers, “I tried to use the same keywords that I used to extract the literature from the [National Institutes of Health’s] PubMed database, so we would have a common basis of comparison. My intuition was that there must be a pattern exhibited in the fake world versus the actual world, but I had no idea what this pattern was.”
Features Analyzed by xFakeSci
xFakeSci analyzes two main features in the papers:- Bigrams: These are pairs of words that frequently appear together, such as "climate change" or "clinical trials." The algorithm found that fake papers had fewer bigrams, and the ones present were more interconnected, while genuine papers displayed a richer variety of bigrams.
- Connectivity: The algorithm also assessed how bigrams were linked to other words and concepts in the text. Real papers exhibited a more complex network of these connections than fake ones.
ChatGPT vs Original Article
“Because ChatGPT is still limited in its knowledge, it tries to convince you by using the most significant words,” Hamed said. “It is not the job of a scientist to make a convincing argument to you. A real research paper honestly reports what happened during an experiment and the method used. ChatGPT is about depth on a single point, while real science is about breadth.”Advertisement
“We are always going to be playing catchup if we don’t design something comprehensive,” he said. “We have a lot of work to look for a general pattern or universal algorithm that does not depend on which version of generative AI is used.”
Advertisement
Reference:
- Detection of ChatGPT fake science with the xFakeSci learning algorithm - (https:www.nature.com/articles/s41598-024-66784-6)
Source-Eurekalert