Researchers have developed a new tool called 'xFakeSci' that can effectively distinguish between genuine scientific papers and those generated by AI chatbots, including ChatGPT. In tests involving 300 fake and real research articles, xFakeSci achieved an impressive detection rate of up to 94% for the fake papers.
This performance is nearly double the success rate of traditional data-mining techniques. The tool, introduced by researchers from the State University of New York and Hefei University of Technology, is described in a study published in Scientific Reports.
To create xFakeSci, the team used two distinct datasets. One dataset comprised nearly 4,000 authentic scientific articles from PubMed, a comprehensive biomedical and life sciences database maintained by the US National Institutes of Health. The other dataset included 300 fake articles generated by ChatGPT, with 100 articles each on Alzheimer's disease, cancer, and depression.
Co-author Ahmed Abdeen Hamed explained that the aim was to identify patterns distinguishing AI-generated content from genuine research. The xFakeSci algorithm was trained on the authentic dataset and then tested on the fake articles. It outperformed traditional data-mining algorithms, which had accuracy rates between 38% and 52%, with xFakeSci achieving accuracy scores ranging from 80% to 94%.