CVP Creates Automated Fake News Detector That Is Over 90% Accurate
Fairfax, Virginia – August 19, 2019 – CVP, a leading business and technology consulting company, has developed an automated fake news detector that can predict whether an article is likely to be real or fake news in under one second.
The controversial topic of fake news is an emerging problem across news and social media. CVP’s team of over 40 data scientists wanted to investigate whether artificial intelligence (AI) could help with this problem. The company first started with a source of 7,000 news articles, of which half were from the mainstream media and half were from known purveyors of fake news. They then organized the articles into a database suitable for machine learning. After using Natural Language Processing (NLP) to clean up the data and perform tasks like excluding common words, CVP trained several open-source machine learning algorithms and created a model that was over 90% accurate in identifying articles from real vs. fake sources.
“I was actually kind of floored that it was so accurate after just a couple runs,” said Cal Zemelman, CVP’s Director of Data Science & Engineering. “I figured this was a complex problem that wouldn’t be straightforward for the model to pick up on, but it turned out I was pessimistic.” Using a technique called “explainable AI,” CVP’s data scientists used a library called SHAP on the machine learning model to actually explain why it thinks the model made the decisions it did.
The biggest factor that the model keyed-in on was that fake news writers tend to state opinions as facts and don’t bother to quote or attribute things to people. The explanations told CVP that the lack of the word “said” was a huge indicator for detecting fake news because the authors seemed to rarely write about who said what. The company found other terms such as “president” were generally correlated with real news, while the words “share” and “article” were associated with fake news, likely because the fake news authors have an important goal of ensuring their message is widely shared on social media.
CVP has released the source code for the application on GitHub, a social coding platform where anyone can see how the solution was created and make their own version.
Click the link to see the source code for the application on GitHub: https://github.com/CVPcorp/example-notebooks/blob/master/Real%20vs%20Fake%20News.ipynb.
CVP is a business and technology consulting company that helps organizations navigate disruption with innovative strategies and solutions and prepare for a culture of Continuous Change. It supports clients in the healthcare, national security, and public sectors, as well as private business, by enabling them to innovate faster and make decisions quicker. [ Read More ]