Detecting Fake News with a BERT Model
In a prior blog post, Using AI to Automate Detection of Fake News, we showed how CVP used open-source tools to build a machine learning model that could predict (with over 90% accuracy) whether an article was real or fake news. The field of Artificial Intelligence (AI) is changing rapidly and there was interest among the CVP Data Science Team as to whether they could improve the accuracy of the model. New advancements, like BERT (Bidirectional Encoder Representations from Transformers) models, have generated much attention in the Natural Language Processing (NLP) realm over the past year. BERT is a type of machine learning architecture that takes an entire phrase or sentence as input, allowing the model to learn word meanings based on context and relation to other words in the text. Furthermore, the BERT model can be pre-trained using an enormous plain text corpus (e.g., the entire English Wikipedia), which means that it does not require the same thorough training and development as do other machine learning methods.
CVP revisited the dataset used and discussed in the prior blog post. The data consisted of 7,000 news articles, of which half were from the mainstream media and half were from known purveyors of fake news. NLP methods were used to tokenize the sentences into a format suitable for input with the BERT model. Once trained, the BERT model was able to identify articles from real vs. fake sources with more than 96% accuracy, a 6% increase from the previous gradient boosting model with no additional data required!
Like before, CVP has released the source code for the application on GitHub, a social coding platform where anyone can see how the solution was created and make their own version. Click the link to see the source code for the application on GitHub: (link).
These results are incredibly promising and a testament to the innovation that has come about in the field of NLP. Unsurprisingly, the CVP Data Science Team is already using tools like BERT to improve NLP results for our clients. While CVP is not directly involved in predictive analytics with news data, these tools are problem-agnostic and can be applied to a wide variety of NLP topics. As an example, CVP is currently working with a government client to predict injuries at certain site locations. The input data for this task contains numerous instances of free form text. By using a BERT model to address this problem, the free form text can be leveraged in a way in which the model learns the context of the text to develop predictions.