Fake news detection by supervised learning
Fake news have attracted attention in recent times raising questions about the degree of influence they can have on important activities of people’s lives. Efforts have been made to detect them, but usually they rely on human labour, what can be costly and time consuming. Hence, it brings the need for automatic ways to infer if a given article is true or false. The aim of this work is to study supervised learning techniques to automatically infer whether news are fake or not in multiple platforms and languages. We use six datasets from three differente platforms (websites, Twitter and WhatsApp) in two languages (Portuguese and English). Two of these datasets are new and were collected along this study. We monitored fact-checking websites looking for fake news topics. Then, we retrieved Twitter messages of such topics, alongside true content. We also downloaded WhatsApp messages these websites reported. Then, for every dataset we used three sets of features (textual, bag-of-words and DCDistance) and fed them to classification algorithms to separate fake from true content. For the WhatsApp dataset we applied one-class-classification (OCC)