Recognizing a Fake News Source Using Artificial Intelligence- in One Click

52% of North Americans have encountered fake news. You probably have too.

From the elections of the Romney and Trump eras, fake news has become a problem to the point where experts are referring to it as an “infodemic”. With 40% of information on the internet being untrue, people are becoming increasingly prone to falling for misinformation and spreading it to others. It’s no secret that thousands — if not millions, of articles containing disinformation, were published during the 2016 presidential election. Shocking claims like “Pope Francis, Endorses Donald Trump for President”; and “Masks don’t help fight COVID.” are becoming more common in our media. The phenomenon of fake news is very common in our age. With the 2020 elections just around the corner and a worldwide pandemic in the hands of our community, the media awareness of our community is incredibly important.

Why Fake News Is a Serious Issue

A significant amount of fake news sources in the modern age stem from social media, with algorithmic newsfeeds based on a calculation of engagement. Twitter users, for example, are 70% more likely to retweet fake news than credible facts, according to a study out of the MIT Sloan School of Management.

The reliability of news sources on the internet, in politics as well as in other fields, influences people’s worldview and can lead to serious consequences on politics, society, and public health, all of which are aspects that play a large part in people’s lives.

Fake news has the power to manipulate by containing misleading graphs and fake quotes to “improve” their credibility. This makes it increasingly difficult to differentiate truth and falsehood in the 21st century, causing others to collectively look down on others’ morals, visions, or thoughts. Articles with such information can lead to people making their decision based on biased or false information. Fake news sources cause misinformed choices which can ultimately be dangerous for a variety of reasons.

Impacts of Fake News

Coming across unreliable news can, unfortunately, not be helped. In today’s age, it has almost become inevitable. But people’s awareness and skepticism of the messages in the article should be encouraged; humanity depends on it. The fact that 45% of North Americans reported they were not being able to recognize fake news is an alarming issue (citation).

Consequences in Politics and the Electoral Process

This has led to instances where misinformed decisions, that have real-world consequences, come into play. Examples include people possibly making a misinformed vote, as they were unaware of the true policies of the government they are electing. This in turn results in an inefficient electoral process, ultimately sacrificing a country’s confidence in their own government.

Consequences in One’s Beliefs

Additionally, fake news sources can lead to prejudice and discrimination against marginalized groups in society, inciting negative reactions and in some cases, violence. Rapidly spread misinformation has influenced people to take on various beliefs, such as anti-vaccines and anti-masks, which are in reality directly against guidance and scientific information relayed by public health officials. These beliefs become a serious risk to the health of individuals and a population. Sources of fake news crowd the internet, making it difficult for people to find out valid information on a topic. It’s easy to begin to feel hopeless amongst the increasing spread of fake news and its devastating consequences.

The Ideal Solution Using Artificial Intelligence

To address the growing “infodemic” of our age, we created Credibly.

Credibly provides information on the facts presented and the authors of news articles and tweets.

This allows users the opportunity to obtain their information on important issues mindfully, in turn encouraging them to act according to verifiable facts rather than untrustworthy sources. Let’s break it down.

  1. Credibly uses Natural Language Processing to extract information from raw text. It first scrapes the web using Python to access and get the HTML content from the webpage. Important steps in the NLP pipeline are further incorporated to clean and preprocess the text, such as expanding contractions, removing special characters, stemming to get the root word, removing stopwords such as “like” or “and”, and parts of speech tagging, using the nltk and spacy Python libraries.
  2. Sentiment Analysis is used to detect the presence of any emotionally charged vocabulary, as oftentimes, biased news articles or ones that attempt to persuade the reader use specifically chosen words to try and create strong feelings and differences of opinion.
  • Sentiment Analysis combines NLP and Machine Learning to analyze a piece of text and classifies the words, sentences, or paragraphs as having positive, negative, or neutral sentiments on a -1.0 to 1.0 scale. This is referred to as “polarity.” It also assigns a second score from 0.0 to 1.0 to determine how opinionated the text is, referred to as “subjectivity.”
Credibly’s Polarity Scale to inform its users how credible their media source is.
  • In our algorithm, we used an unsupervised lexicon-based approach, analyzing the TextBlob lexicon, which is an open source Python library containing a dictionary of words and assigned polarity values. Using various techniques including the position of words, context, parts of speech, and so on, scores are assigned to each unit and aggregated to compute a final score for each.
This is an example of our code in action, outputting the polarities and subjectivities of tweets that we scraped with the Twitter API. Note the polarity and subjectivity of 0.0 in the 5th example. While the content of the text is heavy, it does not use any emotionally charged words.

3. Credibly also determines this score through other methods, including cross-referencing, ensuring the accuracy of facts presented by comparing it to other sources. The more other sources reflect the same information, the higher the probability of it being reliable.

  • Using the TextBlob lexicon, it also considers the grammar and spelling of the text, as obvious typographical errors decrease the author’s credibility.
  • A source’s history and reputation are also considered by using a Recurrent Neural Network (RNN) to feed past scores of authors and sources on a different article back into itself, thus outlining a track record that helps immensely in establishing credibility.

The Importance of Credibly

Online news sources are an increasingly common method for people to read up on information but are also becoming increasingly crowded with fake news. This causes poorly educated world views, which can affect politics, health, and society. Credibly addresses these needs by identifying charged language, poor writing structure or spelling, determining the source’s history of reliability, and confirming a score for the source’s credibility through cross-referencing and fact-checking. Without effort put towards confirming the credibility of news and information we share, we surrender our realities to the will of others. Confirming source credibility emphasizes that we should remain skeptical because our own humanity can be at stake.

16-year-old innovator interested in how AI can solve our current healthcare issues.