Pheme and the fight against fake news
Facebook, Google and most recently the Décodex of the newspaper Le Monde are some of the projects being developed to try and fight rumors and false information. Launched on January 2013, project Pheme gathers researchers, journalists and European experts. This project is the oldest and the least notorious amidst its competitors. Nevertheless, its goals are very ambitious: developing on the one side, algorithms that can automatically detect rumors, on the other, tools for tracing, analyzing and verifying information.
Conceived to serve journalists and medical physicians, these tools could offer the news’ world some breathing room. As Kalina Bontcheva, head of the project, points out: we face a « race between machines and people who make up false information for fun, for political purposes or for money ». A race that often leaves journalists out of breath in front of breaking news and budget restrictions imposed on the press.
Tracing the diffusion of rumors in real time
As the development of Pheme comes to term it should be capable of identifying rumors, the true and the false, to trace their development on real time. One of the possible approaches to following a rumor is to trace the origin of a copy pasted text or fragment that has eventually undergone some transformations. However, tracing a news topic that has been picked up by different medias, or in the form of discussions on social networks requires a more complex approach.
One of the ambitious tools Pheme desires to develop would allow the tracing of implicit links between articles, posts, and conversations on Twitter that refer to the same information, but that, for example, contain contradicting details. To do this the tracing will repose upon not the precise order of the words but upon a more flexible structure, that of the grammatical relationship between words.
As a result, a tweet declaring a fact (be it true or false) will not be analyzed by Pheme individually, but amidst a sequence, amidst a relational network. In this way, Pheme would propose its users indications as to the veracity of the analyzed information, providing, for instance, the space-time frame in which it was expressed, or the degree to which its authors are reliable based on their status and past publications.
From the London riots to project Pheme
The Pheme project was conceived in the wake of the 2011 London riots. At the time, the Guardian and the London School of Economics associated around the project “Reading the Riots”; a social research study that was interested in understanding the role of social media during these events.
Rob Procter, now a professor of the Computer Science department at the University of Warwick, analyzed with his team more than 2.5 million tweets posted during the riots. This work allowed them to study the diffusion of seven rumors that spread through Twitter. Following this research, the Guardian put up a visual interface allowing readers to observe on an interactive chronological line the progression of the tweets confirming, contradicting, questioning or commentating one of these rumors.
“The project with the Guardian gave us an idea of the meanings people give information they receive from social platforms such as Twitter,” Rob Procter says, “It suggested that we could use the information that we gathered to teach machines to foresee automatically, or semi-automatically, the veracity of the information on social networks.”
To reach the results published in the Guardian, they had to use a manual method of data sorting, Rob Procter specifies. In the framework of project Pheme, this procedure has been programmed onto computers so they might be capable of classifying data that confirms, opposes, questions or comments an information.
A multidisciplinary team
Behind Pheme, there is a team of academics, experts and journalists coordinated by the University of Sheffield. Relative to the fields of machine learning, data-mining, journalism, human-computer interaction and visualization, they meet face-to-face two or three times a year and collaborate regularly through Skype. From its beginning in 2013, the project has been financed by the European Commission.
To elaborate models of data verification, the researchers have relied upon collecting data from past cases of rumor proliferation on social media, such as the crash of Germanwings or the Ottawa shooting. “In such news stories, we expect to find erroneous information, misinterpreted information, or hoaxes surrounding the real events,” Rob Procter explains.
The swissinfo.ch teams have helped researchers understand the work and needs of journalists. “Our role has also been to provide deep analysis of the rumors that circulated after events such as the Charlie Hebdo shooting, or the Germanwings crash,” adds Geraldine Wong Sak-Hoi, a swissinfo representative.
With the researchers, the journalists have developed data verification and sorting protocols. Once the data was sorted, analyzed and verified by human experts, machine learning algorithms were run to reproduce their work. The objective being that the algorithms reach the same conclusions as the researchers before moving on to use it on new data.
The platform Pheme developed for journalists is currently at its testing phase. Many of the technologies composing the tool, plus the data relating to the rumors studied by the consortium, should soon be available on open source. If this project seems promising and avant-garde, it is still at its initial phases, as Geraldine Wong Sak Hoi confirms on an article published last June on swissinfo.ch. Time is still necessary for machines to perfectly identify and indicate the veracity of content, “a difficult job to accomplish in all of its precision,” she observes.
If in the future algorithms should help humans distinguish between the true and the false, other strategies must be put to practice today to fight against false information. Simple tools like the “verified account” badge on Twitter, or the mention of “sponsored link” on certain online media already constitute a clear means of identifying fake accounts or ads. However these indicators are rarely mastered by young adults, as a recent study by the Stanford History Education Group shows. This study proves that education on information and media needs to be given a more relevant position, especially in school programs.
Svitlana Vakulenko et al. Visualising the propagation of News on the Web, 2016.
Leon Derczynski et Kalina Bontcheva. Pheme : Veracity in Digital Social Networks, 2014.
Mădălina Ciobanu. A team of researchers from 7 countries is building an open-source tool to help verify claims on Twitter. Journalist.co.uk, juin 2016.
Jo Fahy. True or false ? Sorting rumours and sifting data [podcast]. Swissinfo, mai 2016.
Kurt Wagner. Researchers are building a lie detector for Twitter. Mashable, février 2014.