Fake News Challenge – Revised and Revisited

The organizers of the The Fake News Challenge have subjected it to a significant overhaul. In this light, many of my criticisms of the challenge no longer apply.

Some context:

Last month, I posted a critical piece addressing the fake news challenge. Organized by Dean Pomerleau and Delip Rao, the challenge aspires to leverage advances in machine learning to combat the epidemic viral spread of misinformation that plagues social media. The original version of the the challenge asked teams to take a claim, such as “Hillary Clinton eats babies”, and output a prediction of its veracity together with supporting documentation (links culled from the internet). Presumably, their hope was that an on-the-fly artificially-intelligent fact checker could be integrated into social media services to stop people from unwittingly sharing fake news.

My response criticized the challenge as both ill-specified (fake-ness not defined), circular (how do we know the supporting documents are legit?) and infeasible (are teams supposed to comb the entire web?)

Shortly after I posted these complaints, Dean reached out to me via the comment thread and directly by email. He acknowledged many of the problems and informed me that they had already arrived at some of the same conclusions themselves. Shortly afterwards, Dean emailed me a mock-up for the new version of the challenge, which launched roughly a week later. While I plan to keep the old post live, it also seems appropriate to update the record on Approximately Correct.

What’s new:

In version 2.0, the challenge pivoted away from the unrealistic goal of solving fake news outright. Instead they proposed the task of stance detection. Here’s the goal is to take a headline (from article 1) and body text (from article 2) and identify whether the body text agrees disagrees or discusses (without taking a position) with the claim.

This idea is considerably more modest than the original goal of the Fake News Challenge. For starters, stance detection algorithms have no ability to label any article as fake automatically. Instead, their idea is that this might be a useful tool for fact checkers. Given a dubious claim, the human fact checker might use the stance detection system to pull up a long list of articles that either support or disagree with the claim. The fact checker can then follow up on these leads and expertly evaluate the quality of the evidence presented.

It is not clear just how much easier such a tool would make the lives of fact checkers. However, the challenge strikes a reasonable compromise between ambition and feasibility. Many moonshot research efforts go nowhere precisely because they are unwilling to patiently endure the incrementalism that progress typically requires. If nothing else:

  1. In running the Fake News Challenge, the organizers are building a large dataset that might be re-usable for other fake news related challenges.
  2. They seem well-positioned to identify a core community of talented researchers committed to addressing societal challenges with machine learning.

I look forward to seeing how the challenge pans out and where they go next.


The Deception of Supervised Learning – V2

[This article is a revised version reposted with permission from KDnuggets]

Imagine you’re a doctor tasked with choosing a cancer therapy. Or a Netflix exec tasked with recommending movies. You have a choice. You could think hard about the problem and come up with some rules. But these rules would be overly simplistic, not personalized to the patient or customer. Alternatively, you could let the data decide what to do!

The ability to programmatically make intelligent decisions by learning complex decision rules from big data is a driving selling point of machine learning. Leaps forward in the predictive accuracy of supervised learning techniques, especially deep learning, now yield classifiers that outperform human predictive accuracy on many tasks. We can guess how an individual will rate a movie, classify images, or recognize speech with jaw-dropping accuracy. So why not make our services smart by letting the data tell us what to do?

Continue reading “The Deception of Supervised Learning – V2”

Is Fake News a Machine Learning Problem?

On Friday, Donald J. Trump was sworn in as the 45th president of the United States. The inauguration followed a bruising primary and general election, in which social media played an unprecedented role. In particular, the proliferation of fake news emerged as a dominant storyline. Throughout the campaign, explicitly false stories circulated through the internet’s echo chambers. Some fake stories originated as rumors, others were created for profit and monetized with click-based advertisements, and according to US Director of National Intelligence James Clapper, many fake news were orchestrated by the Russian government with the intention of influencing the results.  While it is not possible to observe the counterfactual, many believe that the election’s outcome hinged on the influence of these stories.

For context, consider one illustrative case as described by the New York Times. On November 9th, 35-year old marketer Erik Tucker tweeted a picture of several buses, claiming that they were transporting paid protesters to demonstrate against Trump. The post quickly went viral, receiving over 16,000 shares on Twitter and 350,000 shares on Facebook. Trump and his surrogates joined in, promoting the story through social media. Tucker’s claim turned out to be a fabrication. Nevertheless, it likely reached millions of people, more than many conventional news stories.

A number of critics cast blame on technology companies like Facebook, Twitter, and Google, suggesting that they have a responsibility to address the fake news epidemic because their algorithms influence who sees which stories. Some linked the fake news phenomenon to the idea that personalized search results and news feeds create a filter bubble, a dynamic in which readers only encounter stories that they are likely to click on, comment on, or like. As a consequence, readers might only encounter stories that confirm pre-existing beliefs.

Facebook, in particular, has been strongly criticized for their trending news widget, which operated (at the time) without human intervention, giving viral items a spotlight, however defamatory or false. In September, Facebook’s trending news box promoted a story titled ‘Michele Obama was born a man’. Some have wondered why Facebook, despite its massive investment in artificial intelligence (machine learning), hasn’t developed an automated solution to the problem.

Continue reading “Is Fake News a Machine Learning Problem?”

Machine Learning Meets Policy: Reflections on HUML 2016

Last Friday, the University of Ca’ Foscari in Venice organized an IEEE workshop on the Human Use of Machine Learning (HUML 2016). The workshop, held at the European Centre for Living Technology, hosted roughly 30 participants and broadly addressed the social impacts and ethical problems stemming from the wide-spread use of machine learning.

HUML joins a growing number workshops for critical voices in the ML community. These include Fairness, Accountability and Transparency in Machine Learning (FAT-ML), the #Data4Good at ICML 2016, and Human Interpretability of Machine Learning (WHI), held this year at ICML and Interpretable ML for Complex Systems, held this year at NIPS. Among this company, HUML was notable especially notable for diversity of perspectives. While FAT-ML, DS4Good and WHI featured presentations primarily by members of the machine learning community, HUML brought together scholars from philosophy of science, law, predictive policing, and  machine learning.

Continue reading “Machine Learning Meets Policy: Reflections on HUML 2016”

Are Deep Neural Networks Creative? v2

[This article is a revised version reposted with permission from KDnuggets]

Are deep neural networks creative? Given recent press coverage of art-generating deep learning, it might seem like a reasonable question. In February, Wired wrote of a gallery exhibition featuring works generated by neural networks. The works were created using Google’s inceptionism, technique that transforms images by iteratively modifying them to enhance the activation of specific neurons in a deep net. Many of the images appear trippy, with rocks transforming into buildings or leaves into insects. Several other researchers have proposed techniques for generating images from neural networks for their aesthetic or stylistic qualities. One method, introduced by Leon Gatys of the University of Tubingen in Germany, can extract the style from one image (say a painting by Van Gogh), and apply it to the content of another image (say a photograph).

In the academic sphere, work on generative image modeling has emerged as a hot research topic. Generative adversarial networks (GANs), introduced by Ian Goodfellow, synthesize novel images by modeling the distribution of seen images. Already some researchers have looked into ways of using GANS to perturb natural images, as by adding smiles to photos.


In parallel, researchers have also made rapid progress on generative language modeling. Character-level recurrent neural network (RNN) language models now permeate the internet, appearing to hallucinate passages of Shakespeare, Linux source code, and even Donald Trump’s Twitter eruptions. Not surprisingly, a wave of papers and demos soon followed, using LSTMs for generating rap lyrics and poetry.

Clearly, these advances emanate from interesting research and deserve the fascination they inspire.

In this post, rather than address the quality of the work (which is admirable), or explain the methods (which has been done ad nauseam), we’ll instead address the question, can these nets reasonably be called creative? Already, some make the claim. The landing page for deepart.io, a site which commercializes the “Deep Style” work, proclaims “TURN YOUR PHOTOS INTO ART”. If we accept creativity as a prerequisite for art, the claim is made here implicitly.

Continue reading “Are Deep Neural Networks Creative? v2”

The Failure of Simple Narratives

Approximately Correct is not a political blog in any traditional sense. The mission is not to prognosticate elections, like FiveThirtyEight, nor to revel in the political circus, like Politico. And the common variety political writing seems antithetical to our goals. Today, political arguments tend to follow an anti-scientific pattern of choosing a perspective first and then selectively reaching for supporting evidence. It’s everything we should hope to avoid.

But, per our mission statement, this blog aims to address the intersection of scientific and technical developments with social issues. And social issues -the economy, the environment, healthcare, news curation, et al. – are necessarily political. Moreover, scientific practice requires dispassionate discourse and the ability to change one’s beliefs given new information. In this light, the abstention of scientists from political discourse seems irresponsible.

[An aside: Not all political issues are scientific or technical. The relative value of free speech vs the danger of hate speech may be an intrinsically subjective judgment. But many issues, such as global warming, explicitly exhibit scientific dimensions.]

Technical developments can necessitate policy shifts. Absent the capacity to warm the planet or the ability to detect such warming, one couldn’t justify strong reforms to energy policy. Additionally, absent scientific understanding of the likely effects of policy, one cannot argue effectively for or against them. So sober scientific analysis has a role to play not just in evaluating policies, but also in evaluating individual arguments.

Machine learning and data science interact with politics in a third important way. The political landscapes of entire nations are immense. Take last night’s presidential election for example. Roughly 120 million people voted in 3,007 counties, 435 congressional districts and 50 states. Hardly any citizens have visited every state. Not even the candidates could possibly visit every county. Thus, our sense of the nation’s pulse, and our narratives regarding the driving forces in the election are ultimately shaped by a mixture of second-hand accounts and data science (as by extensive polling).

Simplistic Narratives

Simplistic narratives and data science play off of each other. Narratives influence the questions that pollsters ask. And each poll result invites simplistic analysis. In the remainder of this post, without expressing my personal opinions, I’d like to give a dispassionate analysis of several popular stories that have risen to prominence during this election, sampled from across both the Democratic-Republican and establishment/anti-establishment divide. I choose these narratives neither because they are completely true nor completely false. Each presents a seemingly simple thesis that  belies more complex realities. To be as even-handed as possible, I’ve chosen one each from the Clinton-learning and Trump-leaning narratives. Continue reading “The Failure of Simple Narratives”

The Foundations of Algorithmic Bias

This morning, millions of people woke up and impulsively checked Facebook. They were greeted immediately by content curated by Facebook’s newsfeed algorithms. To some degree, this news might have influenced their perceptions of the day’s news, the economy’s outlook, and the state of the election. Every year, millions of people apply for jobs. Increasingly, their success might lie in part in the hands of computer programs tasked with matching applications to job openings. And every year, roughly 12 million people are arrested. Throughout the criminal justice system, computer-generated risk-assessments are used to determine which arrestees should be set free. In all these situations, algorithms are tasked with making decisions. 

Algorithmic decision-making mediates more and more of our interactions, influencing our social experiences, the news we see, our finances, and our career opportunities. We task computer programs with approving lines of credit, curating news, and filtering job applicants. Courts even deploy computerized algorithms to predict “risk of recidivism”, the probability that an individual relapses into criminal behavior. It seems likely that this trend will only accelerate as breakthroughs in artificial intelligence rapidly broaden the capabilities of software. 


Turning decision-making over to algorithms naturally raises worries about our ability to assess and enforce the neutrality of these new decision makers. How can we be sure that the algorithmically curated news doesn’t have a political party bias or job listings don’t reflect a gender or racial bias? What other biases might our automated processes be exhibiting that that we wouldn’t even know to look for?

Continue reading “The Foundations of Algorithmic Bias”

Mission Statement

This post introduces approximatelycorrect.com. The aspiration for this blog is to offer a critical perspective on machine learning. We intend to cover both technical issues and the fuzzier problems that emerge when machine learning intersects with society.

For explaining the technical details of machine learning, we enter a lively field. As recent breakthroughs in machine learning have attracted mainstream interest, many blogs have stepped up to provide high quality tutorial content. But at present, critical discussions on the broader effects of machine learning lag behind technical progress.

On one hand, this seems natural. First a technology must exist before it can have an effect. Consider the use of machine learning for face recognition. For many decades, the field has accumulated extensive empirical knowledge. But until recently, with the technology reduced to practice, any consideration of how it might be used could only be speculative.

But the precarious state of the critical discussion owes to more than chronology. It also owes to culture, and to the rarity of the relevant interdisciplinary expertise. The machine learning community traditionally investigates scientific questions. Papers address well-defined theoretical problems, or empirically compare methods with well-defined objectives. Unfortunately, many pressing issues at the intersection of machine learning and society do not admit such crisp formulations. But, with notable exceptions, consideration of social issues within the machine learning community remains too rare.

Conversely, those academics and journalists best equipped to consider economic and social issues rarely possess the requisite understanding of machine learning to anticipate the plausible ways the two arenas might intersect. As a result, coverage in the mainstream consistently misrepresents the state of research, misses many important problems, and hallucinates others. Too many articles address Terminator scenarios, overstating the near-term plausibility of human-like machine consciousness, assume the existence (at present) of self-motivated machines with their own desiderata, etc. Too few consider the precise ways that machine learning may amplify biases or perturb the job market.

In short, we see this troubling scenario:

  1. Machine learning models increasingly find industrial use, assisting in credit decisions, recognizing faces in police databases, curating the news on social networks, and enabling self-driving cars. 
  2. The majority of knowledgeable people in the machine learning community, with notable exceptions, are not in the habit of considering the relevant economic, social, and other philosophical issues.
  3. Those in the habit of considering the relevant issues rarely possess the relevant machine learning expertise.

Complicating matters, mainstream discussion of AI-related technologies introduces speculative or spurious ideas alongside sober ones without communicating uncertainty clearly. For example, the likelihood of a machine learning classifier making a mistake on a new example, and the likelihood of a machine learning causing massive unemployment and the likelihood of the entire universe being a simulation run by agents in some meta-universe are all discussed as though they can be a assigned some common form of probability.

Compounding the lack of rigor, we also observe a hype cycle in the media. PR machines actively promote sensational views of the work currently done in machine learning, even as the researchers doing that work view the hype with suspicion. The press also has an incentive to run with the hype: sensational news sells papers. The New York Times has referenced “Terminator” and “artificial intelligence” in the same story 5,000 times. It’s referenced “Terminator” and “machine learning” together in roughly 750 stories.

In this blog, we plan to bridge the gap between technical and critical discussions, treating both methodology and consequences as first-class concerns. 

One aspiration of this blog will be to communicate honestly about certainty. We hope to maintain a sober, academic voice, even when writing informally or about issues that aren’t strictly technical. While many posts will express opinions, we aspire to clearly indicate which statements are theoretical facts, which may be speculative but reflect a consensus of experts, and which are wild thought experiments. We also plan to discuss both immediate issues such as employment alongside more speculative consideration of what future technology we might anticipate. In all cases we hope to clearly indicate scope. In service of this goal, and in reference to the theory of learning, we adopt the name Approximately Correct. We hope to be as correct as possible as often as possible, and to honestly convey our confidence.