On Thursday, OpenAI announced that they had trained a language model. They used a large training dataset and showed that the resulting model was useful for downstream tasks where training data is scarce. They announced the new model with a puffy press release, complete with this animation (below) featuring dancing text. They demonstrated that their model could produce realistic-looking text and warned that they would be keeping the dataset, code, and model weights private. The world promptly lost its mind.
For reference, language models assign probabilities to sequences of words. Typically, they express this probability via the chain rule as the product of probabilities of each word, conditioned on that word’s antecedents Alternatively, one could train a language model backwards, predicting each previous word given its successors. After training a language model, one typically either 1) uses it to generate text by iteratively decoding from left to right, or 2) fine-tunes it to some downstream supervised learning task.
Training large neural network language models and subsequently applying them to downstream tasks has become an all-consuming pursuit that describes a devouring share of the research in contemporary natural language processing.
Continue reading “OpenAI Trains Language Model, Mass Hysteria Ensues”
Artificial intelligence is transforming the way we work (Venture Beat), turning all of us into hyper-productive business centaurs (The Next Web). Artificial intelligence will merge with human brains to transform the way we think (The Verge). Artificial intelligence is the new electricity (Andrew Ng). Within five years, artificial intelligence will be behind your every decision (Ginni Rometty of IBM via Computer World ).
Before committing all future posts to the coming revolution, or abandoning the blog altogether to beseech good favor from our AI overlords at the AI church, perhaps we should ask, why are today’s headlines, startups and even academic institutions suddenly all embracing the term artificial intelligence (AI)?
In this blog post, I hope to prod all stakeholders (researchers, entrepreneurs, venture capitalists, journalists, think-fluencers, and casual observers alike) to ask the following questions:
- What substantive transformation does this switch in the nomenclature from machine learning (ML) to artificial intelligence (AI) signal?
- If the research hasn’t categorically changed, then why are we rebranding it?
- What are the dangers, to both scholarship and society, of mindlessly shifting the way we talk about research to maximize buzz?
On the BBC’s anthology series Black Mirror, each episode explores a near-future dystopia. In each episode, a small extrapolation from current technological trends leads us into a terrifying future. The series should conjure modern-day Cassandras like Cathy O’Neil, who has made a second career out of exhorting caution against algorithmic decision-making run amok. In particular, she warns that algorithmic decision-making systems, if implemented carelessly, might increase inequality, twist incentives, and perpetrate undesirable feedback loops. For example, a predictive policing system might direct aggressive policing in poor neighborhoods, drive up arrests, depress employment, orphan children, and lead, ultimately, to more crime.
Continue reading “Cathy O’Neil Sleepwalks into Punditry”
Meet Erica, the world’s most human-like autonomous android. From its title alone, this documentary promises a sensational encounter. As the screen fades in from black, a marimba tinkles lightly in the background and a Japanese alleyway appears. Various narrators ask us:
“What does it mean to think?”
“What is human creativity?”
“What does it mean to have a personality?”
“What is an interaction?”
“What is a minimal definition of humans?”
The title, these questions, and nearly everything that follows mislead. This article is an installment in a series of posts addressing the various sources of misinformation feeding the present AI hype cycle.
Continue reading “Press Failure: The Guardian’s “Meet Erica””
Interest in machine learning may be at an all-time high. Per Google Trends, people are searching for machine learning nearly five times as often as five years ago. And at the University of California San Diego (UCSD), where I’m presently a PhD candidate, we had over 300 students enrolled in both our graduate-level recommender systems and neural networks courses.
Much of this attention is warranted. Breakthroughs in computer vision, speech recognition, and, more generally, pattern recognition in large data sets, have given machine learning substantial power to impact industry, society, and other academic disciplines.
Continue reading “The AI Misinformation Epidemic”
The organizers of the The Fake News Challenge have subjected it to a significant overhaul. In this light, many of my criticisms of the challenge no longer apply.
Last month, I posted a critical piece addressing the fake news challenge. Organized by Dean Pomerleau and Delip Rao, the challenge aspires to leverage advances in machine learning to combat the epidemic viral spread of misinformation that plagues social media. The original version of the the challenge asked teams to take a claim, such as “Hillary Clinton eats babies”, and output a prediction of its veracity together with supporting documentation (links culled from the internet). Presumably, their hope was that an on-the-fly artificially-intelligent fact checker could be integrated into social media services to stop people from unwittingly sharing fake news.
My response criticized the challenge as both ill-specified (fake-ness not defined), circular (how do we know the supporting documents are legit?) and infeasible (are teams supposed to comb the entire web?)
Continue reading “Fake News Challenge – Revised and Revisited”