By Jack Clark and Tim Hwang.
Conversations about the social impact of AI often are very abstract, focusing on broad generalizations about technology rather than talking about the specific state of the research field. That makes it challenging to have a full conversation about what good public policy regarding AI would be like. In the interest of helping to bridge that gap, Jack Clark and I have been playing around with doing recaps that’ll take a selection of papers from a recent conference and talk about the longer term policy implications of the work. This one covers papers that appeared at NIPS 2016.
If it’s helpful to the community, we’ll plan to roll out similar recaps throughout 2017 — with the next one being ICLR in April.
Learning from Untrusted Data — Charikar, Steinhardt, Valiant (Safety / Security)
As machine learning models get deployed for an increasingly high-stakes range of tasks, the risk that they will be the target of bad actors looking to manipulate the behavior of these systems increase. One vulnerability might be the tainting or fabrication of inputs that a system receives as it is trained, allowing a hypothetical attacker to alter the resulting learned behavior. We’ll need a better understanding of the risks here so that designers of future systems can give guarantees of performance even in the midst of hostile actors.
“Learning from Untrusted Data” was presented at the “Reliable Machine Learning in the Wild” symposium at NIPS. It proposes two applicable models for navigating this problem in the context of clustering problems — list-decodable learning and semi-verified learning. The former looks at circumstances under which algorithms can output a set of possible answers in the presence of untrusted data, one of which is guaranteed to closely approximate the answer that you would get in the presence of only trusted data. The latter looks at the extent to which a small “verified” dataset enables extraction of trusted information from a larger, untrusted dataset. One hopeful result: these frameworks are robust even when a significant portion of the inputs to a system are compromised. That provides some hope that this particular threat vector is something we’ll be able to control going forwards.
Analyzing training under conditions of bad data has broader application — bad data could be the result of intentional actions on the part of a malicious actor, but could also emerge unintentionally. Further research here will give us a greater ability to provide guarantees about the performance of a system, for example, in the presence of biased data.
Man is to Computer Programmer as Woman is to Homemaker? — Bolukbasi, Chang, Zou, et al. (Fairness)
Word embeddings present a sticky design challenge. While they are demonstrably powerful tools for a range of different AI applications, the fact that they are trained on existing corpora of text mean that they can (and do!) soak up existing social biases in the training set. This problem is compounded when popular word embeddings with these problems are integrated en masse into other tools, raising the risk that entire swathes of AI systems will reinforce existing discriminatory biases within society.
To avoid this problem, scalable tools for addressing biased embeddings are needed. This paper proposes a method for generating a gender subspace through a comparison of a given word embedding with the embeddings of pairs of gender-specific words (e.g., she/he, her/his, woman/man, etc).
From this, the paper presents two approaches in the example case of gender bias. One, “Neutralize and Equalize”, is a hard debiasing technique that ensures that gender-neutral words have no component in gender subspace, and that they are equidistant to an equality set of gendered words (e.g. “guy” and “gal”). This effectively strips out bias which may enter embeddings through the association of gender with non-gender specific words (e.g. “nurse”). Two, “Soften”, is a lighter-touch debiasing method which reduces though preserves as much of the similarity of the original embedding as possible within certain parameters. The latter is useful in cases where it is important to retain a relationship with gender where it may not be as problematic (the paper uses the illustration of grandfather a regulation: we may not want regulation to be equidistant from grandmother since grandmother a regulation does not have the same meaning).
An interesting underlying ethical question here is the extent to which the research and technical community is responsible for addressing these broader social challenges in their systems. One view — brought up in the paper — is that the embeddings “merely [reflect] bias in society, and therefore one should attempt to debias society rather than word embeddings.” This position will be more difficult to hold as this line of research continues to produce low-cost, practical fixes to retroactively remove undesirable bias, and case studies around the harm it can produce pile up. Our thought is that the emerging question will focus on the circumstances under which technologists should act to prevent bias in the products they build, and how they should go about doing so.
Economic Models of Algorithmic Discrimination — Goodman (Fairness)
Machine learning systems might unfairly disfavor certain groups within society, particularly groups that may already be vulnerable. To the extent these systems become used for, say, providing government services, having technical approaches to efficiently detect discrimination and fix systems that have gone wrong will become a major factor in their legitimacy and further adoption.
There’s a growing body of researchers working on problems in this space, and it was a major focus at the “ML and the Law” symposium at NIPS. Goodman’s “Economic Models of Algorithmic Discrimination” usefully adapts frameworks from economics to delineate between multiple different flavors of algorithmic discrimination and how they might emerge — from situations where the data itself is discriminatory (a form of “taste-based” discrimination), to discrimination emerging from a bias against uncertainty and differences in conditional probabilities across groups. That kind of untangling is important: we need a better technical account of how specific discrimination failure states emerge before we can really effectively solve them.
The paper also describes how discrimination emerges in situations of active learning, where a predictive model is continuously updated as new data is fed into it. To date, many of the papers focusing on fairness in ML that we’ve seen have had more limited application to these situations, even though they constitute some of the more advanced systems in the field (there are similar limitations in fairness approaches towards neural networks and reinforcement learning). Goodman’s discussion is a good start, and we’re hoping to see more along these lines.
RL²: Fast Reinforcement Learning via Slow Reinforcement Learning — Duan, Schulman, Chen, et al. and Learning to
Reinforcement Learn — Wang, Nelson, Tirumala, et al. (Interpretability)
As AI is applied to real-world situations, the increased cost of failure means that demand for interpretability will rise. We’re seeing this in the context of self-driving car systems, where proposals have been put forth to try to ensure that systems can clearly output decision making processes in the event of a crash.
New approaches to reinforcement learning, such as the “RL² :Fast Reinforcement Learning via Slow Reinforcement Learning” paper, which was presented at the Deep Reinforcement Learning workshop, point to a future where — driven by competition for superior performance — algorithms move from being static entities to ones that shift and evolve according to their circumstances. This will potentially deepen concerns if deployed widely, given the issues already raised about the limited interpretability in existing static classification algorithms.
The RL² paper, along with a similar technique discussed in “Learning to
Reinforcement Learn”, presents a method to create algorithms that modify themselves to solve tasks. Specifically, the approach makes the objective of a reinforcement learning process to learn how to learn to solve a variety of tasks, rather than to learn to solve a single specific task. This is achieved by structuring the RL agent as a recurrent neural network and feeding it extra information about its performance on the current task. This means that the agent learns to implement its own learning algorithms, which allow it to invent exploration and exploitation strategies as appropriate for the distribution of tasks it is faced without human intervention. This mirrors the way that humans can rapidly improvise solutions to novel situations by drawing on their past experience.
In practice, this means that the underlying algorithm is no longer static, but is instead changing itself to optimize its ability to learn to solve new tasks. These algorithms will be trained on a broad set of scenarios and will improvise their own solutions to each task. Regulators will likely want researchers to develop tools that can probe the internal state of RL²-esque algorithms to unearth the reasoning for why they are taking certain courses of action. It will also be important to find standard ways of quantifying and potentially limiting the degree of change that is possible. These are currently unsolved problems, and ones worthy of attention.
Spatially Adaptive Computation Time for Residual Networks — Figurnov, Collins, Zhu, et al. (Privacy / Access)
Cost is always a big factor in determining who gets to use new technological capabilities, and how they are used. For example, the ability to wield computing power used to be nearly exclusively the province of well-resourced governments. Today, that same power is broadly distributed on mobile devices and used to search through cat photos on the Internet. To that end, cost also changes the kinds of policy concerns that we have around a given piece of technology.
Computer vision presents an interesting case of cost. Deep convolutional networks, while being extremely useful, are computationally expensive. That constrains the applications and the types of technical architectures that are possible.
This paper illustrates one neat approach to cutting that cost. Essentially, it proposes an arrangement that enables a computer vision system to stop expending computational resources on areas of an image when things get “good enough.” The end result is that computational power gets allocated towards objects that require more work to recognize, while simpler parts of the image get less resources put towards them. The net result is a lower overall cost of processing.
The upshot of this ongoing work to drive down computing cost is that powerful image analysis capabilities will get cheaper and distributed to more people over time. That creates space for new positive uses, like systems that scan medical images running on cheap, low-power devices. But it can also potentially expand bad uses as well by cutting the cost of surveillance technologies, or enabling the creation of more effective homebrew robots for targeting people. It also erodes the set of points of control that are lever points for addressing harmful uses, since groups with limited, local access to compute will still be able to access advanced capabilities.
Conditional Image Generation with PixelCNN Decoders — Oord, Kalchbrenner, Vinyals et al. and InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets — Chen, Duan, Houthoof et al. (Future Capabilities / Public Sphere)
Though currently limited, the rise of AI-generated imagery could have profound implications for public discourse in the future. New synthetic image and video generation techniques seem poised to make it possible to cheaply fake imagery that looks real. This has a number of implications for how AI technologies might be used to shape discourse and perception, particularly to the extent it intersects with the ongoing rise of ‘fake news’.
Initially, it’s likely that research techniques could be adapted to subtly alter basic imagery. These alterations will largely automate some of what is possible today with commercial programs like Photoshop, and also extend their capabilities. Two papers, “Conditional Image Generation with PixelCNN Decoders” and “InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets”, introduce systems that can learn some of the more subtle features inherent in a dataset, and use that knowledge to create synthetic images. Both approaches can learn to disentangle some of the features that make up an image, making it possible to do things like modify whether a person is wearing sunglasses or not, or the inclination of their head.
There’s also a parallel phenomenon occurring for video and audio, where new techniques such as Face2Face and Wavenet show how it’s possible to use AI to perform video ventriloquism, where an actor might manipulate footage of a person to change their mouth movements; and to then use a Wavenet-like technique to alter speech to sound like it comes from the target.
However, these approaches are, for now, relatively limited, due to both the low-quality resolution of the images being produced, and the difficulty in capturing the coherent states of the data being explored. For example, when generating synthetic images of dogs it’s quite common that these algorithms will give the creatures too many legs, indicating that they’re yet to have an internal representation of the dog that corresponds to reality. The current need to collect large datasets for these techniques to work effectively also limit the scope of objects and audio that can be convincingly faked.
Whether this becomes a broader public policy challenge in the near term depends on the speed of development. Modified images, videos, and audio clips, when paired with fake news articles, could generate ever-more believable narratives that influence public opinion. Given sufficiently broad datasets, we can imagine these techniques being adapted and extended to modify the apparent demeanor of politicians in certain politically charged images, or create faked video and audio statements attributed to known individuals like celebrities. The same approach could be adapted to create more sophisticated phishing campaigns by hackers. We’ll need to track these uses if and when they appear.
—
…and that’s a wrap! We’ll be keeping an eye out for new policy-relevant papers as we get into 2017. Give us a shout on Twitter (Jack Clark / Tim Hwang) or comment here if you see anything good.
Thanks to Hanna Wallach and Miles Brundage for their suggestions and feedback!