In 2014, Szegedy et al. published an ICLR paper with a surprising discovery: modern deep neural networks trained for image classification exhibit the following vulnerability: by making only slight alterations to an input image, it’s possible to drastically fool a model that would otherwise classify the image correctly (say, as a dog), into outputting a completely wrong label (say, as a banana). Moreover, this attack is possible even with perturbations that are so tiny that a human couldn’t distinguish the altered image from the original.
These doctored images are called adversarial examples and the study of how to make neural networks robust to these attacks is an increasingly active area of machine learning research.
Continue reading “Leveraging GANs to combat adversarial examples”
It’s January 28th and I should be working on my paper submissions. So should you! But why write when we can meta-write? ICML deadlines loom only twelve days away. And KDD follows shortly after. The schedule hardly lets up there, with ACL, COLT, ECML, UAI, and NIPS all approaching before the summer break. Thousands of papers will be submitted to each.
The tremendous surge of interest in machine learning along with ML’s democratization due to open source software, YouTube coursework, and the availability of preprint articles are all exciting happenings. But every rose has a thorn. Of the thousands of papers that hit the arXiv in the coming month, many will be unreadable. Poor writing will damn some to rejection while others will fail to reach their potential impact. Even among accepted and influential papers, careless writing will sow confusion and damn some papers to later criticism for sloppy scholarship (you better hope Ali Rahimi and Ben Recht don’t win another test of time award!).
But wait, there’s hope! Your technical writing doesn’t have to stink. Over the course of my academic career, I’ve formed strong opinions about how to write a paper (as with all opinions, you may disagree). While one-liners can be trite, I learned early in my PhD from Charles Elkan that many important heuristics for scientific paper writing can be summed up in snappy maxims. These days, as I work with younger students, teaching them how to write clear scientific prose, I find myself repeating these one-liners, and occasionally inventing new ones.
The following list consists of easy-to-memorize dictates, each with a short explanation. Some address language, some address positioning, and others address aesthetics. Most are just heuristics so take each with a grain of salt, especially when they come into conflict. But if you’re going to violate one of them, have a good reason. This can be a living document, if you have some gems, please leave a comment.
Continue reading “Heuristics for Scientific Writing (a Machine Learning Perspective)”
Consider a little science experiment we’ve all done, to find out if a switch controls a light. How many data points does it usually take to convince you? Not many! Even if you didn’t do a randomized trial yourself, and observed somebody else manipulating the switch you’d figure it out pretty quickly. This type of science is easy!
One thing that makes this easy is that you already know the right level of abstraction for the problem: what a switch is, and what a bulb is. You also have some prior knowledge, e.g. that switches typically have two states, and that it often controls things like lights. What if the data you had was actually a million variables, representing the state of every atom in the switch, or in the room?
Continue reading “Macro-causality and social science”
In a shocking tweet, organizers of the 35th International Conference on Machine Learning (ICML 2018) announced today, through an official Twitter account, that this year’s conference has sold out. The announcement came as a surprise owing to the timing. Slated to occur in July, 2018, the conference has historically been attended by professors and graduate student authors, who attend primarily to present their research to audience of peers. With the submission deadline set for February 9th and registrations already closed, it remains unclear if and how authors of accepted papers might attend.
Continue reading “ICML 2018 Registrations Sell Out Before Submission Deadline”
In July of this year, NYU Professor of Psychology Gary Marcus argued in the New York Times that AI is stuck, failing to progress towards a more general, human-like intelligence. To liberate AI from it’s current stuckness, he proposed a big science initiative. Covetously referencing the thousands of bodies (employed at) and billions of dollars (lavished on) CERN, he wondered whether we ought to launch a concerted international AI mission.
Perhaps owing to my New York upbringing, I admire Gary’s contrarian instincts. With the press pouring forth a fine slurry of real and imagined progress in machine learning, celebrating any story about AI as a major breakthrough, it’s hard to state the value of a relentless critical voice reminding the community of our remaining shortcomings.
But despite the seductive flash of big science and Gary’s irresistible chutzpah, I don’t buy this particular recommendation. Billion-dollar price tags and frightening head counts are bugs, not features. Big science requires getting those thousands of heads to agree about what questions are worth asking. A useful heuristic that applies here:
The larger an organization, the simpler its elevator pitch needs to be.
Machine learning research doesn’t yet have an agreed-upon elevator pitch. And trying to coerce one prematurely seems like a waste of resources. Dissent and diversity of viewpoints are valuable. Big science mandates overbearing bureaucracy and some amount of groupthink, and sometimes that’s necessary. If, as in physics, an entire field already agrees about what experiments come next and these happen to be thousand-man jobs costing billions of dollars, then so be it
Continue reading “Embracing the Diffusion of AI Research in Yerevan, Armenia”
By David Kale and Zachary Lipton
Starting Friday, August 18th and lasting two days, Northeastern University in Boston hosted the eighth annual Machine Learning for Healthcare (MLHC) conference. This year marked MLHC’s second year as a publishing conference with an archival proceedings in the Journal of Machine Learning Research (JMLR). Incidentally, the transition to formal publishing venue in 2016 coincided with the name change to MLHC from Meaningful Use of Complex Medical Data, denoted by the memorable acronym MUCMD (pronounced MUCK-MED).
From its beginnings at Children’s Hospital Los Angeles as a non-archival symposium, the meeting set out to address the following problem:
- Machine learning, even then, was seen as a powerful tool that can confer insights and improve processes in domains with well-defined problems and large quantities of interesting data.
- In the course of treating patients, hospitals produce massive streams of data, including vital signs, lab tests, medication orders, radiologic imaging, and clinical notes, and record many health outcomes of interest, e.g., diagnoses. Moreover, numerous tasks in clinical care present as well-posed machine learning problems.
- However, despite the clear opportunities, there was surprisingly little collaboration between machine learning experts and clinicians. Few papers at elite machine learning conferences addressed problems in clinical health and few machine learning papers were submitted to the elite medical journals.
Continue reading “A Pedant’s Guide to MLHC 2017”
[This article originally appeared on the Deep Safety blog.]
Long-term AI safety is an inherently speculative research area, aiming to ensure safety of advanced future systems despite uncertainty about their design or algorithms or objectives. It thus seems particularly important to have different research teams tackle the problems from different perspectives and under different assumptions. While some fraction of the research might not end up being useful, a portfolio approach makes it more likely that at least some of us will be right.
In this post, I look at some dimensions along which assumptions differ, and identify some underexplored reasonable assumptions that might be relevant for prioritizing safety research. In the interest of making this breakdown as comprehensive and useful as possible, please let me know if I got something wrong or missed anything important.
Continue reading “Portfolio Approach to AI Safety Research”
With peak submission season for machine learning conferences just behind us, many in our community have peer-review on the mind. One especially hot topic is the arXiv preprint service. Computer scientists often post papers to arXiv in advance of formal publication to share their ideas and hasten their impact.
Despite the arXiv’s popularity, many authors are peeved, pricked, piqued, and provoked by requests from reviewers that they cite papers which are only published on the arXiv preprint.
“Do I really have to cite arXiv papers?”, they whine.
“Come on, they’re not even published!,” they exclaim.
The conversation is especially testy owing to the increased use (read misuse) of the arXiv by naifs. The preprint, like the conferences proper is awash in low-quality papers submitted by band-wagoners. Now that the tooling for deep learning has become so strong, it’s especially easy to clone a repo, run it on a new dataset, molest a few hyper-parameters, and start writing up a draft.
Of particular worry is the practice of flag-planting. That’s when researchers anticipate that an area will get hot. To avoid getting scooped / to be the first scoopers, authors might hastily throw an unfinished work on the arXiv to stake their territory: we were the first to work on X. All that follow must cite us. In a sublimely cantankerous rant on Medium, NLP/ML researcher Yoav Goldberg blasted the rising use of the (mal)practice. Continue reading “Do I really have to cite an arXiv paper?”
The following passage is a musing on the futility of futurism. While I present a perspective, I am not married to it.
When I sat down to write this post, I briefly forgot how to spell “dilemma”. Fortunately, Apple’s spell-check magnanimously corrected me. But it seems likely, if I were cast away on an island without any automatic spell-checkers or other people to subject my brain to the cold slap of reality, that my spelling would slowly deteriorate.
And just yesterday, I had a strong intuition about trajectories through weight-space taken by neural networks along an optimization path. For at least ten minutes, I was reasonably confident that a simple trick might substantially lower the number of updates (and thus the time) it takes to train a neural network.
But for the ability to test my idea against an unforgiving reality, I might have become convinced of its truth. I might have written a paper, entitled “NO Need to worry about long training times in neural networks” (see real-life inspiration for farcical clickbait title). Perhaps I might have founded SGD-Trick University, and schooled the next generation of big thinkers on how to optimize neural networks.
Continue reading “The Futurist’s Dilemma”
Last week, on April 27th and 28th, I attended Algorithms and Explanations, an interdisciplinary conference hosted by NYU Law School’s Information Law Institute. The thrust of the conference could be summarized as follows:
- Humans make decisions that affect the lives of other humans
- In a number of regulatory contexts, humans must explain decisions, e.g.
- Bail, parole, and sentencing decisions
- Approving a line of credit
- Increasingly, algorithms “make” decisions traditionally made by man, e.g.
- Risk models already used to make decisions regarding incarceration
- Algorithmically-determined default risks already used to make loans
- This poses serious questions for regulators in various domains:
- Can these algorithms offer explanations?
- What sorts of explanations can they offer?
- Do these explanations satisfy the requirements of the law?
- Can humans actually explain their decisions in the first place?
The conference was organized into 9 panels. Each featured between 3 and 5 20-minute talks followed by a moderated discussion and Q&A. The first panel, moderated by Helen Nissenbaum (NYU & Cornell Tech), featured legal scholars (including conference organizer Katherine Strandburg) and addressed the legal arguments for explanations in the first place. A second panel featured sociologists Duncan Watts (MSR) and Jenna Burrell (Berkeley) as well as Solon Borocas (MSR), an organizer of the Fairness, Accountability and Transparency in Machine Learning workshop.
Katherine Jo Strandburg, NYU Law professor and conference organizer
Continue reading “NYU Law’s Algorithms and Explanations”