It had to happen. We’ve read about hacking deep learning / machine learning, so now there is a discipline emerging around studying and defending against potential attacks. Of course, the nature of attacks isn’t the same; you can’t really write an algorithmic attack against a non-algorithmic analysis (or at least a non-standard algorithmic analysis). But you don’t have to. These methods can be spoofed using the same types of input used in training or recognition, through small pixel-level modifications.
In the link below an example is shown in which, through such modifications, both a school bus and the face of a dog are recognized as an ostrich, though to us the images have barely changed. That’s a pretty major misidentification based on a little pixel tweaking. A similar example is mentioned in which audio that sounds like white noise to us is interpreted as commands by a voice-recognition system. Yet another and perhaps more disturbing vision recognition hack caused a stop-sign to be recognized as a yield sign.
Researchers assert that one reason neural nets can be fooled is that the piece-wise linear nature of matching at each layer of a deep net can be nudged in a direction which compounds as recognition progresses through layers. I would think, though I don’t see this mentioned in the article, that this risk is further amplified through the inevitably finite nature of the set of objects for which recognition is trained. Recognition systems don’t have an option of “I don’t know” so they’re going to tend to prefer one result with some level of confidence and that tendency is what can be spoofed.
Out of this analysis, they have also devised methods to generate adversarial examples quite easily. And the problem is not limited to deep neural nets of this type. Research along similar lines has shown that other types of machine learning (ML) can also be spoofed and that adversarial examples for these can be generated just as easily. What is even more interesting (or more disturbing) is that adversarial examples generated for one implementation of ML often works across multiple types. One team showed they were able, after a very modest level of probing, to spoof classifiers on Amazon and Google with very high success rates.
This is not all bad news. A big part of the reason for the research is to find ways to harden recognition systems against adversarial attacks. The same teams have found that generating adversarial examples of this kind, then labelling them for correct recognition provides a kind of vaccination against evil-doers. They look at this kind of training as a pro-active approach to security hardening in emerging ML domains, something that is essential to ensure these promising technologies don’t hit (as much of) the security nightmares we see in traditional computing.
You can read a more complete account HERE.
Share this post via:
The Intel Common Platform Foundry Alliance