From the course: Security Risks in AI and Machine Learning: Categorizing Attacks and Failure Modes
Backdoors and existing exploits
From the course: Security Risks in AI and Machine Learning: Categorizing Attacks and Failure Modes
Backdoors and existing exploits
- [Instructor] If you've ever played a video game, you may have discovered something called a cheat code, a special combination of button presses like right-down-right-left-left-right-up-up-down that gives you a power-up, invincibility or unlimited health. These are special codes that developers leave in the game to provide boosts or Easter eggs for players. Developers of non-gaming applications can also leave boosts or backdoors in their systems that allow the person who knows the code to gain access or special privileges. In machine learning, backdoors can be encoded into the model and triggered by a special code or sequence. Machine learning systems operating in the real world can be triggered by physical backdoors, such as a special visual that triggers a self-driving car to accelerate to 65 miles per hour. If an attacker figures out that backdoor, they could negatively impact the safety of the people in the vehicle by placing visual trigger stickers on road signs. In addition to encoded backdoors, researchers have identified backdoor attacks that can be introduced directly into the model while it's operating, allowing the attacker to control the trigger feature. An interesting backdoor attack vector is via blind code poisoning. By injecting a pixel pattern trigger into an image classification system, all images with that pattern would get grouped into the same bucket regardless of actual classification. So with the right pattern trigger, images of bees, birds and flowers would all be classified as trees. Attackers can also attack AI and ML by going after the underlying operating systems and hardware which have their own sets of vulnerabilities and exposure points. Rather than exploiting the models directly, the attacker goes after a vulnerability in the OS, like forcing a buffer overflow that causes the ML model to fail or to provide inaccurate outputs. This is why designers need to plan for how the system will respond if a failure or exploit occurs in any component.
Contents
-
-
-
-
Perturbation attacks and AUPs3m 31s
-
Poisoning attacks3m 11s
-
Reprogramming neural nets1m 39s
-
Physical domain (3D adversarial objects)2m 34s
-
Supply chain attacks2m 42s
-
Model inversion3m 12s
-
System manipulation3m 2s
-
Membership inference and model stealing2m 3s
-
Backdoors and existing exploits2m 19s
-
-
-
-