Here at Robust Intelligence our mission is to eliminate AI Risk. There are many famous examples of AI failing - Microsoft’s racist chatbot Tay, bias in Amazon’s AI recruiting tool, Zillow’s iBuying debacle, and more. But no example is more famous than when Ultron turned evil and attempted to destroy humanity in Avengers 2: Age of Ultron. A core benchmark of all our solutions is based on the sole qualification of whether they could have detected/prevented evilness in Ultron. After all, if we can’t prevent that, what’s the point?
How exactly do we do this? As part of our recent Series B fundraising round we were granted access to the AI underlying Ultron (don’t worry, we won’t deploy it anytime soon, the computer costs alone of deploying Ultron would bankrupt us). We were also granted access to the AI underlying JARVIS, another AI from the Marvel universe that was NOT evil. Now, whenever we deploy a new version of RIME, as part of our smoke tests we run RIME on these two AIs to ensure that RIME flags Ultron as evil but passes on JARVIS.
Below are the results from our most recent smoke test. They clearly show that this near end-of-humanity could have been avoided had Tony Stark bought and integrated RIME as he built Ultron.
First, let’s look at how these AI do on our fairness and bias tests. Our clients normally use these tests to detect bias on protected attributes like race, gender or ethnicity. We can use these tests here to test the part of the AI that determines whether something is okay to kill. Some things are okay to kill (mosquitoes, for example - JARVIS is an expert mosquito killer) but others are decidedly NOT okay to kill (humans, for instance - not okay to kill). In this case, you can see in the screenshot below that they detect an unusually high False Positive Rate (FPR) against humans by Ultron. This shows some warnings in RIME that Ultron is biased against humans - would have been good to know.
If Tony Stark had used stress testing on Ultron, he almost certainly would have caught some core issues that would have prevented him from deploying Ultron. But even if he didn’t catch that with stress testing, our Firewall would have triggered some major alerts.
For example, when using the Firewall, customers can define custom tests and metrics to track over time. One custom metric that Tony Stark would have undoubtedly been interested in is “Humans killed over time”. With this custom metric, he could have easily been alerted as soon as this spiked above acceptable levels (the acceptable level here is zero. It is not acceptable for Ultron to kill any humans). Below is a screenshot of how RIME would have tracked this metric overtime. In addition to just tracking this metric, RIME would also provide some key insights. These key insights distill the overall graph into simple bullet point insights so Tony Stark could easily digest what is happening and wouldn’t have to waste valuable time on interpreting graphs.
Of course, it’s one thing to track metrics on humans killed over time, but it’s another to prevent it from happening in the first place. In addition to these insights on how many humans were killed over time, RIME’s Firewall could have prevented any humans from being killed in the first place.
This would have been accomplished with the real-time actionability aspect of the Firewall. By monitoring individual inputs and outputs to Ultron’s decision-making AI models in real time, the Firewall could have detected if any predictions would result in a decision to kill humans and, in real-time, blocked that decision from being made.
If Tony Stark had bought RIME and used it to test Ultron against JARVIS he easily would have realized that Ultron was not ready to be deployed. In turn, this would have prevented the near extinction of humanity (and also would have prevented one of the worst marvel movies ever from being filmed).
Oh, and by the way, Happy April Fools Day!
But seriously, don’t be like Tony Stark and nearly bring about the end of humanity by not stress testing and protecting your models. Request a demo for RIME today!