The Fallacy of the Hero Lifeguard

Blaine Nelson

Blaine is a lead machine learning engineer at Robust Intelligence.

Blaine Nelson

Blaine is a lead machine learning engineer at Robust Intelligence.

Machine learning is changing the business world. As your company deploys more learned models in production environments, there is an increasing burden on your data scientists and machine learning engineers to maintain the security and efficacy of these models, and react to situations that threaten models’ health.

While building machine learning models has become relatively routine, there are few industry-wide solutions for ensuring the health of a model before it is used in production. Currently, many organizations resort to ad-hoc solutions for security validation — these incomplete solutions may actually cause vulnerabilities, due to their assumptions being violated, data-shift, or adversarial attacks. With every newly deployed model, new unknown risks may be exposed.  As a result, organizations today rely too heavily on their machine learning engineers to react quickly in times of trouble, to save badly behaving models once they are deployed on real data.

Throughout my entire career, I've sought to improve the health and sustainability of machine learning models. After completing my PhD dissertation on the topic of adversarial learning, I co-wrote a book on it; I then worked at Google for 7 years, building machine learning infrastructure for developing and deploying learned models in hostile settings. I've watched as machine learning has gained prominence, and how businesses have built infrastructure around their models, to ensure state-of-the-art predictive power. Throughout my career, though, one important lesson continues to stick out in my mind, which I first learned as a lifeguard. The lesson always reminds me of the importance of a comprehensive approach to model health and security, and it goes something like this...

The Allegory of the Hero Lifeguard

When I was in high school, I decided to become a lifeguard at a local pool for a summer job. My swim coach, Mrs. Kadelle, taught the primary lifeguard training class, which taught us how to monitor a body of water, identify arising problems, and rescue victims from drowning. During this training, she asked the class, “How many rescues do you think a good lifeguard will expect to execute in a given season?”  As we all thought about it, she told us the following story... The previous year, at Juniper Hills Pool, there were 4 rescues.  Of those, a single junior lifeguard, John, had performed 3 of the rescues. John had gained a degree of prominence in the community for his life-saving skills, and had been called a "hero"; there had even been an article in our local paper about him.

"What do you think was John's secret to being such an outstanding lifeguard?" she asked.  The class of trainees buzzed: some suggested that John had just been in the right place at the right time, while others speculated on John's skills.  Mrs. Kadelle surprised us all when she said, "John was neither lucky nor skilled... in fact, John was the worst lifeguard they had that year, and would not be re-hired."  She explained further:

John is exactly the kind of lifeguard I never want on the job.  John did not perform a lifeguard's most important duty — monitoring the pool patrons to prevent any critical life-threatening situations from developing in the first place. John was distracted from his monitoring role, which led to his failure to identify the kinds of risks which allowed novice swimmers to drift into perilous situations. For example, swimmers clinging to buoyed ropes to pull themselves into deeper water or barely using their legs to kick.  Without any direction, weak swimmers exhausted themselves and, at the last moment, John had to make an emergency rescue to save them. Through his negligence, the "heroic" lifeguard had created a hazardous environment — he had failed to look for telltale warning signs of danger until they festered into full blown emergencies.

Mrs. Kadelle explained that a good lifeguard should never have to enter the water to make a rescue. Instead, a good lifeguard will prevent their patrons from ever needing to be rescued, by continually monitoring the pool.

Takeaway

In high school, Mrs. Kadelle’s words of wisdom about lifeguarding were just that: instructions about how to guard the patrons of a pool from the dangers of drowning. Now, however, I see how widely relevant her advice is, especially in our field — the field of data science and machine learning.

In your business’ machine learning deployments, it may seem that the process of model management is adequate. Data engineers are discovering when your served models are getting bad data, and when your engineers identify problems or vulnerabilities, emergency procedures are in place to quickly deploy a rescue model or revert to an earlier model. There may even be heroic stories of engineers—"John"s—working into the wee hours of the morning, to fix a model after their pagers notified them of imminent disaster.

Yet, we may ask ourselves: are these signs of a well-functioning system? Or, are your data and model pipelines, like the hero lifeguard, failing to preemptively identify when your models have become imperiled children, wandering unmonitored into waters that are out of their depth?

A healthy model development and deployment process should anticipate data and model issues, and identify them as early as possible. It should:

  1. Examine your data and model over a suite of known threats before your model is ever deployed (offline testing)
  2. Periodically re-examine snapshots of your live data, to identify worrisome trends that may imperil your model as data shifts (continuous testing)
  3. Monitor the traffic sent to your model in real-time, to provide immediate safeguards against any sudden changes in your data stream (firewall)

With a healthy model pipeline, you can be assured that your models are behaving correctly, and that your system can react quickly if any degradation occurs.  Your data engineers will be able to provide you better guarantees in their work and their models’ performance, and can spend less of their time and resources trying to debug faults.

Like a good lifeguard, Robust Intelligence is building tools that analyze your data and models prior to and during deployment, to quickly identify both pre-existing vulnerabilities and newly arising threats. By using model unit testing, your data engineers will be able to discover conditions that could cause your model to malfunction before that model is ever deployed — thus saving massive amounts of work later on, and allowing your engineers to consistently vet and reconfigure model pipelines.  Then, using RI’s continuous testing to live-monitor how your models handle incoming data, your team can track changing conditions. This allows your data scientists to know when old models become outdated, and when it's time to build, test, and deploy new models.  Finally, RI’s firewall will provide you with the final safeguard, by validating real-time traffic and blocking errant or malicious data in a quickly changing ecosystem.

Mrs. Kadelle was right: there’s a difference between having the reputation of a good lifeguard, and actually delivering the kind of consistency that saves lives. With Robust Intelligence’s RIME system, we can help your team prevent any models from ever beginning to drown in the first place.

Blaine Nelson