AI Regulation is Coming. Get Ready.
As a product manager at Robust Intelligence, I like to ask current and prospective customers about the biggest challenges they face in developing and maintaining their machine learning pipelines. In addition to the operational risks of building AI systems they face day-to-day, many of our customers are navigating upcoming regulatory changes worldwide. Some are understandably apprehensive about the effects such regulations might have on the effectiveness of their ML models and the overhead required to reach compliance. After researching various global regulatory proposals, three things are clear:
- AI Regulation is coming to global markets.
- Companies will likely have to change their machine learning processes in at least two important ways to comply with such regulations.
- The Robust Intelligence Model Engine (RIME) platform provides users with the tools to help comply with such regulations.
Upcoming Global AI Regulations
In April of 2021, the executive branch of the European Union, the European Commission, outlined a framework for regulating the commercial AI space . This proposal requires several changes across the machine learning pipeline, from data collection to model inference, and includes fines of up to 6% of global revenues for noncompliance. These changes include requirements to monitor, log, and remediate various issues across certain machine learning pipelines. In the same month, The United States' Federal Trade Commission released a powerfully worded directive  that outlines similar requirements in the training and serving of machine learning models, especially those that fall under the purview of regulations such as the Fair Credit Reporting Act, the Equal Credit Opportunity Act, and Section 5 of the FTC Act. These two legal guidelines are just the latest in global regulatory efforts and join previous directives from UK's Information Commissioner's Office, the US Food and Drug Administration, Japan's Financial Services Agency, and more .
Common Elements of Regulation: Impact Assessments and Continuous Monitoring
Most regulatory proposals share two common requirements.
The first requirement is that companies evaluate the potential AI failures of their machine learning models via processes commonly known as Impact Assessments. The European Commission proposal requires a "Risk Management System" that evaluates the impact of broadly defined risks throughout ML deployments and mandates that each identified issue be resolved before deployment.
Such rules will require changes to standard commercial machine learning practices. For example, where data scientists could previously focus on improvement across one metric (e.g. accuracy), they also have to measure many such metrics across multiple data slices. This would now be mandated to ensure that model performance is unbiased across sensitive features such as age or ethnicity. As a result, companies must build extensive testing suites to discover and ameliorate these risks.
The second requirement will be the implementation of systematic continuous quality assurance for machine learning models and pipelines. This ensures risks that may emerge while machine learning models are in production are logged and resolved. It can be thought of as an extension of the previously mentioned impact assessments. It would, in effect, be a continuous set of impact assessments that run throughout the production deployment. In fact, the European Commission's proposed framework specifically requires that data scientists should monitor ML system performance after deployment to ensure protection against performance and degradation. Today, some companies are building dashboards that surface a couple of basic model and data metrics after deployment to monitor performance, but these systems fail to provide a thorough or systematic check of unknown risks. Companies will need to build out the testing framework to measure these critical model and data vulnerabilities and then build the infrastructure to run these checks during production — a daunting and time-consuming task.
How RIME reduces AI Risk
At Robust Intelligence, our RIME platform was designed to help machine learning models perform better. Training models to be more robust will also help them become more accurate and stable. It just so happens that the products we've designed to accomplish this goal also help companies become more compliant with the proposed regulatory frameworks. To measure and log risks at a point-in-time capacity, we created AI Stress Testing. With this product, we run hundreds of unit tests designed to measure the robustness of the customer's machine learning model and associated data to a far-reaching set of risks potentially found in production. Customers can swiftly and accurately measure the robustness of their models and take action to remediate any identified risks before deployment.
In addition, RIME has two functionalities to ensure machine learning models remain robust during production inference. The first feature is AI Continuous Testing, which continuously runs stress tests over the customer's model and production data to ensure systems that were robust at training time remain robust after deployment. The second feature is AI Firewall, which logs and proactively blocks data that the customer's model is not properly trained to handle. Using both of these products allows data science teams to focus on their craft rather than spend their time building complex real-time testing and analytics.
Make AI Regulation your Friend
Though any regulation may seem like a burden, our view is that the frameworks laid out by the aforementioned government agencies are carefully considered and will improve the state of commercial machine learning. The overarching goal of Robust Intelligence is to eliminate AI failures. Machine learning today has many blind spots — our products shine a light on various technical failure modes so that machine learning practitioners can take proactive steps to protect their systems. The regulations proposed by these agencies seek to do the same. Machine learning models our systems verify to be robust tend to perform better across many metrics than those we find to be less robust. Data science teams that use RIME spend less time firefighting issues and more time developing machine learning models.
Questions? Comments? For more information on how you can use RIME in your machine learning pipeline, please reach out to me at firstname.lastname@example.org.