In this article we want to propose a strategy on how to satisfy legal constraints AI is facing right now. The ruling of the European Court of Justice (ECJ), on the 21st of July 2022, is a perfect example that emphasizes the need for transparency and risk evaluation methods in order to use AI to its full potential.
Data is one of the most valuable assets in the age of digitization and almost impossible to keep to keep to oneself. Nowadays, when you want to find a person, you don’t follow their footprints, you follow their data trail. For example, if you plan to get on a plane, passenger and travel data is stored and evaluated. According to the Passenger Name Record (PNR) Directive, the aim of the collection of this data is to detect or prevent terrorist offences and serious crime.
The ECJ has now decided to readjust this PNR Directive in order to further protect the right to privacy and to avoid discrimination.
The automated processing and querying of the PNR database among other things have been limited by these new rules. The Advocate General has concerns about an AI’s ability to change evaluation criteria and might select human properties that lead to discrimination.
In addition, automated decision making resulting in identifying a person as a potential threat must be checked individually for legality required by the provisions of the PNR Directive. However, due to the lack of transparency of results, it is seemingly impossible to determine the reason why an artificial intelligence algorithm has decided to flag a human and thus it cannot be guaranteed that such a result is free of discrimination.
Risk is always the hurdle that makes it difficult to deploy AI, because many companies are inexperienced in mitigating risk or understanding the decision making of their AI systems. Risk is not only introduced if a person can be physically harmed: human dignity must be preserved and protected as well. Every human being has the right to equal and fair treatment, regardless of gender, age, origin, ethnicity, religious affiliation, etc., and this right must not be violated when using AI.
The European Court of Justice is not alone in its concerns. The forthcoming AI Act will introduce strict obligations before such systems can be put on the market. High-risk AI systems will require systematic risk assessment and mitigation systems and must minimise risks and discriminatory outcomes. In this use case we are observing the necessity of these upcoming obligations in real-life. Besides an undefined quality threshold for the false positive rate certain input variables might lead to discriminatory outcomes.
At neurocat, our goal is to make artificial intelligence more robust so that society can benefit from the added value of this disruptive technology without compromising safety and other values. For over 5 years, we have focused on enabling AI pioneers to enter critical sectors and apply their AI to sensitive use cases. We do this by developing systematic test strategies to enable the deployment of our clients AI products. So let’s look at how we would do this so as to avoid discrimination in an AI system.
In 2017, Wachter et al. analyzed methods that provide a post-hoc explanation of AI systems. At the time they had concerns about the automated decision-making related to GDPR. However, also for the use case at hand we can use their insights to automatically identify discrimination with so called counterfactuals.
But what are counterfactuals? Let us assume an AI has concluded from the “training data” that people coming from Country A are more likely to pose a risk for society. The people who implemented the system are not aware of this yet. So, if Mrs. Cat, who is a citizen of Country A, books a flight, she will be wrongfully flagged by the algorithm. To inspect whether this decision is based on discriminatory rules counterfactuals can be used.
Counterfacts are an alternative data set, wherein the same AI module would not have classified Mrs. Cat as a danger. In this “explanatory data set” the data matches the original input about Mrs. Cat, except that she is now coming from Country B. By comparing the original input to the counterfact it is straightforward to conclude that the algorithm produces discriminatory results solely based on the fact that a person is living in a specific country.
To automatically generate counterfacts, it is important to create an explanatory data set that provides a diverse set of inputs that would have changed the decision of the AI. If you think about it counterfacts are just like adversarial attacks. Both have the goal to introduce small changes to the input data in order to change the classification of the algorithm. With counterfacts we use that to our advantage and produce post-hoc explanations that are able to reveal discrimination.
We understood now that we can use adversarial attacks to detect discrimination. The next challenge is to find adversarial attacks that produce a diverse set of counterfacts to prevent overlooking any form of discrimination.
One common mistake we are observing is to arbitrarily select adversarial attacks and hope they provide meaningful insights about said module.
To avoid this neurocat invented the Attack Generator. This generator, in a brief, is a systematic approach towards generating diverse adversarial attacks, and thus counterfacts, for any given AI-module. This is a post-hoc explanation method that can be leveraged to determine if discrimination was involved in automated decision-making; case in point, for the new PNR data regulations.
AI is changing the world and we want to unleash its full potential even in critical applications, such as the ECJ is facing right now. Our focus is on delivering innovation in quality control of AI systems. We want to enable companies in evaluation, development and deployment of safe and secure AI systems and help people realize the full capacity of AI systems by applying best practices in testing standards and especially risk mitigation.
Judgment of 21 June 2022, Ligue des droits humains, C‑817/19, ECLI:EU:C:2022:491
ASSION, F., SCHLICHT, P., GRESSNER, F., GÜNTHER, W., HÜGER, F., SCHMIDT, N., AND RASHEED, U. The attack generator: A systematic approach towards constructing adversarial attacks. arXiv (2019). pages 1-12.
DORAN, D., SCHULZ, S., AND BESOLD, T. R. What does explainable AI really mean? A new conceptualization of perspectives. In Proceedings of the First International Workshop on Comprehensibility and Explanation in AI and ML (CEX), Bari, Italy, November 16th and 17th, 2017. (2018), vol. 2071 of CEUR Workshop Proceedings, CEUR-WS.org.
MCGRATH, R., COSTABELLO, L., VAN, C. L., SWEENEY, P., KAMIAB, F., SHEN, Z., AND LÉCUÉ, F. Interpretable credit application predictions with counterfactual explanations. NIPS(2018). pages 1-9.
PAPANGELOU, K., SECHIDIS, K., WEATHERALL, J., AND BROWN, G. Toward an understanding of adversarial examples in clinical trials. Machine Learning and Knowledge Discovery in Databases ECML PKDD(2019). Vol 11051, pages 1-16.
SELBST, A. D., AND POWLES, J. Meaningful information and the right to explanation. International Data Privacy Law (2017). Vol 7, pages 233-242.
SOKOL,K., AND FLACH, P. Counterfactual explanations of machine learning predictions: Opportunities and challenges for AI safety. SafeAI@AAAI(2019). pages 1-4.
WACHTER, S., MITTELSTADT, B., AND RUSSEL, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harvard Journal of Law & Technology (2018). Vol 31, pages 841-887.
WOODWARD, J. Interventionism and causal exclusion. Philosophy and Phenomenological Research (2015). Vol. XCI, pages 303-347.