Transparency of algorithms

In January 2018 we kicked off the VWData research programme, with Inald Lagendijk as coordinator: a research programma that brings together academia, government and industry, and that aims to develop technical and societal solutions for using big data and algorithms responsibly (VWData Flyer).

Transparency of algorithms in the context of justice and security

Ibo van de Poel and Paul Hayes of Delft University of Technology, Remco Boersma of the Dutch Ministry of Justice and Security, and me (Marc Steen of TNO), work in the project “Responsible Collection and Analysis of Personal Data for Justice and Security”. We focus on the usage of big data and algorithms in the context of justice and security, e.g., by judges and by police officers, which raihses a range of questions about ethics and justice, e.g., about discrimination against specific groups of people.

Our objective is to make the usage of algorithms in the context of justice and security more transparent, so that their fairness, accuracy and confidentiality can be evaluated.

Clearly, one cannot maximize transparency in justice and security. Rather, transparency will need to be optimized; transparency will need to be balanced with security. The Ministry needs to be open and transparent ‘where possible’ and to provide security and safety ‘where needed’ (Informatiestrategie 2017-2022, pp 17, 23-24; and Informatieplan 2017, pp 15-19).

transparency

We will combine conceptual and practical research:

  • Conceptual: We will clarify what we mean with transparency, vis-à-vis other values, most notably fairness, accuracy, confidentiality, security and safety, and in terms of accountability, i.e. the ability to provide satisfactory accounts to diverse stakeholders, e.g., courts of justice, police officers and their managers, journalists and citizens;
  • Practical: We will conduct one case study, in close collaboration with the Ministry of Justice and Security’s ‘Living Lab Big Data’, and deliver a set of scenarios for optimizing transparency (the topic will be defined by the Ministry). This case study will also take into account the current data handling processes policies of the Ministry.

Auditing algorithms for fairness, accuracy and confidentiality

In parallel, we will (very likely) also be working on the development of a standard process for the auditing of algorithms (to ‘open the black box’); this process would help:  1) to decide which algorithms should be audited; and 2) to execute the assessment of the algorithm’s fairness, accuracy and confidentiality. Sander Klous and Remko Helms (and others) will also be involved in this work. auditing

Currently, many algorithms function like ‘black boxes’. The give answers but no explanations. This is bad news if you are refused a mortgage (‘algorithm says no’) or if the police arrests you (‘algorithm says yes’).

We foresee that it will be necessary, within the next two years, to audit algorithms, i.e. to assess the algorithm’s fairness, accuracy and confidentiality (maybe use other terms, e.g., reliability, explainability) against a well-defined standard. The results of such an audit can help in various ways: 1) consumers/citizens can assess the algorithm’s fairness, accuracy and confidentiality, similar to how they can assess organic meat or fair trade bananas; and 2) service providers, both public and private, can position their offer as ‘fair’ or ‘accurate’ or ‘confidential’.

We are aware of other initiatives, e.g., “This logo is like an “organic” sticker for algorithms

Advertisements

Responsible Data Innovation

It has become clear that Big Data is not only an enabler of radical changes in technology and business, but also a source of radical changes in society and in people’s daily lives. And, as with many emerging technologies, Big Data offers opportunities, as well as challenges. This is the case, e.g., for Predictive Policing, Quantified Self and all sorts of other Big Data applications and services.

smallbigdatamarcsteen

Pitching ‘Responsible Data Innovation’ in two minutes

Many discussions of Big Data depart from a legal perspective and address, e.g., what is legally permitted. As a complement, we will explore, in this blog, the ‘Ethics in Big Data’, i.e. the various ethical issues at play in developing and deploying Big Data applications.

Let us illustrate what we mean with ‘Ethics in Big Data’, by giving some examples of questions and issues that can arise during the development and deployment of (Big) Data applications–issues that can impact society and can raise ethical questions:

  • Data Selection and Collection: the selection of sources to be included (or excluded), they ways in which missing data points are dealt with–and the ways in which this can, unintentionally, discriminate against certain (‘minority’) groups
  • Data Processing and Modelling: the usage of (implicit) assumptions, prior knowledge or (existing) categories to interpret or label data–which can, often unintentionally, propagate existing biases or unfairness
  • Data Presentation and Action: including, e.g., (unintentional) ‘framing’, and suggestions towards specific interpretations and actions–which can lead to questions about agency: who is in charge, the people or the data?

Framework and workshop format

In order to enable people in the industry to engage with ethical questions like this, we we developed a framework, which also serves as a practical workshop format. The framework consists of three rows (data selection and collection; data processing and modelling; data presentation and action) and three or more columns, with key ethical values–values that are key in a liberal, democratic society:

  • Autonomy and Freedom: people’s capability to form a conception of ‘the good life’ and the practical ability to realize this (‘positive freedom’), and to act without being obstructed by others (‘negative freedom’)
  • Fairness and Equality: the capability for people to be treated fairly or equally, e.g., regarding the distribution of goods and evils between people, and to share the consequences of disadvantageous situations
  • Transparency and Accountability: the capability of people to understand how organizations, both public and private, use their personal data, and the implications of Big Data applications for their personal and social lives.
  • Other values: Please note that this list of (ethical) values can be augmented; we can add other values, depending on the context of the application and the organization, e.g., values like: Privacy, Solidarity, Dignity, Authenticity.

This framework facilitates people to identify and discuss key ethical questions in a systematic manner, i.e. in the different cells of the table, e.g., questions concerning privacy, representation, agency, interpretation, uncertainty, and algorithmic fairness. Very practically, the framework can function as the basis for a workshop format.

respdatainno

Typically, a group of 4-6 people, who are involved in the development and deployment of a specific Big Data application, are invited to discuss a series of ethical questions (typically 3 out of the 9 cells are most relevant), and to explore ways to deal with these questions, and to develop practical solutions. This workshop can be done in 90-120 minutes.

This framework is based on the classical idea of eudaimonia, which refers to people’s flourishing and wellbeing, both on the level of individuals and on the level of society.

We have done this workshop with people of NPO, who were working on ‘MyNPO’, an app that will offer personalized media content, using advanced data analysis of people’s behavior patterns, and with people from the Municipality of Rotterdam, who are exploring ways to analyse data on citizens to forecast future needs for social services. The results of doing the Responsible Data Innovation workshop are the following:

  • Clarity on which ethical issues are at play
  • Suggestions for dealing with these issues
  • Action points for furthering the development

Please contact Dr. Marc Steen of TNO (marc.steen@tno.nl) if you are interested in this framework.