Algorithms, Big Data, Justice and Accountability · the nature, scope, context and purposes of the processing, is likely to result in a high risk to the rights and freedoms of natural

Algorithms, Big Data, Justice and Accountability- Articulating and Formalizing Fairness in Machine Learning

Laurens Naudts, Doctoral Researcher in Law, KU Leuven CiTiP

[email protected]

www.law.kuleuven.be/citip

Outline

Fairness in Machine Learning

Equality and Non-Discrimination in Machine Learning

Luck Equality as a Use-Case

Accountability and Machine Learning

Conclusion

Fairness in Machine LearningDistributive Justice and Machine Learning

Machine Learning

Clustering Classification

Source: Fayyad et al. 1997

Almost all papers concerning algorithms and machine learning contain a sentence

comparable to:

“Increasingly, automated processes are deployed to make decisions that have

a significant impact on individuals’ lives”

Focus on the allocation of benefits and

burdens to individuals within society (Fair

Share)

Theories of Distributive Justice and Machine Learning

Procedural JusticeSubstantive Justice

Focus on the procedures, e.g. processes, logic

and deliberation of a decision, to determine the

allocation of benefits and burdens to individuals

within society (Fair Treatment)

Fair Machine Learning Outcomes

Fair Machine Learning Processes

Formalizing Fair Machine Learning

• Strict Equality

• Equality of Resources

(Dworkin)

• Luck Egalitarianism (Dworkin

et al.)

• Welfare (Bentham, Mill)

• Libertarian (Nozick)

• Others

Fair Machine Learning Outcomes Fair Machine Learning Procedures

Formalizing Fair Machine Learning

• System Functionality:

• Logic

• General Functionality

• Individual Decision-Making:

• Rationale

• Reasons

• Individual Circumstances

• In respect of Data Protection Laws?

Depends on perspective one takes:

• Egalitarianism

• Strict Egalitarianism

• Equality of (Initial) Opportunity

• Equality of Welfare

• Sufficientarianism

• Prioritarianism (Parfit)

• Capability Approach (Sen,

Nussbaum)

• Libertarian

• Utilitarian Other Principles (Beauchamp, Childress):

• Personal Autonomy/Identity

• Beneficence

• Nonmaleficence

National

Security /

Law

Enforceme

nt

Employmen

t

Recommen

der

Systems

Banking

Social

Credit

Insurance

Distributive Justice in Machine

LearningEquality as a Principle of Justice

The Guardian, 2016

The New York Times, 2015

MIT Technology Review.

2016.

Pro Publica, 2017.

De Standaard, 2017

Knack, 2018.

Parity Preference

Treatment Treatment Parity Preferred Treatment

Impact (Results) Group Fairness

Individual Fairness

Equality of Opportunity

Preferred Impact

Source: Gajane (2017)

Parity in Machine Learning

• Treatment Parity: Avoid the use of sensitive attributes in machine learning processes process

• Impact Parity: Avoid disparity in the fraction of users belonging to different sensitive attribute groups that receive beneficial decision outcomes.

• Group Fairness (Statistical/Demographic Parity): The prediction of a particular outcome for individuals across groups should have an almost equal probability. The protected group is statistically treated similar to the general population. (// affirmative action) (Feldman et al., Dwork et al)

• Individual Fairness: “Similar individuals (in relation to purpose of the task at hand) should be treated similarly (receive similar outputs)” (See for instance: Dwork et al.)

• Equality of Opportunity/Equalized Odds/Disparate Mistreatment: “Individuals who qualify for a desirable outcome should have an equal chance of being correctly classified for this outcome” (See for instance: Hardt et al.; Zafar et al.)

Preference in Machine Learning

• Preference: Given the choice between various sets of decision outcomes, anygroup of users would collectively prefer the set that contains the largest fraction(or the greatest number) of beneficial decision outcomes for that group (Zafar etal.).

• Preferred Treatment: every sensitive attribute group (e.g., men and women) prefers theset of decisions they receive over the set of decisions they would have received had theycollectively presented themselves to the system as members of a different sensitivegroup.

• Preferred Impact: every sensitive attribute group (e.g., men and women)prefers the set of decisions they receive over the set of decisions they wouldhave received under the criterion of impact parity.

Source: Friedler et al.

2018

Good Applicant

Bad Applicant

Historical Data of

Corporation

Ethnicity (Sensitive Attribute)

Geographic Location (

Driver’s License

Gender (Sensitive Attribute)

Income previous

profession

Ethnicity (Sensitive Attribute)

Geographic Location

Driver’s License

Gender (Sensitive Attribute)

Income previous profession

Ideal Employee

Direct Discrimination on

the basis of a sensitive

attribute

Indirect Discrimination on

the basis of a sensitive

attribute through proxy

Fair or Unfair

Differentiation?

Correlation Location/License

and EthnicityRandom-Group Differentiation

Key Problem: The Formalization of Fair Machine

Learning is unlikely to take into account future

societal/individual changes as a result of machine

learning itself!

Random-Group Differentiation

Random Groups/Non-distributive outcomes (// over and under inclusion, faultygeneralisation fallacy):

• Can generate and can make systemic new differentiation grounds

• Even if , at one point, they could be considered a proxy for ‘traditional discrimination’ grounds (though the latter shouldn’t necessarily be the case).

• // Stereotyping, Stigmatization

“Statements about individual as member of group” versus “Statements aboutindividuals in their own right”

• Both statements (algorithmic and reality/perception) are true (in some sense), but contradictory(Vedder & Naudts, 2017)

• // De-individualization

Fair Machine Learning and

Luck EqualityThe Articulation of Fair Machine Learning through Option and Brute Luck

Option Luck: Outcomes due to Choice

(Volition)

Brute Luck: Outcomes not foreseeable

by choice (unavoidable).

Option Luck: Events or Outcomes

• Choice or Volition

• Reasonably avoidable

• Reasonably foreseeable

• Influencable

Brute Luck: Events or Outcomes

• Unavoidable

• Not reasonably avoidable

• Not reasonably foreseeable

• Not influenceable

Ju

st/

Fair

In

eq

uali

ties

Un

just/U

nfa

ir Ineq

ualitie

s

Choice versus Chance?

Unavoidable, including

I. Not reasonably

foreseeable

II. Not influenceable

A. (Supposed) information concerning

the affected individual (and group); or

B. (Supposed) actions/behavior of the

affected individual (group);

C. For the affected individual, no clear

link exists between A, B and the

Algorithmic Outcome/Categorization

Interpretation might change over time, e.g. due

to increasing awareness concerning algorithms

An Algorithmic Outcome is due to Brute Luck and

thus Unfair

When Based on:

Machine Learning, Ethics and

the LawAccountability Mechanisms in the GDPR: Towards Fair Machine Learning?

Accountability in the GDPRSelf-Assessment Accountability Measures:

Privacy and Data Protection By Design (Art. 25 GDPR)

// Fair Machine Learning

Record Keeping Obligations (Art. 30 GDPR)

Data Protection Impact Assessment (Art. 35 GDPR)

Codes of Conduct (Recital 99, Art. 45 GDPR)

External Accountability Measures:

Transparency Requirements (Recitals 39, 58 and78 GDPR; Art. 4 (1), 5 §1 (a), 12, 13 §2 (f), Art. .14 §2 (g) and Art. 15 §1 GDPR)

Right to an Explanation? (Recital 71, Art. 22 GDPR)

Informational Justice (See inter alia Colquitt, Binns et el.)

Binns et al.: “Receiving a thorough explanation (informational justice) is important in helping people to assess whether the decision-making procedure is just (procedural justice). In turn, decisions perceived to be procedurally just are more likely to be perceived as substantively just.”

Data Protection Impact Assessment and Codes of

Conduct

Data Protection Impact Assessment (Micro-Level):

Where a type of processing in particular using new technologies, and taking into account

the nature, scope, context and purposes of the processing, is likely to result in a high risk

to the rights and freedoms of natural persons (Art. 35 GDPR)

Natural persons, rather than data subjects

Rights and Freedoms, rather than data protection

Equality and Non-Discrimination

Codes of Conduct (Macro-Level):

Specify amongst others fair and transparent processing, information to be provided to the

public

Stakeholder involvement through consultation (Recital 99 GDPR)

E.g. citizen’s interests bodies, ethics boards, data subjects, etc.

Conclusion

• Morality is complex

• Machine Learning is complex

• Articulating morality is complex

• Formalizing morality is complex

• Fair Machine learning is complex

• Interdisciplinary Research and Dialogue amongst communities remains

necessary

Bibliography• Binns, Reuben. ‘Fairness in Machine Learning: Lessons from Political Philosophy’. In Proceedings of Machine Learning Research. New York City, 2018.

• Binns, Reuben, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. ‘“It’s Reducing a Human Being to a Percentage”; Perceptions of Justice in Algorithmic Decisions’. ArXiv:1801.10408 [Cs], 31 January 2018. https://doi.org/10.1145/3173574.3173951.

• Colquitt, Jason A., Donald E. Conlon, Michael J. Wesson, Christopher O. L. H. Porter, and K. Yee Ng. ‘Justice at the Millennium: A Meta-Analytic Review of 25 Years of Organizational Justice Research’. Journal of Applied Psychology 86, no. 3 (June 2001): 425–45. http://dx.doi.org.kuleuven.ezproxy.kuleuven.be/10.1037/0021-9010.86.3.425.

• Dwork, Cynthia, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Rich Zemel. ‘Fairness Through Awareness’. ArXiv:1104.3913 [Cs], 19 April 2011. http://arxiv.org/abs/1104.3913.

• Dwork, Cynthia, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. ‘Decoupled Classifiers for Fair and Efficient Machine Learning’. ArXiv:1707.06613 [Cs], 20 July 2017. http://arxiv.org/abs/1707.06613

• Dworkin, R. (1981). What is equality? Part 2: equality of resources. Philos Public Aff, 10, 283-345.

• Fayyad, Usama, Gregory Piatetsky-Shapiro, and Padhraic Smyth. ‘From Data Mining to Knowledge Discovery in Databases’, n.d., 18.

• Feldman, Michael, Sorelle Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. ‘Certifying and Removing Disparate Impact’. ArXiv:1412.3756 [Cs, Stat], 11 December 2014. http://arxiv.org/abs/1412.3756.

• Friedler, Sorelle A., Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, and Derek Roth. ‘A Comparative Study of Fairness-Enhancing Interventions in Machine Learning’. ArXiv:1802.04422 [Cs, Stat], 12 February 2018. http://arxiv.org/abs/1802.04422.

• Gajane, Pratik. ‘On Formalizing Fairness in Prediction with Machine Learning’. ArXiv:1710.03184 [Cs, Stat], 9 October 2017. http://arxiv.org/abs/1710.03184.

• Hardt, Moritz, Eric Price, and Nathan Srebro. ‘Equality of Opportunity in Supervised Learning’. ArXiv:1610.02413 [Cs], 7 October 2016. http://arxiv.org/abs/1610.02413.

• Naudts, Laurens. ‘Fair or Unfair Differentiation? Luck Egalitarianism as a Lens for Evaluating Algorithmic Decision-Making.’ London, 2017.

• Vallentyne, P. (2002). Brute luck, option luck and equality of initial opportunities. Ethics 112, 529-557.

• Vedder, Anton, and Laurens Naudts. ‘Accountability for the Use of Algorithms in a Big Data Environment’. International Review of Law, Computers & Technology 31, no. 2 (4 May 2017): 206–24. https://doi.org/10.1080/13600869.2017.1298547.

• Zafar, Muhammad Bilal, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi, and Adrian Weller. ‘From Parity to Preference-Based Notions of Fairness in Classification’. ArXiv:1707.00010 [Cs, Stat], 30 June 2017. http://arxiv.org/abs/1707.00010.

https://doi.org/10.1145/3173574.3173951

http://arxiv.org/abs/1104.3913



Thank you for your attention!

Documents

Algorithms, Big Data, Justice and Accountability · the nature, scope, context and purposes of the processing, is likely to result in a high risk to the rights and freedoms of natural