Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Algorithms, Big Data, Justice and Accountability- Articulating and Formalizing Fairness in Machine Learning
Laurens Naudts, Doctoral Researcher in Law, KU Leuven CiTiP
www.law.kuleuven.be/citip
Outline
Fairness in Machine Learning
Equality and Non-Discrimination in Machine Learning
Luck Equality as a Use-Case
Accountability and Machine Learning
Conclusion
Fairness in Machine LearningDistributive Justice and Machine Learning
Machine Learning
Clustering Classification
Source: Fayyad et al. 1997
Almost all papers concerning algorithms and machine learning contain a sentence
comparable to:
“Increasingly, automated processes are deployed to make decisions that have
a significant impact on individuals’ lives”
Focus on the allocation of benefits and
burdens to individuals within society (Fair
Share)
Theories of Distributive Justice and Machine Learning
Procedural JusticeSubstantive Justice
Focus on the procedures, e.g. processes, logic
and deliberation of a decision, to determine the
allocation of benefits and burdens to individuals
within society (Fair Treatment)
Fair Machine Learning Outcomes
Fair Machine Learning Processes
Formalizing Fair Machine Learning
• Strict Equality
• Equality of Resources
(Dworkin)
• Luck Egalitarianism (Dworkin
et al.)
• Welfare (Bentham, Mill)
• Libertarian (Nozick)
• Others
Fair Machine Learning Outcomes Fair Machine Learning Procedures
Formalizing Fair Machine Learning
• System Functionality:
• Logic
• General Functionality
• Individual Decision-Making:
• Rationale
• Reasons
• Individual Circumstances
• In respect of Data Protection Laws?
Depends on perspective one takes:
• Egalitarianism
• Strict Egalitarianism
• Equality of (Initial) Opportunity
• Equality of Welfare
• Sufficientarianism
• Prioritarianism (Parfit)
• Capability Approach (Sen,
Nussbaum)
• Libertarian
• Utilitarian Other Principles (Beauchamp, Childress):
• Personal Autonomy/Identity
• Beneficence
• Nonmaleficence
National
Security /
Law
Enforceme
nt
Employmen
t
Recommen
der
Systems
Banking
Social
Credit
Insurance
Distributive Justice in Machine
LearningEquality as a Principle of Justice
The Guardian, 2016
The New York Times, 2015
MIT Technology Review.
2016.
Pro Publica, 2017.
De Standaard, 2017
Knack, 2018.
Parity Preference
Treatment Treatment Parity Preferred Treatment
Impact (Results) Group Fairness
Individual Fairness
Equality of Opportunity
Preferred Impact
Source: Gajane (2017)
Parity in Machine Learning
• Treatment Parity: Avoid the use of sensitive attributes in machine learning processes process
• Impact Parity: Avoid disparity in the fraction of users belonging to different sensitive attribute groups that receive beneficial decision outcomes.
• Group Fairness (Statistical/Demographic Parity): The prediction of a particular outcome for individuals across groups should have an almost equal probability. The protected group is statistically treated similar to the general population. (// affirmative action) (Feldman et al., Dwork et al)
• Individual Fairness: “Similar individuals (in relation to purpose of the task at hand) should be treated similarly (receive similar outputs)” (See for instance: Dwork et al.)
• Equality of Opportunity/Equalized Odds/Disparate Mistreatment: “Individuals who qualify for a desirable outcome should have an equal chance of being correctly classified for this outcome” (See for instance: Hardt et al.; Zafar et al.)
Preference in Machine Learning
• Preference: Given the choice between various sets of decision outcomes, anygroup of users would collectively prefer the set that contains the largest fraction(or the greatest number) of beneficial decision outcomes for that group (Zafar etal.).
• Preferred Treatment: every sensitive attribute group (e.g., men and women) prefers theset of decisions they receive over the set of decisions they would have received had theycollectively presented themselves to the system as members of a different sensitivegroup.
• Preferred Impact: every sensitive attribute group (e.g., men and women)prefers the set of decisions they receive over the set of decisions they wouldhave received under the criterion of impact parity.
Source: Friedler et al.
2018
Good Applicant
Bad Applicant
Historical Data of
Corporation
Ethnicity (Sensitive Attribute)
Geographic Location (
Driver’s License
Gender (Sensitive Attribute)
Income previous
profession
Ethnicity (Sensitive Attribute)
Geographic Location
Driver’s License
Gender (Sensitive Attribute)
Income previous profession
Ideal Employee
Direct Discrimination on
the basis of a sensitive
attribute
Indirect Discrimination on
the basis of a sensitive
attribute through proxy
Fair or Unfair
Differentiation?
Correlation Location/License
and EthnicityRandom-Group Differentiation
Key Problem: The Formalization of Fair Machine
Learning is unlikely to take into account future
societal/individual changes as a result of machine
learning itself!
Random-Group Differentiation
Random Groups/Non-distributive outcomes (// over and under inclusion, faultygeneralisation fallacy):
• Can generate and can make systemic new differentiation grounds
• Even if , at one point, they could be considered a proxy for ‘traditional discrimination’ grounds (though the latter shouldn’t necessarily be the case).
• // Stereotyping, Stigmatization
“Statements about individual as member of group” versus “Statements aboutindividuals in their own right”
• Both statements (algorithmic and reality/perception) are true (in some sense), but contradictory(Vedder & Naudts, 2017)
• // De-individualization
Fair Machine Learning and
Luck EqualityThe Articulation of Fair Machine Learning through Option and Brute Luck
Option Luck: Outcomes due to Choice
(Volition)
Brute Luck: Outcomes not foreseeable
by choice (unavoidable).
Option Luck: Events or Outcomes
• Choice or Volition
• Reasonably avoidable
• Reasonably foreseeable
• Influencable
Brute Luck: Events or Outcomes
• Unavoidable
• Not reasonably avoidable
• Not reasonably foreseeable
• Not influenceable
Ju
st/
Fair
In
eq
uali
ties
Un
just/U
nfa
ir Ineq
ualitie
s
Choice versus Chance?
Unavoidable, including
I. Not reasonably
foreseeable
II. Not influenceable
A. (Supposed) information concerning
the affected individual (and group); or
B. (Supposed) actions/behavior of the
affected individual (group);
C. For the affected individual, no clear
link exists between A, B and the
Algorithmic Outcome/Categorization
Interpretation might change over time, e.g. due
to increasing awareness concerning algorithms
An Algorithmic Outcome is due to Brute Luck and
thus Unfair
When Based on:
Machine Learning, Ethics and
the LawAccountability Mechanisms in the GDPR: Towards Fair Machine Learning?
Accountability in the GDPRSelf-Assessment Accountability Measures:
Privacy and Data Protection By Design (Art. 25 GDPR)
// Fair Machine Learning
Record Keeping Obligations (Art. 30 GDPR)
Data Protection Impact Assessment (Art. 35 GDPR)
Codes of Conduct (Recital 99, Art. 45 GDPR)
External Accountability Measures:
Transparency Requirements (Recitals 39, 58 and78 GDPR; Art. 4 (1), 5 §1 (a), 12, 13 §2 (f), Art. .14 §2 (g) and Art. 15 §1 GDPR)
Right to an Explanation? (Recital 71, Art. 22 GDPR)
Informational Justice (See inter alia Colquitt, Binns et el.)
Binns et al.: “Receiving a thorough explanation (informational justice) is important in helping people to assess whether the decision-making procedure is just (procedural justice). In turn, decisions perceived to be procedurally just are more likely to be perceived as substantively just.”
Data Protection Impact Assessment and Codes of
Conduct
Data Protection Impact Assessment (Micro-Level):
Where a type of processing in particular using new technologies, and taking into account
the nature, scope, context and purposes of the processing, is likely to result in a high risk
to the rights and freedoms of natural persons (Art. 35 GDPR)
Natural persons, rather than data subjects
Rights and Freedoms, rather than data protection
Equality and Non-Discrimination
Codes of Conduct (Macro-Level):
Specify amongst others fair and transparent processing, information to be provided to the
public
Stakeholder involvement through consultation (Recital 99 GDPR)
E.g. citizen’s interests bodies, ethics boards, data subjects, etc.
Conclusion
• Morality is complex
• Machine Learning is complex
• Articulating morality is complex
• Formalizing morality is complex
• Fair Machine learning is complex
• Interdisciplinary Research and Dialogue amongst communities remains
necessary
Bibliography• Binns, Reuben. ‘Fairness in Machine Learning: Lessons from Political Philosophy’. In Proceedings of Machine Learning Research. New York City, 2018.
• Binns, Reuben, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. ‘“It’s Reducing a Human Being to a Percentage”; Perceptions of Justice in Algorithmic Decisions’. ArXiv:1801.10408 [Cs], 31 January 2018. https://doi.org/10.1145/3173574.3173951.
• Colquitt, Jason A., Donald E. Conlon, Michael J. Wesson, Christopher O. L. H. Porter, and K. Yee Ng. ‘Justice at the Millennium: A Meta-Analytic Review of 25 Years of Organizational Justice Research’. Journal of Applied Psychology 86, no. 3 (June 2001): 425–45. http://dx.doi.org.kuleuven.ezproxy.kuleuven.be/10.1037/0021-9010.86.3.425.
• Dwork, Cynthia, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Rich Zemel. ‘Fairness Through Awareness’. ArXiv:1104.3913 [Cs], 19 April 2011. http://arxiv.org/abs/1104.3913.
• Dwork, Cynthia, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. ‘Decoupled Classifiers for Fair and Efficient Machine Learning’. ArXiv:1707.06613 [Cs], 20 July 2017. http://arxiv.org/abs/1707.06613
• Dworkin, R. (1981). What is equality? Part 2: equality of resources. Philos Public Aff, 10, 283-345.
• Fayyad, Usama, Gregory Piatetsky-Shapiro, and Padhraic Smyth. ‘From Data Mining to Knowledge Discovery in Databases’, n.d., 18.
• Feldman, Michael, Sorelle Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. ‘Certifying and Removing Disparate Impact’. ArXiv:1412.3756 [Cs, Stat], 11 December 2014. http://arxiv.org/abs/1412.3756.
• Friedler, Sorelle A., Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, and Derek Roth. ‘A Comparative Study of Fairness-Enhancing Interventions in Machine Learning’. ArXiv:1802.04422 [Cs, Stat], 12 February 2018. http://arxiv.org/abs/1802.04422.
• Gajane, Pratik. ‘On Formalizing Fairness in Prediction with Machine Learning’. ArXiv:1710.03184 [Cs, Stat], 9 October 2017. http://arxiv.org/abs/1710.03184.
• Hardt, Moritz, Eric Price, and Nathan Srebro. ‘Equality of Opportunity in Supervised Learning’. ArXiv:1610.02413 [Cs], 7 October 2016. http://arxiv.org/abs/1610.02413.
• Naudts, Laurens. ‘Fair or Unfair Differentiation? Luck Egalitarianism as a Lens for Evaluating Algorithmic Decision-Making.’ London, 2017.
• Vallentyne, P. (2002). Brute luck, option luck and equality of initial opportunities. Ethics 112, 529-557.
• Vedder, Anton, and Laurens Naudts. ‘Accountability for the Use of Algorithms in a Big Data Environment’. International Review of Law, Computers & Technology 31, no. 2 (4 May 2017): 206–24. https://doi.org/10.1080/13600869.2017.1298547.
• Zafar, Muhammad Bilal, Isabel Valera, Manuel Gomez Rodriguez, Krishna P. Gummadi, and Adrian Weller. ‘From Parity to Preference-Based Notions of Fairness in Classification’. ArXiv:1707.00010 [Cs, Stat], 30 June 2017. http://arxiv.org/abs/1707.00010.
Thank you for your attention!