research-article

Open Access

Mitigating Unwanted Biases with Adversarial Learning

Authors:
Brian Hu Zhang

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Blake Lemoine

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

,
Margaret Mitchell

Google, Mountain View, CA, USA

Google, Mountain View, CA, USA
View Profile

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and SocietyDecember 2018Pages 335–340https://doi.org/10.1145/3278721.3278779

Published:27 December 2018Publication History

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

Pages 335–340

ABSTRACT

Machine learning is a tool for building models that accurately represent input training data. When undesired biases concerning demographic groups are in the training data, well-trained models will reflect those biases. We present a framework for mitigating such biases by including a variable for the group of interest and simultaneously learning a predictor and an adversary. The input to the network X, here text or census data, produces a prediction Y, such as an analogy completion or income bracket, while the adversary tries to model a protected variable Z, here gender or zip code. The objective is to maximize the predictor's ability to predict Y while minimizing the adversary's ability to predict Z. Applied to analogy completion, this method results in accurate predictions that exhibit less evidence of stereotyping Z. When applied to a classification task using the UCI Adult (Census) Dataset, it results in a predictive model that does not lose much accuracy while achieving very close to equality of odds (Hardt, et al., 2016). The method is flexible and applicable to multiple definitions of fairness as well as a wide range of gradient-based learning models, including both regression and classification tasks.

References

Asuncion, A., and Newman, D. 2007. Uci machine learning repository.Google Scholar
Beutel, A.; Chen, J.; Zhao, Z.; and Chi, E. H. 2017. Data decisions and theoretical implications when adversarially learning fair representations. arXiv preprint arXiv:1707.00075.Google Scholar
Bolukbasi, T.; Chang, K.-W.; Zou, J. Y.; Saligrama, V.; and Kalai, A. T. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, 4349--4357. Google ScholarDigital Library
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. 2014. Generative adversarial nets. In Advances in neural information processing systems, 2672--2680. Google ScholarDigital Library
Hardt, M.; Price, E.; Srebro, N.; et al. 2016. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems, 3315--3323. Google ScholarDigital Library
Kingma, D., and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.Google Scholar
Kleinberg, J.; Mullainathan, S.; and Raghavan, M. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807.Google Scholar
Lum, K., and Johndrow, J. 2016. A statistical framework for fair predictive algorithms. arXiv preprint arXiv:1610.08077.Google Scholar
Mikolov, T.; Sutskever, I.; Chen, K.; Corrado, G. S.; and Dean, J. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, 3111--3119. Google ScholarDigital Library

Index Terms

Recommendations

Defending against adversarial machine learning attacks using hierarchical learning: A case study on network traffic attack classification
Abstract
Machine learning is key for automated detection of malicious network activity to ensure that computer networks and organizations are protected against cyber security attacks. Recently, there has been growing interest in the domain of ...
Read More
Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

In recent years, machine learning algorithms, and more specifically deep learning algorithms, have been widely used in many fields, including cyber security. However, machine learning systems are vulnerable to adversarial attacks, and this limits the ...
Read More
Adversarial machine learning
AISec '11: Proceedings of the 4th ACM workshop on Security and artificial intelligence

In this paper (expanded from an invited talk at AISEC 2010), we discuss an emerging field of study: adversarial machine learning---the study of effective machine learning techniques against an adversarial opponent. In this paper, we: give a taxonomy for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
December 2018
406 pages
ISBN:9781450360128
DOI:10.1145/3278721
Program Chairs:
Jason Furman
Harvard University, USA
,
Gary Marchant
Arizona State University, USA
,
Huw Price
Cambridge University, UK
,
Francesca Rossi
IBM Research, USA & University of Padova, Italy
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 December 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adversarial learning
debiasing
multi-task learning
unbiasing
Qualifiers
- research-article
Conference

Acceptance Rates
AIES '18 Paper Acceptance Rate61of162submissions,38%Overall Acceptance Rate61of162submissions,38%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 493
  Total Citations
  View Citations
- 14,922
  Total Downloads
- Downloads (Last 12 months)4,221
- Downloads (Last 6 weeks)561
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mitigating Unwanted Biases with Adversarial Learning

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

ABSTRACT

References

Cited By

Index Terms

Recommendations

Defending against adversarial machine learning attacks using hierarchical learning: A case study on network traffic attack classification

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

Adversarial machine learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Mitigating Unwanted Biases with Adversarial Learning

AIES '18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society

ABSTRACT

References

Cited By

Index Terms

Recommendations

Defending against adversarial machine learning attacks using hierarchical learning: A case study on network traffic attack classification

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

Adversarial machine learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media