ABSTRACT
Calls for heightened consideration of fairness and accountability in algorithmically-informed public decisions-like taxation, justice, and child protection-are now commonplace. How might designers support such human values? We interviewed 27 public sector machine learning practitioners across 5 OECD countries regarding challenges understanding and imbuing public values into their work. The results suggest a disconnect between organisational and institutional realities, constraints and needs, and those addressed by current research into usable, transparent and 'discrimination-aware' machine learning-absences likely to undermine practical initiatives unless addressed. We see design opportunities in this disconnect, such as in supporting the tracking of concept drift in secondary data sources, and in building usable transparency tools to identify risks and incorporate domain knowledge, aimed both at managers and at the 'street-level bureaucrats' on the frontlines of public service. We conclude by outlining ethical challenges and future directions for collaboration in these high-stakes applications.
Supplemental Material
- Monsuru Adepeju, Gabriel Rosser, and Tao Cheng. 2016. Novel evaluation metrics for sparse spatio-temporal point process hotspot predictions-a crime case study. International Journal of Geographical Information Science 30, 11 (2016), 2133--2154. Google ScholarDigital Library
- Administrative Data Taskforce. 2012. The UK Administrative Data Research Network: Improving access for research and policy. Economic and Social Research Council. http://www.esrc.ac.uk/files/ research/administrative-data-taskforce-adt/ improving-access-for-research-and-policy/Google Scholar
- AI Now. 2016. The AI Now Report: The Social and Economic Implications of Artificial Intelligence Technologies in the Near-Term. https://artificialintelligencenow.com/Google Scholar
- Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica (2016). https://www.propublica.org/article/ machine-bias-risk-assessments-in-criminal-sentencingGoogle Scholar
- Solon Barocas and Andrew D Selbst. 2016. Big Data's Disparate Impact. California Law Review 104 (2016), 671--732.Google Scholar
- Richard L Baskerville and A Trevor Wood-Harper. 1996. A critical perspective on action research as a method for information systems research. Journal of Information Technology 11, 3 (1996), 235--246.Google ScholarCross Ref
- Gwyn Bevan and Christopher Hood. 2006. What's measured is what matters: Targets and gaming in the English public health care system. Public Administration 84, 3 (2006), 517--538.Google ScholarCross Ref
- Julia Black. 2005. The emergence of risk-based regulation and the new public risk management in the United Kingdom. Public Law (2005), 512--549. Issue Autumn. https://perma.cc/Z8AU-4VNNGoogle Scholar
- danah boyd. 2016. Undoing the neutrality of Big Data. Florida Law Review Forum 16 (2016), 226--232.Google Scholar
- Aurélien Buffat. 2015. Street-level bureaucracy and e-government. Public Management Review 17, 1 (2015), 149--161.Google ScholarCross Ref
- Matthew Chalmers and Ian MacColl. 2003. Seamful and seamless design in ubiquitous computing. In Workshop at the crossroads: The interaction of HCI and systems issues in UbiComp, Vol. 8. https://perma.cc/2A3D-NMJPGoogle Scholar
- Hsinchun Chen, Homa Atabakhsh, Chunju Tseng, Byron Marshall, Siddharth Kaza, Shauna Eggers, Hemanth Gowda, Ankit Shah, Tim Petersen, and Chuck Violette. 2005. Visualization in law enforcement. In CHI'05 Extended Abstracts on Human Factors in Computing Systems. 1268--1271. Google ScholarDigital Library
- Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5, 2 (2017), 153--163.Google ScholarCross Ref
- Cary Coglianese and David Lehr. 2016. Regulating by Robot: Administrative Decision Making in the Machine-Learning Era. Geo. LJ 105 (2016), 1147. https://ssrn.com/abstract=2928293Google Scholar
- Nancy J. Cooke. 1994. Varieties of knowledge elicitation techniques. International Journal of Human-Computer Studies 41, 6 (1994), 801--849. Google ScholarDigital Library
- Patrick Dunleavy, Helen Margetts, Simon Bastow, and Jane Tinkler. 2006. Digital Era Governance: IT Corporations, the State and e-Government. Oxford University Press, Oxford. Google ScholarDigital Library
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS '12). 214--226. Google ScholarDigital Library
- Mary T Dzindolet, Scott A Peterson, Regina A Pomranky, Linda G Pierce, and Hall P Beck. 2003. The role of trust in automation reliance. International Journal of Human-Computer Studies 58, 6 (2003), 697--718. Google ScholarDigital Library
- Editor. 2016. More accountability for big-data algorithms. Nature 537, 7621 (2016), 449.Google Scholar
- Lilian Edwards and Michael Veale. 2017. Slave to the Algorithm? Why a 'Right to an Explanation' is Probably not the Remedy You are Looking For. Duke Law & Technology Review 16, 1 (2017), 18--84.Google Scholar
- Danielle Ensign, Sorelle A. Friedler, Scott Neville, Carlos Scheidegger, and Suresh Venkatasubramanian. 2017. Runaway Feedback Loops in Predictive Policing. Presented as a talk at the 4th Workshop on Fairness, Accountability and Transparency in Machine Learning (FAT/ML 2017), Halifax, Canada (2017). https://arxiv.org/abs/1706.09847Google Scholar
- European Commission. 2017. Tender specifications: Study on Algorithmic Awareness Building, SMART 2017/0055. https://etendering.ted.europa.eu/cft/ cft-document.html?docId=28267Google Scholar
- Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15). 259--268. Google ScholarDigital Library
- Gerhard Fischer. 1991. The importance of models in making complex systems comprehensible. In Mental Models and Human-Computer Interaction, MJ Tauber and D Ackermann (Eds.). Elsevier, Noord Holland.Google Scholar
- Diana E Forsythe. 1995. Using ethnography in the design of an explanation system. Expert Systems with Applications 8, 4 (1995), 403--417.Google ScholarCross Ref
- Batya Friedman and Helen Nissenbaum. 1996. Bias in Computer Systems. ACM Trans. Inf. Syst. 14, 3 (July 1996), 330--347. Google ScholarDigital Library
- Robert D Galliers and Frank F Land. 1987. Choosing appropriate information systems research methodologies. Commun. ACM 30, 11 (1987), 901--902. Google ScholarDigital Library
- J Gama, Indre Žliobaite, A Bifet, M Pechenizkiy, and A Bouchachia. 2013. A survey on concept drift adaptation. Comput. Surveys 1, 1 (2013). Google ScholarDigital Library
- Raphaël Gellert, Katja de Vries, Paul de Hert, and Serge Gutwirth. 2013. A Comparative Analysis of Anti-Discrimination and Data Protection Legislations. In Discrimination and privacy in the information society, Bart Custers, Toon Calders, Bart Schermer, and Tal Zarsky (Eds.). Springer, Heidelberg.Google Scholar
- Government Digital Service. 2015. Data science ethical framework. HM Government, London. https://www.gov.uk/government/publications/ data-science-ethical-frameworkGoogle Scholar
- Government Office for Science. 2016. Artificial intelligence: Opportunities and implications for the future of decision making. HM Government, London. https://www.gov.uk/government/publications/ artificial-intelligence-an-overview-for-policy-makersGoogle Scholar
- Sara Hajian and Josep Domingo-Ferrer. 2012. Direct and indirect discrimination prevention methods. In Discrimination and privacy in the information society, Bart Custers, Toon Calders, Bart Schermer, and Tal Zarsky (Eds.). Springer, Berlin, Heidelberg, 241--254.Google Scholar
- Gillian R Hayes. 2011. The relationship of action research to human-computer interaction. ACM Transactions on Computer-Human Interaction (TOCHI) 18, 3 (2011), 15. Google ScholarDigital Library
- Robert R Hoffman. 2008. Human factors contributions to knowledge elicitation. Human factors 50, 3 (2008), 481--488.Google Scholar
- Robert R Hoffman, Beth Crandall, and Nigel Shadbolt. 1998. Use of the critical decision method to elicit expert knowledge: A case study in the methodology of cognitive task analysis. Human Factors 40, 2 (1998), 254--276.Google ScholarCross Ref
- Christopher Hood. 1991. A public management for all seasons? Public Administration 69 (1991), 3--19.Google ScholarCross Ref
- V David Hopkin. 1995. Human factors in air traffic control. CRC Press, London.Google Scholar
- Robert Hoppe. 2011. The governance of problems: Puzzling, powering and participation. Policy Press.Google Scholar
- House of Common Science and Technology Committee. 2016. Robotics and artificial intelligence (HC 145). The House of Commons, London. http://www.publications.parliament.uk/pa/cm201617/ cmselect/cmsctech/145/145.pdfGoogle Scholar
- House of Commons Science and Technology Committee. 2016. The big data dilemma (HC 468). House of Commons, London. http://www.publications.parliament. uk/pa/cm201516/cmselect/cmsctech/468/468.pdfGoogle Scholar
- Ling Huang, Anthony D Joseph, Blaine Nelson, Benjamin I P Rubinstein, and J D Tygar. 2011. Adversarial machine learning. In Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence. 43--58. Google ScholarDigital Library
- Nathalie Japkowicz and Mohak Shah. 2011. Evaluating learning algorithms: A classification perspective. Cambridge University Press, Cambridge, UK. Google Scholar
- Torben Beck Jørgensen and Barry Bozeman. 2007. Public values: An inventory. Administration & Society 39, 3 (2007), 354--381.Google ScholarCross Ref
- Frans Jorna and Pieter Wagenaar. 2007. The 'iron cage' strengthened? Discretion and digital discipline. Public Administration 85, 1 (2007), 189--214.Google ScholarCross Ref
- Faisal Kamiran and Toon Calders. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1 (2012), 1--33.Google ScholarDigital Library
- Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. 2010. Discrimination aware decision tree learning. In 2010 IEEE International Conference on Data Mining. 869--874. Google ScholarDigital Library
- Kensaku Kawamoto, Caitlin A Houlihan, E Andrew Balas, and David F Lobach. 2005. Improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success. BMJ 330, 7494 (2005), 765.Google Scholar
- Sara Kiesler and Jennifer Goetz. 2002. Mental Models of Robotic Assistants. In CHI '02 Extended Abstracts on Human Factors in Computing Systems (CHI EA '02). 576--577. Google ScholarDigital Library
- Iacovos Kirlappos, Simon Parkin, and M. Angela Sasse. 2015. "Shadow Security" As a Tool for the Learning Organization. SIGCAS Comput. Soc. 45, 1 (2015), 29--37. Google ScholarDigital Library
- Daniel Antony Kolkman, Paolo Campo, Tina Balke-Visser, and Nigel Gilbert. 2016. How to build models for government: Criteria driving model acceptance in policymaking. Policy Sciences 49, 4 (2016), 489--504.Google ScholarCross Ref
- Christopher A Le Dantec and W Keith Edwards. 2010. Across boundaries of influence and accountability: The multiple scales of public sector information systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI'10. ACM, 113--122. Google ScholarDigital Library
- Michael Lipsky. 2010. Street-level bureaucracy: Dilemmas of the individual in public services. Russell Sage Foundation, New York.Google Scholar
- Zachary C Lipton. 2016. The Mythos of Model Interpretability. In 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016). New York. https://arxiv.org/abs/1606.03490Google Scholar
- Grégoire Montavon, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert Müller. 2017. Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recognition 65 (2017), 211 -- 222. Google ScholarDigital Library
- Robin Moore (Ed.). 2015. A compendium of research and analysis on the Offender Assessment System. Ministry of Justice Analytical Series, London. DOI: http://dx.doi.org/https://perma.cc/W2FT-NFWZGoogle Scholar
- J. David Morgenthaler, Misha Gridnev, Raluca Sauciuc, and Sanjay Bhansali. 2012. Searching for Build Debt: Experiences Managing Technical Debt at Google. In Proceedings of the Third International Workshop on Managing Technical Debt, MTD'12, Zurich, Switzerland - June 05, 2012. 1--6. Google ScholarDigital Library
- Kathleen L. Mosier, Linda J. Skitka, Susan Heers, and Mark Burdick. 1998. Automation Bias: Decision Making and Performance in High-Tech Cockpits. The International Journal of Aviation Psychology 8, 1 (1998), 47--63.Google ScholarCross Ref
- Nesta. 2015. Machines that learn in the wild: Machine learning capabilities, limitations and implications. Nesta, London. https://perma.cc/A6AM-GV6XGoogle Scholar
- BBC News. 2016. Kent slavery raids 'uncover 21 victims'. BBC News (7 Dec. 2016). https://perma.cc/AM4S-RMHRGoogle Scholar
- Donald A Norman. 1983. Some observations on mental models. In Mental Models, Dedre Gentner and Albert L Stevens (Eds.). Psychology Press, New York City, NY, 7--14.Google Scholar
- Teresa Odendahl and Aileen M Shaw. 2002. Interviewing elites. Handbook of Interview Research (2002), 299--316.Google Scholar
- Marion Oswald, Jamie Grace, Sheena Urwin, and Geoffrey C. Barnes. forthcoming. Algorithmic Risk Assessment Policing Models: Lessons from the Durham Hart Model and 'Experimental' Proportionality. Information & Communications Technology Laws (forthcoming). https://ssrn.com/abstract=3029345Google Scholar
- Edward C Page and Bill Jenkins. 2005. Policy bureaucracy: Government with a cast of thousands. Oxford University Press, Oxford.Google Scholar
- Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware Data Mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). ACM, New York, NY, USA, 560--568. Google ScholarDigital Library
- Joaquin Quiñonero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D Lawrence. 2009. Dataset shift in machine learning. The MIT Press, Cambridge, MA. Google ScholarDigital Library
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '16). ACM, New York, NY, USA, 1135--1144. Google ScholarDigital Library
- D Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. Hidden Technical Debt in Machine Learning Systems. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montréal, Canada - December 07--12, 2015. MIT Press, Cambridge, MA, 2503--2511. https://perma.cc/G6VN-9B86 Google ScholarDigital Library
- Nick Seaver. 2013. Knowing algorithms. Media in Transition 8 (2013). https://perma.cc/8USJ-VTWSGoogle Scholar
- Nick Seaver. 2014. On reverse engineering: Looking for the cultural work of engineers {blog. Medium (2014). https://medium.com/anthropology-and-algorithms/ on-reverse-engineering-d9f5bae87812Google Scholar
- Andrew Selbst. forthcoming. Disparate Impact in Big Data Policing. Georgia Law Review (forthcoming).Google Scholar
- Linda J Skitka, Kathleen L Mosier, and Mark Burdick. 1999. Does automation bias decision-making? International Journal of Human-Computer Studies 51 (1999), 991--1006. Google ScholarDigital Library
- The Royal Society. 2017. Machine learning: The power and promise of computers that learn by example. The Royal Society, London. https://royalsociety.org/~/ media/policy/projects/machine-learning/publications/ machine-learning-report.pdfGoogle Scholar
- The Royal Society and the British Academy. 2017. Data management and use: Governance in the 21st Century. The Royal Society and the British Academy, London. https://royalsociety.org/~/media/policy/projects/ data-governance/data-management-governance.pdfGoogle Scholar
- Mary E Thomson, Dilek Önkal, Ali Avcioğu, and Paul Goodwin. 2004. Aviation risk perception: A comparison between experts and novices. Risk Analysis 24, 6 (2004), 1585--1595.Google ScholarCross Ref
- Alan B Tickle, Robert Andrews, Mostefa Golea, and Joachim Diederich. 1998. The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE Transactions on Neural Networks 9, 6 (1998), 1057--1068. Google ScholarDigital Library
- Nikolaj Tollenaar, B. S. J. Wartna, P.G.M Van Der Heijden, and Stefan Bogaerts. 2016. StatRec - Performance, validation and preservability of a static risk prediction instrument. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique 129, 1 (2016), 25--44.Google ScholarCross Ref
- Joe Tullio, Anind K. Dey, Jason Chalecki, and James Fogarty. 2007. How It Works: A Field Study of Non-technical Users Interacting with an Intelligent System. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '07). New York, NY, USA, 31--40. Google ScholarDigital Library
- Berk Ustun and Cynthia Rudin. 2016. Supersparse Linear Integer Models for Optimized Medical Scoring Systems. Machine Learning 102, 3 (2016), 349--391. Google ScholarDigital Library
- Michael Veale and Reuben Binns. 2017. Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data. Big Data & Society 4, 2 (2017).Google Scholar
- Wetenschappelijke Raad voor het Regeringsbeleid. 2016. Big Data in een vrije en veilige samenleving (WRR-Rapport 95). WRR, Den Haag. http://www.wrr.nl/publicaties/publicatie/article/ big-data-in-een-vrije-en-veilige-samenleving/Google Scholar
- Michael R Wick and William B Thompson. 1992. Reconstructive expert system explanation. Artificial Intelligence 54, 1--2 (1992), 33--70. Google ScholarDigital Library
- Langdon Winner. 1980. Do Artifacts Have Politics? Dædelus 109, 1 (1980), 121--136. http://www.jstor.org/stable/20024652Google Scholar
- Qian Yang, John Zimmerman, Aaron Steinfeld, Lisa Carey, and James F Antaki. 2016. Investigating the Heart Pump Implant Decision Process: Opportunities for Decision Support Tools to Help. In Proceedings of the 2016 SIGCHI Conference on Human Factors in Computing Systems, CHI'16. 4477--4488. Google ScholarDigital Library
- Yunfeng Zhang, Rachel KE Bellamy, and Wendy A Kellogg. 2015. Designing information for remediating cognitive biases in decision-making. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI'15. 2211--2220. Google ScholarDigital Library
Index Terms
- Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making
Recommendations
Toward Algorithmic Accountability in Public Services: A Qualitative Study of Affected Community Perspectives on Algorithmic Decision-making in Child Welfare Services
CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing SystemsAlgorithmic decision-making systems are increasingly being adopted by government public service agencies. Researchers, policy experts, and civil rights groups have all voiced concerns that such systems are being deployed without adequate consideration ...
Algorithmic Decision Making in Public Administration: A CSCW-Perspective
GROUP '20: Companion Proceedings of the 2020 ACM International Conference on Supporting Group WorkIn this paper, I propose a study of algorithmic decision making in public administration from a computer supported cooperative work (CSCW) perspective. Each day the public administration makes thousands of decisions with consequences for the welfare of ...
Public Works and Infrastructure: Improvement Initiative for Federal Government in Mexico
dg.o '16: Proceedings of the 17th International Digital Government Research Conference on Digital Government ResearchThe Ministry of Communications and Transportation of the Federal Government in Mexico adopted the principles of the Strategy of Opening Government data. So in this poster we describe "Follow Public Works and Infrastructure", an initiative for Public ...
Comments