skip to main content
review-article
Free Access

Techniques for interpretable machine learning

Published:20 December 2019Publication History
Skip Abstract Section

Abstract

Uncovering the mysterious ways machine learning models make decisions.

References

  1. Altmann, A., Toloşi, L., Sander, O. and Lengauer T. Permutation importance: A corrected feature importance measure. Bioinformatics 26, 10 (2010), 1340--1347.Google ScholarGoogle ScholarCross RefCross Ref
  2. Ancona, M., Ceolini, E., Oztireli, C. and Gross, M. Towards better understanding of gradient-based attribution methods for deep neural networks. In Proceedings of the Intern. Conf. Learning Representations, 2018.Google ScholarGoogle Scholar
  3. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller K.-R. and Samek, W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS One 10, 7 (2015), e0130140.Google ScholarGoogle ScholarCross RefCross Ref
  4. Bahdanau, D., Cho, K. and Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proceedings of the Intern. Conf. Learning Representations, 2015.Google ScholarGoogle Scholar
  5. Bastani, O., Kim, C., and Bastani, H. Interpretability via model extraction. In Proceedings of the Fairness, Accountability, and Transparency in Machine Learning Workshop, 2017.Google ScholarGoogle Scholar
  6. Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M. and Elhadad, N. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the ACM SIGKDD Intern. Conf. Knowledge Discovery and Data Mining. ACM, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chen, T. and Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD Intern. Conf. Knowledge Discovery and Data Mining. ACM, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dabkowski, P. and Gal, Y. Real time image saliency for black box classifiers. Advances in Neural Information Processing Systems (2017), 6970--6979.Google ScholarGoogle Scholar
  9. Dix, A. Human issues in the use of pattern recognition techniques. Neural Networks and Pattern Recognition in Human Computer Interaction (1992), 429--451.Google ScholarGoogle Scholar
  10. Doshi-Velez, F. and Kim, B. Towards a rigorous science of interpretable machine learning. 2017.Google ScholarGoogle Scholar
  11. Du, M., Liu, N., Song, Q. and Hu, X. Towards explanation of DNN-based prediction with guided feature inversion. In Proceedings of the ACM SIGKDD Intern. Conf. Knowledge Discovery and Data Mining, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Du, M., Liu, N., Yang, F. and Hu, X. On attribution of recurrent neural network predictions via additive decomposition. In Proceedings of the WWW Conf., 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Fong, R. and Vedaldi, A. Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the Intern. Conf. Computer Vision, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  14. Freitas, A.A. Comprehensible classification models: A position paper. ACM SIGKDD Explorations Newsletter, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Goodfellow, I., Bengio, Y and Courville, A. Deep Learning, Vol.1. MIT Press, Cambridge, MA, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Goodfellow, I.J., Shlens, J. and Szegedy, C. Explaining and harnessing adversarial examples. In Proceedings of the Intern. Conf. Learning Representations, 2015.Google ScholarGoogle Scholar
  17. Kádár, A., Chrupa-la, G., and Alishahi, A. Representation of linguistic form and function in recurrent neural networks. Computational Linguistics 43, 4 (2017), 761--780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Karpathy, A., Johnson, J., and Fei-Fei, L. Visualizing and understanding recurrent networks. In Proceedings of the ICLR Workshop, 2016.Google ScholarGoogle Scholar
  19. Liu, N., Du, M., and Hu, X. Representation interpretation with spatial encoding and multimodal analytics. In Proceedings of the ACM Intern. Conf. Web Search and Data Mining, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Liu, N., Yang, H., and Hu, X. Adversarial detection with model interpretation. In Proceedings of the ACM SIGKDD Intern. Conf. Knowledge Discovery and Data Mining, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. McCullagh, P. and Nelder, J.A. Generalized Linear M, Vol. 37. CRC Press, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  22. Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence (2018).Google ScholarGoogle Scholar
  23. Molnar, C. Interpretable Machine Learning (2018); https://christophm.github.io/interpretable-ml-book/.Google ScholarGoogle Scholar
  24. Mudrakarta, P.K., Taly, A., Sundararajan, M. and Dhamdhere, K. Did the model understand the question? In Proceedings of the 56th Annual Meeting of the Assoc. Computational Linguistics, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  25. Nguyen, A., Dosovitskiy, A., Yosinski, J., Brox, T. and Clune, J. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. Advances in Neural Information Processing Systems, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Nguyen, A., Yosinski, J. and Clune, J. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conf. Computer Vision and Pattern Recognition, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  27. Nguyen, A., Yosinski, J. and Clune, J. Multifaceted feature visualization: Uncovering the different types of features learned by each neuron in deep neural networks. In Proceedings of the ICLR Workshop, 2016.Google ScholarGoogle Scholar
  28. Peters, M.E. et al. Deep contextualized word representations. In Proceedings of the NAACL-HLT, 2018.Google ScholarGoogle Scholar
  29. Quinlan, J.R. Simplifying decision trees. Intern. J. Man-Machine Studies 27, 3 (1987), 221--234.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ribeiro, M.T., Singh, S. and Guestrin, C. Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the ACM SIGKDD Intern. Conf. Knowledge Discovery and Data Mining, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  31. Ribeiro, M.T., Singh, S. and Guestrin, C. Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI Conf. Artificial Intelligence, 2018.Google ScholarGoogle Scholar
  32. Sabour, S., Frosst, N. and Hinton, G.E. Dynamic routing between capsules. Advances in Neural Information Processing Systems, 2017.Google ScholarGoogle Scholar
  33. Simonyan, K., Vedaldi, A. and Zisserman, A. Deep inside convolutional networks: Visualising image classification models and saliency maps. In Proceedings of the ICLR Workshop, 2014.Google ScholarGoogle Scholar
  34. Springenberg, J.T., Dosovitskiy, A., Brox, T. and Riedmiller, M. Striving for simplicity: The all convolutional net. In Proceedings of the ICLR workshop, 2015.Google ScholarGoogle Scholar
  35. Tomsett, R., Braines, D., Harborne, D., Preece, A. and Chakraborty, S. Interpretable to whom? A role-based model for analyzing interpretable machine learning systems. In Proceedings of the ICML Workshop on Human Interpretability in Machine Learning, 2018.Google ScholarGoogle Scholar
  36. Vandewiele, G., Janssens, G., Ongenae, O., and Van Hoecke, F.S. Genesim: Genetic extraction of a single, interpretable model. In Proceedings of the NIPS Workshop, 2016.Google ScholarGoogle Scholar
  37. Wachter, S., Mittelstadt, B. and Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. 2017.Google ScholarGoogle Scholar
  38. Xu, K. et al. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the Intern. Conf. Machine Learning, 2015.Google ScholarGoogle Scholar
  39. Zhang, Q., Wu, Y.N. and Zhu, S.-C. Interpretable convolutional neural networks. In Proceedings of the IEEE Conf. Computer Vision and Pattern Recognition, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  40. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A. and Torralba, A. Object detectors emerge in deep scene CNNs. In Proceedings of the Intern. Conf. Learning Representations, 2015.Google ScholarGoogle Scholar

Index Terms

  1. Techniques for interpretable machine learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Communications of the ACM
      Communications of the ACM  Volume 63, Issue 1
      January 2020
      90 pages
      ISSN:0001-0782
      EISSN:1557-7317
      DOI:10.1145/3377354
      Issue’s Table of Contents

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 December 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • review-article
      • Popular
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format