skip to main content
10.1145/1124772.1124961acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
Article

The validity of the stimulated retrospective think-aloud method as measured by eye tracking

Authors Info & Claims
Published:22 April 2006Publication History

ABSTRACT

Retrospective Think aloud (RTA) is a usability method that collects the verbalization of a user's performance after the performance is over. There has been little work done to investigate the validity and reliability of RTA. This paper reports on an experiment investigating these issues with a form of the method called stimulated RTA. By comparing subjects' verbalizations with their eye movements, we support the validity and reliability of stimulated RTA: the method provides a valid account of what people attended to in completing tasks, it has a low risk of introducing fabrications, and its validity isn't affected by task complexity. More detailed analysis of RTA shows that it also provides additional information about user's inferences and strategies in completing tasks. The findings of this study provide valuable support for usability practitioners to use RTA and to trust the users' performance information collected by this method in a usability study.

References

  1. Bell, B., et al. Usability testing of a graphical programming system: things we missed in a programming walkthrough. In Proc. CHI'91. ACM Press (1991), 7--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bowers, V.A.&H.L. Snyder. Concurrent versus Retrospective Verbal Protocol for Comparing Window Usability. In Proc. of the Human Factors Society 34th Annual Meeting. (1990), 1270--1274.Google ScholarGoogle ScholarCross RefCross Ref
  3. Branch, J.L. Investigating the Information-Seeking Processes of Adolescents: The Value of Using Think Alouds and Think Afters. Library & Information Science Research. 22,4 (2000), 371--392.Google ScholarGoogle ScholarCross RefCross Ref
  4. Campbell, D.J. Task Complexity: A review and analysis. The Academy of Management Review. 13,1 (1988), 40--52.Google ScholarGoogle ScholarCross RefCross Ref
  5. Capra, M.G. Contemporaneous versus Retrospective User-Reported Critical Incidents in Usability Evaluation. In Proc. of Human Factors Society, 46 th Annual Meeting. (2002), 1973--1977.Google ScholarGoogle Scholar
  6. Card, S.K., et al. Information scent as a driver of web behavior graphs: results of a protocol analysis method for web usability. In Proc. CHI'01. ACM Press (2001), 498--505. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Choi, B., et al. A Qualitative Cross-National Study of Cultural Influences on Mobile Data Service Design. In Proc. CHI 2005. ACM Press (2005), 661--670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ericsson, K.A.&H.A. Simon, Protocol analysis: Verbal Reports as Data. 1993: Cambridge, MA: MIT Press.Google ScholarGoogle Scholar
  9. Gapra, M.G. Comtemporaneous versus Retrospective User-reported Critical Incidents in Usability Evaluation. In Proceedings of the Human Factors and Ergonomics Society, 46th Annual Meeting. (2002), 1973--1977.Google ScholarGoogle Scholar
  10. Geiselman, R.E.&F.S. Bellezza. Eye-movements and overt rehearsal in word recall. Journal of Experimental Psychology: Human Learning and Memory. 3,3 (1977), 305--315.Google ScholarGoogle ScholarCross RefCross Ref
  11. Gero, J.S.&H.-h. Tang. Differences between retrospective and concurrent protocols in revealing the process-oriented aspects of the design process. Design Studies. 21,3 (2001), 283--295.Google ScholarGoogle Scholar
  12. Goldberg, J.H.&A.M. Wichansky, Eye tracking in usability evlauation: A practitioner's guide., in The Mind's Eyes: Cognitive and Applied Aspects of Eye Movements, R. Radach, et al., Editors.(2003), Elsevier Science: Oxford. 493--516.Google ScholarGoogle Scholar
  13. Gray, W.D.&M.C. Salzman. Damaged merchandise? A review of experiments that compare usability evaluation methods. Human-Computer Interaction. 13,3 (1998), 203--261. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kensing, F. Prompted Reflections: A Technique for Understanding complex work. Interactions. Jan-Feb.,(1998), 7--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kjeldskov, J.&M.B. Skov. Creating Realistic Laboratory Settings: Comparative Studies of Three Think aloud Usability Evaluations of a Mobile System. In Proc. of the 9th IFIP TC13 INTERACT 2003. (2003), 663 -- 670.Google ScholarGoogle Scholar
  16. Mankoff, J., et al. Is Your Web Page Accessible? A Comparative Study of Methods for Assessing Web Page Accessibility for the Blind. In Proc. of CHI'05. ACM Press (2005), 41--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nielson, J., Usability Engineering. 1993: Cambridge, MA: AP Professional.Google ScholarGoogle Scholar
  18. Page, C.&M. Rahimi. Concurrent and Retrospective Verbal Protocols in Usability Testing: Is There Value Added In Collecting Both? In Proc. of the Human Factors and Ergonomics Society, 39th Annual Meeting. (1995), 223--227.Google ScholarGoogle ScholarCross RefCross Ref
  19. Preece, J., Human-Computer Interaction. 1994: Addison-Wesley, England. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Preece, J., et al., Interaction Design: Beyond Human-Computer Interaction. 2002: John Wiley & Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ramey, J., et al., Adaptation of an Ethnographic Method for Investigation the Task Domain in Diagnostic Radiology, in A Field Methods Casebook for Software Design, e. D. Wixon and J. Ramey, Editor.(1996), John Wiley and Sons. 1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Rhenius, D.&G. Deffner. Evaluation of Concurrent Thinking Aloud using Eye-tracking Data. Proc. of the Human Factors and Ergonomics Society 34th Annual Meeting. (1990), 1265--1269.Google ScholarGoogle ScholarCross RefCross Ref
  23. Rowley, D.E. Usability Testing in the field: bringing the laboratory to the user. In Proc. CHI'94. ACM Press (1994), 252 -- 257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Russo, J.E., et al. The Validity of Verbal Protocols. Memory and Cognition. 17,6 (1989), 759--769.Google ScholarGoogle ScholarCross RefCross Ref
  25. Sankoff, D.&J.B. Kruskal, An overview of sequence comparison, in Time Warps, String Edits, and Macro-Molecules: The Theory and Practice of Sequence Comparison.(1983), Addison-Wesley.Google ScholarGoogle Scholar
  26. Soukoreff, R.W.&I.S. MacKenzie. Measuring errors in text entry tasks: An application of the Levenshtein string distance statistic. In Proc. CHI'01. ACM Press (2001), 319--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. St. Amant, R.&M.O. Riedl. A perception/action substrate for cognitive modeling in HCI. International Journal of Human-Computer Studies. 55,1 (2001), 15--39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Suwa, M.&B. Tversky. What architects see in their sketches: implications for design tools. In Proc. CHI'96. ACM Press (1996), 191--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Teague, R., et al. Concurrent vs. Post-Task Usability Test Ratings. In Proc. CHI'01. ACM Press (2001), 289--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Van den Haak, M.J., et al. Retrospective vs. concurrent think-aloud protocols: testing the usability of an online library catalogue. Behaviour& Information Technology. 22,5 (2003), 339--351.Google ScholarGoogle ScholarCross RefCross Ref
  31. Waes, L.V. Thinking Aloud as a Method for Testing the Usability of Websites: The influence of Task Variation on the Evaluation of Hypertext. IEEE Transactions on Professional Communication. 43,3 (2000), 279--291.Google ScholarGoogle Scholar
  32. Williams, T.R., et al. Does Isolating a Visual Element Call Attention to It? Results of an Eye-tracking Investigation of the Effects of Isolation on Emphasis. Technical Communication. 52,1 (2005), 21--26.Google ScholarGoogle Scholar
  33. Wood, R.E. Task Complexity: Definition of the construct. Organizational Behavior and Human Decision Processes. 37,(1986), 60--82.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. The validity of the stimulated retrospective think-aloud method as measured by eye tracking

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CHI '06: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
      April 2006
      1353 pages
      ISBN:1595933727
      DOI:10.1145/1124772

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 April 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate6,199of26,314submissions,24%

      Upcoming Conference

      CHI '24
      CHI Conference on Human Factors in Computing Systems
      May 11 - 16, 2024
      Honolulu , HI , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader