skip to main content
research-article
Open Access

Peer and self assessment in massive online classes

Published:01 December 2013Publication History
Skip Abstract Section

Abstract

Peer and self-assessment offer an opportunity to scale both assessment and learning to global classrooms. This article reports our experiences with two iterations of the first large online class to use peer and self-assessment. In this class, peer grades correlated highly with staff-assigned grades. The second iteration had 42.9% of students’ grades within 5% of the staff grade, and 65.5% within 10%. On average, students assessed their work 7% higher than staff did. Students also rated peers’ work from their own country 3.6% higher than those from elsewhere. We performed three experiments to improve grading accuracy. We found that giving students feedback about their grading bias increased subsequent accuracy. We introduce short, customizable feedback snippets that cover common issues with assignments, providing students more qualitative peer feedback. Finally, we introduce a data-driven approach that highlights high-variance items for improvement. We find that rubrics that use a parallel sentence structure, unambiguous wording, and well-specified dimensions have lower variance. After revising rubrics, median grading error decreased from 12.4% to 9.9%.

References

  1. L. Alben. 1996. Defining the criteria for effective interaction design. Interactions 3, 3 (1996), 11--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. M. Amabile. 1982. Social psychology of creativity: A consensual assessment technique. Journal of Personality and Social Psychology 43, 2 (1982), 997--1013.Google ScholarGoogle ScholarCross RefCross Ref
  3. J. R. Anderson and G. H. Bower. 1972. Recognition and retrieval processes in free recall. Psychological Review 79, 2 (1972), 97--123.Google ScholarGoogle ScholarCross RefCross Ref
  4. H. G. Andrade. 2005. Teaching with rubrics: The good, the bad, and the ugly. College Teaching 53, 1 (2005), 27--31.Google ScholarGoogle ScholarCross RefCross Ref
  5. R. E. Bennett. 1998. Validity and automated scoring: It's not only the scoring. Educational Measurement: Issues and Practice 17, 4 (1998).Google ScholarGoogle Scholar
  6. R. E. Bennett, M. Steffen, M. K. Singley, M. Morley, and D. Jacquemin. 1997. Evaluating an automatically scorable, open-ended response type for measuring mathematical reasoning in computer-adaptive tests. Journal of Educational Measurement 34, 2 (1997), 162--76.Google ScholarGoogle ScholarCross RefCross Ref
  7. D. Boud. 1995. Enhancing Learning through Self Assessment. Routledge.Google ScholarGoogle Scholar
  8. D. Boud. 2000. Sustainable assessment: rethinking assessment for the learning society. Studies in Continuing Education 22, 2 (2000), 151--167.Google ScholarGoogle ScholarCross RefCross Ref
  9. L. B. Breslow, D. E. Pritchard, J. DeBoer, G. S. Stump, A. D. Ho, and D. T. Seaton. 2013. Studying learning in the worldwide classroom: Research into edX's first MOOC. Research & Practice in Assessment 8 (2013), 13--25.Google ScholarGoogle Scholar
  10. B. Buxton. 2007. Sketching User Experiences: Getting the Design Right and the Right Design. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. J. Cadiz, A. Balachandran, E. Sanocki, A. Gupta, J. Grudin, and Gavin Jancke. 2000. Distance learning through distributed collaborative video viewing. In Proceedings of the ACM Conference on Computer Supported cooperative Work. ACM, 135--144. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. A. Carlson and F. C. Berry. 2003. Calibrated Peer Review and assessing learning outcomes. In Proceedings of the Frontiers in Education Conference, Vol. 2. STIPES.Google ScholarGoogle Scholar
  13. S. Carter, J. Mankoff, S. R. Klemmer, and T. Matthews. 2008. Exiting the cleanroom: On ecological validity and ubiquitous computing. Human--Computer Interaction 23, 1 (2008), 47--99.Google ScholarGoogle ScholarCross RefCross Ref
  14. K. Cennamo, S. A Douglas, M. Vernon, C. Brandt, B. Scott, Y. Reimer, and M. McGrath. 2011. Promoting creativity in the computer science design studio. In Proceedings of the 42nd ACM Technical Symposium on Computer Science Education. ACM, 649--654. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Cheshire and J. Antin. 2008. The social psychological effects of feedback on the production of Internet information pools. Journal of Computer-Mediated Communication 13, 3 (2008), 705--727.Google ScholarGoogle ScholarCross RefCross Ref
  16. E. H. Chi. 2009. A position paper onliving laboratories”: Rethinking ecological designs and experimentation in human-computer interaction. In Proceedings of the 13th International Conference on Human-Computer Interaction. Part I: New Trends. Springer-Verlag, 597--605. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Chinn. 2005. Peer assessment in the algorithms course. ACM SIGCSE Bulletin 37, 3 (2005), 69--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Conti, H. Coon, and T. M. Amabile. 1996. Evidence to support the componential model of creativity: Secondary analyses of three studies. Creativity Research Journal 9, 4 (1996), 385--389.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. T. Corbett, K. R. Koedinaer, and W. Haaley. 2002. Cognitive tutors: From the research classroom to all classrooms. In P. S. Goodman, Ed., Technology Enhanced Learning: Opportunities for Change. Lawrence Erlbaum Associates, Mahwah, NJ, 235.Google ScholarGoogle Scholar
  20. P. Dai, Mausam D., and D. S. Weld. 2010. Decision-theoretic control of crowd-sourced workflows. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI'10).Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. D. P. Dannels and K. N. Martin. 2008. Critiquing critiques a genre analysis of feedback across novice to expert design studios. Journal of Business and Technical Communication 22, 2 (2008), 135--159.Google ScholarGoogle ScholarCross RefCross Ref
  22. B. De La Harpe, J. F. Peterson, N. Frankham, R. Zehner, D. Neale, E. Musgrave, and R. McDermott. 2009. Assessment focus in studio: What is most prominent in architecture, art and design? International Journal of Art & Design Education 28, 1 (2009), 37--51.Google ScholarGoogle Scholar
  23. S. P. Dow, A. Glassco, J. Kass, M. Schwarz, D. L. Schwartz, and S. R. Klemmer. 2010. Parallel prototyping leads to better design results, more divergence, and increased self-efficacy. ACM Transactions on Computer-Human Interaction 17, 4 (2010), 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Dow, A. Kulkarni, S. Klemmer, and B. Hartmann. 2012. Shepherding the crowd yields better work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. ACM, 1013--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Drexler, R. Chafee, and others. 1977. The Architecture of the Ecole des Beaux-Arts. MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  26. B. Efron and R. Tibshirani. 1993. An Introduction to the Bootstrap. Vol. 57. Chapman & Hall/CRC, Boca Raton, FL.Google ScholarGoogle Scholar
  27. J. Ehrlinger, K. Johnson, M. Banner, D. Dunning, and J. Kruger. 2008. Why the unskilled are unaware: Further explorations of (absent) self-insight among the incompetent. Organizational Behavior and Human Decision Processes 105, 1 (2008), 98--121.Google ScholarGoogle ScholarCross RefCross Ref
  28. N. Falchikov and J. Goldfinch. 2000. Student peer assessment in higher education: A meta-analysis comparing peer and teacher marks. Review of Educational Research 70, 3 (2000), 287--322.Google ScholarGoogle ScholarCross RefCross Ref
  29. D. Fallman. 2003. Design-oriented human-computer interaction. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 225--232. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. E. B. Feldman. 1994. Practical art criticism. Prentice Hall New York.Google ScholarGoogle Scholar
  31. J. Forlizzi and K. Battarbee. 2004. Understanding experience in interactive systems. In Proceedings of the 5th Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques. ACM, 261--268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Fox and D. Patterson. 2012. Crossing the software education chasm. Communications of the ACM 55, 5 (2012), 44--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. D Galinsky and G. B Moskowitz. 2000. Counterfactuals as behavioral primes: Priming the simulation heuristic and consideration of alternatives. Journal of Experimental Social Psychology 36, 4 (2000), 384--409.Google ScholarGoogle ScholarCross RefCross Ref
  34. T. Gallien and J. Oomen-Early. 2008. Personalized versus collective instructor feedback in the online courseroom: Does type of feedback affect student satisfaction, academic performance and perceived connectedness with the instructor? International Journal on E-Learning 7, 3 (2008), 463--476.Google ScholarGoogle Scholar
  35. R. D. Gerdeman, A. A. Russell, and K. J. Worden. 2007. Web-Based student writing and reviewing in a large biology lecture course. Journal of College Science Teaching 36, 5 (2007), 46--52.Google ScholarGoogle Scholar
  36. S. Greenberg. 2009. Embedding a design studio course in a conventional computer science program. In Creativity and HCI: From Experience to Design in Education. Springer, 23--41.Google ScholarGoogle Scholar
  37. S. Guo, A. Parameswaran, and H. Garcia-Molina. 2012. So who won?: dynamic max discovery with the crowd. In Proceedings of the 2012 International Conference on Management of Data. ACM, 385--396. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. M. A. Hearst. 2000. The debate on automated essay grading. Intelligent Systems and Their Applications, IEEE 15, 5 (2000), 22--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. S. Hsi and A. M. Agogino. 1995. Scaffolding knowledge integration through designing multimedia case studies of engineering design. In Proceedings of the 1995 Frontiers in Education Conference. Vol. 2. IEEE, 4d1--1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. W. Huang and W. T. Fu. 2013. Enhancing reliability using peer consistency evaluation in human computation. In Proceedings of ACM 2013 Conference on Computer Supported Collaborative Work. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. P. G. Ipeirotis, F. Provost, and J. Wang. 2010. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation. ACM, 64--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. C. Kaufman, J. Baer, J. C. Cole, and J. D. Sexton. 2008. A comparison of expert and nonexpert raters using the consensual assessment technique. Creativity Research Journal 20, 2 (2008), 171--178.Google ScholarGoogle ScholarCross RefCross Ref
  43. F. Khatib, F. DiMaio, S. Cooper, M. Kazmierczyk, M. Gilski, S. Krzywda, H. Zabranska, I. Pichova, J. Thompson, Z. Popović, and others. 2011. Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nature Structural & Molecular Biology 18, 10 (2011), 1175--1177.Google ScholarGoogle ScholarCross RefCross Ref
  44. H. Kim and P. Hinds. 2012. Harmony vs. disruption: The effect of iterative prototyping on teams creative processes and outcomes in the West and the East. In Proceedings of the ICIC: International Conference on Intercultural Collaboration. ACM.Google ScholarGoogle Scholar
  45. A. Kittur, J. Nickerson, M. Bernstein, E. Gerber, A. Shaw, J. Zimmerman, M. Lease, and J. Horton. 2013. The future of crowd work. In Proceedings of the ACM Conference on Computer Supported Coooperative Work (CSCW'13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. R. F. Kizilcec, C. Piech, and E. Schneider. 2013. Deconstructing disengagement: Analyzing Learner subpopulations in massive open online courses. In Proceedings of the 3rd International Conference on Learning Analytics and Knowledge. 170--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. S. R. Klemmer, B. Hartmann, and L. Takayama. 2006. How bodies matter: Five themes for interaction design. In Proceedings of the 6th Conference on Designing Interactive Systems. ACM, 140--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. R. E. Kraut and P. Resnick. 2011. Evidence-Based Social Design: Mining the Social Sciences to Build Online Communities. MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  49. J. E. Kuebli, R. D. Harvey, and J. H. Korn. 2008. Critical thinking in critical courses: Principles and applications. In D. S. Dunn, J. S. Halonen, and R. A. Smith, Eds. Teaching Critical Thinking in Psychology: A Handbook of Best Practices. Wiley-Blackwell, New York, 137.Google ScholarGoogle Scholar
  50. J. Kurhila. 2012. Human-Computer Interaction by Coursera opened for credit for the students of the Department. Retrieved December 13, 2013 from http://www.cs.helsinki.fi/en/uutiset/72025.Google ScholarGoogle Scholar
  51. B. Lawson. 2006. How Designers Think: The Design Process Demystified. Architectual Press.Google ScholarGoogle Scholar
  52. T. Lewin. 2012a. Education site expands slate of universities and courses. The New York Times. September 19, 2012.Google ScholarGoogle Scholar
  53. T. Lewin. 2012b. One course, 150,000 students. The New York Times. July 18, 2012.Google ScholarGoogle Scholar
  54. T. Lewin. 2013a. College of future could be come one, come all. The New York Times. November 19, 2012.Google ScholarGoogle Scholar
  55. T. Lewin. 2013b. Five online courses are eligible for college credit. The New York Times. February 6, 2013.Google ScholarGoogle Scholar
  56. T. Lewin. 2013c. Students rush to web classes, but profits may be much later. The New York Times. January 6, 2013.Google ScholarGoogle Scholar
  57. T. Lewin. 2013d. Universities abroad join partnerships on the web. The New York Times. February 20, 2013.Google ScholarGoogle Scholar
  58. J. L. Little and E. L. Bjork. 2012. Pretesting with multiple-choice questions facilitates learning. In Proceedings of the Annual Meeting of the Cognitive Science Society.Google ScholarGoogle Scholar
  59. A. B. Markman and D. Gentner. 1993. Splitting the differences: A structural alignment view of similarity. Journal of Memory and Language 32 (1993), 517--517.Google ScholarGoogle ScholarCross RefCross Ref
  60. J. Marlow, L. Dabbish, and J. Herbsleb. 2013. Impression formation in online peer production: activity traces and personal profiles in github. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work. ACM, 117--128. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. F. G. Martin. 2012. Will massive open online courses change how we teach? Communications of the ACM 55, 8 (2012), 26--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. N. Mazar, O. Amir, and D. Ariely. 2008. The dishonesty of honest people: A theory of self-concept maintenance. Journal of Marketing Research 45, 6 (2008), 633--644.Google ScholarGoogle ScholarCross RefCross Ref
  63. P. A. Murtaugh, L. D. Burns, and J. Schuster. 1999. Predicting the retention of university students. Research in Higher Education 40, 3 (1999), 355--371.Google ScholarGoogle ScholarCross RefCross Ref
  64. D. J. Nicol and D. Macfarlane-Dick. 2006. Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education 31, 2 (2006), 199--218.Google ScholarGoogle ScholarCross RefCross Ref
  65. J. Nielsen. 1993. Iterative user-interface design. Computer 26, 11 (1993), 32--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. J. Nielsen. 1994. Enhancing the explanatory power of usability heuristics. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 152--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. L. Palen. 1999. Social, individual and technological issues for groupware calendar systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: The CHI Is the Limit. ACM, 17--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. A. Pendleton-Jullian. 2010. Four (+1) Studios. CreateSpace Independent Publishing.Google ScholarGoogle Scholar
  69. W. G. Perry. 1970. Forms of Intellectual Development in the College Years. Holt, New York.Google ScholarGoogle Scholar
  70. P. R. Pintrich. 1995. Understanding self-regulated learning. New Directions for Teaching and Learning 1995, 63 (1995), 3--12.Google ScholarGoogle ScholarCross RefCross Ref
  71. P. Pintrich and A. Zusho. 2007. Student motivation and self-regulated learning in the college classroom. In R. P. Perry and J. C. Smart, Eds. The Scholarship of Teaching and Learning in Higher Education: An Evidence-based Perspective. Springer, 731--810.Google ScholarGoogle Scholar
  72. Y. J. Reimer and S. A. Douglas. 2003. Teaching HCI design with the studio approach. Computer Science Education 13, 3 (2003), 191--205.Google ScholarGoogle ScholarCross RefCross Ref
  73. E. Roberts, J. Lilly, and B. Rollins. 1995. Using undergraduates as teaching assistants in introductory programming courses: An update on the Stanford experience. ACM SIGCSE Bulletin 27, 1 (1995), 48--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. D. Schön. 1985. The Design Studio: An exploration of its traditions and potential. London: Royal Institute of British Architects (1985).Google ScholarGoogle Scholar
  75. A. Snodgrass and R. Coyne. 2006. Interpretation in architecture: Design as a Way of Thinking. Routledge.Google ScholarGoogle Scholar
  76. R. Socher, B. Huval, C. D. Manning, and A. Y. Ng. 2012. Semantic compositionality through recursive matrix-vector spaces. In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP'12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. C. A. Stanley and M. E. Porter. 2002. Engaging Large Classes: Strategies and Techniques for College Faculty. ERIC.Google ScholarGoogle Scholar
  78. J. Surowiecki. 2005. The Wisdom of Crowds. Anchor. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. M. Szpir. 2002. Clickworkers on Mars. American Scientist 90, 3 (2002).Google ScholarGoogle Scholar
  80. D. Tinapple, L. Olson, and John Sadauskas. 2013. CritViz: Web-based software supporting peer critique in large creative classrooms. Bulletin of the IEEE Technical Committee on Learning Technology 15, 1 (2013), 29.Google ScholarGoogle Scholar
  81. M. Tohidi, W. Buxton, R. Baecker, and A. Sellen. 2006. Getting the right design and the design right. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1243--1252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. J. E. Tomayko. 1991. Teaching software development in a studio environment. ACM SIGCSE Bulletin 23, 1 (1991), 300--303. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. K. Topping. 1998. Peer assessment between students in colleges and universities. Review of Educational Research 68, 3 (1998), 249--276.Google ScholarGoogle ScholarCross RefCross Ref
  84. B. Uluoglu. 2000. Design knowledge communicated in studio critiques. Design Studies 21, 1 (2000), 33--58.Google ScholarGoogle ScholarCross RefCross Ref
  85. A. Venables and R. Summit. 2003. Enhancing scientific essay writing using peer assessment. Innovations in Education and Teaching International 40, 3 (2003), 281--290.Google ScholarGoogle ScholarCross RefCross Ref
  86. J. Widom. 2012. From 100 Students to 100,000. ACM SIGMOD Blog. Retreived December 13, 2013 from http://wp.sigmod.org/?p=165.Google ScholarGoogle Scholar
  87. T. Winograd. 1990. What can we teach about human-computer interaction?(plenary address). In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 443--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. O. F. Zaidan and C. Callison-Burch. 2011. Crowdsourcing translation: Professional quality from non-professionals. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1. 1220--1229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. B. J. Zimmerman and D. H. Schunk. 2001. Reflections on theories of self-regulated learning and academic achievement. Self-regulated Learning and Academic Achievement: Theoretical Perspectives 2 (2001), 289--307.Google ScholarGoogle Scholar

Index Terms

  1. Peer and self assessment in massive online classes

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Computer-Human Interaction
        ACM Transactions on Computer-Human Interaction  Volume 20, Issue 6
        December 2013
        155 pages
        ISSN:1073-0516
        EISSN:1557-7325
        DOI:10.1145/2562181
        Issue’s Table of Contents

        Copyright © 2013 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 December 2013
        • Revised: 1 July 2013
        • Accepted: 1 July 2013
        • Received: 1 March 2013
        Published in tochi Volume 20, Issue 6

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader