research-article

Open Access

Peer and self assessment in massive online classes

Authors:
Chinmay Kulkarni

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Koh Pang Wei

Stanford University, and Coursera, Inc., Stanford, CA

Stanford University, and Coursera, Inc., Stanford, CA
View Profile

,
Huy Le

Coursera, Inc., Mountain View, CA

Coursera, Inc., Mountain View, CA
View Profile

,
Daniel Chia

Stanford University, and Coursera, Inc., Stanford, CA

Stanford University, and Coursera, Inc., Stanford, CA
View Profile

,
Kathryn Papadopoulos

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Justin Cheng

Stanford University, Stanford, CA

Stanford University, Stanford, CA
View Profile

,
Daphne Koller

Stanford University, and Coursera, Inc., Stanford, CA

Stanford University, and Coursera, Inc., Stanford, CA
View Profile

,
Scott R. Klemmer

Stanford University, San Diego

Stanford University, San Diego
View Profile

Authors Info & Claims

ACM Transactions on Computer-Human Interaction Volume 20 Issue 6Article No.: 33pp 1–31https://doi.org/10.1145/2505057

Published:01 December 2013Publication History

ACM Transactions on Computer-Human Interaction

Abstract

Peer and self-assessment offer an opportunity to scale both assessment and learning to global classrooms. This article reports our experiences with two iterations of the first large online class to use peer and self-assessment. In this class, peer grades correlated highly with staff-assigned grades. The second iteration had 42.9% of students’ grades within 5% of the staff grade, and 65.5% within 10%. On average, students assessed their work 7% higher than staff did. Students also rated peers’ work from their own country 3.6% higher than those from elsewhere. We performed three experiments to improve grading accuracy. We found that giving students feedback about their grading bias increased subsequent accuracy. We introduce short, customizable feedback snippets that cover common issues with assignments, providing students more qualitative peer feedback. Finally, we introduce a data-driven approach that highlights high-variance items for improvement. We find that rubrics that use a parallel sentence structure, unambiguous wording, and well-specified dimensions have lower variance. After revising rubrics, median grading error decreased from 12.4% to 9.9%.

References

L. Alben. 1996. Defining the criteria for effective interaction design. Interactions 3, 3 (1996), 11--15. Google ScholarDigital Library
T. M. Amabile. 1982. Social psychology of creativity: A consensual assessment technique. Journal of Personality and Social Psychology 43, 2 (1982), 997--1013.Google ScholarCross Ref
J. R. Anderson and G. H. Bower. 1972. Recognition and retrieval processes in free recall. Psychological Review 79, 2 (1972), 97--123.Google ScholarCross Ref
H. G. Andrade. 2005. Teaching with rubrics: The good, the bad, and the ugly. College Teaching 53, 1 (2005), 27--31.Google ScholarCross Ref
R. E. Bennett. 1998. Validity and automated scoring: It's not only the scoring. Educational Measurement: Issues and Practice 17, 4 (1998).Google Scholar
R. E. Bennett, M. Steffen, M. K. Singley, M. Morley, and D. Jacquemin. 1997. Evaluating an automatically scorable, open-ended response type for measuring mathematical reasoning in computer-adaptive tests. Journal of Educational Measurement 34, 2 (1997), 162--76.Google ScholarCross Ref
D. Boud. 1995. Enhancing Learning through Self Assessment. Routledge.Google Scholar
D. Boud. 2000. Sustainable assessment: rethinking assessment for the learning society. Studies in Continuing Education 22, 2 (2000), 151--167.Google ScholarCross Ref
L. B. Breslow, D. E. Pritchard, J. DeBoer, G. S. Stump, A. D. Ho, and D. T. Seaton. 2013. Studying learning in the worldwide classroom: Research into edX's first MOOC. Research & Practice in Assessment 8 (2013), 13--25.Google Scholar
B. Buxton. 2007. Sketching User Experiences: Getting the Design Right and the Right Design. Morgan Kaufmann. Google ScholarDigital Library
J. J. Cadiz, A. Balachandran, E. Sanocki, A. Gupta, J. Grudin, and Gavin Jancke. 2000. Distance learning through distributed collaborative video viewing. In Proceedings of the ACM Conference on Computer Supported cooperative Work. ACM, 135--144. Google ScholarDigital Library
P. A. Carlson and F. C. Berry. 2003. Calibrated Peer Review and assessing learning outcomes. In Proceedings of the Frontiers in Education Conference, Vol. 2. STIPES.Google Scholar
S. Carter, J. Mankoff, S. R. Klemmer, and T. Matthews. 2008. Exiting the cleanroom: On ecological validity and ubiquitous computing. Human--Computer Interaction 23, 1 (2008), 47--99.Google ScholarCross Ref
K. Cennamo, S. A Douglas, M. Vernon, C. Brandt, B. Scott, Y. Reimer, and M. McGrath. 2011. Promoting creativity in the computer science design studio. In Proceedings of the 42nd ACM Technical Symposium on Computer Science Education. ACM, 649--654. Google ScholarDigital Library
C. Cheshire and J. Antin. 2008. The social psychological effects of feedback on the production of Internet information pools. Journal of Computer-Mediated Communication 13, 3 (2008), 705--727.Google ScholarCross Ref
E. H. Chi. 2009. A position paper onliving laboratories”: Rethinking ecological designs and experimentation in human-computer interaction. In Proceedings of the 13th International Conference on Human-Computer Interaction. Part I: New Trends. Springer-Verlag, 597--605. Google ScholarDigital Library
D. Chinn. 2005. Peer assessment in the algorithms course. ACM SIGCSE Bulletin 37, 3 (2005), 69--73. Google ScholarDigital Library
R. Conti, H. Coon, and T. M. Amabile. 1996. Evidence to support the componential model of creativity: Secondary analyses of three studies. Creativity Research Journal 9, 4 (1996), 385--389.Google ScholarCross Ref
A. T. Corbett, K. R. Koedinaer, and W. Haaley. 2002. Cognitive tutors: From the research classroom to all classrooms. In P. S. Goodman, Ed., Technology Enhanced Learning: Opportunities for Change. Lawrence Erlbaum Associates, Mahwah, NJ, 235.Google Scholar
P. Dai, Mausam D., and D. S. Weld. 2010. Decision-theoretic control of crowd-sourced workflows. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI'10).Google ScholarDigital Library
D. P. Dannels and K. N. Martin. 2008. Critiquing critiques a genre analysis of feedback across novice to expert design studios. Journal of Business and Technical Communication 22, 2 (2008), 135--159.Google ScholarCross Ref
B. De La Harpe, J. F. Peterson, N. Frankham, R. Zehner, D. Neale, E. Musgrave, and R. McDermott. 2009. Assessment focus in studio: What is most prominent in architecture, art and design&quest; International Journal of Art & Design Education 28, 1 (2009), 37--51.Google Scholar
S. P. Dow, A. Glassco, J. Kass, M. Schwarz, D. L. Schwartz, and S. R. Klemmer. 2010. Parallel prototyping leads to better design results, more divergence, and increased self-efficacy. ACM Transactions on Computer-Human Interaction 17, 4 (2010), 18. Google ScholarDigital Library
S. Dow, A. Kulkarni, S. Klemmer, and B. Hartmann. 2012. Shepherding the crowd yields better work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. ACM, 1013--1022. Google ScholarDigital Library
A. Drexler, R. Chafee, and others. 1977. The Architecture of the Ecole des Beaux-Arts. MIT Press, Cambridge, MA.Google Scholar
B. Efron and R. Tibshirani. 1993. An Introduction to the Bootstrap. Vol. 57. Chapman & Hall/CRC, Boca Raton, FL.Google Scholar
J. Ehrlinger, K. Johnson, M. Banner, D. Dunning, and J. Kruger. 2008. Why the unskilled are unaware: Further explorations of (absent) self-insight among the incompetent. Organizational Behavior and Human Decision Processes 105, 1 (2008), 98--121.Google ScholarCross Ref
N. Falchikov and J. Goldfinch. 2000. Student peer assessment in higher education: A meta-analysis comparing peer and teacher marks. Review of Educational Research 70, 3 (2000), 287--322.Google ScholarCross Ref
D. Fallman. 2003. Design-oriented human-computer interaction. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 225--232. Google ScholarDigital Library
E. B. Feldman. 1994. Practical art criticism. Prentice Hall New York.Google Scholar
J. Forlizzi and K. Battarbee. 2004. Understanding experience in interactive systems. In Proceedings of the 5th Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques. ACM, 261--268. Google ScholarDigital Library
A. Fox and D. Patterson. 2012. Crossing the software education chasm. Communications of the ACM 55, 5 (2012), 44--49. Google ScholarDigital Library
A. D Galinsky and G. B Moskowitz. 2000. Counterfactuals as behavioral primes: Priming the simulation heuristic and consideration of alternatives. Journal of Experimental Social Psychology 36, 4 (2000), 384--409.Google ScholarCross Ref
T. Gallien and J. Oomen-Early. 2008. Personalized versus collective instructor feedback in the online courseroom: Does type of feedback affect student satisfaction, academic performance and perceived connectedness with the instructor&quest; International Journal on E-Learning 7, 3 (2008), 463--476.Google Scholar
R. D. Gerdeman, A. A. Russell, and K. J. Worden. 2007. Web-Based student writing and reviewing in a large biology lecture course. Journal of College Science Teaching 36, 5 (2007), 46--52.Google Scholar
S. Greenberg. 2009. Embedding a design studio course in a conventional computer science program. In Creativity and HCI: From Experience to Design in Education. Springer, 23--41.Google Scholar
S. Guo, A. Parameswaran, and H. Garcia-Molina. 2012. So who won&quest;: dynamic max discovery with the crowd. In Proceedings of the 2012 International Conference on Management of Data. ACM, 385--396. Google ScholarDigital Library
M. A. Hearst. 2000. The debate on automated essay grading. Intelligent Systems and Their Applications, IEEE 15, 5 (2000), 22--37. Google ScholarDigital Library
S. Hsi and A. M. Agogino. 1995. Scaffolding knowledge integration through designing multimedia case studies of engineering design. In Proceedings of the 1995 Frontiers in Education Conference. Vol. 2. IEEE, 4d1--1. Google ScholarDigital Library
S. W. Huang and W. T. Fu. 2013. Enhancing reliability using peer consistency evaluation in human computation. In Proceedings of ACM 2013 Conference on Computer Supported Collaborative Work. ACM. Google ScholarDigital Library
P. G. Ipeirotis, F. Provost, and J. Wang. 2010. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation. ACM, 64--67. Google ScholarDigital Library
J. C. Kaufman, J. Baer, J. C. Cole, and J. D. Sexton. 2008. A comparison of expert and nonexpert raters using the consensual assessment technique. Creativity Research Journal 20, 2 (2008), 171--178.Google ScholarCross Ref
F. Khatib, F. DiMaio, S. Cooper, M. Kazmierczyk, M. Gilski, S. Krzywda, H. Zabranska, I. Pichova, J. Thompson, Z. Popović, and others. 2011. Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nature Structural & Molecular Biology 18, 10 (2011), 1175--1177.Google ScholarCross Ref
H. Kim and P. Hinds. 2012. Harmony vs. disruption: The effect of iterative prototyping on teams creative processes and outcomes in the West and the East. In Proceedings of the ICIC: International Conference on Intercultural Collaboration. ACM.Google Scholar
A. Kittur, J. Nickerson, M. Bernstein, E. Gerber, A. Shaw, J. Zimmerman, M. Lease, and J. Horton. 2013. The future of crowd work. In Proceedings of the ACM Conference on Computer Supported Coooperative Work (CSCW'13). Google ScholarDigital Library
R. F. Kizilcec, C. Piech, and E. Schneider. 2013. Deconstructing disengagement: Analyzing Learner subpopulations in massive open online courses. In Proceedings of the 3rd International Conference on Learning Analytics and Knowledge. 170--179. Google ScholarDigital Library
S. R. Klemmer, B. Hartmann, and L. Takayama. 2006. How bodies matter: Five themes for interaction design. In Proceedings of the 6th Conference on Designing Interactive Systems. ACM, 140--149. Google ScholarDigital Library
R. E. Kraut and P. Resnick. 2011. Evidence-Based Social Design: Mining the Social Sciences to Build Online Communities. MIT Press, Cambridge, MA.Google Scholar
J. E. Kuebli, R. D. Harvey, and J. H. Korn. 2008. Critical thinking in critical courses: Principles and applications. In D. S. Dunn, J. S. Halonen, and R. A. Smith, Eds. Teaching Critical Thinking in Psychology: A Handbook of Best Practices. Wiley-Blackwell, New York, 137.Google Scholar
J. Kurhila. 2012. Human-Computer Interaction by Coursera opened for credit for the students of the Department. Retrieved December 13, 2013 from http://www.cs.helsinki.fi/en/uutiset/72025.Google Scholar
B. Lawson. 2006. How Designers Think: The Design Process Demystified. Architectual Press.Google Scholar
T. Lewin. 2012a. Education site expands slate of universities and courses. The New York Times. September 19, 2012.Google Scholar
T. Lewin. 2012b. One course, 150,000 students. The New York Times. July 18, 2012.Google Scholar
T. Lewin. 2013a. College of future could be come one, come all. The New York Times. November 19, 2012.Google Scholar
T. Lewin. 2013b. Five online courses are eligible for college credit. The New York Times. February 6, 2013.Google Scholar
T. Lewin. 2013c. Students rush to web classes, but profits may be much later. The New York Times. January 6, 2013.Google Scholar
T. Lewin. 2013d. Universities abroad join partnerships on the web. The New York Times. February 20, 2013.Google Scholar
J. L. Little and E. L. Bjork. 2012. Pretesting with multiple-choice questions facilitates learning. In Proceedings of the Annual Meeting of the Cognitive Science Society.Google Scholar
A. B. Markman and D. Gentner. 1993. Splitting the differences: A structural alignment view of similarity. Journal of Memory and Language 32 (1993), 517--517.Google ScholarCross Ref
J. Marlow, L. Dabbish, and J. Herbsleb. 2013. Impression formation in online peer production: activity traces and personal profiles in github. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work. ACM, 117--128. Google ScholarDigital Library
F. G. Martin. 2012. Will massive open online courses change how we teach&quest; Communications of the ACM 55, 8 (2012), 26--28. Google ScholarDigital Library
N. Mazar, O. Amir, and D. Ariely. 2008. The dishonesty of honest people: A theory of self-concept maintenance. Journal of Marketing Research 45, 6 (2008), 633--644.Google ScholarCross Ref
P. A. Murtaugh, L. D. Burns, and J. Schuster. 1999. Predicting the retention of university students. Research in Higher Education 40, 3 (1999), 355--371.Google ScholarCross Ref
D. J. Nicol and D. Macfarlane-Dick. 2006. Formative assessment and self-regulated learning: A model and seven principles of good feedback practice. Studies in Higher Education 31, 2 (2006), 199--218.Google ScholarCross Ref
J. Nielsen. 1993. Iterative user-interface design. Computer 26, 11 (1993), 32--41. Google ScholarDigital Library
J. Nielsen. 1994. Enhancing the explanatory power of usability heuristics. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 152--158. Google ScholarDigital Library
L. Palen. 1999. Social, individual and technological issues for groupware calendar systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: The CHI Is the Limit. ACM, 17--24. Google ScholarDigital Library
A. Pendleton-Jullian. 2010. Four (+1) Studios. CreateSpace Independent Publishing.Google Scholar
W. G. Perry. 1970. Forms of Intellectual Development in the College Years. Holt, New York.Google Scholar
P. R. Pintrich. 1995. Understanding self-regulated learning. New Directions for Teaching and Learning 1995, 63 (1995), 3--12.Google ScholarCross Ref
P. Pintrich and A. Zusho. 2007. Student motivation and self-regulated learning in the college classroom. In R. P. Perry and J. C. Smart, Eds. The Scholarship of Teaching and Learning in Higher Education: An Evidence-based Perspective. Springer, 731--810.Google Scholar
Y. J. Reimer and S. A. Douglas. 2003. Teaching HCI design with the studio approach. Computer Science Education 13, 3 (2003), 191--205.Google ScholarCross Ref
E. Roberts, J. Lilly, and B. Rollins. 1995. Using undergraduates as teaching assistants in introductory programming courses: An update on the Stanford experience. ACM SIGCSE Bulletin 27, 1 (1995), 48--52. Google ScholarDigital Library
D. Schön. 1985. The Design Studio: An exploration of its traditions and potential. London: Royal Institute of British Architects (1985).Google Scholar
A. Snodgrass and R. Coyne. 2006. Interpretation in architecture: Design as a Way of Thinking. Routledge.Google Scholar
R. Socher, B. Huval, C. D. Manning, and A. Y. Ng. 2012. Semantic compositionality through recursive matrix-vector spaces. In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP'12). Google ScholarDigital Library
C. A. Stanley and M. E. Porter. 2002. Engaging Large Classes: Strategies and Techniques for College Faculty. ERIC.Google Scholar
J. Surowiecki. 2005. The Wisdom of Crowds. Anchor. Google ScholarDigital Library
M. Szpir. 2002. Clickworkers on Mars. American Scientist 90, 3 (2002).Google Scholar
D. Tinapple, L. Olson, and John Sadauskas. 2013. CritViz: Web-based software supporting peer critique in large creative classrooms. Bulletin of the IEEE Technical Committee on Learning Technology 15, 1 (2013), 29.Google Scholar
M. Tohidi, W. Buxton, R. Baecker, and A. Sellen. 2006. Getting the right design and the design right. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1243--1252. Google ScholarDigital Library
J. E. Tomayko. 1991. Teaching software development in a studio environment. ACM SIGCSE Bulletin 23, 1 (1991), 300--303. Google ScholarDigital Library
K. Topping. 1998. Peer assessment between students in colleges and universities. Review of Educational Research 68, 3 (1998), 249--276.Google ScholarCross Ref
B. Uluoglu. 2000. Design knowledge communicated in studio critiques. Design Studies 21, 1 (2000), 33--58.Google ScholarCross Ref
A. Venables and R. Summit. 2003. Enhancing scientific essay writing using peer assessment. Innovations in Education and Teaching International 40, 3 (2003), 281--290.Google ScholarCross Ref
J. Widom. 2012. From 100 Students to 100,000. ACM SIGMOD Blog. Retreived December 13, 2013 from http://wp.sigmod.org/&quest;p=165.Google Scholar
T. Winograd. 1990. What can we teach about human-computer interaction&quest;(plenary address). In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 443--448. Google ScholarDigital Library
O. F. Zaidan and C. Callison-Burch. 2011. Crowdsourcing translation: Professional quality from non-professionals. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1. 1220--1229. Google ScholarDigital Library
B. J. Zimmerman and D. H. Schunk. 2001. Reflections on theories of self-regulated learning and academic achievement. Self-regulated Learning and Academic Achievement: Theoretical Perspectives 2 (2001), 289--307.Google Scholar

Index Terms

Peer and self assessment in massive online classes
1. Applied computing
  1. Education
2. Human-centered computing
  1. Collaborative and social computing

Recommendations

Improving the Peer Assessment Experience on MOOC Platforms
L@S '16: Proceedings of the Third (2016) ACM Conference on Learning @ Scale

Massive Open Online Courses (MOOCs) have revolutionized higher education by offering university-like courses for a large amount of learners via the Internet. The paper at hand takes a closer look on peer assessment as a tool for delivering ...
Read More
Investigating Learners’ Views of Assessment Types in Massive Open Online Courses (MOOCs)
Design for Teaching and Learning in a Networked World
Abstract
Massive Open Online Courses (MOOCs) are changing the contours of the teaching and learning landscape. Assessment covers an important part of this landscape and may be a key driver for learning. This paper presents preliminary results of a ...
Read More
Implementation and Experience of the Online Peer Grading System for Our Real Class
SIGUCCS '15: Proceedings of the 2015 ACM SIGUCCS Annual Conference

In an online learning course like MOOC (Massive Open Online Course), peer grading is useful for scaling the grades of assignments to the large number of students. Peer grading is also attractive because it reduces the burden of teachers and gives rich ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Computer-Human Interaction Volume 20, Issue 6
December 2013
155 pages
ISSN:1073-0516
EISSN:1557-7325
DOI:10.1145/2562181
Issue’s Table of Contents

Copyright © 2013 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2013
- Revised: 1 July 2013
- Accepted: 1 July 2013
- Received: 1 March 2013
Published in tochi Volume 20, Issue 6

Check for updates
Author Tags
MOOC
Peer assessment
design assessment
design crit
massive online classroom
online education
qualitative feedback
self-assessment
studio-based learning
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 209
  Total Citations
  View Citations
- 10,265
  Total Downloads
- Downloads (Last 12 months)619
- Downloads (Last 6 weeks)60
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Peer and self assessment in massive online classes

ACM Transactions on Computer-Human Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Improving the Peer Assessment Experience on MOOC Platforms

Investigating Learners’ Views of Assessment Types in Massive Open Online Courses (MOOCs)

Implementation and Experience of the Online Peer Grading System for Our Real Class

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Peer and self assessment in massive online classes

ACM Transactions on Computer-Human Interaction

Abstract

References

Cited By

Index Terms

Recommendations

Improving the Peer Assessment Experience on MOOC Platforms

Investigating Learners’ Views of Assessment Types in Massive Open Online Courses (MOOCs)

Implementation and Experience of the Online Peer Grading System for Our Real Class

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media