research-article

How do People Sort by Ratings?

Authors:
Jerry O. Talton

Carta, Inc., New York, NY, USA

Carta, Inc., New York, NY, USA
View Profile

,
Krishna Dusad

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

,
Konstantinos Koiliaris

University of Illinois at Urbana-Champaign, Champaign, IL, USA

University of Illinois at Urbana-Champaign, Champaign, IL, USA
View Profile

,
Ranjitha S. Kumar

University of Illinois at Urbana-Champaign, Urbana, IL, USA

University of Illinois at Urbana-Champaign, Urbana, IL, USA
View Profile

CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing SystemsMay 2019Paper No.: 305Pages 1–10https://doi.org/10.1145/3290605.3300535

Published:02 May 2019Publication History

CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

Pages 1–10

ABSTRACT

Sorting items by user rating is a fundamental interaction pattern of the modern Web, used to rank products (Amazon), posts (Reddit), businesses (Yelp), movies (YouTube), and more. To implement this pattern, designers must take in a distribution of ratings for each item and define a sensible total ordering over them. This is a challenging problem, since each distribution is drawn from a distinct sample population, rendering the most straightforward method of sorting --- comparing averages --- unreliable when the samples are small or of different sizes. Several statistical orderings for binary ratings have been proposed in the literature (e.g., based on the Wilson score, or Laplace smoothing), each attempting to account for the uncertainty introduced by sampling. In this paper, we study this uncertainty through the lens of human perception, and ask "How do people sort by ratings?" In an online study, we collected 48,000 item-ranking pairs from 4,000 crowd workers along with 4,800 rationales, and analyzed the results to understand how users make decisions when comparing rated items. Our results shed light on the cognitive models users employ to choose between rating distributions, which sorts of comparisons are most contentious, and how the presentation of rating information affects users' preferences.

References

Dan Cosley, Shyong K. Lam, Istvan Albert, Joseph A. Konstan, and John Riedl. 2003. Is Seeing Believing?: How Recommender System Interfaces Affect Users' Opinions. In Proc. SIGCHI. 585--592. Google ScholarDigital Library
F. Maxwell Harper, Xin Li, Yan Chen, and Joseph A. Konstan. 2005. An Economic Model of User Rating in an Online Recommender System. In Proc. UM. 307--316. Google ScholarDigital Library
Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating Collaborative Filtering Recommender Systems. ACM Trans. Inf. Syst. 22 (2004), 5--53. Google ScholarDigital Library
Will Hill, Larry Stead, Mark Rosenstein, and George Furnas. 1995. Recommending and Evaluating Choices in a Virtual Community of Use. In Proc. CHI. 194--201. Google ScholarDigital Library
Christopher K. Hsee, George F. Loewenstein, Sally Blount, and Max H. Bazerman. 1999. Preference reversals between joint and separate evaluation of options: A review and theoretical analysis. Psychological Bulletin 125, 5 (1999), 576--590. CHI 2019, May 4--9, 2019, Glasgow, Scotland Uk J. Talton et al.Google Scholar
Nan Hu, Jie Zhang, and Paul A. Pavlou. 2009. Overcoming the J-shaped Distribution of Product Reviews. CACM 52 (2009), 144--147. Google ScholarDigital Library
Daniel Kahneman. 2011. Thinking, fast and slow. Farrar, Straus and Giroux, New York.Google Scholar
Daniel Kahneman and Amos Tversky. 1979. Prospect Theory: An Analysis of Decision under Risk. Econometrica 47, 2 (1979), 263--291.Google ScholarCross Ref
Daniel Kluver, Tien T. Nguyen, Michael Ekstrand, Shilad Sen, and John Riedl. 2012. How Many Bits Per Rating?. In Proc. RecSys. 99--106. Google ScholarDigital Library
Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval 3, 3 (2009), 225--331. Google ScholarDigital Library
Nathan McAlone. 2017. The exec who replaced Netflix's 5-star rating system with 'thumbs up, thumbs down' explains why. http://www.businessinsider.com/ why-netflix-replaced-its-5-star-rating-system-2017--4Google Scholar
Evan Miller. 2009. How Not To Sort By Average Rating. http://www. evanmiller.org/how-not-to-sort-by-average-rating.htmlGoogle Scholar
Evan Miller. 2012. Bayesian Average Ratings. http://www.evanmiller. org/bayesian-average-ratings.htmlGoogle Scholar
Evan Miller. 2014. Ranking Items With Star Ratings. http://www. evanmiller.org/how-not-to-sort-by-average-rating.htmlGoogle Scholar
Michael P. O'Mahony, Neil J. Hurley, and Guénolé C.M. Silvestre. 2006. Detecting Noise in Recommender System Databases. In Proc. IUI. 109-- 115. Google ScholarDigital Library
Will Qiu, Palo Parigi, and Bruno Abrahao. 2018. More Stars or More Reviews?. In Proc. CHI. 153:1--153:11. Google ScholarDigital Library
Al Mamunur Rashid, Istvan Albert, Dan Cosley, Shyong K. Lam, Sean M. McNee, Joseph A. Konstan, and John Riedl. 2002. Getting to Know You: Learning New User Preferences in Recommender Systems. In Proc. IUI. 127--134. Google ScholarDigital Library
Alan Said and Alejandro Bellogín. 2018. Coherence and Inconsistencies in Rating Behavior: Estimating the Magic Barrier of Recommender Systems. UMUAI 28 (2018), 97--125. Google ScholarDigital Library
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based Collaborative Filtering Recommendation Algorithms. In Proc. WWW. 285--295. Google ScholarDigital Library
Aaron Schumacher. 2014. How To Sort By Average Rating. https://planspacedotorg.wordpress.com/2014/08/17/ how-to-sort-by-average-rating/Google Scholar
Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, and Jose San Pedro. 2010. How Useful Are Your Comments?: Analyzing and Predicting Youtube Comments and Comment Ratings. In Proc. WWW. 891--900. Google ScholarDigital Library
E. Isaac Sparling and Shilad Sen. 2011. Rating: How Difficult is It?. In Proc. RecSys. 149--156. Google ScholarDigital Library
Jacob Thebault-Spieker, Daniel Kluver, Maximilian A. Klein, Aaron Halfaker, Brent Hecht, Loren Terveen, and Joseph A. Konstan. 2017. Simulation Experiments on (the Absence of) Ratings Bias in Reputation Systems. In Proc. CSCW. 101:1--101:25.Google Scholar
Amos Tversky and Daniel Kahneman. 1985. The Framing of Decisions and the Psychology of Choice. Springer US, Boston, MA, 25--41.Google Scholar
Edwin B. Wilson. 1927. Probable Inference, the Law of Succession, and Statistical Inference. J. Amer. Statist. Assoc. 22, 158 (1927), 209--212.Google ScholarCross Ref
Timothy Wilson and Jonathan Schooler. 1991. Thinking Too Much: Introspection Can Reduce the Quality of Preferences and Decisions. Journal of personality and social psychology 60 (03 1991), 181--92.Google Scholar
Dell Zhang, Robert Mao, Haitao Li, and Joanne Mao. 2011. How to Count Thumb-Ups and Thumb-Downs: User-Rating Based Ranking of Items from an Axiomatic Perspective. In Proc ICTIR. 238--249. Google ScholarDigital Library

Index Terms

How do People Sort by Ratings?
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Ranking with non-random missing ratings: influence of popularity and positivity on evaluation metrics
RecSys '12: Proceedings of the sixth ACM conference on Recommender systems

The evaluation of recommender systems in terms of ranking has recently gained attention, as it seems to better fit the top-k recommendation task than the usual ratings prediction task. In that context, several authors have proposed to consider missing ...
Read More
EigenRank: a ranking-oriented approach to collaborative filtering
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

A recommender system must be able to suggest items that are likely to be preferred by the user. In most systems, the degree of preference is represented by a rating score. Given a database of users' past ratings on a set of items, traditional ...
Read More
Pairwise preference regression for cold-start recommendation
RecSys '09: Proceedings of the third ACM conference on Recommender systems

Recommender systems are widely used in online e-commerce applications to improve user engagement and then to increase revenue. A key challenge for recommender systems is providing high quality recommendation to users in ``cold-start" situations. We ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
May 2019
9077 pages
ISBN:9781450359702
DOI:10.1145/3290605
General Chairs:
Stephen Brewster
University of Glasgow, Scotland, UK
,
Geraldine Fitzpatrick
TU Wien, Austria
,
Program Chairs:
Anna Cox
University College London, UK
,
Vassilis Kostakos
University of Melbourne, Australia
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 May 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
ranking
uncertainty
user ratings
Qualifiers
- research-article
Conference

Acceptance Rates
CHI '19 Paper Acceptance Rate703of2,958submissions,24%Overall Acceptance Rate6,199of26,314submissions,24%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 470
  Total Downloads
- Downloads (Last 12 months)30
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

How do People Sort by Ratings?

CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Ranking with non-random missing ratings: influence of popularity and positivity on evaluation metrics

EigenRank: a ranking-oriented approach to collaborative filtering

Pairwise preference regression for cold-start recommendation