Article

Deriving marketing intelligence from online discussion

Authors:
Natalie Glance

Intelliseek Applied Research Center, Pittsburgh, PA

Intelliseek Applied Research Center, Pittsburgh, PA
View Profile

,
Matthew Hurst

Intelliseek Applied Research Center, Pittsburgh, PA

Intelliseek Applied Research Center, Pittsburgh, PA
View Profile

,
Kamal Nigam

Intelliseek Applied Research Center, Pittsburgh, PA

Intelliseek Applied Research Center, Pittsburgh, PA
View Profile

,
Matthew Siegler

Intelliseek Applied Research Center, Pittsburgh, PA

Intelliseek Applied Research Center, Pittsburgh, PA
View Profile

,
Robert Stockton

Intelliseek Applied Research Center, Pittsburgh, PA

Intelliseek Applied Research Center, Pittsburgh, PA
View Profile

,
Takashi Tomokiyo

Intelliseek Applied Research Center, Pittsburgh, PA

Intelliseek Applied Research Center, Pittsburgh, PA
View Profile

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data miningAugust 2005Pages 419–428https://doi.org/10.1145/1081870.1081919

Published:21 August 2005Publication History

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Pages 419–428

ABSTRACT

Weblogs and message boards provide online forums for discussion that record the voice of the public. Woven into this mass of discussion is a wide range of opinion and commentary about consumer products. This presents an opportunity for companies to understand and respond to the consumer by analyzing this unsolicited feedback. Given the volume, format and content of the data, the appropriate approach to understand this data is to use large-scale web and text data mining technologies.This paper argues that applications for mining large volumes of textual data for marketing intelligence should provide two key elements: a suite of powerful mining and visualization technologies and an interactive analysis environment which allows for rapid generation and testing of hypotheses. This paper presents such a system that gathers and annotates online discussion relating to consumer products using a wide variety of state-of-the-art techniques, including crawling, wrapping, search, text classification and computational linguistics. Marketing intelligence is derived through an interactive analysis framework uniquely configured to leverage the connectivity and content of annotated online discussion.

References

S. Abney. Partial parsing via finite-state cascades. In Workshop on Robust Parsing, 8th European Summer School in Logic, Language and Information, 1996.]]Google ScholarDigital Library
R. Agrawal, S. Rajagopalan, R. Srikant, and Y. Xu. Mining newsgroups using networks arising from social behavior. In Proceedings of the Twelfth International World Wide Web Conference (WWW2003), 2003.]] Google ScholarDigital Library
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In J. B. Bocca, M. Jarke, and C. Zaniolo, editors, Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487--499. Morgan Kaufmann, 12--15 1994.]] Google ScholarDigital Library
R. Baumgartner, S. Flesca, and G. Gottlob. Declarative information extraction, Web crawling, and recursive wrapping with Lixto. Lecture Notes in Computer Science, 2173, 2001.]] Google ScholarDigital Library
K. D. Bollacker, S. Lawrence, and C. L. Giles. CiteSeer: An autonomous web agent for automatic retrieval and identification of interesting publications. In Agents '98, pages 116--123, 1998.]] Google ScholarDigital Library
H. Chen, J. Hu, and R. W. Sproat. Integrating geometric and linguistic analysis for e-mail signature block parsing. ACM Transactions on Information Systems, 17(4):343--366, 1999.]] Google ScholarDigital Library
W. W. Cohen. Data integration using similarity joins and a word-based information representation language. ACM Transactions on Information Systems, 18(3):288---321, 2000.]] Google ScholarDigital Library
W. W. Cohen, L. S. Jensen, and M. Hurst. A flexible learning system for wrapping tables and lists in HTML documents. In Proceedings of The Eleventh International World Wide Web Conference (WWW-2002), Honolulu, Hawaii, 2002.]] Google ScholarDigital Library
M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery. Learning to construct knowledge bases from the World Wide Web. Artificial Intelligence, 118(1--2):69--113, 2000.]] Google ScholarDigital Library
N. Glance and W. Cohen. BoardViewer: Meta-search and community mapping over message boards. Intelliseek Technical Report, 2003.]]Google Scholar
N. Glance, M. Hurst, and T. Tomokiyo. BlogPulse: Automated trend discovery for weblogs. In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004.]]Google Scholar
M. Hurst and K. Nigam. Retrieving topical sentiments from online document collections. In Document Recognition and Retrieval XI, pages 27--34, 2004.]]Google Scholar
L. S. Jensen and W. Cohen. Grouping extracted fields. In Proceedings of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining, 2001.]]Google Scholar
T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In Machine Learning: ECML-98, Tenth European Conference on Machine Learning, 1998.]] Google ScholarDigital Library
D. D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In Machine Learning: Proceedings of the Eleventh International Conference, 1994.]]Google ScholarDigital Library
D. D. Lewis and W. A. Gale. A sequential algorithm for training text classifiers. In SIGIR '94, pages 3--12, 1994.]] Google ScholarDigital Library
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285--318, 1988.]] Google ScholarCross Ref
A. McCallum and K. Nigam. Employing EM in pool-based active learning for text classification. In Machine Learning: Proceedings of the Fifteenth International Conference, pages 350--358, 1998.]] Google ScholarDigital Library
J. Myllymaki. Effective web data extraction with standard XML technologies. In Proc. WWWW10, pages 689--696, May 2001.]] Google ScholarDigital Library
T. Nasukawa, M. Morohashi, and T. Nagano. Customer claim mining: Discovering knowledge in vast amounts of textual data. Technical report, IBM Research, Japan, 1999.]]Google Scholar
T. Nasukawa and J. Yi. Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of K-CAP '03, 2003.]] Google ScholarDigital Library
K. Nigam and M. Hurst. Towards a robust metric of opinion. In AAAI Spring Symposium on Exploring Attitude and Affect in Text, 2004.]]Google Scholar
B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? sentiment classification using machine learning techniques. In Proceedings of EMNLP 2002, 2002.]] Google ScholarDigital Library
J. G. Shanahan, Y. Qu, and J. Weibe, editors. Computing Attitude and Affect in Text. Springer, Dordrecht, Netherlands, 2005.]] Google ScholarDigital Library
T. Tomokiyo and M. Hurst. A language model approach to keyphrase extraction. In Proceedings of the ACL Workshop on Multiword Expressions, 2003.]] Google ScholarDigital Library
Y. Yang. An evaluation of statistical approaches to text categorization. Information Retrieval, 1(1/2):67--88, 1999.]] Google ScholarDigital Library

Index Terms

Deriving marketing intelligence from online discussion
1. Information systems
  1. Information retrieval

Recommendations

Analyzing online discussion for marketing intelligence
WWW '05: Special interest tracks and posters of the 14th international conference on World Wide Web

We present a system that gathers and analyzes online discussion as it relates to consumer products. Weblogs and online message boards provide forums that record the voice of the public. Woven into this discussion is a wide range of opinion and ...
Read More
Business intelligence in online customer textual reviews

Apply text mining and regression to analyze customer textual reviews.Identifies key attributes leading to customer satisfaction and dissatisfaction.Hotel star level significantly influences satisfaction and dissatisfaction. With the rapid development of ...
Read More
Blogger-Centric Contextual Advertising

Web advertising (online advertising), a form of advertising that uses the World Wide Web to attract customers, has become one of the most commonly-used marketing channels. This paper addresses the concept of Blogger-Centric Contextual Advertising, which ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
August 2005
844 pages
ISBN:159593135X
DOI:10.1145/1081870
General Chair:
Robert Grossman
University of Illinois at Chicago & Open Data Partners, USA
,
Program Chairs:
Roberto Bayardo
IBM Almaden Research, USA
,
Kristin Bennett
RPI, USA
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
computational linguistics
content systems
information retrieval
machine learning
text mining
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 100
  Total Citations
  View Citations
- 2,613
  Total Downloads
- Downloads (Last 12 months)42
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deriving marketing intelligence from online discussion

KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Analyzing online discussion for marketing intelligence

Business intelligence in online customer textual reviews

Blogger-Centric Contextual Advertising