ABSTRACT
Effective disease monitoring provides a foundation for effective public health systems. This has historically been accomplished with patient contact and bureaucratic aggregation, which tends to be slow and expensive. Recent internet-based approaches promise to be real-time and cheap, with few parameters. However, the question of when and how these approaches work remains open. We addressed this question using Wikipedia access logs and category links. Our experiments, replicable and extensible using our open source code and data, test the effect of semantic article filtering, amount of training data, forecast horizon, and model staleness by comparing across 6 diseases and 4 countries using thousands of individual models. We found that our minimal-configuration, language-agnostic article selection process based on semantic relatedness is effective for improving predictions, and that our approach is relatively insensitive to the amount and age of training data. We also found, in contrast to prior work, very little forecasting value, and we argue that this is consistent with theoretical considerations about the nature of forecasting. These mixed results lead us to propose that the currently observational field of internet-based disease surveillance must pivot to include theoretical models of information flow as well as controlled experiments based on simulations of disease.
- Harshavardhan Achrekar et al. 2011. Predicting flu trends using Twitter data. In Computer Communications Workshops (INFOCOM Workshops)).Google Scholar
- Harshavardhan Achrekar et al. 2012. Twitter improves seasonal influenza prediction. In Health Informatics (HEALTHINF). http://www.cs.uml.edu/~bliu/pub/healthinf_2012.pdfGoogle Scholar
- Byung Gyu Ahn, Benjamin Van Durme, and Chris Callison-Burch. 2011. WikiTopics: What is popular on Wikipedia and why. In Workshop on Automatic Summarization for Different Genres, Media, and Languages (WASDGML). http://dl.acm.org/citation.cfm?id=2018987.2018992Google Scholar
- Murray Aitken, Thomas Altmann, and Daniel Rosen. 2014. Engaging patients through social media. Tech report. IMS Institute for Healthcare Informatics.Google Scholar
- Cristiano Alicino et al. Assessing Ebola-related web search behaviour: Insights and implications from an analytical study of Google Trends-based query volumes. Infectious Diseases of Poverty 4 (2015).Google Scholar
- Tim Althoff et al. 2013. Analysis and forecasting of trending topics in online media streams. In Multimedia. Google ScholarDigital Library
- Benjamin M. Althouse, Yih Yng Ng, and Derek A. T. Cummings. Prediction of dengue incidence using 15http://colorbrewer2.org search query surveillance. PLOS Neglected Tropical Diseases 5, 8 (Aug. 2011).Google ScholarCross Ref
- Eiji Aramaki, Sachiko Maskawa, and Mizuki Morita. 2011. Twitter catches the flu: Detecting influenza epidemics using Twitter. In Empirical Methods in Natural Language Processing (EMNLP). http://dl. acm.org/citation.cfm?id=2145432.2145600 Google ScholarDigital Library
- Ozgur M. Araz, Dan Bentley, and Robert L. Muelleman. Using Google flu Trends data in forecasting influenza-like-illness related ED visits in Omaha, Nebraska. The American Journal of Emergency Medicine 32, 9 (Sept. 2014).Google ScholarCross Ref
- Anoshé A. Aslam et al. The reliability of tweets as a supplementary method of seasonal influenza surveillance. Journal of Medical Internet Research 16, (Nov. 2014).Google ScholarCross Ref
- John W. Ayers et al. Seasonality in seeking mental health information on Google. American Journal of Preventive Medicine 44, 5 (May 2013).Google ScholarCross Ref
- Gyung Jin Bahk, Yong Soo Kim, and Myoung Su Park. Use of internet search queries to enhance surveillance of foodborne illness. Emerging Infectious Diseases 21, 11 (Nov. 2015).Google ScholarCross Ref
- Batuhan Bardak and Mehmet Tan. 2015. Prediction of influenza outbreaks by integrating Wikipedia article access logs and Google flu Trend data. In IEEE Bioinformatics and Bioengineering (BIBE). Google ScholarDigital Library
- Michał Bogdziewicz and Jakub Szymkowiak. Oak acorn crop and Google search volume predict Lyme disease risk in temperate Europe. Basic and Applied Ecology (Jan. 2016).Google Scholar
- Stephanie M. Borchardt, Kathleen A. Ritger, and Mark S. Dworkin. Categorization, prioritization, and surveillance of potential bioterrorism agents. Infectious Disease Clinics of North America 20, 2 (June 2006).Google ScholarCross Ref
- Dena M. Bravata et al. Systematic review: Surveillance systems for early detection of bioterrorism-related diseases. Annals of Internal Medicine 140, 11 (June 2004).Google ScholarCross Ref
- Benjamin N. Breyer et al. Use of Google Insights for Search to track seasonal and geographic kidney stone incidence in the USA. Urology 78, 2 (Aug. 2011).Google ScholarCross Ref
- Francesco Brigo and Roberto Erro. Why do people Google movement disorders? An infodemiological study of information seeking behaviors. Neurological Sciences (Feb. 2016).Google Scholar
- David Andre Broniatowski et al. Using social media to perform local influenza surveillance in an inner-city hospital: A retrospective observational study. JMIR Public Health and Surveillance 1, 1 (2015).Google ScholarCross Ref
- David A. Broniatowski, Michael J. Paul, and Mark Dredze. National and local influenza surveillance through Twitter: An analysis of the 2012-2013 influenza epidemic. PLOS ONE 8, 12 (Dec. 2013).Google Scholar
- Logan C. Brooks et al. flexible modeling of epidemics with an empirical bayes framework. PLOS Computational Biology 11, 8 (Aug. 2015).Google ScholarCross Ref
- Matt Brooks. Was the NBA draft lottery rigged for the New Orleans Hornets to win? Washington Post (May 2012). https://www.washingtonpost.com/blogs/early-lead/post/was-the-nba-draft-lotteryrigged-for-the-new-orleans-hornets-towin/2012/05/31/gJQAmL5V4U_blog.htmlGoogle Scholar
- Jan Burdziej and Piotr Gawrysiak. 2012. Using web mining for discovering spatial patterns and hot spots for spatial generalization. In Foundations of Intelligent Systems, Li Chen et al. (Eds.). Number 7661. http://link.springer.com/chapter/10.1007/ 978--3--642--34624--8_21 Google ScholarDigital Library
- Declan Butler. When Google got flu wrong. Nature 494, 7436 (Feb. 2013).Google ScholarCross Ref
- Herman Anthony Carneiro and Eleftherios Mylonakis. Google Trends: A web-based tool for real-time surveillance of disease outbreaks. Clinical Infectious Diseases 49, 10 (Nov. 2009).Google ScholarCross Ref
- Rachael Cayce, Kathleen Hesterman, and Paul Bergstresser. Google technology in the surveillance of hand foot mouth disease in Asia. International Journal of Integrative Pediatrics and Environmental Medicine 1 (2014). http://www.ijipem.com/index.php/ijipem/article/view/6Google Scholar
- Centers for Disease Control and Prevention (CDC). MMWR morbidity tables. (2015). http://wonder.cdc.gov/mmwr/mmwrmorb.aspGoogle Scholar
- 2016. Overview of influenza surveillance in the USA. Technical Report. Centers for Disease Control and Prevention (CDC). http://www.cdc.gov/flu/pdf/weekly/overview.pdfGoogle Scholar
- Boris Cergol and Matjaz Omladić. What can Wikipedia and Google tell us about stock prices under diferent market regimes? Ars Mathematica Contemporanea 9, 2 (June 2015). http://amcjournal.eu/index.php/amc/article/view/561Google ScholarCross Ref
- Prithwish Chakraborty et al. 2014. Forecasting a moving target: Ensemble models for ILI case count predictions. In SIAM Data Mining.Google Scholar
- Emily H. Chan et al. Using web search query data to monitor dengue epidemics: A new model for neglected tropical disease surveillance. PLOS Neglected Tropical Diseases 5, 5 (May 2011).Google ScholarCross Ref
- Jedsada Chartree. 2014. Monitoring dengue outbreaks using online data. Ph.D. University of North Texas. http://digital.library.unt.edu/ark:/67531/ metadc500167/m2/1/high_res_d/dissertation. pdfGoogle Scholar
- Sungjin Cho et al. Correlation between national influenza surveillance data and Google Trends in South Korea. PLOS ONE 8, 12 (Dec. 2013).Google Scholar
- Rumi Chunara et al. Online reporting for malaria surveillance using micro-monetary incentives, in urban India 2010-2011. Malaria Journal 11, 1 (Feb. 2012).Google Scholar
- Rumi Chunara et al. flu Near You: An online self-reported influenza surveillance system in the USA. Online Journal of Public Health Informatics 5, 1 (March 2013).Google Scholar
- Rumi Chunara, Jason R Andrews, and John S Brownstein. Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. American Journal of Tropical Medicine and Hygiene 86, 1 (Jan. 2012).Google Scholar
- Marek Ciglan and Kjetil Nørvåg. 2010. WikiPop: Personalized event detection system based on Wikipedia page view statistics. In Information and Knowledge Management (CIKM). Google ScholarDigital Library
- Nigel Collier et al. BioCaster: Detecting public health rumors with a Web-based text mining system. Bioinformatics 24, 24 (Dec. 2008). Google ScholarDigital Library
- Crystale Purvis Cooper et al. Cancer internet search activity on a major search engine, USA 2001-2003. Journal of Medical Internet Research 7, 3 (July 2005).Google Scholar
- Aron Culotta. 2010. Towards detecting influenza epidemics by analyzing Twitter messages. In Workshop on Social Media Analytics (SOMA). Google ScholarDigital Library
- Aron Culotta. Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages. Language Resources and Evaluation 47, 1 (March 2013). Google ScholarDigital Library
- Aron Culotta. 2014. Estimating county health statistics with Twitter. In Human Factors in Computing Systems (CHI). Google ScholarDigital Library
- Michael W. Davidson, Dotan A. Haim, and Jennifer M. Radin. Using networks to combine "big data" and traditional surveillance to improve influenza predictions. Scientific Reports 5 (Jan. 2015).Google Scholar
- Brian de Silva and Ryan Compton. Prediction of foreign box office revenues based on Wikipedia page activity. arXiv:1405.5924 {cs.SI} (May 2014). http://arxiv.org/abs/1405.5924Google Scholar
- Rishi Desai et al. Norovirus disease surveillance using Google internet query share data. Clinical Infectious Diseases 55, 8 (Oct. 2012).Google ScholarCross Ref
- Son Doan, Lucila Ohno-Machado, and Nigel Collier. 2012. Enhancing Twitter data analysis with simple semantic filtering: Example in tracking influenza-like illnesses. In Healthcare Informatics, Imaging and Systems Biology (HISB). Google ScholarDigital Library
- Timothy J. Doyle, M. Kathleen Glynn, and Samuel L. Groseclose. Completeness of notifiable infectious disease reporting in the USA: An analytical literature review. American Journal of Epidemiology 155, 9 (Jan. 2002).Google ScholarCross Ref
- Andrea Freyer Dugas et al. Influenza forecasting with Google flu Trends. PLOS ONE 8, 2 (Feb. 2013).Google ScholarCross Ref
- Vanja M. Dukic, Michael Z. David, and Diane S. Lauderdale. Internet queries and methicillin-resistant Staphylococcus aureus surveillance. Emerging Infectious Diseases 17, 6 (June 2011).Google ScholarCross Ref
- Michael Edelstein et al. Detecting the norovirus season in Sweden using search engine data -- Meeting the needs of hospital infection control teams. PLOS ONE 9, 6 (June 2014).Google Scholar
- Johannes C. Eichstaedt et al. Psychological language on Twitter predicts county-level heart disease mortality. Psychological Science 26, 2 (Feb. 2015).Google ScholarCross Ref
- Andreas Ekström et al. Forecasting emergency department visits using internet data. Annals of Emergency Medicine 65, 4 (April 2015).Google Scholar
- Gunther Eysenbach. Infodemiology: Tracking flu-related searches on the web for syndromic surveillance. AMIA Annual Symposium 2006 (2006). http://www.ncbi.nlm.nih.gov/pmc/articles/ PMC1839505/Google Scholar
- Geoffrey Fairchild et al. 2015. Eliciting disease data from Wikipedia articles. In Weblogs and Social Media (ICWSM) Workshops. http://www.aaai.org/ocs/ index.php/ICWSM/ICWSM15/paper/view/10630Google Scholar
- Clark C. Freifeld et al. HealthMap: Global infectious disease monitoring through automated classification and visualization of internet media reports. Journal of the American Medical Informatics Association 15, 2 (Jan. 2008).Google ScholarCross Ref
- Thomas R. Frieden. A framework for public health action: The health impact pyramid. American Journal of Public Health 100, 4 (April 2010).Google ScholarCross Ref
- Nicholas Generous et al. Global disease monitoring and forecasting with Wikipedia. PLOS Computational Biology 10, 11 (Nov. 2014).Google Scholar
- Francesco Gesualdo et al. Can Twitter be a source of information on allergy? Correlation of pollen counts with tweets reporting symptoms of allergic rhinoconjunctivitis and names of antihistamine drugs. PLOS ONE 10, 7 (July 2015).Google Scholar
- Jeremy Ginsberg et al. Detecting influenza epidemics using search engine query data. Nature 457, 7232 (Nov. 2008).Google Scholar
- Steven Gittelman et al. A new source of data for public health surveillance: Facebook likes. Journal of Medical Internet Research 17, 4 (April 2015).Google ScholarCross Ref
- Sharad Goel et al. Predicting consumer behavior with Web search. PNAS 107, 41 (Oct. 2010).Google Scholar
- Janaína Gomide et al. 2011. Dengue surveillance based on a computational model of spatio-temporal locality of Twitter. In Web Science Conference (WebSci). http://www.websci11.org/fileadmin/websci/ Papers/92_paper.pdf Google ScholarDigital Library
- Yuzhou Gu et al. Early detection of an epidemic erythromelalgia outbreak using Baidu search data. Scientific Reports 5 (July 2015).Google Scholar
- Akihito Hagihara, Shogo Miyazaki, and Takeru Abe. Internet suicide searches and the incidence of suicide in young people in Japan. European Archives of Psychiatry and Clinical Neuroscience 262, 1 (Feb. 2012).Google ScholarCross Ref
- Francis H. Harlow and Jacob E. Fromm. Computer experiments in fluid dynamics. Scientific American 212, 3 (March 1965).Google ScholarCross Ref
- Miguel Helft. Google uses web searches to track fluids spread. The New York Times (Nov. 2008). http://www.nytimes.com/2008/11/12/ technology/internet/12flu.htmlGoogle Scholar
- Kyle S. Hickmann et al. Forecasting the 2013-2014 influenza season using Wikipedia. PLOS Computational Biology 11, 5 (May 2015).Google Scholar
- Hideo Hirose and Liangliang Wang. 2012. Prediction of infectious disease spread using Twitter: A case of influenza. In Parallel Architectures, Algorithms and Programming (PAAP). Google ScholarDigital Library
- Arthur E. Hoerl and Robert W. Kennard. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1 (Feb. 1970).Google ScholarCross Ref
- Martin Rudi Holaker and Eirik Emanuelsen. 2013. Event detection using Wikipedia. Master's thesis. Institutt for Datateknikk og Informasjonsvitenskap. http://www.diva-portal.org/smash/record.jsf?pid=diva2:655606Google Scholar
- Anette Hulth et al. Eye-opening approach to norovirus surveillance. Emerging Infectious Diseases 16, 8 (Aug. 2010).Google Scholar
- Anette Hulth and Gustaf Rydevik. Web query-based surveillance in Sweden during the influenza A(H1N1)2009 pandemic, April 2009 to February 2010. Euro Surveillance 16, 18 (2011).Google Scholar
- Anette Hulth, Gustaf Rydevik, and Annika Linde. Web queries as a source for syndromic surveillance. PLOS ONE 4, 2 (Feb. 2009).Google ScholarCross Ref
- Robert Koch Institute. SurvStat@RKI 2.0. (2016). https://survstat.rki.de/Content/Query/ Create.aspxGoogle Scholar
- Instituto Nacional de Salud. Boletín epidemiológico. (2015). http://www.ins.gov.co/boletinepidemiologico/Paginas/default.aspxGoogle Scholar
- Molly E. Ireland et al. Action tweets linked to reduced county-level HIV prevalence in the USA: Online messages and structural determinants. AIDS and Behavior (Dec. 2015).Google Scholar
- Bao Jia-xing et al. 2013. Gonorrhea incidence forecasting research based on Baidu search data. In Management Science and Engineering (ICMSE).Google Scholar
- Amy K. Johnson and Supriya D. Mehta. A comparison of internet search trends and sexually transmitted infection rates using Google Trends. Sexually Transmitted Diseases 41, 1 (Jan. 2014).Google ScholarCross Ref
- Heather A Johnson et al. Analysis of Web access logs for surveillance of influenza. Studies in Health Technology and Informatics 107, 2 (2004). http:// www.ncbi.nlm.nih.gov/pubmed/15361003Google Scholar
- Mirko Kämpf et al. The detection of emerging trends using Wikipedia traffic data and context networks. PLOS ONE 10, 12 (Dec. 2015).Google Scholar
- Min Kang et al. Using Google Trends for influenza surveillance in South China. PLOS ONE 8, 1 (Jan. 2013).Google Scholar
- M.-G. Kang et al. Google unveils a glimpse of allergic rhinitis in the real world. Allergy 70, 1 (Jan. 2015).Google ScholarCross Ref
- Asad Ullah Rafiq Khan, Mohammad Badruddin Khan, and Khalid Mahmood. 2015. Cloud service for assessment of news? popularity in internet based on Google and Wikipedia indicators. In National Symposium on Information Technology: Towards New Smart World (NSITNSW).Google ScholarCross Ref
- Eui-Ki Kim et al. Use of Hangeul Twitter to track and predict human influenza infection. PLOS ONE 8, 7 (July 2013).Google ScholarCross Ref
- Kwang Deok Kim and Liaquat Hossain. 2014. Towards early detection of influenza epidemics by using social media analytics. In DSS 2.0 -- Supporting Decision Making with New Technologies. Vol. 261.Google Scholar
- Nicholas E. Kman and Daniel J. Bachmann. Biosurveillance: a review and update. Advances in Preventive Medicine 2012 (Jan. 2012).Google Scholar
- Volker König and Ralph Mösges. A model for the determination of pollen count using Google search queries for patients suffering from allergic rhinitis. Journal of Allergy 2014 (June 2014).Google ScholarCross Ref
- Natalie Kupferberg and Bridget McCrate Protus. Accuracy and completeness of drug information in Wikipedia: An assessment. Journal of the Medical Library Association 99, 4 (Oct. 2011).Google ScholarCross Ref
- Alex Lamb, Michael J. Paul, and Mark Dredze. 2013. Separating fact from fear: Tracking flu infections on Twitter. In Human Language Technologies (NAACL-HLT). http://www.aclweb.org/anthology/N/N13/N131097.pdfGoogle Scholar
- Vasileios Lampos et al. Advances in nowcasting influenza-like illness rates using search query logs. Scientific Reports 5 (Aug. 2015).Google Scholar
- Vasileios Lampos et al. Assessing the impact of a health intervention via user-generated Internet content. Data Mining and Knowledge Discovery 29, 5 (July 2015). Google ScholarDigital Library
- Vasileios Lampos and Nello Cristianini. 2010. Tracking the flu pandemic by monitoring the social web. In Cognitive Information Processing (CIP).Google Scholar
- Vasileios Lampos and Nello Cristianini. Nowcasting events from the social web with statistical learning. Transactions on Intelligent Systems and Technology 3, 4 (Sept. 2012). Google ScholarDigital Library
- Michaël R. Laurent and Tim J. Vickers. Seeking Health Information Online: Does Wikipedia Matter? Journal of the American Medical Informatics Association 16, 4 (July 2009).Google ScholarCross Ref
- David Lazer et al. The parable of Google flu: Traps in big data analysis. Science 343, 14 March (2014).Google Scholar
- Andreas Leithner et al. Wikipedia and osteosarcoma: A trustworthy patients' information? Journal of the American Medical Informatics Association 17, 4 (Jan. 2010).Google Scholar
- Shengli Li and Xichuan Zhou. Research of the correlation between the H1N1 morbidity data and Google Trends in Egypt. arXiv:1511.05300 {cs.SI} (Nov. 2015). http://arxiv.org/abs/1511.05300Google Scholar
- Johan Lindh et al. Head lice surveillance on a deregulated OTC-sales market: A study using web query data. PLOS ONE 7, 11 (Nov. 2012).Google Scholar
- Ruoqian Liu et al. 2014. Enhancing financial decision-making using social behavior modeling. In Social Network Mining and Analysis (SNAKDD). Google ScholarDigital Library
- Kevin Lutsky, Joseph Bernstein, and Pedro Beredjiklian. Quality of information on the internet about carpal tunnel syndrome: An update. Orthopedics 36, 8 (2013). http://www.healio.com/orthopedics/ journals/ortho/%7Bf97c8407--7483--4d26--9aac2b860b0e6d2c%7D/quality-of-information-onthe-internet-about-carpal-tunnel-syndromean-updateGoogle ScholarCross Ref
- T. Ma et al. Syndromic surveillance of influenza activity in Sweden: an evaluation of three tools. Epidemiology & Infection 143, 11 (Aug. 2015).Google Scholar
- Douglas Martin. Jack Twyman, N.B.A. star, dies at 78. The New York Times (May 2012). http://www.nytimes.com/2012/06/01/sports/ basketball/jack-twyman-nba-star-dies-at78.htmlGoogle Scholar
- Leah J. Martin, B. E. Lee, and Yutaka Yasui. Google flu Trends in Canada: A comparison of digital disease surveillance data with physician consultations and respiratory virus surveillance data, 2010-2014. Epidemiology & Infection 144, 02 (Jan. 2016).Google Scholar
- Leah J. Martin, Biying Xu, and Yutaka Yasui. Improving Google flu Trends estimates for the USA through transformation. PLOS ONE 9, 12 (Dec. 2014).Google ScholarCross Ref
- David J. McIver and John S. Brownstein. Wikipedia usage estimates prevalence of influenza-like illness in the USA in near real-time. PLOS Computational Biology 10, 4 (April 2014).Google ScholarCross Ref
- Wes McKinney. 2010. Data structures for statistical computing in Python. In Python in Science (SCIPY), Vol. 445. http://conference.scipy.org/ proceedings/scipy2010/pdfs/mckinney.pdfGoogle ScholarCross Ref
- Anthony J. McMichael. Globalization, climate change, and human health. New England Journal of Medicine 368, 14 (April 2013).Google ScholarCross Ref
- Márton Mestyán, Taha Yasseri, and János Kertész. Early prediction of movie box office success based on Wikipedia activity big data. PLOS ONE 8, 8 (Aug. 2013).Google ScholarCross Ref
- Gabriel J. Milinovich et al. Using internet search queries for infectious disease surveillance: Screening diseases for suitability. BMC Infectious Diseases 14 (2014).Google Scholar
- David Milne and Ian H. Witten. An open-source toolkit for mining Wikipedia. Artificial Intelligence 194 (Jan. 2013). Google ScholarDigital Library
- Ministry of Health Israel. Weekly epidemiological reports. (2015). http://www.health.gov.il/ UnitsOffice/HD/PH/epidemiology/Pages/ epidemiology_report.aspxGoogle Scholar
- Susan M. Mniszewski et al. 2014. Understanding the impact of face mask usage through epidemic simulation of large social networks. In Theories and Simulations of Complex Social Systems, Vahid Dabbaghian and Vijay Kumar Mago (Eds.). Number 52. http://link.springer.com/chapter/10.1007/ 978--3--642--39149--1_8Google Scholar
- Helen Susannah Moat et al. Quantifying Wikipedia usage patterns before stock market moves. Scientific Reports 3 (May 2013).Google Scholar
- Helen Susannah Moat et al. 2014. Anticipating stock market movements with Google and Wikipedia. In Nonlinear Phenomena in Complex Systems: From Nano to Macro Scale, Davron Matrasulov and H. Eugene Stanley (Eds.).Google Scholar
- Ruchit Nagar et al. A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives. Journal of Medical Internet Research 16, 10 (Oct. 2014).Google Scholar
- Anna C. Nagel et al. The complex relationship of realspace events and messages in cyberspace: Case study of influenza and pertussis using tweets. Journal of Medical Internet Research 15, 10 (Oct. 2013).Google ScholarCross Ref
- N.J.D. Nagelkerke. A note on a general definition of the coefficient of determination. Biometrika 78, 3 (1991).Google ScholarCross Ref
- Kok W. Ng. 2014. The use of Twitter to predict the level of influenza activity in the USA. M.S. Naval Postgraduate School. http://oai.dtic.mil/oai/ oai?verb=getRecord&metadataPrefix=html& identifier=ADA620696Google Scholar
- Alex J. Ocampo, Rumi Chunara, and John S. Brownstein. Using search queries for malaria surveillance, Thailand. Malaria Journal 12, 1 (Nov. 2013).Google ScholarCross Ref
- Donald R. Olson et al. Reassessing Google flu Trends data for detection of seasonal and pandemic influenza: A comparative epidemiological study at three geographic scales. PLOS Computational Biology 9, 10 (Oct. 2013).Google ScholarCross Ref
- Miles Osborne et al. 2012. Bieber no more: First story detection using Twitter and Wikipedia. In SIGIR Workshop on Time-aware Information Access (TAIA). http://www.dcs.gla.ac.uk/~craigm/ publications/osborneTAIA2012.pdfGoogle Scholar
- John Paparrizos, Ryen W. White, and Eric Horvitz. Screening for pancreatic adenocarcinoma using signals from web search logs: Feasibility study and results. Journal of Oncology Practice (June 2016).Google Scholar
- Michael J. Paul and Mark Dredze. 2011. You are what you tweet: Analyzing Twitter for public health. In Weblogs and Social Media (ICWSM).Google Scholar
- Michael J. Paul, Mark Dredze, and David Broniatowski. Twitter improves influenza forecasting. PLOS Currents (Oct. 2014).Google Scholar
- Fabian Pedregosa et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, Oct (2011). http://jmlr.org/papers/v12/pedregosa11a.html Google ScholarDigital Library
- Camille Pelat et al. More diseases tracked by using Google Trends. Emerging Infectious Diseases 15, 8 (Aug. 2009).Google ScholarCross Ref
- Geng Peng and Jiyuan Wang. 2014. Detecting syphilis amount in China based on Baidu query data. In Soft Computing in Information Communication Technology (SCICT 2014).Google ScholarCross Ref
- Fernando Pérez and Brian E. Granger. IPython: A system for interactive scientific computing. Computing in Science & Engineering 9, 3 (2007). Google ScholarDigital Library
- Lyle R. Petersen et al. Zika virus. New England Journal of Medicine 374, 16 (April 2016).Google ScholarCross Ref
- David T. Plante and David G. Ingram. Seasonal trends in tinnitus symptomatology: Evidence from Internet search engine query data. European Archives of Oto-Rhino-Laryngology 272, 10 (Sept. 2014).Google Scholar
- Philip M. Polgreen et al. Using internet searches for influenza surveillance. Clinical Infectious Diseases 47, 11 (Jan. 2008).Google ScholarCross Ref
- Tobias Preis and Helen Susannah Moat. Adaptive nowcasting of influenza outbreaks using Google searches. Royal Society Open Science 1, 2 (Oct. 2014).Google ScholarCross Ref
- Reid Priedhorsky et al. 2007. Creating, destroying, and restoring value in Wikipedia. In Supporting Group Work (GROUP). Google ScholarDigital Library
- Reid Priedhorsky, Geoffrey Fairchild, and Sara Y. Del Valle. Research:Geo-aggregation of Wikipedia pageviews. (2015). https://meta.wikimedia.org/ wiki/Research:Geoaggregation_of_Wikipedia_pageviewsGoogle Scholar
- Malolan S. Rajagopalan et al. Patient-oriented cancer information on the internet: A comparison of Wikipedia and a professionally maintained database. Journal of Oncology Practice 7, 5 (Jan. 2011).Google ScholarCross Ref
- Sudha Ram et al. Predicting asthma-related emergency department visits using big data. IEEE Journal of Biomedical and Health Informatics 19, 4 (July 2015).Google ScholarCross Ref
- Ronald E. Rice. Influences, usage, and outcomes of Internet health information searching: Multivariate results from the Pew surveys. International Journal of Medical Informatics 75, 1 (Jan. 2006).Google ScholarCross Ref
- Joshua Ritterman, Miles Osborne, and Ewan Klein. 2009. Using prediction markets and Twitter to predict a swine flu pandemic. In Workshop on Mining Social Media. http://homepages.inf.ed.ac.uk/miles/ papers/swine09.pdfGoogle Scholar
- Caitlin M. Rivers et al. Modeling the impact of interventions on an epidemic of Ebola in Sierra Leone and Liberia. PLOS Currents (2014).Google Scholar
- Ankit Rohatgi. WebPlotDigitizer. (Oct. 2015). http://arohatgi.info/WebPlotDigitizerGoogle Scholar
- Mauricio Santillana et al. Using clinicians' search query data to monitor influenza epidemics. Clinical Infectious Diseases 59, 10 (Nov. 2014).Google ScholarCross Ref
- Mauricio Santillana et al. What can digital disease detection learn from (an external revision to) Google flu Trends? American Journal of Preventive Medicine 47, 3 (Sept. 2014).Google Scholar
- Mauricio Santillana et al. Combining search, social media, and traditional data sources to improve influenza surveillance. PLOS Computational Biology 11, 10 (Oct. 2015).Google Scholar
- Sercan Sarigul and Huaxia Rui. 2014. Nowcasting obesity in the U.S. using Google search volume data. In AAEA/EAAE/CAES Joint Symposium: Social Networks, Social Media and the Economics of Food. http://econpapers.repec.org/paper/ agsaajs14/166113.htmGoogle Scholar
- Shilad Sen et al. 2014. WikiBrain: Democratizing computation on Wikipedia. In OpenSym. Google ScholarDigital Library
- Dong-Woo Seo et al. Cumulative query method for influenza surveillance using search engine data. Journal of Medical Internet Research 16, 12 (Dec. 2014).Google ScholarCross Ref
- Jeffrey Shaman and Alicia Karspeck. Forecasting seasonal outbreaks of influenza. Proceedings of the National Academy of Sciences 109, 50 (Nov. 2012).Google ScholarCross Ref
- Alessio Signorini. 2014. Use of social media to monitor and predict outbreaks and public opinion on health topics. Ph.D. University of Iowa. http://ir.uiowa.edu/etd/1503/Google Scholar
- Alessio Signorini, Alberto Maria Segre, and Philip M. Polgreen. The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic. PLOS ONE 6, 5 (May 2011).Google ScholarCross Ref
- Amit Singhal. Introducing the Knowledge Graph: Things, not strings. (May 2012). https://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-thingsnot.htmlGoogle Scholar
- Giovanni Stilo et al. 2014. Predicting flu epidemics using Twitter and historical data. In Brain Informatics and Health, Dominik Ślezak et al. (Eds.). Number 8609.Google Scholar
- Michael Strube and Simone Paolo Ponzetto. 2006. WikiRelate! Computing semantic relatedness using Wikipedia. In AAAI, Vol. 6. http://www.aaai.org/Papers/AAAI/2006/AAAI06--223.pdf Google ScholarDigital Library
- Yla Tausczik et al. Public Anxiety and Information Seeking Following the H1N1 Outbreak: Blogs, Newspaper Articles, and Wikipedia Visits. Health Communication 27, 2 (2012).Google Scholar
- flu Trends Team. The next chapter for flu Trends. (Aug. 2015). http://googleresearch.blogspot.com/2015/08/the-next-chapter-for-flu-trends.htmlGoogle Scholar
- Marijn ten Thij et al. Modeling page-view dynamics on Wikipedia. arXiv:1212.5943 {physics} (Dec. 2012). http://arxiv.org/abs/1212.5943Google Scholar
- Garry R. Thomas et al. An evaluation of Wikipedia as a resource for patient education in nephrology. Seminars in Dialysis 26, 2 (2013).Google ScholarCross Ref
- L. H. Thompson et al. Emergency department and 'Google flu Trends' data as syndromic surveillance indicators for seasonal influenza. Epidemiology & Infection 142, 11 (Nov. 2014).Google ScholarCross Ref
- Anna R. Thorner et al. Correlation between UpToDate searches and reported cases of Middle East respiratory syndrome during outbreaks in Saudi Arabia. Open Forum Infectious Diseases 3, 1 (Jan. 2016). http:/Google ScholarCross Ref
Index Terms
- Measuring Global Disease with Wikipedia: Success, Failure, and a Research Agenda
Recommendations
Wikipedia searches and the epidemiology of infectious diseases: A systematic review
AbstractThis review aims to collect, analyse and synthesize the available evidence that can be provided by Wikipedia for epidemiologic surveillance purposes. PRISMA guidelines were followed. PubMed/Medline and Scopus were consulted. Out of 238 ...
Highlights- This review shows an increased academic interest in Wikipedia searches and the epidemiology of infectious diseases.
Methodological Review: Health GIS and HIV/AIDS studies: Perspective and retrospective
GIS (Geographic Information System) is a useful tool that aids and assists in health research, health education, planning, monitoring and evaluation of health programmes that are meant to control and eradicate certain life threatening diseases and ...
Application of soft computing models to hourly weather analysis in southern Saskatchewan, Canada
Accurate weather forecasts are necessary for planning our day-to-day activities. However, dynamic behavior of weather makes the forecasting a formidable challenge. This study presents a soft computing model based on a radial basis function network (RBFN)...
Comments