Abstract
The digitization initiatives in the past decades have led to a tremendous increase in digitized objects in the cultural heritage domain. Although digitally available, these objects are often not easily accessible for interested users because of the distributed allocation of the content in different repositories and the variety in data structure and standards. When users search for cultural content, they first need to identify the specific repository and then need to know how to search within this platform (e.g., usage of specific vocabulary). The goal of the EEXCESS project is to design and implement an infrastructure that enables ubiquitous access to digital cultural heritage content. Cultural content should be made available in the channels that users habitually visit and be tailored to their current context without the need to manually search multiple portals or content repositories. To realize this goal, open-source software components and services have been developed that can either be used as an integrated infrastructure or as modular components suitable to be integrated in other products and services. The EEXCESS modules and components comprise (i) Web-based context detection, (ii) information retrieval-based, federated content aggregation, (iii) metadata definition and mapping, and (iv) a component responsible for privacy preservation. Various applications have been realized based on these components that bring cultural content to the user in content consumption and content creation scenarios. For example, content consumption is realized by a browser extension generating automatic search queries from the current page context and the focus paragraph and presenting related results aggregated from different data providers. A Google Docs add-on allows retrieval of relevant content aggregated from multiple data providers while collaboratively writing a document. These relevant resources then can be included in the current document either as citation, an image, or a link (with preview) without having to leave disrupt the current writing task for an explicit search in various content providers’ portals.
- James Allan, Bruce Croft, Alistair Moffat, and Mark Sanderson. 2012. Frontiers, challenges, and opportunities for information retrieval: Report from SWIRL 2012. SIGIR Forum 46, 1, 2--32.Google ScholarDigital Library
- Avi Arampatzis, Pavlos S. Efraimidis, and George Drosatos. 2013. A query scrambler for search privacy on the Internet. Information Retrieval 16, 6, 657--679. Google ScholarDigital Library
- Albert-Lásló Barabási, Réka Albert, and Hawoong Jeong. 2000. Scale-free characteristics of random networks: The topology of the World-Wide Web. Physica A: Statistical Mechanics and Its Applications 281, 1--4, 69--77. Google ScholarCross Ref
- Emanuele Bellini and Paolo Nesi. 2013. Metadata quality assessment tool for open access cultural heritage institutional repositories. In Information Technologies for Performing Arts, Media Access, and Entertainment. Lecture Notes in Computer Science, Vol. 7990. Springer, 90--103. Google ScholarCross Ref
- Sonia Ben Mokhtar, Gautier Berthou, Amadou Diarra, Vivien Quéma, and Ali Shoker. 2013. RAC: A freerider-resilient, scalable, anonymous communication protocol. In Proceedings of the IEEE International Conference on Distributed Computing Systems (ICDCS’13). 520--529. Google ScholarDigital Library
- John Brooke. 1996. SUS: A ‘quick and dirty’ usability scale. In Usability Evaluation in Industry, P. W. Jordan, B. Weerdmeester, A. Thomas, and I. L. Mclelland (Eds.). Taylor 8 Francis, London, England, 189--194.Google Scholar
- Thomas R. Bruce and Diane I. Hillmann. 2004. The Continuum of Metadata Quality: Defining, Expressing, Exploiting. ALA Editions, Chicago, IL, 238--256.Google Scholar
- Jay Budzik and Kristian Hammond. 1999. Watson: Anticipating and contextualizing information needs. In Proceedings of the Annual Meeting of the American Society for Information Science. 727--740.Google Scholar
- Jordi Castellà-Roca, Alexandre Viejo, and Jordi Herrera-Joancomartí. 2009. Preserving user’s privacy in Web search engines. Computer Communications 32, 13, 1541--1551. Google ScholarDigital Library
- J. Debattista, S. Londoo, C. Lange, and S. Auer. 2014. LUZZU—a framework for linked data quality assessment. arXiv:1412.3750. http://arxiv.org/abs/1412.3750Google Scholar
- Roger Dingledine, Nick Mathewson, and Paul Syverson. 2004. Tor: The second-generation onion router. In Proceedings of the 13th Conference on USENIX Security Symposium, Volume 13 (SSYM’04). 21.Google ScholarCross Ref
- Josep Domingo-Ferrer, Agusti Solanas, and Jordi Castellà-Roca. 2009. h(k)-Private information retrieval from privacy-uncooperative queryable databases. Online Information Review 33, 4, 720--744. Google ScholarCross Ref
- Evelyn Dröge. 2012. Criteria for Vocabulary Evaluation and Comparison. Technical Report. Humboldt-Universität zu Berlin.Google Scholar
- C. Dwork, E, Kumar, M. Naor, and D. Sivakumar. 2001. Rank aggregation methods for the Web. In Proceedings of the 10th International Conference on World Wide Web. 613--622. DOI:http://dx.doi.org/10.1145/371920.372165 Google ScholarDigital Library
- Europeana Foundation. 2015. Definition of the Europeana Data Model. Technical Report. Europeana Foundation. http://pro.europeana.eu/page/edm-documentation.Google Scholar
- D. Gavrilis, D.-N. Makri, L. Papachristopoulos, S. Angelis, K. Kravvaritis, C. Papatheodorou, and P. Constantopoulos. 2015. Measuring quality in metadata repositories. In Research and Advanced Technology for Digital Libraries. Lecture Notes in Computer Science, Vol. 9316. Springer, 56--67. Google ScholarCross Ref
- Arthur Gervais, Reza Shokri, Adish Singla, Srdjan Capkun, and Vincent Lenders. 2014. Quantifying Web-search privacy. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, New York, NY, 966--977. Google ScholarDigital Library
- David Goldschlag, Michael Reed, and Paul Syverson. 1999. Onion routing. Communications of the ACM 42, 2, 39--41. Google ScholarDigital Library
- Michael Granitzer and Christin Seifert. 2016. Taking cultural and scientific content to users through the EEXCESS project. D-Lib Magazine 22, 3--4, 1. DOI:http://dx.doi.org/10.1045/march2016-contents. Google ScholarCross Ref
- Michael Granitzer, Christin Seifert, Silvia Russegger, and Klaus Tochtermann. 2013. Unfolding cultural, educational and scientific long-tail content in the Web. In Late-Breaking Results, Project Papers, and Workshop Proceedings of the 21st Conference on User Modeling, Adaptation, and Personalization. http://ceur-ws.org/Vol-997/umap2013_project_1.pdf.Google Scholar
- Jiafeng Guo, Gu Xu, Xueqi Cheng, and Hang Li. 2009. Named entity recognition in query. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’09). ACM, New York, NY, 267--274. DOI:http://dx.doi.org/10.1145/1571941.1571989 Google ScholarDigital Library
- Matthias Hagen, Martin Potthast, Anna Beyer, and Benno Stein. 2012. Towards optimum query segmentation: In doubt without. In Proceedings of the International Conference on Information and Knowledge Management (CIKM’12). ACM, New York, NY, 1015--1024.Google ScholarDigital Library
- David Hauger, Alexandros Paramythis, and Stephan Weibelzahl. 2011. Using browser interaction data to determine page reading behavior. In Proceedings of the 19th International Conference on User Modeling, Adaptation, and Personalization (UMAP’11). 147--158. http://dl.acm.org/citation.cfm?id=2021855.2021869Google ScholarDigital Library
- Martin Höffernig, Werner Bailer, Günter Nagler, and Helmut Mülner. 2010. Mapping audiovisual metadata formats using formal semantics. In Semantic Multimedia. Lecture Notes in Computer Science, Vol. 6725. Springer, 80--94. Google ScholarCross Ref
- Martin Höffernig, Thomas Orgel, Silvia Russegger, and Werner Bailer. 2015. Assessing quality in automated metadata aggregation and mapping services. In Proceedings of the Workshop on Cloud-Based Services for Digital Libraries.Google Scholar
- ISO 21127. 2014. ISO 21127:2014: Information and documentation—a reference ontology for the interchange of cultural heritage information. Retrieved February 20, 2017, from http://www.iso.org/iso/catalogue_detail?csnumber=57832.Google Scholar
- Marc Juarez and Vicenc Torra. 2015. DisPA: An intelligent agent for private Web search. In Advanced Research in Data Privacy. Vol. 567. Springer, 389--405.Google Scholar
- Arlind Kopliku, Karen Pinel-Sauvagnat, and Mohand Boughanem. 2014. Aggregated search: A new information retrieval paradigm. ACM Computing Surveys 46, 3, 41.Google ScholarDigital Library
- Quoc V. Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning (ICML’14). 1188--1196.Google ScholarDigital Library
- Timothy Lebo, Satya Sahoo, and Deborah McGuinness (Eds.). 2013. PROV-O: The PROV Ontology. Retrieved February 20, 2017, from http://www.w3.org/TR/prov-o/.Google Scholar
- Ryong Lee and Kazutoshi Sumiya. 2009. Zero-effort search and integration model for augmented Web applications. In Proceedings of the 9th International Conference on Web Engineering (ICWE’09). 330--339. Google ScholarDigital Library
- Henry Lieberman. 1997. Autonomous interface agents. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI’97). ACM, New York, NY, 67--74. Google ScholarDigital Library
- Yehuda Lindell and Erez Waisbard. 2010. Private Web search with malicious adversaries. In Proceedings of the 10th International Conference on Privacy Enhancing Technologies (PETS’10). 220--235. Google ScholarCross Ref
- Pasquale Lops, Marco De Gemmis, and Giovanni Semeraro. 2011. Content-based recommender systems: State of the art and trends. In Recommender Systems Handbook. Springer, 73--105. Google ScholarCross Ref
- Jie Lu and Jamie Callan. 2005. Federated search of text-based digital libraries in hierarchical peer-to-peer networks. In Advances in Information Retrieval. Springer, 52--66. Google ScholarDigital Library
- Kay Michal. 2007. XSL Transformations (XSLT) Version 2.0. W3C Recommendation. Retrieved February 20, 2017, from http://www.w3.org/TR/2007/REC-xslt20-20070123/.Google Scholar
- Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing.Google Scholar
- Jesse Montgomery, Luo Si, Jamie Callan, and David A. Evans. 2004. Effect of varying number of documents in blind feedback: Analysis of the 2003 NRRC RIA workshop “bf_numdocs” experiment suite. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’04). ACM, New York, NY, 476--477.Google Scholar
- Vanessa Murdock and Mounia Lalmas. 2008. Workshop on aggregated search. ACM SIGIR Forum 42, 2, 80.Google ScholarDigital Library
- Mummoorthy Murugesan and Chris Clifton. 2009. Providing privacy through plausibly deniable search. In Proceedings of the 2009 SIAM International Conference on Data Mining. 768--779. Google ScholarCross Ref
- Thomas Orgel, Werner Bailer, Martin Höffernig, Werner Preininger, and Silvia Russegger. 2016. Integration and Enrichment Services Final Prototype. Technical Report. EEXCESS Deliverable 4.4. EEXCESS.Google Scholar
- Thomas Orgel, Martin Höffernig, Werner Bailer, and Silvia Russegger. 2015. A metadata model and mapping approach for facilitating access to heterogeneous cultural heritage assets. International Journal on Digital Libraries 15, 2--4, 189--207. Google ScholarDigital Library
- Sai Teja Peddinti and Nitesh Saxena. 2014. Web search query privacy: Evaluating query obfuscation and anonymizing networks. Journal of Computer Security 22, 1, 155--199. Google ScholarDigital Library
- Albin Petit, Thomas Cerqueus, Antoine Boutet, Sonia Ben Mokhtar, David Coquil, Lionel Brunie, and Harald Kosch. 2016. SimAttack: Private Web Search Under Fire. Technical Report. Institut National des Sciences Appliquées de Lyon ; Universität Passau. https://hal.inria.fr/hal-01289861Google Scholar
- Albin Petit, Thomas Cerqueus, Sonia Ben Mokhtar, Lionel Brunie, and Harald Kosch. 2015. PEAS: Private, efficient and accurate Web search. In Proceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA Conference, Vol. 1. IEEE, Los Alamitos, CA, 571--580.Google ScholarDigital Library
- K. J. Reiche, I. Schieferdecker, and E. Höfig. 2014. Assessment and visualization of metadata quality for open government data. In Proceedings of the International Conference for E-Democracy and Open Government.Google Scholar
- B. J. Rhodes. 2000. Just-In-Time Information Retrieval. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
- B. J. Rhodes and P. Maes. 2000. Just-in-time information retrieval agents. IBM Systems Journal 39, 3--4, 685--704. Google ScholarDigital Library
- Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to Recommender Systems Handbook. Springer. Google ScholarCross Ref
- Stuart Rose, Dave Engel, Nick Cramer, and Wendy Cowley. 2010. Automatic Keyword Extraction from Individual Documents. John Wiley 8 Sons. DOI:http://dx.doi.org/10.1002/9780470689646.ch1 Google ScholarCross Ref
- Raoul Rubien, Hermann Ziak, and Roman Kern. 2015. Efficient search result diversification via query expansion using knowledge bases. In Proceedings of 12th International Workshop on Text-Based Information Retrieval (TIR’15). Google ScholarDigital Library
- Jörg Schlötterer. 2015. From context to query. In Proceedings of the ACM Symposium on Applied Computing (SAC’15). ACM, New York, NY, 1108--1109. Google ScholarDigital Library
- Jörg Schlötterer, Christin Seifert, and Michael Granitzer. 2016. Supporting Web surfers in finding related material in digital library repositories. In Proceedings of the International Conference on Theory and Practice of Digital Libraries (TPDL’16). Google ScholarCross Ref
- H. A. Seid and A. L. Lespagnol. 1998. Virtual private network. US Patent 5,768,271.Google Scholar
- C. Seifert, J. Jurgovsky, and M. Granitzer. 2014. FacetScape: A visualization for exploring the search space. In Proceedings of the 2014 18th International Conference on Information Visualization (IV’14). 94--101. Google ScholarCross Ref
- Christin Seifert, Nils Witt, Sebastian Bayerl, and Michael Granitzer. 2015. Digital library content in the social Web: Resource usage and content injection. IEEE STCN Newsletter 3, 1. https://sites.google.com/a/ieee.net/stc-social-networking/e-letter/stcsn- e-letter-vol-3-no-1/.Google Scholar
- Marc Shapiro. 1986. Structure and encapsulation in distributed systems: The proxy principle. In Proceedings of the 2013 IEEE 6th International Conference on Distributed Computing Systems (ICDCS’86). 198--204.Google Scholar
- Milad Shokouhi and Qi Guo. 2015. From queries to cards: Re-ranking proactive card recommendations based on reactive search history. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’15). ACM, New York, NY, 695--704. Google ScholarDigital Library
- Milad Shokouhi and Luo Si. 2011. Federated search. Foundations and Trends in Information Retrieval 5, 1, 1--102. Google ScholarDigital Library
- B. Stvilia, L. Gasser, and M. Twidale. 2007. A framework for information quality assessment. Journal of the American Society for Information Science and Technology 58, 12, 1720--1733. Google ScholarDigital Library
- Vincent Toubiana, Lakshminarayanan Subramanian, and Helen Nissenbaum. 2011. Trackmenot: Enhancing the privacy of Web search. arXiv:1109.4677.Google Scholar
- T. Trippel, D. Broeder, M. Durco, and O. Ohren. 2014. Towards automatic quality assessment of component metadata. In Proceedings of the 9th International Conference on Language Resources and Evaluation.Google Scholar
- Gerwald Tschinkel, Cecialia di Sciascio, Belgin Mutlu, and Vedran Sabol. 2015. The recommendation dashboard: A system to visualise and organise recommendations. In Proceedings of the International Conference on Information Visualisation (IV’15). 241--244. Google ScholarDigital Library
- Hermann Ziak and Roman Kern. 2015. Evaluation of pseudo relevance feedback techniques for cross vertical aggregated search. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. Lecture Notes in Computer Science, Vol. 9283. Springer, 91--102. Google ScholarDigital Library
- Stefan Zwicklbauer, Christin Seifert, and Michael Granitzer. 2016a. Robust and collective entity disambiguation through semantic embeddings. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’16). Google ScholarDigital Library
- Stefan Zwicklbauer, Christin Seifert, and Michael Granitzer. 2016b. DoSeR—a knowledge-base-agnostic framework for disambiguating entities using semantic embeddings. In Proceedings of the European Semantic Web Conference (ESWC’16).Google Scholar
Index Terms
- Ubiquitous Access to Digital Cultural Heritage
Recommendations
From Digital Cultural Heritage to Digital Culture: Evolution in Digital Humanities
DTUC '18: Proceedings of the 1st International Conference on Digital Tools & Uses CongressThe paper focuses on the need to rethink digital and digitization process for long term digital preservation, aiming to redefine them as the new Cultural Heritage of the contemporary era. This new way to observe digital artifacts and their co-creation ...
Multimedia in cultural heritage manuscripts: integrating description, transcription, and image content
Special issue on image and video processing for cultural heritageCultural heritage documents are often subject to digitization processes resulting in image material, even for textual contents. It is therefore common, in collections of valuable documents, to have descriptive information generated by the institutions, ...
Comments