ABSTRACT
In this technical demo we present repoVizz (http://repovizz.upf.edu), an integrated online system capable of structural formatting and remote storage, browsing, exchange, annotation, and visualization of synchronous multi-modal, time-aligned data. Motivated by a growing need for data-driven collaborative research, repoVizz aims to resolve commonly encountered difficulties in sharing or browsing large collections of multi-modal data. At its current state, repoVizz is designed to hold time-aligned streams of heterogeneous data: audio, video, motion capture, physiological signals, extracted descriptors, annotations, et cetera. Most popular formats for audio and video are supported, while Broadcast WAVE or CSV formats are adopted for streams other than audio or video (e.g., motion capture or physiological signals). The data itself is structured via customized XML files, allowing the user to (re-) organize multi-modal data in any hierarchical manner, as the XML structure only holds metadata and pointers to data files. Datasets are stored in an online database, allowing the user to interact with the data remotely through a powerful HTML5 visual interface accessible from any standard web browser; this feature can be considered a key aspect of repoVizz since data can be explored, annotated, or visualized from any location or device. Data exchange and upload/download is made easy and secure via a number of data conversion tools and a user/permission management system.
Supplemental Material
- O. Mayor and J. Llop and E. Maestre, Repovizz: a multimodal on-line database and browsing tool for music performane research, Proceedings of Int. Symposium for Music Information Retrieval, 2011.Google Scholar
Index Terms
- repoVizz: a framework for remote storage, browsing, annotation, and exchange of multi-modal data
Recommendations
Semiotic schemas: A framework for grounding language in action and perception
Special volume on connecting language to the worldA theoretical framework for grounding language is introduced that provides a computational path from sensing and motor action to words and speech acts. The approach combines concepts from semiotics and schema theory to develop a holistic approach to ...
A probabilistic multimodal approach for predicting listener backchannels
During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important ...
An extension of the multimodal presentation markup language (MPML) to a three-dimensional VRML space
We are conducting research into multimodal presentations that make use of anthropomorphic character agents as a new type of multimodal media that can be used to effectively communicate information in conjunction with the World Wide Web (WWW) and we are ...
Comments