Faceted search and browsing of audio content on spoken web
Abstract
Spoken Web is a web of VoiceSites that can be accessed by a phone. The content in a VoiceSite is audio. Therefore Spoken Web provides an alternate to the World Wide Web (WWW) in developing regions where low Internet penetration and low literacy are barriers to accessing the conventional WWW. Searching of audio content in Spoken Web through an audio query-result interface presents two key challenges: indexing of audio content is not accurate, and the presentation of results in audio is sequential, and therefore cumbersome. In this paper, we apply the concepts of faceted search and browsing to the Spoken Web search problem. We use the concepts of facets to index the meta-data associated with the audio content. We provide a mechanism to rank the facets based on the search results. We develop an interactive query interface that enables easy browsing of search results through the top ranked facets. To our knowledge, this is the first system to use the concepts of facets in audio search, and the first solution that provides an audio search for the rural population. We present quantitative results to illustrate the accuracy and effectiveness of the faceted search and qualitative results to highlight the usability of the interactive browsing system. The experiments have been conducted on more than 4000 audio documents collected from a live Spoken Web VoiceSite and evaluations were carried out with 40 farmers who are the target users of the VoiceSite. © 2010 ACM.