Evaluating tag filtering techniques for Web resource classification in folksonomies

Abstract

Social or collaborative tagging systems emerged as a novel classification scheme on the Web based on the collective knowledge of people. In sites such as Del.icio.us, Technorati or Flickr, users annotate a variety of resources, including Web pages, blogs, pictures, videos or bibliographic references; using freely chosen textual labels or tags. Underlying collaborative tagging systems are ternary data structures known as folksonomies relating resources and users through tags, this information facilitate accessing and browsing massive repositories of resources. Collective annotations provided by people in the form of tags can also be exploited to organize resources on-line in a more formal classification scheme such as the ones provided by hierarchies or directories, alleviating the task of manual classification commonly required by systems such as directories on the Web. In this paper we present an empirical study carried out to determine the value of tags in resource classification. Furthermore, the use of several filtering and pre-processing operations to reduce the ambiguity and noise in tags are analyzed to determine whether they allow to increase the quality of resource classification.

Publication
Expert Systems with Applications