Classifying unlabeled short texts using a fuzzy declarative approach.

Romero, Francisco P.; Iranzo, Pascual Julián; Soto, Andrés; Satler, Mateus Ferreira; Casero, Juan Gallardo

Use este identificador para citar ou linkar para este item: http://www.repositorio.ufop.br/jspui/handle/123456789/8823

Registro completo de metadados

Campo Dublin Core	Valor	Idioma
dc.contributor.author	Romero, Francisco P.	-
dc.contributor.author	Iranzo, Pascual Julián	-
dc.contributor.author	Soto, Andrés	-
dc.contributor.author	Satler, Mateus Ferreira	-
dc.contributor.author	Casero, Juan Gallardo	-
dc.date.accessioned	2017-10-02T11:49:21Z	-
dc.date.available	2017-10-02T11:49:21Z	-
dc.date.issued	2013	-
dc.identifier.citation	ROMERO, F. P. et al. Classifying unlabeled short texts using a fuzzy declarative approach. Language Resources and Evaluation, v. 47, p. 151-178, 2013. Disponível em: <https://link.springer.com/article/10.1007/s10579-012-9203-2>. Acesso em: 28 jul. 2017.	pt_BR
dc.identifier.issn	1574-0218	-
dc.identifier.uri	http://www.repositorio.ufop.br/handle/123456789/8823	-
dc.description.abstract	Web 2.0 provides user-friendly tools that allow persons to create and publish content online. User generated content often takes the form of short texts (e.g., blog posts, news feeds, snippets, etc). This has motivated an increasing interest on the analysis of short texts and, specifically, on their categorisation. Text categorisation is the task of classifying documents into a certain number of predefined categories. Traditional text classification techniques are mainly based on word frequency statistical analysis and have been proved inadequate for the classification of short texts where word occurrence is too small. On the other hand, the classic approach to text categorization is based on a learning process that requires a large number of labeled training texts to achieve an accurate performance. However labeled documents might not be available, when unlabeled documents can be easily collected. This paper presents an approach to text categorisation which does not need a pre-classified set of training documents. The proposed method only requires the category names as user input. Each one of these categories is defined by means of an ontology of terms modelled by a set of what we call proximity equations. Hence, our method is not category occurrence frequency based, but highly depends on the definition of that category and how the text fits that definition. Therefore, the proposed approach is an appropriate method for short text classification where the frequency of occurrence of a category is very small or even zero. Another feature of our method is that the classification process is based on the ability of an extension of the standard Prolog language, named Bousi*Prolog, for flexible matching and knowledge representation. This declarative approach provides a text classifier which is quick and easy to build, and a classification process which is easy for the user to understand. The results of experiments showed that the proposed method achieved a reasonably useful performance.	pt_BR
dc.language.iso	en_US	pt_BR
dc.rights	restrito	pt_BR
dc.subject	Text categorization	pt_BR
dc.subject	Ontologies	pt_BR
dc.subject	Thesauri	pt_BR
dc.subject	Unlabeled short texts	pt_BR
dc.title	Classifying unlabeled short texts using a fuzzy declarative approach.	pt_BR
dc.type	Artigo publicado em periodico	pt_BR
dc.identifier.uri2	https://link.springer.com/article/10.1007/s10579-012-9203-2	pt_BR
dc.identifier.doi	https://doi.org/10.1007/s10579-012-9203-2	-
Aparece nas coleções:	DECSI - Artigos publicados em periódicos

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
ARTIGO_ClassifyingUnlabeledShort.pdf Restricted Access		835,59 kB	Adobe PDF	Visualizar/Abrir

Mostrar registro simples do item Visualizar estatísticas