Chi-square classifier for document categorization

Mikhail Alexandrov, Alexander Gelbukh, George Lozovoi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

The problem of document categorization is considered. The set of domains and the keywords specific for these domains is supposed to be selected beforehand as initial data. We apply the well-known statistical hypothesis test that considers images of documents and domains as normalized vectors. In comparison with existing methods, such approach allows to take into account a random character of initial data. The classifier is developed in the framework of Document Investigator software package.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 2nd International Conference, CICLing 2001, Proceedings
EditorsAlexander Gelbukh
PublisherSpringer Verlag
Pages457-459
Number of pages3
ISBN (Print)3540416870, 9783540416876
DOIs
StatePublished - 2001
Event2nd International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2001 - Mexico City, Mexico
Duration: 18 Feb 200124 Feb 2001

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2004
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2001
Country/TerritoryMexico
CityMexico City
Period18/02/0124/02/01

Fingerprint

Dive into the research topics of 'Chi-square classifier for document categorization'. Together they form a unique fingerprint.

Cite this