TY - GEN
T1 - Syntactic dependency-based n-grams as classification features
AU - Sidorov, Grigori
AU - Velasquez, Francisco
AU - Stamatatos, Efstathios
AU - Gelbukh, Alexander
AU - Chanona-Hernández, Liliana
PY - 2013
Y1 - 2013
N2 - In this paper we introduce a concept of syntactic n-grams (sn-grams). Sn-grams differ from traditional n-grams in the manner of what elements are considered neighbors. In case of sn-grams, the neighbors are taken by following syntactic relations in syntactic trees, and not by taking the words as they appear in the text. Dependency trees fit directly into this idea, while in case of constituency trees some simple additional steps should be made. Sn-grams can be applied in any NLP task where traditional n-grams are used. We describe how sn-grams were applied to authorship attribution. SVM classifier for several profile sizes was used. We used as baseline traditional n-grams of words, POS tags and characters. Obtained results are better when applying sn-grams.
AB - In this paper we introduce a concept of syntactic n-grams (sn-grams). Sn-grams differ from traditional n-grams in the manner of what elements are considered neighbors. In case of sn-grams, the neighbors are taken by following syntactic relations in syntactic trees, and not by taking the words as they appear in the text. Dependency trees fit directly into this idea, while in case of constituency trees some simple additional steps should be made. Sn-grams can be applied in any NLP task where traditional n-grams are used. We describe how sn-grams were applied to authorship attribution. SVM classifier for several profile sizes was used. We used as baseline traditional n-grams of words, POS tags and characters. Obtained results are better when applying sn-grams.
KW - authorship attribution
KW - classification features
KW - parsing
KW - sn-grams
KW - syntactic n-grams
KW - syntactic paths
UR - http://www.scopus.com/inward/record.url?scp=84875865567&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-37798-3_1
DO - 10.1007/978-3-642-37798-3_1
M3 - Contribución a la conferencia
SN - 9783642377976
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 11
BT - Advances in Artificial Intelligence - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012, Revised Selected Papers
T2 - 11th Mexican International Conference on Artificial Intelligence, MICAI 2012
Y2 - 27 October 2012 through 4 November 2012
ER -