Cic-fbk approach to native language identification

Ilia Markov; Lingzhen Chen; Carlo Strapparava; Grigori Sidorov

Cic-fbk approach to native language identification

Ilia Markov, Lingzhen Chen, Carlo Strapparava, Grigori Sidorov

Centro de Investigación en Computación (CIC)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

17 Citas (Scopus)

Resumen

We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring.

Idioma original	Inglés
Título de la publicación alojada	EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop
Editorial	Association for Computational Linguistics (ACL)
Páginas	374-381
Número de páginas	8
ISBN (versión digital)	9781945626852
Estado	Publicada - 2017
Evento	12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017, held in conjunction with EMNLP 2017 - Copenhagen, Dinamarca Duración: 8 sep. 2017 → …

Serie de la publicación

Nombre	EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop

Conferencia

Conferencia	12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017, held in conjunction with EMNLP 2017
País/Territorio	Dinamarca
Ciudad	Copenhagen
Período	8/09/17 → …

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Markov, I., Chen, L., Strapparava, C., & Sidorov, G. (2017). Cic-fbk approach to native language identification. En EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop (pp. 374-381). (EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop). Association for Computational Linguistics (ACL).

Markov, Ilia ; Chen, Lingzhen ; Strapparava, Carlo et al. / Cic-fbk approach to native language identification. EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop. Association for Computational Linguistics (ACL), 2017. pp. 374-381 (EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop).

@inproceedings{59096b50dd804ebbb3c17fb407ef6a64,

title = "Cic-fbk approach to native language identification",

abstract = "We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring.",

author = "Ilia Markov and Lingzhen Chen and Carlo Strapparava and Grigori Sidorov",

note = "Publisher Copyright: {\textcopyright} EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop. All rights reserved.; 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017, held in conjunction with EMNLP 2017 ; Conference date: 08-09-2017",

year = "2017",

language = "Ingl{\'e}s",

series = "EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop",

publisher = "Association for Computational Linguistics (ACL)",

pages = "374--381",

booktitle = "EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop",

}

Markov, I, Chen, L, Strapparava, C & Sidorov, G 2017, Cic-fbk approach to native language identification. En EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop. EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop, Association for Computational Linguistics (ACL), pp. 374-381, 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017, held in conjunction with EMNLP 2017, Copenhagen, Dinamarca, 8/09/17.

Cic-fbk approach to native language identification. / Markov, Ilia; Chen, Lingzhen; Strapparava, Carlo et al.
EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop. Association for Computational Linguistics (ACL), 2017. p. 374-381 (EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - Cic-fbk approach to native language identification

AU - Markov, Ilia

AU - Chen, Lingzhen

AU - Strapparava, Carlo

AU - Sidorov, Grigori

PY - 2017

Y1 - 2017

N2 - We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring.

AB - We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring.

UR - http://www.scopus.com/inward/record.url?scp=85096916226&partnerID=8YFLogxK

M3 - Contribución a la conferencia

AN - SCOPUS:85096916226

T3 - EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop

SP - 374

EP - 381

BT - EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop

PB - Association for Computational Linguistics (ACL)

T2 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017, held in conjunction with EMNLP 2017

Y2 - 8 September 2017

ER -

Markov I, Chen L, Strapparava C, Sidorov G. Cic-fbk approach to native language identification. En EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop. Association for Computational Linguistics (ACL). 2017. p. 374-381. (EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop).

Cic-fbk approach to native language identification

Resumen

Serie de la publicación

Conferencia

Otros archivos y enlaces

Huella

Citar esto