Towards faster annotation interfaces for learning to filter in information extraction and search

Carlos A. Aguirre, Shelby Coen, Maria F. De La Torre, William H. Hsu, Margaret Rys

Research output: Contribution to conferencePaper

Abstract

© 2018 Copyright for the individual papers remains with the authors. This work explores the design of an annotation interface for a document filtering system based on supervised and semisupervised machine learning, focusing on usability improvements to the user interface to improve the efficiency of annotation without loss of precision, recall, and accuracy. Our objective is to create an automated pipeline for information extraction (IE) and exploratory search for which the learning filter serves as an intake mechanism. The purpose of this IE and search system is ultimately to help users create structured recipes for nanomaterial synthesis from scientific documents crawled from the web. A key part of each text corpus used to train our learning classifiers is a set of thousands of documents that are hand-labeled for relevance to nanomaterials search criteria of interest. This annotation process becomes expensive as the text corpus is expanded through focused web crawling over open-access documents and the addition of new publisher collections. To speed up annotation, we present a user interface that facilitates and optimizes the interactive steps of document presentation, inspection, and labeling. We aim towards transfer of these improvements to usability and response time for this annotator to other classification learning domains for text documents and beyond.
Original languageAmerican English
StatePublished - 1 Jan 2018
Externally publishedYes
EventCEUR Workshop Proceedings -
Duration: 1 Jan 2018 → …

Conference

ConferenceCEUR Workshop Proceedings
Period1/01/18 → …

Fingerprint Dive into the research topics of 'Towards faster annotation interfaces for learning to filter in information extraction and search'. Together they form a unique fingerprint.

Cite this