© 2018 Copyright for the individual papers remains with the authors. This work explores the design of an annotation interface for a document filtering system based on supervised and semisupervised machine learning, focusing on usability improvements to the user interface to improve the efficiency of annotation without loss of precision, recall, and accuracy. Our objective is to create an automated pipeline for information extraction (IE) and exploratory search for which the learning filter serves as an intake mechanism. The purpose of this IE and search system is ultimately to help users create structured recipes for nanomaterial synthesis from scientific documents crawled from the web. A key part of each text corpus used to train our learning classifiers is a set of thousands of documents that are hand-labeled for relevance to nanomaterials search criteria of interest. This annotation process becomes expensive as the text corpus is expanded through focused web crawling over open-access documents and the addition of new publisher collections. To speed up annotation, we present a user interface that facilitates and optimizes the interactive steps of document presentation, inspection, and labeling. We aim towards transfer of these improvements to usability and response time for this annotator to other classification learning domains for text documents and beyond.
|Original language||American English|
|State||Published - 1 Jan 2018|
|Event||CEUR Workshop Proceedings - |
Duration: 1 Jan 2018 → …
|Conference||CEUR Workshop Proceedings|
|Period||1/01/18 → …|