### Abstract

Original language | American English |
---|---|

Pages (from-to) | 819-833 |

Number of pages | 735 |

Journal | Computacion y Sistemas |

DOIs | |

State | Published - 1 Jan 2018 |

### Fingerprint

### Cite this

*Computacion y Sistemas*, 819-833. https://doi.org/10.13053/CyS-22-3-3015

}

*Computacion y Sistemas*, pp. 819-833. https://doi.org/10.13053/CyS-22-3-3015

**A formula embedding approach to math information retrieval.** / Pathak, Amarnath; Pakray, Partha; Gelbukh, Alexander.

Research output: Contribution to journal › Article

TY - JOUR

T1 - A formula embedding approach to math information retrieval

AU - Pathak, Amarnath

AU - Pakray, Partha

AU - Gelbukh, Alexander

PY - 2018/1/1

Y1 - 2018/1/1

N2 - © 2018 Lithuanian Institute of Philosophy and Sociology. All rights reserved. Intricate math formulae, which majorly constitute the content of scientific documents, add to the complexity of scientific document retrieval. Although modifications in conventional indexing and search mechanisms have eased the complexity and exhibited notable performance, the formula embedding approach to scientific document retrieval sounds equally appealing and promising. Formula Embedding Module of the proposed system uses a Bit Position Information Table to transform math formulae, contained inside scientific documents, into binary formulae vectors. Each set bit of a formula vector designates presence of a specific mathematical entity. Mathematical user query is transformed into query vector, in similar fashion, and the corresponding relevant documents are retrieved. Relevance of a search result is characterized by extent of similarity between the indexed formula vector and the query vector. Promising performance, under moderately constrained situation, substantiates competence of the proposed approach.

AB - © 2018 Lithuanian Institute of Philosophy and Sociology. All rights reserved. Intricate math formulae, which majorly constitute the content of scientific documents, add to the complexity of scientific document retrieval. Although modifications in conventional indexing and search mechanisms have eased the complexity and exhibited notable performance, the formula embedding approach to scientific document retrieval sounds equally appealing and promising. Formula Embedding Module of the proposed system uses a Bit Position Information Table to transform math formulae, contained inside scientific documents, into binary formulae vectors. Each set bit of a formula vector designates presence of a specific mathematical entity. Mathematical user query is transformed into query vector, in similar fashion, and the corresponding relevant documents are retrieved. Relevance of a search result is characterized by extent of similarity between the indexed formula vector and the query vector. Promising performance, under moderately constrained situation, substantiates competence of the proposed approach.

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85055505893&origin=inward

UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85055505893&origin=inward

U2 - 10.13053/CyS-22-3-3015

DO - 10.13053/CyS-22-3-3015

M3 - Article

SP - 819

EP - 833

JO - Computacion y Sistemas

JF - Computacion y Sistemas

SN - 1405-5546

ER -