Aller au contenu principal Aller au sitemap

Generative Question-Answering System for identifying Experts in Academia

AUTHORS

  • Nouali Sarah
  • Badache Ismail
  • Bellot Patrice

KEYWORDS

  • Natural Language Processing
  • Retrieval-Augmented Generation
  • Expert Search
  • Question Answering System
  • Large Language Models
  • Information Retrieval
  • Document type

    Conference papers

    Abstract

    In academia, mapping all researchers within an institution and summarizing all of their skills and expertise is not easy. Whether for collaborating, team building for multidisciplinary projects, or mentoring, identifying the right expert is crucial and requires detailed knowledge of individual skills, abilities, and experience. Expert finding has long been a topic of interest in Information Retrieval (IR) research. Despite the multiple studies, issues such as dynamic expertise, sparse data, multidimensional skills, and large heterogeneous data sources make it difficult to build accurate unbiased expert systems that take into consideration different types of expertise and temporal evolution. Recent works explore neural, embedding-based, and graph models, showing the continued interest and an on-going research problem. The aim of this thesis is to propose a generative question answering search model capable of identifying the entities (people or institutions) whose skills are closest to those being sought and then generating a text summarizing and justifying the answers found. Our approach is based on the Retrieval-Augmented Generation (RAG) principle to extract a list of experts from scientific documents: articles or project proposals. The initial evaluation is conducted using the TREC Enterprise 2007 dataset for the document and expert search tasks, the first and one of the few benchmarks for this type of task. Revisiting this 18-year-old dataset allows us not only to comparre traditional and modern approaches, but also to analyze the benefits of LLMs in the retrieval and generation stages of our RAG approach.

    FILE

    MORE INFORMATION