Paper: https://arxiv.org/pdf/2411.14199) Github: https://github.com/AkariAsai/OpenScholar)
The research introduces OpenScholar, a retrieval-augmented large language model (LLM) designed for synthesizing scientific literature. OpenScholar uses a large datastore of open-access papers and iterative self-feedback to generate high-quality responses to scientific questions, including accurate citations. A new benchmark, ScholarQABench, is introduced for evaluating open-ended scientific question answering, incorporating both automatic and human evaluations. Experiments demonstrate OpenScholar's superior performance compared to other LLMs and even human experts in certain aspects, particularly in terms of information coverage. Limitations of OpenScholar and ScholarQABench are discussed, alongside plans for open-sourcing the model and benchmark.
ai , llm , retrieval augmented, rag , artificial intelligence , arxiv , research , paper , publication , genai , generativeai, agentic