Multi-Agent Research Assistant

Overview

Problem Focus

Academic literature reviews are time consuming and cognitively demanding. Researchers need to read many papers to extract insights, identify research gaps, and validate novel directions. This project explores how an AI assistant could streamline this process while maintaining accuracy and providing verifiable citations.

Duration 2 Weeks

Team 4 Members

Tools Streamlit, Python, OpenAI GPT-4o-mini, ChromaDB, LangChain, Tavily API

My Contributions: I designed and implemented the core system architecture including the multi-agent coordination framework, RAG pipeline with vector embeddings, and intent classification system. Following this foundation, a team member developed the Streamlit interface with streaming responses and integrated Tavily API and wrapped up other small elements.

View on GitHub

System architecture illustrating the multi-agent coordination framework with RAG pipeline.

Architecture & Design

Multi-Agent System Overview

The system is built around specialized AI agents that coordinate to provide comprehensive literature analysis. Each agent handles a distinct role, enabling efficient processing of diverse research tasks.

The Four Specialized Agents

Reader Agent: Extracts and summarizes key sections from research papers including abstract, methodology, results, and contributions. It provides structured summaries with clear sections for findings, methodology, and limitations.
QA Agent: Answers specific questions about paper content by searching through relevant chunks and providing precise, citation-backed responses. It distinguishes between what papers state explicitly and interpretations.
Research Advisor: Synthesizes insights across papers to identify gaps, limitations, and open questions. It suggests promising future research directions with clear rationale and potential impact.
Research Verifier: Validates research suggestions by conducting web searches through Tavily API. It assesses whether ideas are novel, identifies related work, and provides revisions to position ideas more effectively.

Research Advisor agent output showing research gaps and suggested future directions — The Research Advisor analyzes uploaded literature to identify research gaps and suggest novel future directions.

Intent Classification & Routing

The intent classifier uses GPT-4o-mini to analyze user messages and route them to the right agent. It categorizes queries into three types: "summarize" (for paper overviews), "question" (for specific content inquiries), or "research" (for gap analysis and future directions). This classification happens before the main agent interaction, keeping the system responsive while ensuring users get the most relevant help.

Intent classification workflow diagram showing automatic query routing to specialized agents — Intent classification automatically routes queries to the most appropriate specialized agent.

RAG (Retrieval-Augmented Generation) Pipeline

The RAG architecture grounds all responses in the actual uploaded papers. PDFs are processed through LangChain's PyPDFLoader, split into 1000-character chunks with 200-character overlap, and then embedded using OpenAI's text-embedding-3-small model. These embeddings are stored in ChromaDB's ephemeral client for fast semantic search. When a user asks a question, the system retrieves the 5 most relevant chunks and injects them directly into the agent's context which helps ensure responses are accurate and cite specific passages.

Technical Implementation

Document Processing Pipeline

The system handles PDF uploads through a multi-step pipeline. Each uploaded file is temporarily saved, processed by LangChain's PyPDFLoader to extract text content, and deleted. The RecursiveCharacterTextSplitter divides the extracted text into 1000-character chunks with 200-character overlap to maintain context at chunk boundaries. All chunks are aggregated and embedded together which creates a vector store that spans all uploaded papers.

Vector Embeddings & Semantic Search

OpenAI's text-embedding-3-small model converts chunks into dense vector representations. These embeddings enable semantic searches that go beyond simple keyword matching. ChromaDB handles fast retrieval of relevant context based on similarity to user queries.

Key Technical Features

Streaming Responses: All agent interactions use OpenAI's streaming API to deliver tokens as they're generated, providing immediate visual feedback and a more natural conversation feel.
Smart Context Injection: For each query, the system retrieves the top 5 relevant chunks via semantic search and injects them into the system prompt, keeping responses grounded in source material.
Session State Management: Streamlit's session state maintains the chat history, uploaded files, and vector store across interactions. The last 10 messages are included in each request for conversational context.
Source Transparency: Users can expand a "Source Chunks Used" section to see exactly which paper excerpts informed each response, improving trust and verifiability.

Research Verification with Web Search

When users request research ideas, the system follows a two-step workflow. First, the Research Advisor generates suggestions based on the uploaded papers. Then, the system automatically extracts key ideas from these suggestions and conducts targeted web searches using Tavily API (typically 3 queries with phrases like "related work arxiv"). The Research Verifier agent then analyzes these search results to assess whether each idea appears novel, closely related to existing work, or already well-explored. It provides revised versions of promising ideas along with evidence from the web searches.

Research Verifier agent output showing web search validation of suggested research directions — The Research Verifier validates suggestions by searching for existing related work on the web.

Technology Stack

The technology stack balances functionality, performance, and maintainability:

Frontend: Streamlit for the interactive web interface
LLM: OpenAI GPT-4o-mini accessed via API
Embeddings: OpenAI text-embedding-3-small for creating semantic vector representations
Vector Store: ChromaDB ephemeral client for in-memory storage and fast retrieval of embeddings
Document Processing: LangChain's PyPDFLoader and RecursiveCharacterTextSplitter for robust PDF handling
Web Search: Tavily API for comprehensive research verification
Backend: Python with OpenAI SDK for clean, maintainable code

Results & Future Work

Project Outcomes

The project successfully demonstrated how coordinated AI agents can streamline academic literature analysis. The system effectively handles multiple PDFs, provides accurate summaries with citations, and offers research direction suggestions. The web-based verification feature ensures suggested directions are genuinely novel.

Future Enhancements

There are several potential directions could extend the system's capabilities:

Additional Specialized Agents

Citation Network Analyzer: An agent that maps citation relationships between papers to identify influential works and research lineages
Methodology Extractor: Specialized agent focused on extracting and comparing research methodologies across papers
Dataset Finder: Agent that identifies datasets mentioned in papers and suggests relevant publicly available datasets

Technical Improvements

Multi-LLM Support: Allow users to choose between different language models (GPT-4, Claude, Llama) based on their needs and preferences
Batch Processing: Enable processing of large document collections with progress tracking and resumption capabilities
Export Functionality: Allow users to export chat histories, summaries, and research suggestions in various formats (PDF, Markdown, LaTeX)

Enhanced Verification

Integration with academic databases (arXiv, Semantic Scholar) for more comprehensive checking
Trend analysis showing publication volume over time for suggested research directions

Lessons Learned

This project provided me with experience building an AI integration system. Working through the implementation reinforced my knowledge on how important clear system design becomes when coordinating multiple agents, separating concerns between agents, getting the intent classification right, and managing context effectively all turned out to be critical for keeping responses coherent and reliable.