Agents API Reference π€¶
This document details the agent system in the application, including the specialized agents and their roles in processing queries.
Common Architecture¶
All agents in the system share a similar architectural pattern and are managed by the agent factory (app.core.agents.agents_factory
):
Core Components:
- Dynamic Agent Creation: Loads agent modules based on configuration settings
- Flexible Parameter Handling: Uses introspection to pass only required parameters to each agent
- LLM Selection: Supports configuration-based and agent-specific LLM selection
- Session Management: Maintains consistent session IDs across the agent ecosystem
- Error Handling: Provides robust logging and exception handling
Core Factory Function¶
The create_all_agents
function serves as the entry point for initializing the entire agent ecosystem:
def create_all_agents(llms, graph, openai_key=None, session_id=None):
"""
Dynamically create and initialize all agent modules as specified in the configuration.
Parameters:
llms (dict): A dictionary mapping LLM keys to their instances.
graph: The graph instance used by the agents.
openai_key (str, optional): The OpenAI API key to be used by agents.
session_id (str, optional): A unique session identifier.
Returns:
dict: A dictionary mapping agent names to their created executor instances.
"""
Individual Agent Architecture¶
Each agent in the system follows a consistent structural pattern while maintaining specialized functionality.
Common Agent Components¶
- Creation Function: Each agent implements a
create_agent
function that returns anAgentExecutor
- Tool Management: Dynamically loads tools from its directory using the
import_tools
utility - Role-Specific Prompts: Defines behavior through customized prompts
- LLM Integration: Utilizes language models as reasoning engines
- Logging: Implements consistent logging for monitoring and debugging
Standard Agent Structure¶
def create_agent(llms, graph, openai_key, llm_instance=None) -> AgentExecutor:
"""
Creates and configures an agent with its specialized tools.
Parameters:
llms (dict): Available language models.
graph: The knowledge graph instance.
openai_key (str): API key for OpenAI services.
llm_instance: Optional specific LLM instance to use.
Returns:
AgentExecutor: A configured agent executor instance.
"""
# Load tools dynamically from the agent's directory
# Configure the agent with appropriate prompts
# Return an AgentExecutor instance
Agent Locations:
- ENPKG Agent:
app/core/agents/enpkg/agent.py
- Entry Agent:
app/core/agents/entry/agent.py
- Interpreter Agent:
app/core/agents/interpreter/agent.py
- SPARQL Agent:
app/core/agents/sparql/agent.py
- Validator Agent:
app/core/agents/validator/agent.py
- Supervisor Agent:
app/core/agents/supervisor/agent.py
Entry Agent πͺ¶
The Entry Agent serves as the first point of contact for user interactions.
Purpose:
- Initial query processing and classification
- File analysis for submitted documents
Key Features:
- Classifies queries into "New Knowledge Question" or "Help me understand Question"
- Analyzes submitted files using the FILE_ANALYZER tool
- Validates query completeness and requests clarification when needed
Tools:
- FILE_ANALYZER: Processes and analyzes submitted files, providing file paths and content summaries
Usage Cases:
- When users submit new queries requiring database information
- When files need to be analyzed
- For follow-up questions requiring context from previous conversations
Validator Agent β ¶
The Validator Agent ensures query validity and data quality.
Purpose:
- Validates user queries against database capabilities
- Ensures data quality and consistency
- Provides feedback for invalid queries
Validation Checks:
- Plant name verification in database
- Query compatibility with schema
- Content relevance to available nodes/entities
Tools:
- PLANT_DATABASE_CHECKER: Verifies plant names in database
Validation Criteria:
- Plant-specific and feature-related queries
- Grouping, counting, and annotation comparisons
- Schema compatibility
- Data availability
Supervisor Agent π¨βπΌ¶
The Supervisor Agent orchestrates the interaction between all other agents in the system.
Purpose:
- Coordinates information flow between agents
- Makes routing decisions based on query content
- Ensures proper processing sequence
- Manages agent responses and task completion
Decision Making:
- Routes queries containing entities (compounds, taxa, targets) to ENPKG Agent
- Forwards resolved entities and queries to SPARQL Agent
- Directs query results to Interpreter Agent when needed
- Determines when to complete the process
ENPKG Agent 𧬶
The ENPKG Agent specializes in resolving and standardizing entities mentioned in queries.
Purpose:
- Resolves entity references to standardized identifiers
- Handles multiple types of entities (chemicals, taxa, targets)
- Provides unit information for numerical values
Tools:
- CHEMICAL_RESOLVER: Maps chemical names to NPC Class URIs
- TAXON_RESOLVER: Resolves taxonomic names to Wikidata IRIs
- TARGET_RESOLVER: Maps target names to ChEMBLTarget IRIs
- SMILE_CONVERTER: Converts SMILE structures to InChIKey notation
Entity Types Handled:
- Natural product compounds (with chemical class identifiers)
- Taxonomic names (with Wikidata IRIs)
- Molecular targets (with ChEMBL IRIs)
- Chemical structures (in SMILE notation)
SPARQL Agent π¶
The SPARQL Agent handles the generation and execution of database queries.
Purpose:
- Generates and executes SPARQL queries
- Handles query optimization and improvement
- Manages data retrieval and formatting
Key Components:
- Query Generation Chain: Creates initial SPARQL queries
- Query Improvement Chain: Refines queries if initial results are empty
- Vector Search: Finds similar successful queries for improvement
- Schema Validation: Ensures queries follow database schema
Tools:
- SPARQL_QUERY_RUNNER: Executes SPARQL queries against the knowledge graph
- WIKIDATA_QUERY_TOOL: Retrieves data from Wikidata in specific case
- OUTPUT_MERGE: Combines results from multiple sources
Query Processing Features:
- Automatic query improvement when no results are found
- Token management for large result sets
- Results are automatically saved as temporary CSV files in the user's session directory
- Local CSV storage enables users to perform additional analysis or processing on the results
Interpreter Agent π¶
The Interpreter Agent processes and formats query results for human understanding.
Purpose:
- Processes SPARQL query outputs
- Handles file interpretation
- Generates visualizations
- Provides clear, formatted answers
Input Processing: - Handles SPARQL query results - Processes user-submitted files - Interprets data files
Tools:
- INTERPRETER_TOOL: Main tool for data interpretation and visualization
- SPECTRUM_PLOTTER: Provides the url with a plot of a spectrum given its USI
Output Types:
- Text interpretations of query results
- Generated visualizations
- Formatted data summaries
- File path references for downloads
- URL which connects to the Metabolomics Spectrum Resolver tool
Agent Interaction Architecture π¶
The agent system is implemented using LangGraph's StateGraph, where agents operate as nodes in a directed graph with state-managed transitions.
Core Architecture:
class AgentState(TypedDict):
messages: Annotated[Sequence[BaseMessage], operator.add] # Accumulated messages
next: str # Next routing target
Workflow Components:
-
Graph Structure:
- Entry_Agent as the designated entry point
- Nodes representing individual agents
- Conditional edges defining valid transitions
- State preservation across transitions
-
Routing Logic:
- Entry_Agent can route to Supervisor or Validator
- Validator makes routing decisions based on query validation
- Supervisor dynamically routes to specialized agents
- Process ends when valid response generated or error detected
-
State Management:
- Messages accumulate throughout processing
- Each agent adds its output to message history
- State maintained across all transitions
- Metadata preserved for context
Communication Patterns:
-
Primary Flow:
-
Conditional Branching:
- Validator routes based on query validity
- Supervisor routes based on query content analysis
- Dynamic routing to specialized agents as needed
-
Message Handling:
- Human messages initiate workflow
- Agent responses added to message sequence
- State updates trigger next agent selection
- Final response terminates workflow
This architecture ensures:
- Clear communication paths
- Stateful processing
- Dynamic routing
- Error handling
- Process monitoring
Agent Setup Guidelines π§βπ»¶
Agent Directory Creation¶
Create a dedicated folder for your agent within the app/core/agents/
directory.
Standard File Structure¶
-
Agent (
agent.py
): Copy from an existing agent unless your tool requires private class property access. Refer to "If Your Tool Serves as an Agent" for special cases.Psst... don't let the complexities of Python imports overcomplicate your flowβtrust the process!
-
Prompt (
prompt.py
): Adapt the prompt for your specific context/tasks. -
Tools (
tool_xxxx.py
) (optional): Inherit from the LangChainBaseTool
, defining:name
,description
,args_schema
- A Pydantic model for input validation
- The
_run
method for execution
Supervisor Configuration¶
Modify the supervisor prompt (see supervisor prompt) to detect and select your agent. Our AI PR-Agent π€ is triggered automatically through issues and pull requests, so you'll be in good hands!
Configuration Updates¶
Update app/config/langgraph.json
to include your agent in the workflow and specify llm_choice
based on the models defined in app/config/params.ini
. Available models include:
- OpenAI models:
llm_preview
,llm_o
,llm_mini
- OVH models:
ovh_Meta-Llama-3_1-70B-Instruct
- Deepseek models:
deepseek_deepseek-chat
,deepseek_deepseek-reasoner
- LiteLLM compatible models:
llm_litellm_openai
,llm_litellm_deepseek
,llm_litellm_claude
,llm_litellm_gemini
Choose the appropriate model based on your agent's requirements for reasoning capabilities and performance. For reference, see langgraph.json. If you need to add a new language model, refer to the Language Model Configuration guide.
If Your Tool Serves as an Agent¶
For LLM-interaction, make sure additional class properties are set in agent.py
(refer to tool_sparql.py and agent.py). Keep it snazzy and smart!