Tutorial 2: Building a Multi-Tool Environment#
What you'll learn: Create an environment with multiple tools and see how agents use them together.
Prerequisites: Complete Tutorial 1
Step 1: Create tools for a calculator#
Create calculator_env.py:
from corral.backend.env import Environment
from corral.backend.tool import tool
@tool
def add_numbers(a: float, b: float) -> float:
"""Add two numbers together.
Args:
a: first number
b: second number
"""
return a + b
@tool
def multiply_numbers(a: float, b: float) -> float:
"""Multiply two numbers.
Args:
a: first number
b: second number
"""
return a * b
@tool
def subtract_numbers(a: float, b: float) -> float:
"""Subtract b from a.
Args:
a: first number
b: second number
"""
return a - b
Notice we created three separate tools. Each tool does one thing clearly.
Step 2: Create the environment#
Add to calculator_env.py:
class CalculatorEnvironment(Environment):
def __init__(
self,
task_id: str,
num1: float,
num2: float,
answer: float,
base_work_dir=BASE_WORK_DIR,
):
self.problem = problem
self.answer = answer
super().__init__(task_id, base_work_dir)
# Add all three tools
self.add_tool(add)
self.add_tool(multiply)
self.add_tool(subtract)
def get_task_prompt(self) -> str:
return f"Calculate: {self.problem}\nSubmit your answer as a number."
def score(self) -> float:
if self.state.submitted_answer is None:
return 0.0
try:
result = float(self.state.submitted_answer)
return 1.0 if abs(result - self.answer) < 0.001 else 0.0
except (ValueError, TypeError):
return 0.0
The environment now has three tools available. The agent will need to choose which ones to use.
Step 3: Create complex tasks#
Add to calculator_env.py:
from corral.backend.server import create_benchmark_server
import uvicorn
environments = {
"simple_add": CalculatorEnvironment("simple_add", "10 + 5", 15),
"two_step": CalculatorEnvironment("two_step", "(10 + 5) * 2", 30),
"three_step": CalculatorEnvironment("three_step", "10 * 3 - 5", 25),
}
if __name__ == "__main__":
app = create_benchmark_server(environments)
uvicorn.run(app, host="0.0.0.0", port=8000)
Notice the tasks increase in complexity. The three-step task requires the agent to use multiple tools.
Step 4: Run with verbose output#
Start the server:
python calculator_env.py
In another terminal, create run_verbose.py:
from corral.run import CorralRunner
from corral.router import CorralRouter
from corral.agents import ReActAgent
interface = CorralRouter(base_url="http://localhost:8000")
agent = ReActAgent(model="openai/gpt-4o", max_iterations=10)
runner = CorralRunner(interface, agent)
result = runner.bench(
task_ids=["three_step"], trials_per_task=1, verbose=True # Enable verbose output
)
print(f"\nScore: {result.average_score():.2f}")
# Check tool usage
for trial in result.all_results:
stats = trial.tool_statistics
print(f"Tools used: {stats['tools_used']}")
print(f"Total tool calls: {stats['total_calls']}")
Run it:
python run_verbose.py
Watch the agent's reasoning process. You'll see it call multiply first, then subtract. The agent conversation is saved in JSON files in your working directory.
What you accomplished#
- ✅ Created multiple related tools
- ✅ Built tasks requiring multi-step reasoning
- ✅ Enabled verbose output to see agent thinking
- ✅ Examined tool usage statistics