Contributing
Development Setup
- Clone the repository and install prerequisites (see Getting Started)
- Start the infrastructure:
docker-compose up -d - Source environment variables:
source .env
Project Structure
The project is a Cargo workspace with four crates:
crates/
├── chatbot/ # Main binary — agent loop, HTTP server, MCP client
├── mcp-server-fraud/ # Fraud detection MCP server
├── mcp-server-kyc/ # KYC verification MCP server
└── mcp-server-funding/# Deposit/withdrawal MCP server
The evaluation framework is a separate Python project under evals/.
Building
# Build all crates
cargo build
# Build a specific crate
cargo build -p chatbot
cargo build -p mcp-server-fraud
Adding a New MCP Server
-
Create a new crate under
crates/: -
Add it to
Cargo.tomlworkspace members - Implement tools using RMCP's
#[tool]macro on an Axum + RMCP server (follow existing servers as examples) - Add hardcoded test user data for
test_user_1throughtest_user_5 - Add the server to
docker-compose.ymlandchatbot.toml - Update the system prompt in
prompts/system.txtwith tool chaining rules
Adding Eval Test Cases
- Add cases to an existing dataset file in
evals/datasets/, or create a new YAML file - Each case needs:
id,description,input(messages + test_user), andexpected(tool_calls + scoring thresholds) - Make sure the
test_usermaps to deterministic MCP data that supports your scenario - Run
python run_eval.py syncto push to Langfuse - Run
python run_eval.py run --run-name "test"to verify
Adding a New Judge Dimension
- Create a new YAML file in
evals/judges/ - Define the prompt template with a 1-5 rubric and concrete examples
- Set
temperature: 0for consistent scoring - Reference the new dimension in test case
expected.scoringthresholds
Modifying the System Prompt
The system prompt lives in prompts/system.txt. Key rules to preserve:
- Never say "fraud" — use "security review" instead
- Never reveal internal flag types or system details
- Follow mandatory tool chaining rules (documented in the prompt)
- Offer escalation when users are frustrated
After changes, run the full eval suite to check for regressions:
Code Style
- Rust: Follow standard Rust conventions. Use
cargo fmtandcargo clippy - Python: Follow PEP 8. The eval code uses Click for CLI and PyYAML for config
Commit Messages
Follow the Conventional Commits style used in the repository: