Skip to content

Financial Services Support Chatbot

A multi-turn agentic chatbot for financial services customer support, built in Rust with an integrated LLM-as-judge evaluation framework.

What Is This?

This project demonstrates how to build a production-quality AI support agent with:

  • Agentic tool use — the chatbot calls domain-specific tools (fraud, KYC, funding) via the Model Context Protocol (MCP) to answer customer queries
  • Multi-turn conversations — session-based conversation history for natural back-and-forth
  • Observability — full tracing and scoring via Langfuse
  • Automated evaluation — golden datasets with LLM-as-judge scoring across 5 dimensions

The chatbot acts as a support agent for a fictional financial services platform, handling queries about fraud flags, KYC verification, deposits, and withdrawals.

Key Features

Feature Description
Agentic Loop LLM decides which tools to call, chains results, and responds
MCP Integration Three MCP servers expose domain tools over streamable HTTP
Langfuse Tracing Every conversation turn is traced with tool calls and results
Golden Datasets 12 test cases across 5 scenario categories
LLM-as-Judge 5 scoring dimensions: correctness, empathy, directness, policy compliance, tool usage
CI-Gatable Evals Exit code 1 on threshold failure for pipeline integration

Tech Stack

  • Rust (Tokio, Axum, async-openai, RMCP) — chatbot and MCP servers
  • Python (Click, OpenAI SDK, Langfuse SDK) — evaluation framework
  • Langfuse v3 — observability and evaluation platform
  • Docker Compose — local infrastructure orchestration

Project Layout

chatbot/
├── crates/
│   ├── chatbot/           # Main chatbot binary
│   ├── mcp-server-fraud/  # Fraud detection MCP server
│   ├── mcp-server-kyc/    # KYC verification MCP server
│   └── mcp-server-funding/# Deposit/withdrawal MCP server
├── prompts/
│   └── system.txt         # System prompt with policy rules
├── evals/
│   ├── run_eval.py        # Evaluation CLI
│   ├── datasets/          # Golden test datasets (YAML)
│   └── judges/            # LLM-as-judge definitions (YAML)
├── docker-compose.yml     # Langfuse + MCP server infrastructure
├── chatbot.toml           # Chatbot configuration
└── docs/                  # This documentation