Financial Agents Orchestration Framework: From Algorithmic Trading to Agentic Trading

📌 Key Takeaways

  • Nine Specialized AI Agents: The FinAgent framework maps every algorithmic trading component — from research to execution — to autonomous AI agents coordinated through MCP and A2A protocols.
  • 2.63 Sharpe Ratio: The stock trading strategy achieved the lowest volatility (11.83%) and smallest max drawdown (-3.59%) across all benchmarks during April–December 2024 testing.
  • BTC Outperformance: Minute-level crypto trading returned 8.39% vs. 3.80% buy-and-hold over 17 days, with a 64.7% win rate and Calmar ratio of 166.06.
  • Zero Data Leakage: Strict separation between LLM reasoning and numerical computation ensures alpha agents never see evaluation-window data — a critical safeguard for mission-critical financial AI.
  • Democratization Thesis: Presented at NeurIPS 2025, the framework aims to make professional-grade quantitative trading accessible to the general public through multi-agent orchestration.

The Evolution from Algorithmic to Agentic Trading

The financial trading landscape has undergone four distinct evolutionary phases: from open-outcry floor trading, through telephone-routed orders, to algorithmic trading systems executing pre-programmed strategies at microsecond speeds. Now, researchers at Columbia University’s SecureFinAI Lab are charting the next frontier — agentic trading, where multi-agent AI systems powered by large language models autonomously manage the entire investment pipeline.

Presented at the Workshop on Generative AI in Finance at NeurIPS 2025, the paper by Jifeng Li, Arnav Grover, Abraham Alpuerto, Yupeng Cao, and Xiao-Yang Liu introduces a comprehensive orchestration framework that transforms algorithmic trading components into specialized, communicating AI agents. The result is a system that can research strategies, generate alpha signals, manage risk, construct portfolios, and execute trades — all with minimal human intervention.

Financial markets present uniquely challenging environments for AI deployment. Temporal dynamics shift constantly, signal-to-noise ratios are extraordinarily low, and the stakes are mission-critical — errors translate directly into monetary losses. The framework addresses these challenges through rigorous data leakage prevention, structured communication protocols, and auditable memory systems that make every decision traceable and reproducible.

Perhaps most ambitiously, the authors frame agentic trading as a democratization of financial intelligence. Capabilities once available only to hedge funds with teams of PhDs and millions in infrastructure could soon be accessible to individual investors through well-orchestrated multi-agent AI systems.

FinAgent Orchestration Framework Architecture

The FinAgent framework establishes a one-to-one mapping between traditional algorithmic trading pipeline components and specialized AI agent types. This architectural design ensures that every critical function in quantitative trading is handled by a dedicated, autonomous agent with clear responsibilities and interfaces.

The framework comprises nine agent types. The Planner agent handles strategy research, identifying promising trading approaches from published literature. The Orchestrator manages overall strategy development and coordinates the pipeline. Data Agents handle both historical and real-time market data from sources including Polygon.io, yfinance, and Binance APIs.

Alpha Agents generate trading signals based on factor structures, while Risk Agents compute exposures and enforce constraints. Portfolio Agents combine signals and risk diagnostics into position weights through convex optimization. Backtest Agents perform pre-trade analysis, Execution Agents translate weights into market orders with slippage modeling, and Audit Agents provide post-trade verification.

The cross-cutting Memory Agent maintains state across all other agents, recording every decision, prompt, and tool call for complete auditability. Multiple LLMs power different agents — including GPT-4o, Llama3, and FinGPT — allowing each agent to leverage the model best suited to its task. For organizations exploring how AI transforms complex financial workflows, this architecture provides a blueprint for multi-agent orchestration.

Communication Protocols: MCP and A2A for Financial Agents

Agent coordination is arguably the most critical challenge in multi-agent systems, and the FinAgent framework addresses it through two complementary communication protocols: Model Context Protocol (MCP) for top-down orchestration and Agent-to-Agent Protocol (A2A) for peer communication.

MCP defines how the orchestrator issues control messages to agent pools. Each task description includes the node type, unique task ID, declared inputs with JSON schemas, policy flags, timeout specifications, and retry budgets. Agent replies follow a structured format including acknowledgement, execution status, diagnostic logs, and artifact identifiers. System health is tracked through regular heartbeats, enabling the orchestrator to detect and respond to agent failures.

A2A enables direct peer-to-peer communication among agents through four message types: ask (request information), tell (share updates), propose (suggest actions), and confirm (validate decisions). Each message carries role tags and context IDs, and agents share progress at fixed intervals. When an agent fails, peers can take over its tasks or the orchestrator can reassign work — ensuring resilient pipeline execution even under adverse conditions.

All exchanges are time-stamped and JSON-serialized through the context protocol schema, with strict exclusion of test-period data from any messages visible to LLM agents. This design ensures that the communication layer itself cannot become a vector for data leakage.

Turn cutting-edge AI research into interactive experiences your trading team can learn from.

Try It Free →

The Memory Agent: Auditability and State Persistence

In mission-critical financial applications, every decision must be traceable. The Memory Agent addresses this requirement by recording all states, prompts, tool calls, and decisions across trading runs using a sophisticated UUID-based identification system.

Each UUID is generated deterministically via SHA256(role || task || params || time), guaranteeing four critical properties: immutability (UUIDs cannot be retroactively modified), identity matching (identical inputs always produce the same UUID), safe retrieval (any historical state can be precisely located), and isolation (each agent’s records are cleanly separated).

A memory entry contains the UUID, agent role, plan step reference, features hash, metrics summary, and timestamp. Critically, the Memory Agent stores only structural summaries — never raw prices, returns, or profit-and-loss data. This design choice prevents evaluation-window information from leaking into future model training or LLM prompts, while still providing complete audit trails for regulatory compliance and strategy refinement.

This approach enables full replay capability: any past trading session can be reconstructed from its memory entries, allowing researchers to diagnose failures, verify signals, and validate that the system operated within its designated parameters.

Alpha and Risk Agents: Signal Generation Without Data Leakage

Data leakage — where future information inadvertently influences model training — is perhaps the most insidious risk in quantitative finance. The FinAgent framework addresses this through an architecturally enforced separation between LLM reasoning and numerical computation.

Alpha Agents are restricted to proposing factor structures based on published academic literature only. They can reason about what types of signals might be predictive (momentum factors, mean-reversion indicators, microstructure features) but never receive access to evaluation-window data. All numerical signal construction, return mapping, forecasting, and backtesting are handled by deterministic tool-based modules that operate independently of the LLMs.

For stock trading, alpha signals include momentum factors comparing 20-day versus 60-day returns and cross-sectional 120-day rank. For BTC trading, the system generates over 100 microstructure features at one-minute frequency, including order-flow imbalance, bid-ask spreads, volume spikes, funding rates, RSI indicators, MACD signals, Bollinger Band features, and momentum metrics across eight time horizons.

Risk Agents compute exposures and enforce concentration, volatility, and drawdown constraints. They produce binary pass/fail gates — vol_ok, beta_ok, sector_ok, dd_ok, position_ok — that must all clear before any trade executes. Notably, risk constraints are calibrated differently for crypto versus equity, reflecting BTC’s higher volatility regime. This systematic approach to risk management demonstrates how AI agents can enforce institutional-grade controls.

Portfolio Construction and Execution Agents

Portfolio Agents combine alpha signals and risk diagnostics into portfolio weights under multiple constraints. For stock trading, they implement both long-only and long-short rules with capital allocation limits, turnover constraints, and convex optimization via tool-based solvers. For BTC trading, a long-flat regime applies with signal smoothing for turnover control and minimum-change thresholds to prevent excessive trading.

Position sizing follows regime-dependent rules. The framework identifies four market regimes — strong trend, breakout, sideways, and high volatility — each receiving different base size factors: 1.8× for strong trends, 2.5× for breakouts, 0.7× for sideways markets, and 0.8× for high-volatility environments. Additional momentum boosts of up to 30% apply when momentum exceeds the 70th percentile. Maximum leverage is capped at 4.0×, with individual positions limited to 3–5%.

Execution Agents translate portfolio weights into market orders with realistic friction modeling. For BTC, simulated intraday orders use historical bid/ask data and order book depth, accounting for latency, exchange fees, spreads, and partial fills. Orders are submitted only when all gates — Data, Alpha, Risk, and Portfolio — pass simultaneously, preventing any trade from bypassing the pipeline’s safeguards.

Risk management operates continuously through drawdown-based position reduction: a 20% cut at 1% drawdown, 30% at 2%, and 50% at 3%. Step-wise emergency cuts trigger at 1.5% (-50%) and 2.5% (-75%). A hard maximum drawdown limit of -3.0% closes all positions immediately. Volatility-based stop-losses centered at -0.8% scale dynamically between 0.5× and 1.5× based on current market conditions.

Transform dense financial research into engaging interactive experiences — trusted by leading banks and asset managers.

Get Started →

Stock Trading Backtesting Results and Performance

The stock trading evaluation covers a universe of seven major stocks — AAPL, MSFT, GOOGL, JPM, TSLA, NVDA, and META — using hourly bars with a three-month scrolling training window and $100,000 initial capital. The test window spans April 2024 through December 2024.

The agentic trading system delivered a 20.42% total return (31.08% annualized) with the most compelling risk-adjusted metrics across all benchmarks. Its Sharpe ratio of 2.63 significantly exceeded SPY (1.86), QQQ (1.79), IWM (0.79), and VTI (1.79). Most impressively, the system achieved the lowest volatility at 11.83% — well below SPY’s 13.49% and dramatically below QQQ’s 18.38% and IWM’s 21.61%.

The maximum drawdown of just -3.59% represents the tightest risk control of any strategy tested, compared to SPY’s -8.89%, QQQ’s -14.13%, and IWM’s -11.60%. This performance profile reflects the framework’s layered risk management approach, where every trade must pass through multiple agent gates before execution.

In an important caveat for transparency, the equally-weighted (EW) benchmark with weekly rebalancing achieved the highest total return at 47.46% with a 3.37 Sharpe ratio — outperforming the agentic system on raw returns. However, the EW strategy carried significantly higher volatility (22.54%) and a maximum drawdown of -16.21%, illustrating the fundamental trade-off between aggressive return seeking and disciplined risk management that the agentic system prioritizes.

BTC Minute-Level Agentic Trading Performance

The crypto trading evaluation pushes the framework into high-frequency territory, operating on BTC/USDT minute-level bars with a seven-day scrolling window. The test period covers 17 days from July 27 to August 13, 2025, encompassing approximately 23,500 one-minute observations with $100,000 initial capital.

The agentic system returned 8.39% versus 3.80% for buy-and-hold — an excess return of 4.59 percentage points. The Sharpe ratio of 0.378 more than doubled the buy-and-hold baseline of 0.170. Maximum drawdown was contained to -2.80% versus -5.26%, and the Calmar ratio of 166.06 dwarfed the benchmark’s 23.30, reflecting exceptional return-to-drawdown efficiency.

The underlying XGBoost model uses 300 trees with a maximum depth of 6, retrained every 24 hours through a rolling walk-forward process. Over 100 features are engineered at the minute level, including smoothed returns, realized and EWMA volatility, RSI and MACD indicators, momentum metrics across eight time horizons, GARCH-style volatility regime identifiers, support/resistance levels, mean-reversion z-scores, and volume statistics.

Trading activity was notably disciplined: just 17 trades over 17 days (approximately one per day), with a median holding time of 39 minutes and a 64.7% win rate. Signal smoothing with exponential moving averages (α₁=0.25, α₂=0.15) and a dead band width of 0.08 prevent excessive trading, while a minimum holding time of 8 minutes eliminates noise-driven micro-trades. Discover more AI and machine learning insights for financial applications in our interactive library.

Implications for Democratizing Quantitative Finance

The FinAgent orchestration framework arrives at a moment when agentic AI systems are rapidly proliferating across finance. A comparison with open-source alternatives reveals the landscape’s maturity: AI Hedge Fund leads with 42,300 GitHub stars and 18 specialized agents, while TradingAgents follows with 24,800 stars and 6 agents. The Columbia framework differentiates through its rigorous data leakage prevention and formally specified communication protocols.

The democratization thesis carries profound implications. Professional-grade quantitative trading — historically the domain of institutional investors with substantial technical and capital resources — could become accessible to individual traders through well-orchestrated AI agent systems. The framework’s modular architecture means components can be upgraded independently: a better alpha model, an improved risk engine, or a more sophisticated execution algorithm can each be swapped without rebuilding the entire pipeline.

The authors acknowledge important limitations. Evaluation horizons remain short (8 months for stocks, 17 days for crypto), and the stock universe is limited to seven large-cap names. The EW benchmark’s outperformance on total returns suggests that the agentic system’s conservative risk management, while producing superior risk-adjusted metrics, may leave significant alpha on the table. No ablation studies isolate individual component contributions, making it difficult to determine which agents contribute most to performance.

Looking ahead, the convergence of LLM capabilities, structured agent communication protocols, and rigorous financial engineering principles points toward a future where agentic trading systems become standard infrastructure for both institutional and retail participants. The key challenge remains ensuring that these systems operate with the same auditability, risk discipline, and data integrity standards demanded by mission-critical financial applications — exactly the principles this framework was designed to enforce.

Ready to make complex financial research accessible to your entire organization? Start with Libertify.

Start Now →

Frequently Asked Questions

What is agentic trading and how does it differ from algorithmic trading?

Agentic trading uses multi-agent AI systems powered by large language models (LLMs) to autonomously manage the entire trading pipeline — from research and signal generation to risk management and execution. Unlike traditional algorithmic trading which relies on pre-programmed rules and fixed strategies, agentic trading employs specialized AI agents that can reason about market conditions, adapt strategies dynamically, and coordinate through structured communication protocols like MCP and A2A.

What is the FinAgent orchestration framework?

The FinAgent orchestration framework is an end-to-end architecture developed by Columbia University researchers that maps traditional algorithmic trading components to nine specialized AI agent types: planner, orchestrator, data agents, alpha agents, risk agents, portfolio agents, backtest agents, execution agents, and a memory agent. These agents coordinate through Model Context Protocol (MCP) for top-down control and Agent-to-Agent (A2A) protocol for peer communication.

How does the framework prevent data leakage in AI trading?

The framework enforces strict separation between LLM reasoning and numerical computation. Alpha agents propose factor structures based only on published literature — they never see evaluation-window data. All numerical signal construction, return mapping, forecasting, and backtesting are handled by deterministic tool-based modules. Features are computed up to time t-1, targets use t+1, with a minimum 2-minute gap between feature timestamps and labels.

What backtesting results did the agentic trading system achieve?

In stock trading across 7 major stocks (April-December 2024), the system achieved a 20.42% total return with a 2.63 Sharpe ratio, the lowest volatility at 11.83%, and smallest maximum drawdown of just -3.59%. For BTC minute-level trading over 17 days, it returned 8.39% vs. 3.80% buy-and-hold, with a 64.7% win rate, -2.80% max drawdown, and a Calmar ratio of 166.06.

What role does the memory agent play in the framework?

The memory agent records all states, prompts, tool calls, and decisions across trading runs using UUID-based identification. Each UUID is generated via SHA256(role||task||params||time) for deterministic identification. It stores only structural summaries — never raw prices, returns, or P&L data — ensuring complete auditability and replay capability while preventing data leakage between evaluation periods.

Your documents deserve to be read.

PDFs get ignored. Presentations get skipped. Reports gather dust.

Libertify transforms them into interactive experiences people actually engage with.

No credit card required · 30-second setup

Our SaaS platform, AI Ready Media, transforms complex documents and information into engaging video storytelling to broaden reach and deepen engagement. We spotlight overlooked and unread important documents. All interactions seamlessly integrate with your CRM software.