Backtesting Frameworks: Execution-ready selection guide
Compress selection, performance trade-offs, bias guardrails, and validation into one page.
This is a framework selection and validation workflow, not investment advice.
Asset: US equities Frequency: Daily Strategy style: trend / mean-reversion / stat-arb Execution fidelity: medium (event-driven preferred) Universe size: 500 symbols Data source: vendor + corporate actions Costs: 1 bps commission, 5 bps slippage Compute: local + 1 cloud node Target: research now, live trading in 3 months
Workflow
Five steps to selection
From intent to delivery, every step is explicit.
Decide between realism vs speed before building anything.
Asset class: equities / crypto / futures Frequency: intraday / daily / weekly Execution fidelity: event-driven vs vectorized Deployment: research-only vs live trading
Data and costs are hard constraints for any framework.
Data source: vendor / exchange / CSV Adjustments: splits, dividends, corporate actions Costs: commissions, slippage, borrow, funding Universe size + lookback window
Limit to 1-2 tools per category to avoid thrash.
Event-driven: Backtrader, Zipline, Lean Vectorized: VectorBT, Backtesting.py Platform: QuantConnect/Lean, QuantRocket Shortlist 1-2 finalists
Walk-forward + OOS + stability tests are non-negotiable.
Bias checks: look-ahead, survivorship, data-snooping Validation: walk-forward + out-of-sample Execution: slippage, latency, fill model
Make every decision re-runnable and reviewable.
Reproducibility: env lockfile + seeds + data snapshot Report: scorecard + decision matrix + go/no-go Archive: config, results, charts
Framework Landscape
Map the categories
Start with categories before tool-level details.
| Category | Examples | Strengths | Trade-offs | Best for |
|---|---|---|---|---|
| Event-driven | Backtrader, Zipline, Lean | High execution realism, detailed order modeling. | Slower, more complex, data quality sensitive. | Complex order logic, multi-asset, live trading parity. |
| Vectorized | VectorBT, Backtesting.py | Fast iteration and massive parameter sweeps. | Simplified execution, weaker realism. | Research loops, hypothesis testing, sensitivity scans. |
| Hosted platforms | QuantConnect/Lean, QuantRocket | Integrated data + compute, cloud backtesting, live ops. | Platform lock-in, quota and cost constraints. | Teams, multi-asset data, managed execution pipelines. |
Event vs Vectorized
Speed vs realism
Key differences distilled from authoritative comparisons.
| Dimension | Event-driven | Vectorized |
|---|---|---|
| Realism | High: event-by-event execution simulation. | Moderate: simplified execution. |
| Speed | Slower: iterates bar-by-bar. | Faster: batch computation. |
| Complexity | Higher: order/state management. | Lower: simpler to implement. |
| Order modeling | Detailed slippage/fill control. | Basic assumptions, limited fill realism. |
| Best for | HFT/arb/multi-asset live parity. | Mid/low frequency, research + optimization. |
Key takeaway
Speed is not everything, but large universes make it a hard constraint.
Scale Factors
What drives runtime
Key factors highlighted in QuantRocket benchmarks.
Bias Guardrails
Biases you must guard
Every bias needs an executable check.
| Bias | Risk | Mitigation |
|---|---|---|
| Look-ahead bias | Future data inflates returns. | Only use past data; forbid forward indexes. |
| Survivorship bias | Survivors only distort performance. | Use historical data including delisted assets. |
| Transaction cost | Ignoring costs overstates edge. | Model commissions, slippage, and fills. |
| Data-snooping | Too many trials cause overfitting. | OOS + walk-forward + stability checks. |
| Execution mismatch | Backtest fills differ from live execution. | Add latency, slippage, order queue modeling. |
Decision Matrix
Selection scorecard
Score the top 5 dimensions before deep tests.
| Criteria | Why | Signals |
|---|---|---|
| Data coverage | Sets the asset/time horizon ceiling. | Corporate actions, multi-frequency, multi-asset. |
| Fidelity | Determines live trading parity. | Order models, slippage, execution controls. |
| Speed | Defines research iteration cost. | Vectorization, parallelism, hardware fit. |
| Usability | Drives adoption and maintenance. | Docs, ecosystem, examples, community. |
| Reproducibility | Makes results auditable. | Version locks, data snapshots, config management. |
1) Framework shortlist + rationale 2) Data + cost assumptions 3) Bias guardrails + validation plan 4) Performance results + sensitivity 5) Go/No-Go decision + next steps
Evaluation Rubric
Quality guardrails
Every item must be verifiable.
Self-Test
Quick knowledge check
1-2 points each. Score 5+ to pass.
Resources
Authoritative baselines
All evidence in one place.
Related Links
Next steps inside EKX.AI
Internal links for deeper exploration.
Source Kit
Source and license
Keep attribution and compliance clear.
| Skill source | https://github.com/wshobson/agents/blob/main/plugins/quantitative-trading/skills/backtesting-frameworks/SKILL.md |
| Clone command | git clone https://github.com/wshobson/agents.git |
| License | MIT License (wshobson/agents) |
| Evidence | Backtrader / Backtesting.py / Zipline / VectorBT / QuantRocket |
FAQ
Common questions
Answer the most common questions.