Do I need event-driven for research only?

Not always - vectorized is faster unless you need complex order logic.

How do I validate data quality?

Check for delisted assets, adjustments, and traceable sources.

Do I need a hosted platform?

Only if you need managed data + compute + live trading integration.

Skill BreakdownFramework SelectionNo AI Required

Backtesting Frameworks: Execution-ready selection guide

Compress selection, performance trade-offs, bias guardrails, and validation into one page.

This is a framework selection and validation workflow, not investment advice.

View Skill Source Open speed study

Built from official docs and authoritative benchmarks.

Required Inputs

Lock scope before picking tools.

Asset class + trading frequency (intraday/daily/weekly)

Strategy complexity and execution fidelity target

Data source, adjustments, and refresh cadence

Cost model (commission, slippage, borrow, funding)

Compute budget (local vs cloud)

Outputs

Turn decisions into deliverables.

Framework category + final shortlist

Performance vs fidelity trade-off summary

Bias guardrails + validation plan

Report template + scorecard

Reproducibility package (config, data, versions)

Input Template

Copy to align context fast.

Asset: US equities
Frequency: Daily
Strategy style: trend / mean-reversion / stat-arb
Execution fidelity: medium (event-driven preferred)
Universe size: 500 symbols
Data source: vendor + corporate actions
Costs: 1 bps commission, 5 bps slippage
Compute: local + 1 cloud node
Target: research now, live trading in 3 months

More complete inputs = clearer selection.

Workflow

Five steps to selection

From intent to delivery, every step is explicit.

Define the objective

Clarify asset class, frequency, and fidelity.

Decide between realism vs speed before building anything.

Asset class: equities / crypto / futures
Frequency: intraday / daily / weekly
Execution fidelity: event-driven vs vectorized
Deployment: research-only vs live trading

Data + cost model

Confirm data sources, adjustments, and costs.

Data and costs are hard constraints for any framework.

Data source: vendor / exchange / CSV
Adjustments: splits, dividends, corporate actions
Costs: commissions, slippage, borrow, funding
Universe size + lookback window

Shortlist frameworks

Pick candidates by framework category.

Limit to 1-2 tools per category to avoid thrash.

Event-driven: Backtrader, Zipline, Lean
Vectorized: VectorBT, Backtesting.py
Platform: QuantConnect/Lean, QuantRocket
Shortlist 1-2 finalists

Guardrails

Turn bias risks into executable checks.

Walk-forward + OOS + stability tests are non-negotiable.

Bias checks: look-ahead, survivorship, data-snooping
Validation: walk-forward + out-of-sample
Execution: slippage, latency, fill model

Deliver + reproduce

Ship scorecards, report template, and reproducibility.

Make every decision re-runnable and reviewable.

Reproducibility: env lockfile + seeds + data snapshot
Report: scorecard + decision matrix + go/no-go
Archive: config, results, charts

Framework Landscape

Map the categories

Start with categories before tool-level details.

Category	Examples	Strengths	Trade-offs	Best for
Event-driven	Backtrader, Zipline, Lean	High execution realism, detailed order modeling.	Slower, more complex, data quality sensitive.	Complex order logic, multi-asset, live trading parity.
Vectorized	VectorBT, Backtesting.py	Fast iteration and massive parameter sweeps.	Simplified execution, weaker realism.	Research loops, hypothesis testing, sensitivity scans.
Hosted platforms	QuantConnect/Lean, QuantRocket	Integrated data + compute, cloud backtesting, live ops.	Platform lock-in, quota and cost constraints.	Teams, multi-asset data, managed execution pipelines.

Event vs Vectorized

Speed vs realism

Key differences distilled from authoritative comparisons.

Dimension	Event-driven	Vectorized
Realism	High: event-by-event execution simulation.	Moderate: simplified execution.
Speed	Slower: iterates bar-by-bar.	Faster: batch computation.
Complexity	Higher: order/state management.	Lower: simpler to implement.
Order modeling	Detailed slippage/fill control.	Basic assumptions, limited fill realism.
Best for	HFT/arb/multi-asset live parity.	Mid/low frequency, research + optimization.

Key takeaway

Speed is not everything, but large universes make it a hard constraint.

Scale Factors

What drives runtime

Key factors highlighted in QuantRocket benchmarks.

Universe size

Bigger universes magnify speed differences.

Hardware

CPU/Memory/parallelism drive throughput.

Architecture

Event-driven vs vectorized shapes iteration.

Language ecosystem

Numerical libraries dictate true performance.

Bias Guardrails

Biases you must guard

Every bias needs an executable check.

Bias	Risk	Mitigation
Look-ahead bias	Future data inflates returns.	Only use past data; forbid forward indexes.
Survivorship bias	Survivors only distort performance.	Use historical data including delisted assets.
Transaction cost	Ignoring costs overstates edge.	Model commissions, slippage, and fills.
Data-snooping	Too many trials cause overfitting.	OOS + walk-forward + stability checks.
Execution mismatch	Backtest fills differ from live execution.	Add latency, slippage, order queue modeling.

Decision Matrix

Selection scorecard

Score the top 5 dimensions before deep tests.

Criteria	Why	Signals
Data coverage	Sets the asset/time horizon ceiling.	Corporate actions, multi-frequency, multi-asset.
Fidelity	Determines live trading parity.	Order models, slippage, execution controls.
Speed	Defines research iteration cost.	Vectorization, parallelism, hardware fit.
Usability	Drives adoption and maintenance.	Docs, ecosystem, examples, community.
Reproducibility	Makes results auditable.	Version locks, data snapshots, config management.

Report skeleton

Copy into review docs or PRDs.

1) Framework shortlist + rationale
2) Data + cost assumptions
3) Bias guardrails + validation plan
4) Performance results + sensitivity
5) Go/No-Go decision + next steps

Evaluation Rubric

Quality guardrails

Every item must be verifiable.

Data reliability

Clear source, adjustments, traceable gaps.

Fidelity match

Execution model fits strategy complexity.

Scalable performance

Runtime fits budget and scales with universe.

Bias control

Guardrails cover core bias risks.

Reproducible delivery

Results are re-runnable and auditable.

Self-Test

Quick knowledge check

1-2 points each. Score 5+ to pass.

When should you prioritize event-driven?

Score: 2

When live execution realism is required.

Vectorized frameworks are best for?

Score: 1

Fast research and parameter sweeps.

Key speed factors?

Score: 1

Universe size, hardware, architecture, language ecosystem.

Why OOS/Walk-forward?

Score: 1

To reduce overfitting and data snooping risk.

Most critical reproducibility items?

Score: 1

Data snapshot, config versions, random seeds.

6 points total. Aim for 5+ before selecting.

Resources

Authoritative baselines

All evidence in one place.

Backtrader Features

Event-driven + vectorized capabilities

Open link

Backtesting.py

Fast, lightweight, visual backtesting

Open link

Zipline Docs

Pythonic event-driven backtester

Open link

VectorBT Docs

Numba-accelerated vectorized backtesting

Open link

Event vs Vectorized

Realism vs speed comparison

Open link

Speed benchmark

Backtest speed factors

Open link

Source Kit

Source and license

Keep attribution and compliance clear.

Skill source	https://github.com/wshobson/agents/blob/main/plugins/quantitative-trading/skills/backtesting-frameworks/SKILL.md
Clone command	`git clone https://github.com/wshobson/agents.git`
License	MIT License (wshobson/agents)
Evidence	Backtrader / Backtesting.py / Zipline / VectorBT / QuantRocket

FAQ

Common questions

Answer the most common questions.

Ready to select

Make framework selection repeatable

Start with the template and ship auditable results.

Open Skill Open Repo

Asset: US equities Frequency: Daily Strategy style: trend / mean-reversion / stat-arb Execution fidelity: medium (event-driven preferred) Universe size: 500 symbols Data source: vendor + corporate actions Costs: 1 bps commission, 5 bps slippage Compute: local + 1 cloud node Target: research now, live trading in 3 months

Backtesting Frameworks: Execution-ready selection guide

Workflow

Framework Landscape

Event vs Vectorized

Scale Factors

Bias Guardrails

Decision Matrix

Evaluation Rubric

Self-Test

Resources

Related Links

Source Kit

FAQ

Backtesting Frameworks: Execution-ready selection guide

Workflow

Framework Landscape

Event vs Vectorized

Scale Factors

Bias Guardrails

Decision Matrix

Evaluation Rubric

Self-Test

Resources

Related Links

Source Kit

FAQ