# Code Manual

## Purpose

This manual explains how the codebase is organized, how data flows through the system, how strategies are defined and evaluated, and where to extend the project safely.

## Repository Structure

### Core Runtime

- `strategy_gui.py`: GUI shell, run orchestration, results display
- `run_backtest_report.py`: main research pipeline and report generation
- `backtest.py`: long-only trade simulation
- `rules.py`: text-rule parser and compiler
- `indicators.py`: cached indicator access layer over TA-Lib
- `conditions.py`: implemented condition registry and placeholder list
- `combos.py`: combo-generation logic for search modes

### Utility / Example

- `example_random_discovery.py`: minimal scripted example
- `reports/`: generated outputs

### External Dependencies

The project directly depends on:

- `pandas`
- `numpy`
- `matplotlib`
- `yfinance`
- `PyQt5`
- `TA-Lib`

## System Overview

The system has three major layers:

1. UI layer
2. research engine
3. rule-and-indicator layer

### UI Layer

The GUI gathers settings, launches a worker thread, and presents ranked outputs.

Main class:

- `StrategyResearchWindow` in `strategy_gui.py`

Background worker:

- `ResearchWorker` in `strategy_gui.py`

### Research Engine

The research engine downloads data, generates strategy specs, runs backtests, computes metrics, screens and ranks candidates, and writes reports plus dashboards.

Main entry point:

- `run_research_bundle()` in `run_backtest_report.py`

### Rule-and-Indicator Layer

This layer converts human-readable expressions such as:

```text
EMA(10) > EMA(50) AND RSI(14) crosses above 50
```

into boolean pandas Series aligned to market data.

Core functions and classes:

- `compile_rule()` in `rules.py`
- `eval_condition()` in `rules.py`
- `IndicatorCache` in `indicators.py`

## End-to-End Data Flow

1. GUI or script builds a `RunConfig`.
2. `run_research_bundle()` downloads Yahoo Finance OHLCV data through `load_yfinance()`.
3. `generate_1000_entry_combos()` builds the predefined combo universe.
4. `build_strategy_specs()` creates concrete strategy specifications.
5. Each strategy is compiled into entry and exit boolean signals using `compile_rule()`.
6. `backtest_sl_tp_long()` simulates trades.
7. `evaluate_strategy_window()` computes strategy, benchmark, weekly, and trade metrics.
8. Multi-horizon aggregation and optional validation are applied.
9. Candidates are scored and screened.
10. CSV files, dashboards, and standalone strategy scripts are exported.

## Search Modes

### Grid Search

Implemented by:

- `build_strategy_specs()`
- `generate_exit_combos()`
- `generate_1000_entry_combos()`

Grid mode pairs a ranked list of entry expressions with a bounded set of exit variants.

If `use_all_combos` is `True`, the engine evaluates the full generated specification list.

### Random Search

Implemented by:

- `generate_random_strategy_specs()` in `combos.py`

Random mode samples strategies using:

- random entry condition count
- random exit condition count
- randomized stop/holding variants where supported by the generator

The run stops when:

- `max_attempts` is reached, or
- `target_valid_strategies` passing candidates are found

## Condition Registry

`conditions.py` is the canonical source of implemented condition phrases.

It exposes:

- `ENTRY_CONDITION_REGISTRY`
- `EXIT_CONDITION_REGISTRY`
- `PLACEHOLDER_CONDITION_FAMILIES`

Registry items include:

- `name`
- `family`
- `required_columns`
- `signal`
- `description`
- `export`

The registry is used for:

- human inspection
- combo generation
- documentation support

## Rule Language

The rule compiler is string-based, regex-driven, and intentionally narrow.

Supported examples include:

- `EMA(10) > EMA(50)`
- `RSI(14) crosses above 50`
- `Close breaks above MAX(14)`
- `STOCH(14,3,3) %K crosses above %D below 20`
- `PLUS_DI(14) > MINUS_DI(14)`
- `Close crosses below EMA(20)`
- `Close breaks above upper BBANDS(20,2)`
- `CDLHAMMER = 100`

The parser is implemented in:

- `eval_condition()` in `rules.py`

The full rule is assembled by:

- `compile_rule()` in `rules.py`

This usually means:

- split by `AND`
- evaluate each condition
- combine condition Series with boolean logic

If you add new condition phrases in `conditions.py` or `combos.py`, you must usually add matching parser support in `rules.py`.

## Indicator Layer

`IndicatorCache` in `indicators.py` centralizes indicator calculation and memoization.

Responsibilities:

- validate required OHLCV columns
- compute TA-Lib indicators lazily
- cache results by parameter tuple
- return pandas Series aligned to the original index

This prevents repeated indicator recalculation during large strategy runs.

### Common Indicator Methods

- `EMA()`
- `SMA_close()`
- `RSI()`
- `ATR()`
- `NATR()`
- `STDDEV()`
- `OBV()`
- `AD()`
- `ADOSC()`
- `ADX()`
- `ADXR()`
- `PLUS_DI()`
- `MINUS_DI()`
- `MACD()`
- `STOCH()`
- `BBANDS()`
- `SAR()`

The cache is the correct extension point for adding new indicators.

## Combo Generation

`combos.py` builds phrase-level building blocks, then combines them into full rule expressions.

Key builders:

- `build_trend_filters()`
- `build_breakout_triggers()`
- `build_oscillator_triggers()`
- `build_quality_filters()`
- `build_candle_confirms()`
- `build_volatility_breakout_triggers()`

Key outputs:

- `generate_1000_entry_combos()`
- `generate_exit_combos()`

Design notes:

- combinations are intentionally controlled, not exhaustive
- output is deduplicated
- lightweight contradiction filters are applied
- heuristics are used to score and rank combinations

## Backtest Engine

The main engine is `backtest_sl_tp_long()` in `backtest.py`.

### Core Behavior

- long-only
- single active position at a time
- supports stop-loss and take-profit in ATR units
- optional trailing stop
- optional time stop via `max_holding_bars`
- optional final-bar close
- configurable next-open entry logic
- conservative same-bar SL/TP handling when selected

### Trade Output

Each trade row usually includes:

- `entry_time`
- `exit_time`
- `entry_px`
- `exit_px`
- `reason`
- `gross_ret`
- `net_ret`
- `bars_held`

### Summary Output

The backtester also returns a compact summary dictionary used later by the reporting engine.

## Research Metrics

The reporting pipeline computes several metric families.

## Equity and Performance Metrics

Generated through:

- `build_buy_hold_equity()`
- `build_mtm_equity_from_trades()`
- `perf_metrics_from_equity()`
- `regression_beta_alpha()`

Common metrics:

- total return
- CAGR
- annualized volatility
- Sharpe
- Sortino
- max drawdown
- Calmar
- skew
- kurtosis
- VaR
- CVaR
- beta
- alpha
- information ratio
- correlation

## Trade Metrics

Generated through:

- `trade_stats_report()`

Examples:

- trade count
- win rate
- average trade
- best/worst trade
- profit factor
- win/loss streaks
- average duration

## Weekly Metrics

Generated through:

- `weekly_return_stats()`
- `aggregate_weekly_stats()`

These metrics drive the ranking system and emphasize stability across time windows.

Common weekly fields:

- average weekly return
- median weekly return
- negative week rate
- positive week rate
- consistency
- weekly volatility
- downside weekly volatility
- weekly Sharpe
- worst week

## Validation and Screening

There are two distinct filters:

### 1. Weekly Credibility Screen

Implemented by:

- `screen_strategy()`

Possible failure reasons include:

- insufficient valid horizons
- too few trades
- drawdown too deep
- profit factor too low
- too many negative weeks
- negative weeks dominate
- non-positive total return
- non-positive average weekly return
- non-positive weekly edge

### 2. Three-Phase Validation

Implemented by:

- `split_validation_windows()`
- `validate_three_phase_results()`

The dataset is split into:

- in-sample
- out-of-sample
- holdout

The validation logic tracks:

- split return behavior
- drawdown limits
- underperformance versus buy-and-hold
- concentration of profits in one trade
- holdout degradation
- return instability across splits

The output includes an `overfitting_risk_score`.

## Scoring

Primary scoring is computed by:

- `compute_score()`

Then adjusted by a validation risk penalty before being written as:

- `raw_score`
- `final_score`
- `score`

Current behavior:

- `score` is kept for valid numeric runs, even when the credibility screen fails
- this allows fallback ranking and dashboard generation for screened-out candidates

## Dashboard and Report Generation

### Core Exports

Key functions:

- `save_dashboard_html()`
- `save_index_html()`
- `save_master_plots()`
- `export_top_strategies()`
- `export_strategy_file()`

### Generated Artifacts

Typical run folder contents:

- `strategies_report.csv`
- `strategies_report_with_meta.csv`
- `top30_by_score.csv`
- `credible_candidates.csv` or `ranked_candidates.csv`
- `screened_out_candidates.csv`
- `fallback_ranked_candidates.csv`
- `score_distribution.png`
- `risk_return_scatter.png`
- `dashboards/index.html`
- `dashboards/top_ranked_table.csv`
- `dashboards/assets/*.png`
- `reports/output/<run_name>/strategy_rank_*.py`

## Dashboard Fallback Mode

If no strategy passes the credibility screen:

- strongest screened-out candidates are still ranked
- fallback dashboards are generated
- metadata records `dashboard_mode = screened_out`

This behavior is implemented in `run_strategies()`.

## GUI Architecture

The GUI is a thin orchestration layer over the engine.

### Main Classes

### `EmittingStream`

Redirects worker output into the log pane.

### `MetricCard`

Small reusable metric widget for top-line run stats.

### `ResearchWorker`

Runs the research engine on a background thread and emits:

- log lines
- completion payload
- failure tracebacks

### `StrategyResearchWindow`

Builds the entire application window and handles:

- defaults
- validation of user input
- worker launch
- result-table population
- dashboard / CSV opening

## Configuration Model

`RunConfig` in `run_backtest_report.py` is the central configuration object.

Important fields:

- search controls
- backtest parameters
- validation settings
- weekly ranking settings
- export counts
- credibility screen toggles

`default_hourly_run_config()` is the best place to inspect current defaults.

## Scripted Usage

Use `example_random_discovery.py` as the simplest script entry point.

For custom automation:

1. construct a `RunConfig`
2. call `run_research_bundle()`
3. inspect the returned `metadata` and `report_df`

## Safe Extension Points

### Add a New Indicator

1. Implement it in `IndicatorCache`.
2. Return a pandas Series aligned to the input index.
3. Cache it by a stable key.

### Add a New Rule Phrase

1. Add the phrase to `conditions.py` or `combos.py`.
2. Add the matching regex and evaluation logic to `eval_condition()` in `rules.py`.
3. Test it through a small scripted run.

### Add a New Ranking Metric

1. Compute it inside `evaluate_strategy_window()` or aggregation helpers.
2. Store it on the result row.
3. Integrate it into `compute_score()` or screening functions as needed.
4. Add it to dashboards and CSVs if it should be user-visible.

### Add a New Output Artifact

Best insertion points:

- `run_research_bundle()` for run-level exports
- `run_strategies()` for candidate-level exports
- `save_dashboard_html()` for dashboard-level additions

## Common Gotchas

### 1. Registry and parser drift

Adding a phrase to the registry without adding parser support will break compilation.

### 2. TA-Lib availability

Many runtime issues come from missing TA-Lib binaries or unsupported environments.

### 3. History coverage

Long evaluation horizons silently skip when data coverage is insufficient.

### 4. Wide combo universes

Full grid mode can become large quickly. Use limits or random search when iterating.

### 5. No passing strategies

This does not necessarily mean the run failed. It may indicate that:

- the screen is strict
- the asset or horizon is unsuitable
- the rule universe is weak for the selected regime

Fallback dashboards exist specifically to help inspect those cases.

## Suggested Maintenance Practices

- keep rule phrases and parser support synchronized
- prefer adding metrics in the reporting layer, not the GUI layer
- treat `RunConfig` as the source of truth for defaults
- use small scripted runs before large GUI experiments
- keep exports deterministic when changing scoring logic

## Minimal Mental Model

If you need the shortest technical summary, it is this:

1. `conditions.py` defines what strategies can say.
2. `combos.py` decides which strategies get generated.
3. `rules.py` turns those phrases into boolean signals.
4. `backtest.py` converts signals into trades.
5. `run_backtest_report.py` scores, screens, ranks, and exports everything.
6. `strategy_gui.py` wraps the engine in a desktop interface.