Skip to content

KirikPapka/vol-pricing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pricing Short-Dated U.S. Equity Options

This code is submitted as part of the assessment for the Economics Dissertation module (BEE3068).

Dissertation codebase for:
“Pricing Short-Dated U.S. Equity Options: A Comparative Study”
(Equities: AAPL, MSFT, NVDA, AMZN, GOOGL, META, JPM, XOM, TSLA; SPX as benchmark)

This repository contains the full research pipeline used in the dissertation:

  • Cleaning and merging OptionMetrics / underlying / macro data
  • Building forward prices (SOFR OIS + dividends)
  • Constructing volatility panels: market IV, GARCH, SABR, Neural SDE
  • Pricing American options via binomial, PDE, and Neural SDE Monte Carlo
  • Evaluation, model risk metrics, and trading-style diagnostics
  • Generating all figures (PNG) and animations (GIF) used in the thesis

Execution is orchestrated via VS Code tasks in .vscode/tasks.json (no CLI main.py driver).


1. Repository Structure

High-level layout (non-exhaustive):

.
├── README.md
├── requirements.txt
├── pytest.ini
├── config/
│   └── config.yaml         # Global settings (tickers, dates, buckets, etc.)
├── data/
│   ├── README.md           # Notes on raw/processed data locations
│   ├── raw/                # WRDS / Bloomberg / FRED / etc. exports (NOT in git)
│   ├── interim/            # Intermediate cleaned files
│   └── processed/          # Final research panels, IV surfaces, eval tables
│       ├── panel*.parquet
│       ├── iv_surfaces/
│       ├── iv_true/
│       ├── eval/
│       └── (other helper CSV/Parquet files)
├── outputs/
│   └── figures/            # All PNG/GIF figures used in the dissertation
│       ├── iv_smiles/
│       ├── iv_surfaces/
│       ├── surface3d_png/
│       ├── smile_gif/
│       ├── surface_gif/
│       ├── smiles_gif/
│       └── atm_iv_vs_vix/  # ATM IV vs VIX per ticker
├── cpp/
│   ├── CMakeLists.txt
│   ├── binomial.cpp        # EEP pricing (C++ / pybind-style DLL)
│   ├── neural_mc.cpp       # Neural SDE Monte Carlo accelerator (C++)
│   └── build/
│       ├── libbinomial.dylib
│       └── libneural_mc.dylib
├── fortran/
│   ├── heston_pde.f90      # Heston PDE solver
│   ├── heston_pde_mod.mod
│   └── libheston_pde.dylib # Shared library used from Python
├── src/
│   ├── __init__.py
│   ├── main.py             # (optional) helper; core work via module entrypoints
│   ├── cleaning.py         # Raw OptionMetrics -> cleaned panel
│   ├── rates_divs.py       # SOFR OIS curve, forwards, dividend/earnings flags
│   ├── eep.py              # Early exercise premium utilities
│   ├── eval.py             # Core evaluation, risk metrics, PnL tables
│   ├── utils.py            # Logging, config loading, shared helpers
│   ├── native.py           # Thin wrappers around C++/Fortran libraries
│   ├── viz.py              # Summary figures (core performance plots)
│   ├── viz_atm_iv_vs_vix.py
│   ├── viz_iv_smiles.py
│   ├── viz_iv_surfaces.py
│   ├── viz_iv_surfaces_3d.py
│   ├── viz_iv_animation.py
│   ├── viz_iv_surface_animation.py
│   ├── viz_model_smiles_animation.py
│   └── models/
│       ├── garch.py                # GARCH(1,1) volatility panel
│       ├── sabr.py                 # SABR volatility panel
│       ├── market_iv.py            # Market-implied IV + basic EEP for benchmark
│       ├── neural_sde.py           # Neural SDE training and path simulation
│       ├── pricing_binomial.py     # Binomial + C++ lib, EEP corrections
│       ├── pricing_garch.py        # American pricing under GARCH vol panel
│       ├── pricing_sabr.py         # American pricing under SABR vol panel
│       ├── pricing_neural_sde.py   # Neural SDE American pricing
│       ├── heston_calib.py         # (optional) Heston calibration
│       ├── heston_pde.py           # Heston PDE pricing using Fortran DLL
│       ├── iv_surfaces.py          # IV surface panel construction
│       ├── iv_true_vs_panel.py     # True vs panel IV comparison
│       ├── regime_buckets.py       # Volatility regimes (VIX-based)
│       ├── early_exercise_region.py# EEP sign/size analysis, regions in (K,T)
│       ├── model_cost_benchmark.py # Runtime / efficiency benchmarks
│       ├── model_pnl_ranking.py    # Hedged PnL style ranking
│       └── tail_risk_eval.py       # ES / underpricing tail metrics
└── tests/
    ├── test_schema.py
    ├── test_data_integrity.py
    ├── test_forward.py
    ├── test_eep.py
    ├── test_heston_cos.py          # Not used
    ├── test_models_panels.py
    ├── test_model_relations.py
    ├── test_additional_outputs.py
    └── test_smoke.py

2. Environment & Data Setup

2.1 Python environment

Inside the repo root:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

The VS Code tasks assume the Python interpreter lives at:

${workspaceFolder}/.venv/bin/python

If your environment path differs, update .vscode/tasks.json accordingly.

2.2 Data directories and environment variables

Expected directory layout under data/:

data/
  raw/          # WRDS, Bloomberg, FRED etc. exports (not tracked in git)
  interim/      # automatically filled with intermediate cleaned data
  processed/    # final research panels, IV surfaces, evaluation tables

For local runs, set:

export DATA_DIR="<absolute-path-to>/data"
export OUT_DIR="<absolute-path-to>/outputs"

config/config.yaml stores the core parameter choices (tickers, sample period, DTE buckets, etc.). Adjust it if you want to rerun the pipeline on a different sample or equity subset.


3. Running the Pipeline via VS Code Tasks

The project is driven through VS Code tasks defined in .vscode/tasks.json.
Each task calls a specific Python module (or build tool) in the correct order.

To run a task:

  1. Open the workspace in VS Code (folder root = this repo).
  2. Make sure the .venv is created and selected as the Python interpreter.
  3. Press Ctrl+Shift+B / Cmd+Shift+B or open the Command Palette:
    “Tasks: Run Task”.
  4. Choose the desired task label (e.g. 99-full-pipeline).

3.1 Core tasks (0–8)

0-env-show
    Quick sanity check: prints the Python version of .venv.

1-cleaning
    Entry point: src.cleaning
    - Loads raw OptionMetrics and underlying data from data/raw
    - Applies filters (DTE, moneyness, volume, OI, spread)
    - Outputs cleaned option panel(s) into data/interim and data/processed

2-rates-divs-forwards
    Entry point: src.rates_divs
    - Builds SOFR OIS discount curves
    - Integrates dividends and earnings dates
    - Produces forward-adjusted panel_with_forwards.parquet

3-market-iv-american
    Entry point: src.models.market_iv
    - Inverts BS to obtain market-implied IV
    - Applies simple American adjustment where relevant
    - Produces panel_market_iv_american.parquet (benchmark)

4-garch-vol-panel
    Entry point: src.models.garch
    - Estimates GARCH(1,1) per ticker
    - Produces panel_garch.parquet with conditional volatility forecasts

5-sabr-vol-panel
    Entry point: src.models.sabr
    - Calibrates SABR to cross-sections
    - Produces panel_sabr.parquet

6-neural-sde-train
    Entry point: src.models.neural_sde
    - Trains neural SDE model(s)
    - Saves weights: data/neural_sde_state*.pt
    - Produces a basic neural volatility/panel output

7-build-cpp-libs
    Command:
        cmake --build cpp/build --config Release -j8
    - Builds libbinomial.dylib and libneural_mc.dylib
    - Required for fast binomial/MC pricing

8-pricing-binomial-eep
    Entry point: src.models.pricing_binomial
    - Uses the C++ binomial library
    - Computes early exercise premia and sanity checks
    - Produces panel_*_binom_*.parquet and related diagnostics

3.2 American pricing tasks (9–13)

9-build-heston-fortran-lib
    Command:
        gfortran -O3 -shared -fPIC heston_pde.f90 -o libheston_pde.dylib
    - Compiles the Heston PDE solver to a shared library

10-pricing-heston-american
    Entry point: src.models.heston_pde
    - Calls the Fortran DLL to price under Heston
    - Produces panel_heston_american.parquet

11-pricing-garch-american
    Entry point: src.models.pricing_garch
    - Uses vol from panel_garch.parquet + EEP logic
    - Produces panel_garch_american.parquet

12-pricing-sabr-american
    Entry point: src.models.pricing_sabr
    - Uses SABR vol panel and binomial/EEP machinery
    - Produces panel_sabr_american.parquet

13-pricing-neural-sde-american
    Entry point: src.models.pricing_neural_sde
    - Uses Neural SDE paths (and, where relevant, neural_mc C++ lib)
    - Produces panel_neural_sde_american.parquet

Hint: once volumes/panels are built, you can re-run any pricing task individually if you change a parameter.


3.3 Evaluation, surfaces, regimes, and visualisation (14–27)

14-eval-core
    Entry point: src.eval
    - Aggregates pricing errors into eval_by_* CSVs
    - Computes eval_overall.csv, large error lists,
      tail_risk_metrics.csv, and model_cost_benchmark.csv (numeric part)

15-iv-surfaces-panel
    Entry point: src.models.iv_surfaces
    - Builds IV surfaces and panel across (K/F, T) for each ticker
    - Outputs iv_surface_<TICKER>.parquet and iv_panel_all.parquet

16-iv-true-vs-panel
    Entry point: src.models.iv_true_vs_panel
    - Compares “true” IV (from market) with panel reconstructions
    - Outputs iv_true_vs_panel.* under data/processed/iv_true/

17-regime-buckets
    Entry point: src.models.regime_buckets
    - Creates volatility regimes based on VIX levels
    - Produces daily regime tagging: regime_daily.* in src/data/processed/regimes

18-viz-core
    Entry point: src.viz
    - Produces core summary plots:
      overall RMSE/MAE/MAPE, EEP behaviour, cost vs error, etc.

19-viz-iv-smiles
    Entry point: src.viz_iv_smiles
    - Static IV smile plots at selected dates and regimes

20-viz-iv-surfaces-2d
    Entry point: src.viz_iv_surfaces
    - 2D surface heatmaps in (moneyness, DTE) for each ticker

21-viz-iv-surfaces-3d
    Entry point: src.viz_iv_surfaces_3d
    - 3D surface plots (surface3d_png/)

22-viz-iv-smile-gifs
    Entry point: src.viz_iv_animation
    - Animated IV smiles over time (smile_gif/)

23-viz-iv-surface-gifs
    Entry point: src.viz_iv_surface_animation
    - Animated IV surfaces (surface_gif/)

24-viz-model-smiles-gifs
    Entry point: src.viz_model_smiles_animation
    - GIFs comparing model vs market smiles (smiles_gif/)

25-viz-atm-iv-vs-vix
    Entry point: src.viz_atm_iv_vs_vix
    - ATM IV vs VIX scatter/time-series by ticker (atm_iv_vs_vix/)

26-model-pnl-ranking
    Entry point: src.models.model_pnl_ranking
    - Simple hedged PnL-based ranking and supporting figures

27-model-cost-benchmark
    Entry point: src.models.model_cost_benchmark
    - Runtime benchmarking and cost-performance trade-off plots

28-tests
    Command:
        ${workspaceFolder}/.venv/bin/python -m pytest -q
    - Runs the full pytest suite in tests/

3.4 Composite tasks

For convenience:

99-full-pipeline
    dependsOn:
        1-cleaning
        2-rates-divs-forwards
        3-market-iv-american
        4-garch-vol-panel
        5-sabr-vol-panel
        6-neural-sde-train
        7-build-cpp-libs
        8-pricing-binomial-eep
        9-build-heston-fortran-lib
        10-pricing-heston-american
        11-pricing-garch-american
        12-pricing-sabr-american
        13-pricing-neural-sde-american
        14-eval-core
        15-iv-surfaces-panel
        16-iv-true-vs-panel
        17-regime-buckets
        18-viz-core
        19-viz-iv-smiles
        20-viz-iv-surfaces-2d
        21-viz-iv-surfaces-3d
        22-viz-iv-smile-gifs
        23-viz-iv-surface-gifs
        24-viz-model-smiles-gifs
        25-viz-atm-iv-vs-vix
        26-model-pnl-ranking
        27-model-cost-benchmark
        28-tests

    Run this when you have raw data in place and want to regenerate
    everything from the cleaned panel all the way to final figures
    and tests.

98-semifull-pipeline
    dependsOn:
        14-eval-core
        15-iv-surfaces-panel
        16-iv-true-vs-panel
        17-regime-buckets
        18-viz-core
        19-viz-iv-smiles
        20-viz-iv-surfaces-2d
        21-viz-iv-surfaces-3d
        22-viz-iv-smile-gifs
        23-viz-iv-surface-gifs
        24-viz-model-smiles-gifs
        25-viz-atm-iv-vs-vix
        26-model-pnl-ranking
        27-model-cost-benchmark
        28-tests

    Use this if all pricing panels already exist (e.g. after a previous
    full run) and you want to regenerate evaluation + visualisations
    without redoing GARCH/SABR/Neural SDE or raw data cleaning.

4. Running Without VS Code (Optional)

All tasks are simply invoking Python modules. To run manually from the shell:

source .venv/bin/activate
python -m src.cleaning
python -m src.rates_divs
python -m src.models.garch
python -m src.models.pricing_garch
python -m src.eval
python -m src.viz

and so on, following the same order as defined in .vscode/tasks.json.


5. Testing and Quality Control

You can either:

  • Run the 28-tests task inside VS Code, or

  • From the repo root (with .venv activated):

    pytest -q
    

The test suite checks:

  • Data schema & integrity
  • Forwards and discount factors
  • EEP convergence and sign in typical regions
  • Heston vs Black–Scholes limiting behaviour
  • Consistency between model panels (GARCH vs SABR vs MIV)
  • Existence and basic properties of final outputs (tables + figures)

6. Reproducibility Checklist

  1. Clone the repository.
  2. Create .venv and pip install -r requirements.txt.
  3. Place data exports under data/raw as described in data/README.md.
  4. Set DATA_DIR and OUT_DIR environment variables.
  5. Adjust config/config.yaml if you want a different sample or asset set.
  6. In VS Code, open the folder, select .venv as the interpreter.
  7. Run task 99-full-pipeline.
  8. All final panels, evaluation tables, and figures should now be available in data/processed/ and outputs/figures/.
  9. Optionally, re-run 28-tests manually to verify the environment.

7. License

The codebase is intended to be released under the MIT License (or the licence specified alongside the dissertation submission). Vendor data (WRDS, Bloomberg, FRED, etc.) remain subject to their respective licences and are not distributed in this repository.


8. Contact

Author: Kirill Papka

Supervisor: Giuseppe Cavaliere

Institution: University of Exeter

For questions about replication, extensions, or potential collaborations, please contact the author via university email: kp604@exeter.ac.uk

About

This code is submitted as part of the assessment for the Economics Dissertation module (BEE3068).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors