January 20, 2025

Crypto Pairs Trading: Part 3 — Constructing Your Strategy with Logs, Hedge Ratios, and Z-Scores

Share this blog:

Welcome to Crypto Pairs Trading: Part 3 — Constructing Your Strategy with Logs, Hedge Ratios, and Z-Scores. Building upon the foundational concepts in Part 1 — Foundations of Moving Beyond Correlation and statistical validations in Part 2 — Verifying Mean Reversion with ADF and Hurst Tests, we now move from theory to practice—applying log transformations, determining hedge ratios, and setting Z-score thresholds to turn your data into concrete trading signals. In one line: We transform raw statistical findings into a tactical, data-driven framework that guides when to enter and exit your trades.

Selecting Pairs and Constructing the Strategy

Having covered the theoretical backbone—correlation vs. cointegration, the necessity of stationarity, and the importance of using metrics like the ADF test and the Hurst exponent—we now bring it all together in practice. The goal is to transition from broad theory to actionable steps. This involves selecting pairs that not only look cointegrated on paper but also exhibit stable mean-reverting behavior when traded in live markets, particularly the volatile crypto environment.

From Data Acquisition to Strategy Execution

Our computational framework, as reflected in the provided Jupyter Notebook, embodies the principles discussed so far. Let’s walk through each logical step to show how the theory translates into a trading algorithm:

Data Acquisition:
Before any statistical test can be performed, we need historical data. The code fetches historical daily OHLC data for multiple crypto assets (e.g., BTC/USDT, ETH/USDT, LTC/USDT) from chosen exchanges. It structures this data into a Pandas DataFrame, ensuring proper timestamp alignment and data cleaning. The code handles pagination from the API and ensures that duplicate records are dropped and the dataset is consistently indexed by time.
Preparing the Data:
Once the data is collected, we typically work with closing prices and transform them using logarithms. The code takes logs of the price series to stabilize variance and enhance the linearity of relationships between assets. This helps ensure that the regression and cointegration tests produce meaningful and stable results, rather than being skewed by raw price scales or volatility outliers.
Statistical Filtering and Cointegration Testing:
This is where we confirm that any apparent relationship isn’t just spurious correlation. The code uses a cointegration test (Engle-Granger methodology) to determine whether a combination of two assets yields a stationary spread. Specifically:
- Cointegration Test: A pair’s logged price series are tested for a unit root in their linear combination. The Engle-Granger test provides a p-value indicating whether the pair is likely cointegrated. We look for a p-value below a chosen significance threshold (commonly 5%).
- Hedge Ratio Estimation: If cointegration is confirmed, the code fits a regression model to find the hedge ratio—how many units of one asset should be held against one unit of the other. This ratio is crucial for constructing a stable, mean-reverting spread.
- Stationarity Check: The resulting spread (e.g., Asset1 - Hedge_Ratio * Asset2) is tested for stationarity with the Augmented Dickey-Fuller (ADF) test. If the ADF test suggests stationarity and the Hurst exponent indicates mean-reversion tendencies (H < 0.5), it confirms we have a genuine equilibrium-based relationship. This ensures the spread is not merely drifting but truly “snaps back” to its mean over time.
Z-Score Thresholds and Signal Generation:
After identifying cointegrated pairs and confirming that their spreads are stationary and mean-reverting, the code calculates the spread’s rolling mean and standard deviation. With these in hand, it derives a Z-score for the spread:
- Z-Score Calculation: The Z-score tells us how many standard deviations the current spread deviates from its historical mean. For instance, a Z-score above +2 or +3 might indicate the spread is too wide, suggesting the “overvalued” asset should be shorted and the “undervalued” one should be bought, expecting them to converge.
- Trading Signals:
  - If Z-score > +Z_threshold (e.g., +3), the program generates a sell signal for the spread (short the overvalued asset, long the undervalued).
  - If Z-score < -Z_threshold, it generates a buy signal (long the underpriced asset and short the overpriced one).
The positions are exited when the spread reverts and the Z-score moves back toward zero, capturing the profit from the convergence.
Practical Considerations in Crypto Markets:
Crypto markets can be highly volatile, and relationships that appear stable can break down if the underlying fundamentals or broader market regimes change. Therefore, the code imposes additional sanity checks:
- Minimum Data Requirements: The code ensures sufficient historical data points (e.g., at least 60 days) to reduce noise.
- Hedge Ratio Bounds: Extremely large or tiny hedge ratios are filtered out to avoid unrealistic trades.
- Stop Losses and Take Profits: Risk management parameters—like a stop loss at a certain percentage loss and a take profit at a certain gain—are integrated into the backtesting logic to shield against sudden market shifts.
These constraints ensure that even after identifying a statistically sound pair, the trading conditions remain practical, controlled, and resilient against extreme market moves.

Bringing It Together: Selecting Pairs in Crypto

Applying these steps to a broad set of crypto pairs might look like this in practice:

Data Gathering: Pull daily data for a wide range of crypto assets quoted in USDT (e.g., BTC/USDT, ETH/USDT, LTC/USDT).
Filtering and Testing: For every pair combination (e.g., BTC/USDT vs. ETH/USDT), run the cointegration tests. Only pairs passing the stringent cointegration p-value threshold and producing a stationary, mean-reverting spread are retained.
Hedge Ratio and Confidence Checks: Derive a stable hedge ratio for each selected pair. Confirm this ratio leads to a stationary spread by re-checking the ADF test. Inspect the Hurst exponent to ensure the spread displays actual mean reversion rather than randomness or trending behavior.
Signal Generation and Execution: With a confirmed stationary spread, calculate the spread’s rolling mean and standard deviation, compute Z-scores, and set thresholds. When the Z-score breaches the thresholds, generate buy or sell signals. As the code runs, it simulates trades, applies transaction costs, and tracks performance metrics like total return, Sharpe ratio, and drawdowns.
Monitoring and Re-Calibration: Over time, monitor selected pairs. If market conditions change, a once-stationary spread might lose its equilibrium. Regularly re-test cointegration, stationarity, and mean reversion parameters to ensure the strategy remains reliable.

Starting from historical price data, we narrow down the universe of crypto pairs to those that are statistically cointegrated and stationary, ensuring a robust foundation for mean-reversion trades. By leveraging Z-scores, we translate these statistical properties into clear entry and exit signals. The outcome is a carefully engineered pairs trading strategy that stands on the pillars of cointegration, stationarity, hedge ratio stability, and disciplined signal generation.

This integrated approach moves us from the theoretical underpinnings—why correlation alone isn’t enough and why we need cointegration and stationarity—into a fully realized, data-driven trading framework tailored for the dynamic crypto market.

Backtesting, Risk Management, and Transaction Costs

After establishing a robust theoretical foundation and a practical method for selecting pairs and generating signals, the next logical step is to ensure the viability of the strategy through backtesting. Backtesting involves simulating the strategy on historical data to evaluate how it would have performed under realistic market conditions. This process not only helps in assessing expected profitability, but also in identifying weaknesses, optimizing parameters, and understanding potential drawdowns.

Why Backtest?

No matter how compelling the theoretical rationale or how favorable the statistical tests, all strategies are incomplete without empirical validation. By applying your entry and exit rules to past price data, you gain insights into the strategy’s historical behavior. You can review how often trades triggered, how large profits and losses were, and how the strategy’s risk profile evolved over time. In short, backtesting helps you answer crucial questions before risking real capital.

Core Components of the Backtest:

Signal Simulation:
The backtest replays historical conditions day by day:
- Entry Conditions:
  When the spread’s Z-score surpasses the upper threshold, the strategy initiates a short spread trade. Conversely, if it falls below the lower threshold, the strategy goes long the spread.
- Exit Conditions:
  The trade is closed as the spread mean-reverts and the Z-score moves back toward zero, ideally capturing the profit from the convergence.
Performance Metrics:
As the simulation runs, the code calculates key metrics:
- Profit and Loss (P/L):
  Tracks how much hypothetical profit (or loss) each trade generates over the test period.
- Sharpe Ratio:
  Gauges risk-adjusted return by comparing the strategy’s excess returns to its volatility. A higher Sharpe ratio suggests a more efficient strategy.
- Maximum Drawdown:
  Measures the largest peak-to-trough decline. Even a profitable strategy can have large drawdowns, testing an investor’s capacity to hold on through losing streaks.
- Win Rate:
  Percentage of winning trades helps verify how consistent the strategy might be, though it’s not the sole measure of success.

Risk Management Within the Backtest:

No trading strategy is complete without mechanisms to limit losses and secure gains. The code integrates straightforward but effective risk management measures:

Stop Losses:
Even a statistically robust mean-reversion strategy can face “outlier” events where the spread keeps moving against your position. To prevent runaway losses, a stop loss threshold is defined. If the spread deviates beyond a certain loss percentage, the position is closed. This ensures that a single bad trade won’t cripple the overall performance.
Take Profits:
When the spread reverts and reaches a favorable level, the strategy locks in gains by closing the position. This prevents lingering in trades that have already achieved their statistical objective and reduces the risk of giving back profits if the market reverses once more.

Accounting for Transaction Costs:

In live markets, trades aren’t free. Transaction costs—commissions, trading fees, slippage, and potential funding charges—can erode profits. Including these costs in the backtest ensures a more realistic estimate of how the strategy might fare in practice. The code deducts a fraction of the trade value as costs, ensuring that what appears profitable theoretically remains viable after execution expenses.

For instance:

Commissions: Exchange-based fees for executing trades.
Slippage: The difference between the expected price of a trade and the actual price due to market movements and order execution delays.

By modeling these frictions, the backtest aligns more closely with the real-world environment you’ll face when going live. A strategy that’s only marginally profitable before costs might actually underperform after including them—this early insight can save time, money, and frustration.

Bringing It Together

Scenario Example

Imagine a backtest over two years of historical data for a selected pair you found cointegrated with a stable, mean-reverting spread. As the algorithm runs:

The program flags entry signals whenever the Z-score crosses ±3 standard deviations.
It simulates order execution, charging a small percentage fee.
If the spread continues moving against you, it stops out at a predefined loss percentage.
Once the spread returns to equilibrium, it books profits, after factoring in slippage and transaction costs.
At the end of the simulation, you review metrics like annualized return, Sharpe ratio, and maximum drawdown.

If the results are promising—positive cumulative returns, a respectable Sharpe ratio, moderate drawdowns, and resilience against transaction costs—then the strategy passes a crucial pre-live test. If not, you may need to refine your parameters, impose tighter risk controls, or seek better cointegrated pairs.

From Theory to Reality

Backtesting and risk management bridge the gap between theoretical possibility and practical implementation. By simulating real-world frictions—transaction costs, losing streaks, and unexpected market moves—you ensure that your pairs trading approach isn’t just statistically sound, but also adaptable and robust. This thorough examination and stress testing of the strategy serve as a final quality check before deciding to deploy real capital in live crypto markets.

Now that your strategy blueprint is taking shape, you’re ready to see how it performs in actual market scenarios. In Part 4 — Empirical Results and Performance Analysis, we’ll present backtested empirical results, analyze risk-adjusted returns, and demonstrate how a cointegration-based pairs trading strategy can navigate the ever-shifting crypto landscape.

Enhance your strategy with real-world data. Reach out to Amberdata for cutting-edge crypto market insights and tools that support your journey from theory to execution.

Disclaimers

The information contained in this report is provided by Amberdata solely for educational and informational purposes. The contents of this report should not be construed as financial, investment, legal, tax, or any other form of professional advice. Amberdata does not provide personalized recommendations; any opinions or suggestions expressed in this report are for general informational purposes only.

Although Amberdata has made every effort to ensure the accuracy and completeness of the information provided, it cannot be held responsible for any errors, omissions, inaccuracies, or outdated information. Market conditions, regulations, and laws are subject to change, and readers should perform their own research and consult with a qualified professional before making any financial decisions or taking any actions based on the information provided in this report.

Past performance is not indicative of future results, and any investments discussed or mentioned in this report may not be suitable for all individuals or circumstances. Investing involves risks, and the value of investments can go up or down. Amberdata disclaims any liability for any loss or damage that may arise from the use of, or reliance on, the information contained in this report.

By accessing and using the information provided in this report, you agree to indemnify and hold harmless Amberdata, its affiliates, and their respective officers, directors, employees, and agents from and against any and all claims, losses, liabilities, damages, or expenses (including reasonable attorney’s fees) arising from your use of or reliance on the information contained herein.

Tag(s): Trading , Cryptocurrency , Blockchain , Digital Asset , Data , API , Analytics , Research , Media , Has_Landing_Page

Michael Marshall

Mike Marshall is Head of Research at Amberdata. He leads pioneering research initiatives at the forefront of blockchain and cryptocurrency analytics. Mike is a seasoned quantitative analyst with a 15-year track record in developing AI-driven trading algorithms and pioneering proprietary cryptocurrency strategies. His...