LEARN — ARTIFICIAL INTELLIGENCE

Machine Learning: How Models Weigh the Odds

Weighing the odds: probability of up vs down

Instead of asking “how much will it move?”, Machine learning (using a methodology that's called logistic regression) asks a simpler question: “up or down?” And it answers with a probability — 62% up, 38% down.

Where linear regression predicts a number, logistic regression predicts a probability. It outputs a value between 0% and 100% — how confident the model is that the next return will be positive.

This model is trained using 15 technical features per data point — the same indicators used in our Analysis Pro view: SMA & EMA at three timeframes, DMI trend strength (+DI, -DI, DX), the Trend Oscillator, plus OHLCV price action and volume patterns. They are built using a combination of widely studied technical analysis methodologies.

Up to 5 years of historical data feed the model, so they can learn from bullish and bearish periods alike.

This introduces an additional parameter: the confidence threshold. It controls how confident the model must be before the simulation assumes exposure.

At 50%, the simulation assumes exposure whenever the model considers it “more likely up than down.” At 80%, it only assumes exposure when the model is very confident. Higher thresholds mean fewer trades but potentially higher conviction per trade.

How it works

For each day, analyse 15 features: log return, price range, direction, volume, SMA & EMA distances (short/mid/long), DMI (+DI, -DI, DX), and the Trend Oscillator
The model learns and outputs a probability between 0% and 100%
If the probability exceeds the confidence threshold → the model estimates an upward move
If below the threshold → the model stays flat
The model is trained on part of the data, not all of it

The three parameters that define this strategy:

Number of lags — how many past days the model uses as input
Train/test split — how much data to learn from vs. evaluate on
Confidence threshold — how certain the model must be before it assumes exposure (40% to 99%)

Educational content only — not investment advice, recommendations, or a suggestion to act. Past performance is not indicative of future results. Your decisions are your own. Full disclaimer.

Why probability matters

Binary prediction

“It will go up.” No nuance. The model treats a 51% chance the same as a 99% chance. The simulation assumes exposure either way.

Probability prediction

“There's a 73% chance it goes up.” The simulation can filter for high-confidence estimates only. Fewer trades, but each one backed by more conviction.

The key insight: A model's confidence and its actual accuracy are not the same thing. A model can be 90% “confident” and still be wrong. The probability chart below lets you see the model's confidence over time — and whether higher confidence actually led to better outcomes.

When it works, when it doesn't

Tends to work when

There are genuine, persistent patterns in the data that generalize beyond the training period. Rare in efficient markets, but possible in specific conditions.

Tends to fail when

The model captures noise instead of signal (overfitting), market conditions change between training and testing periods, or the features (lagged returns) simply don't contain predictive information.

See it in action

Pick a ticker, adjust the parameters, and watch a model train on real data. Pay attention to the probability chart — see how the model's confidence relates to actual outcomes.

Loading SPY data...

What to notice:

The probability chart shows the model's confidence over time — notice how it clusters around 50% (uncertain) most of the time
Raise the confidence threshold to 70-80% — fewer trades, but are they actually better?
Compare in-sample vs out-of-sample — does higher confidence in training data translate to better results on test data?
Try many lags (50+) — the model gets more features to memorize, which often hurts out-of-sample

Your turn

Consider the role of certainty in investment decisions. Is confidence typically based on intuition or data? And does higher confidence actually correlate with better outcomes?

The lesson from this model isn't about the model itself — it's about the relationship between confidence and accuracy. Being sure and being right are two very different things. This applies to algorithms and to humans.

Reflect in your Journal

What you've learned

-This model (Logistic regression) predicts a probability (0-100%) instead of a binary signal — the confidence threshold determines the cutoff.
-A model’s confidence and its actual accuracy are not the same thing — 90% confidence can still mean frequent errors.
-The confidence threshold controls the trade-off: lower thresholds mean more trades (more noise), higher thresholds mean fewer trades (more selective).
-The model is trained on 5 years of market data — 15 technical indicators, machine learning, learning from history.
-The most important question for any predictive model: does it work on data it has never seen before?

Want to test this?

Many experienced investors suggest practicing with a paper money account on a reputable broker before risking real capital. Many brokers offer free simulated trading environments where you can test strategies with real market data and no financial risk.

Paper trading lets you build confidence, understand execution, and see how a strategy behaves in real time — without the emotional weight of real money on the line.

Important

Everything on this platform is educational and didactic in nature. We do not provide investment advice, financial advisory, or recommendations to buy or sell any financial instrument. Past performance is not indicative of future results. All strategies shown are historical simulations for learning purposes only. Always do your own research and consult a qualified financial advisor before making investment decisions.

Why doesn't the model always win?

If you've been experimenting with different tickers and parameters, you've probably noticed: the model doesn't consistently beat buy & hold. That's not a bug — it's the reality of financial markets.

Markets are partially efficient

When a trading pattern becomes widely known, other participants exploit it until it stops working. The most visible technical signals tend to get arbitraged away over time.

Regime changes break patterns

A pattern that works during a sustained uptrend may fail completely during a sideways or volatile market. The model learns from one environment and gets tested in another — and markets change character more often than most people expect.

Understanding why models struggle is just as valuable as understanding how they work. It builds realistic expectations about what quantitative models can and cannot do.

Previous: Linear Regression Next: Reinforcement Learning

We're educators, not advisors. We don't make buy or sell recommendations under any circumstance. All content is for educational purposes. Past performance doesn't guarantee future results. Your decisions are your own. Disclaimer