Expected Goals (xG) and Value Betting: Using Data to Find Mispriced Odds
The gap between what a team's results say and what their underlying performance shows is where value betting opportunities live. Expected Goals, commonly abbreviated as xG, measures the quality of chances a team creates and faces. It's not perfect. But it's far better than results alone for predicting future form, and that's exactly what value betting requires.
What Expected Goals Actually Measures
Expected Goals quantifies the quality of shots taken in a match. Every shot gets assigned a probability of becoming a goal based on historical data. A penalty is worth around 0.79 xG (penalties score roughly 79% of the time). A tap-in from three yards might be worth 0.6 xG. A hopeful shot from 30 yards might be 0.01 xG. Add up all the shots and you get the team's xG for that match.
The key insight is that xG measures the chances created, not the luck involved in converting them. A team might face five shots with 0.8 xG combined and lose 2-1 despite conceding only 0.8 expected goals. The team was actually unlucky. Conversely, a team might win 2-1 having only created 1.2 xG. They were lucky.
Over time, results regress towards expected goals. A team that consistently creates 1.8 xG per match but has only scored 1.4 goals per match is likely to improve. A team that's been scoring 2.0 goals per match on 1.2 xG is likely to decline. This regression is mathematically inevitable. You can't beat your underlying chance creation indefinitely.
For betting purposes, this is crucial. The market prices matches based partly on recent results. A team that's won their last three matches gets shorter odds even if their underlying performance has been middling. A team that's lost three on the trot gets longer odds even if they've been creating good chances. This is where xG reveals value.
How the Gap Between Results and Performance Creates Betting Opportunities
Imagine a Premier League team that's been on a terrible run: three losses in their last four matches. Their odds to win the next match are 2.50. But their underlying xG over those four matches shows a different story. They've been creating an average of 1.6 xG per match while facing only 1.3 xG. They're underperforming their underlying quality. The market has punished them with longer odds because of results, but their actual performance suggests they should be stronger.
This is a classic value spot. The bookmaker's odds reflect recent form more heavily than underlying performance. Your job as a value bettor using xG is to identify when this gap is large enough to represent value.
The inverse is equally important. A team on a hot streak might be overpriced. They've won their last three matches 2-0, 2-1, 2-0. But underlying data shows they've been creating only 1.1 xG per match while facing 1.6 xG. They've been winning through luck. Their odds to win the next match are 1.50. But if we trust the underlying data more than the hot streak, that price is too short. You might find value backing their opposition.
Finding and Using xG Data for Betting
Several sources provide reliable xG data. Understat (understat.com) is the industry standard for detailed xG analysis across all major European leagues. They track not just total xG but xG distribution, shot types, and player-level metrics. StatsBomb provides similar data. FBref (a Sports Reference product) offers free xG data across all matches.
For practical betting purposes, you need several pieces of information:
Team attacking profile: How much xG does the team create per match? Over the last 10 matches? Are they trending up or down? A team averaging 1.4 xG per match but creating 1.8 xG in their last five matches is improving.
Team defensive profile: How much xG do they concede per match? A team conceding 1.9 xG per match is in trouble regardless of their recent results.
Head-to-head context: Sometimes matchups matter. A team that creates 1.5 xG against top-six sides but 1.9 xG against lower-ranked sides needs context. The upcoming opponent matters.
Regression likelihood: The bigger the gap between results and xG, the more regression is likely in the short term. A team three points above where xG suggests they should be is likely to drop points soon. A team five points below where xG suggests they should be is likely to earn points soon.
You can find much of this data on publicly available websites. Understat shows xG timelines for every team. FBref shows season-to-date xG records. Most data sources let you filter by time period (last 10 matches, last 5 matches, etc.).
The Practical xG-Based Value Identification Process
Here's a structured workflow for using xG to find value:
Step 1: Identify a match you're interested in. Check the fixture list and pick a match you want to analyse.
Step 2: Gather xG data for both teams. Find their last 5-10 matches xG statistics. Calculate their average xG created and xG conceded over this period. Look for trends.
Step 3: Compare results to xG. Calculate how many points the team should have earned based on xG (roughly, 1.5+ xG is expected to earn about a point, 2.0+ xG about 1.5 points, etc.). Compare this to actual points earned. The larger the gap, the more likely regression.
Step 4: Consider the matchup. How do the attacking profile of one team match against the defensive profile of the other? A team averaging 1.8 xG per match facing a defence conceding 1.4 xG per match is a slightly positive matchup. A team averaging 1.2 xG facing a defence conceding 1.8 xG is a negative matchup.
Step 5: Estimate true probability. Use the xG data to form your own probability estimate for the match. You might think: "Based on xG, the home team has about a 55% chance to win, 25% to draw, 20% to lose."
Step 6: Check bookmaker odds. Convert the bookmaker's odds to implied probability. If they're significantly lower than your xG-based estimate, you have value.
Step 7: Place the bet and track CLV. Over time, track whether your xG-based assessments are correctly predicting match outcomes. This tells you if xG analysis is actually giving you an edge.
Understanding the Limitations of xG
xG is a powerful tool but it's not the whole story. A team's underlying quality is important, but execution matters in football. Some teams are genuinely better at converting chances. Some defences are genuinely better at limiting danger. xG captures the creation of chances, not the ability to finish or prevent them.
Also, xG data can be noisy in low-volume situations. A single match tells you very little. Two matches tell you almost nothing. But 10 matches of xG data gives you signal. This is why xG-based value betting works better over a season or across multiple leagues than for picking individual matches.
Some bookmakers are now aware of xG analysis and factor it into their odds. In highly watched leagues with sophisticated markets, the gap between results and xG-based probability might already be reflected in the odds. But in lower-profile leagues or early in the season before enough data accumulates, xG can still reveal edges that the market hasn't fully priced.
How AI Models Go Beyond Basic xG Analysis
While xG is a starting point, modern AI models incorporate many other factors. They track shot patterns, ball possession in dangerous areas, defensive shape, fatigue levels, injuries, and tactical adjustments. An AI model might recognize that a team's xG is genuinely low because they've been tactically set up to defend deep, not because they're weak.
This is where tools like those offered by SportSignals can add value. Rather than manually gathering xG data and making probabilistic estimates, AI models integrate xG with dozens of other variables to arrive at more accurate probability assessments. The model might recognize a regression opportunity that basic xG analysis would miss because it factors in other context.
But the core principle remains the same: find gaps between bookmaker odds and true probability using the best data available. xG is one powerful data source. AI models layer additional sophistication on top.
Real-World Example: Using xG to Spot Value
Consider a real scenario from a Premier League season. Team A has lost their last three matches 1-0, 0-1, and 2-1. They're at 3.50 to win their next match. But over those three matches, they created 1.8, 1.6, and 1.4 xG respectively (total 4.8 xG) while conceding 1.2, 1.3, and 1.1 xG (total 3.6 xG).
Their results say they've conceded six goals while creating only three. But their underlying data says they should have conceded roughly 3.6 goals while creating roughly 4.8 goals. They've been heavily unlucky. Their actual record is three losses when the xG data suggests they should be close to level or even slightly positive.
This is classic regression opportunity territory. The market has overreacted to results. At 3.50, backing Team A to win represents value because the underlying data suggests they're better than their record indicates. Over the next few matches, regression should occur. This is how xG-based value betting works.
In Summary
- Expected Goals (xG) measures the quality of chances created and faced, not the luck of converting them
- When team results diverge significantly from underlying xG, regression to the mean is likely over time
- The market often overweights recent results relative to underlying performance, creating value opportunities
- Gather xG data, compare actual results to xG, and identify where odds misprice the true probability
- The xG-based value identification process: find a match, gather xG data, compare results to xG, estimate probability, check bookmaker odds
- xG is not perfect and works better over larger samples (10+ matches); variance masks signals in small samples
- Modern AI models build on xG by incorporating additional context (weather, injuries, possession patterns)
- FBref.com (Sports Reference) provides free xG data; Understat.com offers more detailed analysis for paid subscribers
- xG is significantly more predictive than results alone; gaps between results and xG reveal where value lives
FAQ
Q: Where can I find reliable xG data for free? A: FBref.com (Sports Reference) provides free xG data across all major leagues. Understat.com offers more detailed analysis but requires a paid subscription for full access. Most analysis websites display xG data even if their models differ slightly.
Q: How much recent xG data should I use? Last five matches or the whole season? A: Use both. Check how a team's xG has changed over the season and specifically their last 5-10 matches. A team that was weak early season but improving recently has more positive momentum than overall season xG suggests.
Q: If xG is so predictive, why don't bookmakers just price based on xG? A: Bookmakers do use advanced analytics internally, but they still price with biases. Retail bettors overweight recent results and underweight complex statistical measures. To manage their liability against retail money, bookmakers adjust odds away from pure xG-based probability.
Q: Can I use xG for in-play betting? A: xG is primarily designed for pre-match analysis. In-play, the match state changes so rapidly that pre-match xG analysis becomes less relevant. Focus on pre-match value betting with xG.
Q: What's a significant gap between results and xG? A: Teams typically regress towards their xG within 5-10 matches. A gap of 5+ points between expected points (based on xG) and actual points is notable. Beyond 10 points of variance, regression becomes quite likely.
Q: Should I use only xG or combine it with other betting factors? A: xG should be one part of your analysis, not the whole picture. Combine it with team form, head-to-head records, injury information, and tactical analysis. The strongest value bets come when multiple factors align.
