In my last post, I outlined a model that can predict regular season point percentage based on team goals for and goals against rates. We were able to use it to see which teams have performed better or worse than the model for the current season, which provided some insight on which teams are likely to heat up or cool off down the stretch.  The main limitation of the model, however, is that is needs a sufficient amount of data from the current season to make these assessments.
I would like improve the predictive value of the model and it’s ability to understand which players bring the most value to a team in the salary cap era. Eventually, I would like the model to use forecasted individual player statistics to predict team performance. This would allow us to use the model to predict regular season point percentages from the start of the season when we have no team data for the current season. It would also allow us to predict the effect of individual players on team performance. This could mean predicting the impact of trades or free agent signing, the impact of injuries, or assess areas where struggling teams need to focus their attention to re-build.
Today, we take the next step in building out the model and look at the how contributions to goals for and goals against rates from individual skaters can be combined to estimate team goal rates. We want to get a reasonably close estimate of the overall team goal rates that we can use as inputs to the points predictor model. Once have this in place, we can turn our focus to the individual players statistics.
Let’s start by looking at goals for per 60 minutes (GF/60). We can find on-ice GF/60 for individual players and we want to combine these stats so that they closely align with the team GF/60 that we used as input to the points predictor model.  The model uses only 5v5 data, so we’ll stick to that again (data from Natural Stat Trick).
We could simply average the GF/60 of all the players on a team and check to see how close it is to the team statistic. However, there are a couple of obvious challenges with such a simple approach that we should address. We know that each player gets a different amount of ice time and, as such, has a different amount of influence over the team numbers. Players who play more will have a larger effect on the team. We also know that players have different roles, the most obvious being the split between forwards and defensemen. This split also affects the players ice time since defensemen split available ice time with other defensemen, and forwards split their ice time with the other forwards.
We also want to ensure we don’t let outliers due to small sample sizes affect our estimate. A player who has played only a game or two throughout the full season is probably not worth us trying to include in a forecast. It may even add error to the estimate if we do. To avoid this, we’ll use the 12 forwards and 6 defensemen who have the most games played (for the season) on each team for our estimate. This will reflect the team’s typical lineup.
With these considerations in mind, I will use a weighted average to calculate a combined GF/60 among forwards and among defensemen separately. We’ll then combine those two numbers to get an estimated team GF/60. We’ll look at the last full 82 game regular season (2018-2019) and compare the estimated numbers to the actual team GF/60 to see if our calculation gets us reasonably close.
To calculate the weighted average, we’ll use the player’s time-on-ice per game played (TOI/GP) out of the total TOI/GP for forwards/defensemen to weight their contribution to GF/60. We’ll combine the numbers within the forwards and defensemen groups to get the weighted average GF/60 for the two groups. Then we’ll combine them to come up with our estimated team GF/60. Below is a chart of the results, along with the actual team statistics.
Team | Actual GF/60 | Calculated GF/60 | % Error GF/60 |
Anaheim Ducks | 2.05 | 2.06 | 0.49% |
Arizona Coyotes | 1.99 | 2.02 | 1.51% |
Boston Bruins | 2.34 | 2.47 | 5.56% |
Buffalo Sabres | 2.29 | 2.31 | 0.87% |
Calgary Flames | 2.87 | 2.88 | 0.35% |
Carolina Hurricanes | 2.42 | 2.45 | 1.24% |
Chicago Blackhawks | 2.72 | 2.75 | 1.10% |
Colorado Avalanche | 2.43 | 2.46 | 1.23% |
Columbus Blue Jackets | 2.75 | 2.79 | 1.45% |
Dallas Stars | 2.03 | 2.08 | 2.46% |
Detroit Red Wings | 2.19 | 2.23 | 1.83% |
Edmonton Oilers | 2.17 | 2.14 | 1.38% |
Florida Panthers | 2.45 | 2.50 | 2.04% |
Los Angeles Kings | 2.07 | 2.05 | 0.97% |
Minnesota Wild | 2.12 | 2.22 | 4.72% |
Montreal Canadiens | 2.83 | 2.86 | 1.06% |
Nashville Predators | 2.52 | 2.56 | 1.59% |
New Jersey Devils | 2.23 | 2.27 | 1.79% |
New York Islanders | 2.41 | 2.42 | 0.41% |
New York Rangers | 2.20 | 2.28 | 3.64% |
Ottawa Senators | 2.52 | 2.53 | 0.40% |
Philadelphia Flyers | 2.49 | 2.51 | 0.80% |
Pittsburgh Penguins | 2.71 | 2.76 | 1.85% |
San Jose Sharks | 2.88 | 2.91 | 1.04% |
St Louis Blues | 2.50 | 2.52 | 0.80% |
Tampa Bay Lightning | 3.16 | 3.17 | 0.32% |
Toronto Maple Leafs | 3.03 | 3.04 | 0.33% |
Vancouver Canucks | 2.20 | 2.21 | 0.45% |
Vegas Golden Knights | 2.59 | 2.56 | 1.16% |
Washington Capitals | 3.00 | 3.02 | 0.67% |
Winnipeg Jets | 2.54 | 2.51 | 1.18% |
Average | 1.44% |
Wow! The average error is less 1.5% and there are only a couple of teams where the error creeps up around 5%. This looks like it will accurate enough for our needs.
Let’s go through the same process for GA/60. I expect that we will see similar results.
Team | Actual GA/60 | Calculated GA/60 | % Error GA/60 |
Anaheim Ducks | 2.32 | 2.31 | 0.43% |
Arizona Coyotes | 2.38 | 2.35 | 1.26% |
Boston Bruins | 1.91 | 1.96 | 2.62% |
Buffalo Sabres | 2.75 | 2.72 | 1.09% |
Calgary Flames | 2.29 | 2.29 | 0.00% |
Carolina Hurricanes | 2.24 | 2.23 | 0.45% |
Chicago Blackhawks | 2.74 | 2.66 | 2.92% |
Colorado Avalanche | 2.38 | 2.34 | 1.68% |
Columbus Blue Jackets | 2.50 | 2.57 | 2.80% |
Dallas Stars | 1.98 | 2.00 | 1.01% |
Detroit Red Wings | 2.64 | 2.57 | 2.65% |
Edmonton Oilers | 2.65 | 2.58 | 2.64% |
Florida Panthers | 2.86 | 2.89 | 1.05% |
Los Angeles Kings | 2.52 | 2.41 | 4.37% |
Minnesota Wild | 2.34 | 2.39 | 2.14% |
Montreal Canadiens | 2.45 | 2.44 | 0.41% |
Nashville Predators | 2.16 | 2.23 | 3.24% |
New Jersey Devils | 2.82 | 2.89 | 2.48% |
New York Islanders | 1.89 | 1.89 | 0.00% |
New York Rangers | 2.61 | 2.62 | 0.38% |
Ottawa Senators | 3.24 | 3.27 | 0.93% |
Philadelphia Flyers | 2.87 | 2.85 | 0.70% |
Pittsburgh Penguins | 2.23 | 2.25 | 0.90% |
San Jose Sharks | 2.78 | 2.83 | 1.80% |
St Louis Blues | 2.21 | 2.17 | 1.81% |
Tampa Bay Lightning | 2.41 | 2.44 | 1.24% |
Toronto Maple Leafs | 2.50 | 2.47 | 1.20% |
Vancouver Canucks | 2.65 | 2.62 | 1.13% |
Vegas Golden Knights | 2.42 | 2.47 | 2.07% |
Washington Capitals | 2.46 | 2.50 | 1.63% |
Winnipeg Jets | 2.48 | 2.43 | 2.02% |
Average | 1.58% |
Good news. We see very small errors in the estimates again. This further confirms that the estimation process works quite well and it can help us translate individual statistics into team inputs for the points predictor model.
The next step in the process will be building models to predict player on-ice GF/60 and GA/60 from their individual statistics. Make sure you subscribe to follow along.