Evaluating the 2019 Predictions & Projections

The Nationals are World Series champions and the offseason is here. Before we start looking ahead to the spring, however, it’s time to look back…to the spring.

That’s right, no-one who made a preseason win total prediction is getting away without the usual level of accountability. It’s time for BttP’s fifth annual prediction and projection review. Can PECOTA defend its title? Will one of the other projection systems strike back? What if the humans decide to overthrow the computers? Let’s find out who takes home the coveted title of Least Bad At Predicting Baseball, At Least For This Year.

A quick recap of how this works for the new reader. For each of the sets of predictions and projections that featured in our preseason analysis, the mean absolute error (MAE) and root mean squared error (RMSE) has been calculated. MAE is the average difference between the predicted total and the actual, while RMSE is the square root of the average of the squares of all the differences. RMSE gives greater weight to large errors because they are squared, so if you think bigger misses should be punished more heavily, this is the more relevant number.

Read the preseason piece for a full breakdown of where all the competitors stood in March, but if you want to get right to the results, here’s a quick reminder of who’s competing for the title:

The Contenders

PECOTA (PEC): The Baseball Prospectus projected win totals based on their in-house projection system.

FanGraphs (FG): The FanGraphs Depth Charts projected totals, which are a combination of the Steamer and ZiPS projection systems, with an additional playing time adjustment applied by FanGraphs staff.

Davenport (Dav): Totals based on Clay Davenport’s projection system, with Clay’s own playing time estimates.

FiveThirtyEight (538): Site projections from FiveThirtyEight.com, based on their Elo rating system.

Banished to the Pen writers (BttP): Predictions from each of our writers from our season preview series.

Effectively Wild guests (EW): Predictions from each of Effectively Wild‘s team preview podcast guests.

Composite (Comp): The average of the six projection/prediction sets above, with the BttP/EW sets adjusted down to add up to 2430 wins so they are not given extra weight.

Public (Pub): The average of all responses to a preseason poll in which I asked people to predict win totals for every team. This has replaced the PECOTA over/under game from previous editions.

The Results

Set	MAE	MAE Rank	RMSE	RMSE Rank
Pub	7.73	2	8.92	1
538	7.70	1	9.08	2
Comp	7.90	3	9.28	3
FG	8.43	7	9.68	4
PEC	8.30	5	9.70	5
EW	8.40	6	9.86	6
BttP	8.20	4	10.04	7
Dav	9.43	8	10.70	8

What a result for the newcomers. 538 makes an impressive debut by taking the MAE victory, but the real turn-up here is the wisdom of the crowds functioning spectacularly to claim the RMSE title. The Public set was most notably closest on the Pirates compared to the other competitors: still 7 wins too high, but closer than the projection systems by 2-4 wins and a full 10 better than Effectively Wild preview guest Stephen J. Nesbitt.

Beyond that, the Public set simply did well by not whiffing too big. It was the only set to not have at least one 20-win miss, and those big misses really hurt in RMSE. The Tigers were so bad that Davenport and BttP preview authors AD and Mark Sands both missed by 24 wins, making the Public’s 19 comparatively excellent.

538, meanwhile, nailed the Cubs’ 84 win season exactly and also had the Reds pegged better than anyone else, missing their 75 wins by just two. It was slim pickings in terms of other correct win totals, as the only other correct win total came from the Composite set, which aggregated opinions on the White Sox perfectly.

FanGraphs and PECOTA had an incredibly tight battle, with PECOTA taking MAE but FanGraphs just barely edging it in RMSE. Davenport was, once again, the last-place projection system and indeed a fairly distant last place overall. Like PECOTA and FanGraphs, the two human projection sets that weren’t the public split the difference. Our BttP previewers were better by MAE but Anthony Fenech’s Tigers pessimism meant that the EW prediction missed by a mere 16 wins, tipping the RMSE calculation in their favor.

The most predictable team this year were, ironically, the Mets. Their 86 win season was missed by just 1.6 wins on average. The aforementioned Tigers were the landmine in 2019, with an average miss of 20.5 wins. Although we didn’t get any spot-on predictions this year, special mention goes to BttP previewers Scott Brady (Indians), Alex Crisafulli (Cardinals), Andrew Ingrelli (Brewers) and Peter Bloom (Nationals), who all missed their team’s total by just one, and EW guest Barry Svrluga, who did the same with the Nationals in the other direction. Lindsey Adler of The Athletic also deserves a mention for going bold with a 105-win Yankees prediction and missing by just two.

While 2019 represented a more easy to predict season than the past two, it still lags comfortably behind 2016, when the RMSE from the top set was just 7.3 and all of the predictors were ahead of this year’s winning RMSE. As the league gets more polarized, that shouldn’t be surprising; 47 or 107 win teams simply don’t show up in projections, and even most human predictions aren’t that extreme. The sortable table below shows how all of the sets compared to the final win totals.

Table 1
Div	Team	Actual	PEC	FG	Dav	538	BttP	EW	Pub	Comp
ALW	HOU	107	-9	-11	-8	-9	-11	-9	-8	-10
NLW	LAD	106	-13	-13	-15	-11	-11	-13	-11	-14
ALE	NYY	103	-7	-6	-6	-6	-6	2	-5	-6
ALC	MIN	101	-19	-19	-15	-17	-15	-17	-18	-18
ALW	OAK	97	-20	-15	-12	-14	-13	-9	-13	-15
NLE	ATL	97	-12	-13	-18	-13	-5	-7	-10	-12
ALE	TBR	96	-10	-12	-8	-10	-9	-6	-10	-10
ALC	CLE	93	4	-1	-1	2	-1	-4	-3	-1
NLE	WAS	93	-4	-3	-6	-4	1	-1	-4	-4
NLC	STL	91	-5	-5	-8	-6	-1	5	-4	-4
NLC	MIL	89	-2	-6	-9	-3	1	2	-1	-4
NLE	NYM	86	1	-1	1	-1	-2	-2	-3	-1
NLW	ARI	85	-5	-8	-13	-6	-5	-13	-10	-9
ALE	BOS	84	6	10	9	11	11	15	12	9
NLC	CHC	84	-4	4	-3	0	7	8	3	1
NLE	PHI	81	8	5	4	3	9	7	8	5
ALW	TEX	78	-8	-7	-8	-8	3	6	-7	-4
NLW	SFG	77	-4	-2	-11	-6	-7	4	-6	-5
NLC	CIN	75	6	6	5	2	9	6	5	5
ALC	CHW	72	-2	-2	5	-1	6	-2	1	0
ALW	LAA	72	7	10	13	8	15	11	11	10
NLW	COL	71	13	10	8	11	22	21	14	13
NLW	SDP	70	11	9	4	5	12	10	9	8
NLC	PIT	69	11	10	9	10	13	17	7	11
ALW	SEA	68	4	8	15	11	6	9	7	8
ALE	TOR	67	7	9	14	8	4	8	8	8
ALC	KCR	59	14	11	10	11	11	9	7	10
NLE	MIA	57	10	8	8	7	3	11	5	7
ALE	BAL	54	4	8	13	6	-3	-2	3	4
ALC	DET	47	19	21	24	21	24	16	19	20

Public Predictions

Given that the Public set came away with the RMSE win and only narrowly missed MAE, you might already have guessed that a few people did really well. One respondent came out with an extremely nice 6.90 MAE and an 8.43 RMSE to beat the winning marks by 0.8 and 0.49 respectively. Unfortunately, that respondent did not leave their name, so we’ll just have to call them Nostradamus.

The next-best contestant in MAE did leave their name: congratulations to Scott T. Holland, who came out with a 7.07 MAE. While slightly less of a mystery than Nostradamus, the still rather cryptic Humphrey was just barely pipped in RMSE at 8.44.

That’s not to say there weren’t some truly terrible predictions that fell way behind all of the other sets. Several people had an average error of more than ten wins, and one was even over 11.

People who looked at PECOTA predictions before completing the survey were actually slightly worse in MAE, at 8.49 on average compared to 8.40 for those who did not. They just about flipped that in RMSE, 10.04 to 10.06. There was a slight edge for those four respondents who actually used PECOTA during the survey, with 8.37/9.95 respectively.

As might be expected over a sample of this size, there were a lot more spot-on predictions here, so the special mention goes to those three participants who got three separate team totals correct: Simon G, Alex, and AJP. Full results for all 56 people who took part can be found here.

Ranks

When it came to predicting the order the teams finished in, it was no contest at all. 538 ran away with it, while FanGraphs performed much better in RMSE here by having just two double-digit misses.

Set	MAE	MAE Rank	RMSE	RMSE Rank
538	3.90	1	4.88	1
FG	4.43	4	5.29	2
Comp	4.27	2	5.42	3
Pub	4.47	5	5.51	4
BttP	4.47	5	5.97	5
PEC	4.57	7	5.99	6
EW	4.40	3	6.02	7
Dav	4.93	8	6.44	8

Here, it was Boston which confounded our predictors the most, with almost every set but PECOTA missing their rank – all too high, of course – by 11 spots. PECOTA did marginally better at 9, while Alex Speier’s streak of impressive predictions came to an end as he missed by 15 wins and 12 ranking places.

The Twins also proved to be tricky, this time in the other direction, as every set but Davenport (-5) and 538 (-7) was out by ten spots or more on the low side. PECOTA really lost out here by being way too low on the A’s, predicting them as just the 21st-ranked team in the preseason only for them to end up fifth.

Div	Team	Rank	PEC	FG	Dav	538	BttP	EW	PUB	Composite
ALW	HOU	1	0	-1	0	0	-1	-2	0	-1
NLW	LAD	2	-2	-2	-3	-1	-1	-3	-2	-2
ALE	NYY	3	0	2	1	1	2	2	1	2
ALC	MIN	4	-10	-10	-5	-7	-11	-12	-11	-11
ALW	OAK	5	-16	-9	-5	-10	-11	-8	-9	-11
NLE	ATL	5	-7	-6	-14	-6	-2	-5	-4	-7
ALE	TBR	7	-3	-4	1	0	-6	-3	-5	-2
ALC	CLE	8	6	3	4	5	1	-4	3	3
NLE	WAS	8	2	2	1	2	3	2	2	2
NLC	STL	10	0	2	-3	1	0	6	1	3
NLC	MIL	11	3	-2	-6	4	1	2	3	1
NLE	NYM	12	4	2	5	3	-4	-4	-3	-1
NLW	ARI	13	-4	-8	-11	-5	-9	-12	-8	-9
ALE	BOS	14	9	11	11	11	11	12	11	11
NLC	CHC	14	-3	7	-1	3	5	8	5	3
NLE	PHI	16	10	8	6	5	6	3	10	8
ALW	TEX	17	-9	-8	-9	-9	-4	1	-8	-7
NLW	SFG	18	-5	-6	-11	-6	-9	-2	-7	-8
NLC	CIN	19	4	2	2	-2	3	-1	1	1
ALC	CHW	20	-6	-6	-2	-4	-3	-6	-4	-5
ALW	LAA	20	0	6	10	3	7	1	5	3
NLW	COL	22	9	5	3	6	16	16	9	8
NLW	SDP	23	8	4	0	1	4	1	4	3
NLC	PIT	24	7	5	3	6	5	9	4	5
ALW	SEA	25	0	3	12	7	1	2	4	4
ALE	TOR	26	4	4	11	4	1	2	5	3
ALC	KCR	27	4	1	0	1	0	0	0	0
NLE	MIA	28	0	-1	-2	-1	-1	1	-1	-1
ALE	BAL	29	-1	-1	1	-1	-1	-1	-1	-1
ALC	DET	30	1	2	5	2	5	1	3	2

So there we have it. 538 makes the case for Elo-based projection systems, while the people prove that, like the automated strike zone, the machines haven’t actually got it all figured out yet.

Darius Austin

Next post: Will the Phillies Ever Overcome Their -6,528 Run Differential?
Previous post: The Next Star from Japan