Can Past Stats Really Predict a Player's Future?

Note to the reader: I wrote this article a little over a year ago. I posted in my blog for a month, but took it down because it didn’t the team theme of the blog and has been collecting dust ever since. Now that it has a home, I’m going to make this a weekly series. I hope you enjoy it.

I wanted to write this article as a way to view how people interpret stats when talking about players and projections. I’ve noticed through the years as a baseball fan that a lot of analysts/writers/bloggers use past season stats almost as a way of projecting what a player is capable of doing in the next season. This got me thinking about stats across baseball and how we look or read into the numbers to project players. I realized this article would need more than my opinion on the matter, and as much as I would like to bask in my own self-perceived glory, I decided to reach out to three writers to give this subject a broader prospective instead and maybe learn a thing or two myself.

Before we begin, let me introduce the writers that have graciously lent their voice to this article. We have Drew Fairservice of Fangraphs, Sam Miller of Baseball Prospectus, and Mike Petriello of Dodgers Digest. I chose these writers, well, at the risk of sounding like a kiss ass, they’re my favorite writers. I read their articles and listen to some of their podcasts and, even though I don’t see eye to eye with everything they produce, I have a huge respect for them. So now that we’ve gone through the pleasantries, and I’ve put the ceremonial chapstick away, let’s get down to business.

Growing up as a kid I was taught that the beauty of baseball is that you can’t predict outcomes as the variables in baseball are so astronomical that it makes it improbable to do so. I’ve adopted that idea into how, maybe foolishly, I look at players. There seem to be too many variables such as broken bat hits, weird bounces, cracker box ballparks, crazy umpires, etc. Don’t get me wrong I think stats are useful tools in measuring what a player has done, what they’re doing, and their present value. Call me crazy, but I still find it hard to swallow that they’re great ways to project the future. So in my never-ending search to learn more about baseball and how others employ stats, I reached out to the above mentioned three and asked them the following. How much validity do stats have in predicting what a player will do in the future and how much should we read into trends? This is what they had to say:

Sam Miller: “An object at rest stays at rest and an object in motion stays in motion, so in a very general sense the best way to predict the future is by measuring the past. Predicting the future in baseball is, of course, impossible, but there is a practical need to try, and to be as close as possible. That means not just saying ‘stats’ are good or not good, but trying to get the best stats and trying to be aware of the shortcomings. Bigger samples are better than smaller samples; recent is better than not recent; controlling for distorting outside factors (like ballpark, luck, etc.) is better than not controlling; and relevant non-statistical information shouldn’t be ignored.”

Drew Fairservice: “The biggest key for projecting stats from previous seasons is focusing on what is skill-based and what is luck-based (if not luck, then stuff that remains beyond the control of the batter.) Things like walk rate and plate discipline are quite consistent season over season, as the batter has the most control over the one crucial detail: to swing or not to swing. Something like batting average tends to be more volatile because so much other stuff (like lucky bounces and scoring decisions) can affect it from one year to the next. This search for year-by-year consistency leads to numbers like ISO gaining prevalence over slugging percentage, or why home run per fly ball rate might tell you more about a player’s likelihood of repeating their breakout season. There is a lot of variance in the outcomes so it is easy to poke holes in projections because, well, it is extremely unlikely they’ll end up exactly right.”

Mike Petriello: “I do think that stats from past seasons have a huge importance when judging future seasons, but it’s just part of the puzzle. For example, Luis Cruz never walked that much in the minors for over a decade. He didn’t walk that much with the Dodgers or Yankees. After such a long track record of impatience at the plate, there’s little reason to think he’s going to change now — and players who walk so infrequently, time and again, have shown that they have difficulty succeeding over the long-term. Of course some (stats) have no year-to-year correlation at all – stats like wins & RBI, ones that rely so heavily on other players to the point that they have no individual relevance. That’s not to say there can’t be outliers, because part of the greatness of baseball is that you can never know for sure.”

So while they all agree that predicting a players production to perfection is implausible, we can trace patterns in certain stats to come really close, but let’s examine some of the “outliers” and “non-statistical information” that Sam and Mike mention. Oftentimes you’ll read about a player who hasn’t played that well get traded to another team and have a great season and the narrative by reporters or T.V analyst will say “He just needed a change of scenery” or “The hitting/pitching coach sure has turned this guy around.” Sometimes you hear reports on troubles in the personal life of a player and you have to think it could affect their performance on the field. I myself, admittedly, buy into these because of my downfall of my love of a certain team. I want to believe these narratives of players getting magically better because he plays for my favorite team. So in the real world, how much weight can we put into a player’s ability to adjust with the instructions of coaches or to a change of scenery of a new team?

Drew: “Adjustments happen but more often that not, a player is what he is. There is only so much coaching that can help established or older players.”

Mike: “A guy can improve unexpectedly for any number of reasons — a new diet, an improved dedication to working out, a pitching coach who teaches a new grip or fixes a mechanical flaw, resolving a bitter divorce case, etc. Things like that will always have an impact and largely we’ll never know about them, but on the whole, if a guy is who he is it’s extremely rare that they can suddenly change, especially at the big league level.”

Sam: “Most of us aren’t qualified to really measure the relevance of non-statistical information. We aren’t experts and (besides team officials and the players themselves) we aren’t insiders. What we think we know – based on news reports and such — comes through unreliable filters, and is driven by a lot of factors that aren’t analytically sound: the need for reporters to fill so many column inches, the need for players and coaches to present a positive image of themselves to the public, etc. We don’t truly know much of the inside stuff. Even if we do know it, we don’t know how much the inside stuff matters. Most importantly, the vagaries of unmeasurable inside stuff give people too much leeway to create whatever narrative based on whatever evidence they want.”

So the non-statistical information can change a player future or it could not. That fact that the information is unmeasurable makes it impossible to know how much, or how little, these changes affect a player’s future production.

I hope this article has been educational and thought-provoking in how you look at stats as a way to project the future as it was for me, or at least entertaining to read. I want to thank Sam, Drew, and Mike for taking time out of their busy schedules and contributing their thoughts to this article. Usually I would come up with a formed opinion to end this article that would try to make a majestic point to make me look brilliant, but I think instead I’ll end it with one more quote from Sam who unknowingly summed up this article perfectly but I’ll let him tell you.

Sam: “One of the nice things about stats is they remove a lot of our subjectivity. There is still subjectivity in how we look at stats and how we deploy them, but at least the accounting itself is generally reliable and won’t lie to you unless you want it to.”

Joseph Garcia

Next post: Billy Butler Light
Previous post: How I Spent my Summer #VacationByMySelfie (Part 1)