Projecting Baseball Like a Meteorologist (Part 2)

Welcome to Physics Friday! I would be happy to receive suggestions for future topics based on questions YOU have relating to baseball and science/physics/mechanics: you can drop them in the comments below or reach me at michael@btoolbox.org.

This is part 2 of N (probably N=3) of an extended discussion of the overlap between forecasting weather and projecting baseball. Part 1 was on techniques used to produce daily forecasts at one location (as in the wxchallenge.com competition). Part 2 will focus on other techniques used, especially in tropical cyclone forecasting. Part(s) 3(+), which may come later, will dig more deeply into individual methods.

In Summer 2014, I spent three months in rural County Galway, Ireland. I was collecting air samples at the Mace Head Atmospheric Research Station for my PhD research. However, due to restrictions on under-25 car rentals, I was forced to bike the ~12 miles between my rented house and the station each day. Driving a small stick-shift car from the right-side seat along narrow “one-lane, two-way” roads would have presented its own challenges, but spending so much time outside meant I got to know well the cool, rainy climate in western Ireland. The weather in the coastal North Atlantic doesn’t vary too widely. So you can imagine my surprise when, last week, I saw that Hurricane Ophelia was on track to make landfall nearly exactly at the Mace Head station.

NOAA National Hurricane Center forecast track for Hurricane Ophelia, issued Friday 13 October 2017, with a “cone of uncertainty” in its track in solid (3-day) and dotted (5-day) white shading

As I mentioned in my first piece on baseball and meteorology, I loved tropical meteorology when I was younger. To this day, the scale and power of hurricanes inspire in me great awe. Also as mentioned in that piece, I am not a trained meteorologist (especially in tropical meteorology). But from what I have learned, there are numerous connections between the specific field of tropical meteorology and baseball projection. I will introduce a few methods used to produce public forecasts for hurricanes and tropical storms, and discuss how a similar approach can be used in projecting player and team performance.

***“Memory Effect” and Large-scale Flow

In the case of Hurricane Ophelia, the forecast track from October 13 was not a surprise given where it was. With stronger, more regular winds in the North Atlantic, Ophelia was nearly certainly going to be driven by large-scale airflow patterns toward Ireland and the United Kingdom. Of course, the more amazing story is how a Category 3 Hurricane survived the relatively cold waters at such a high latitude. But in hurricane forecasting, one can largely ignore the specifics of the past, instead focusing on the present situation (which, of course, is strongly influenced by the past) and how those conditions will evolve in the future.

Consider two players with the following statistics over 4 years: Player A with season-long WAR of 1.0, 2.0, 3.0, and 4.0; Player B with 0.1, 6.3, -0.4, 4.0. Absent other information, we know both players were worth 10 WAR over the last four years and both had a 4.0 WAR campaign last year; arguments could be made on either side to support A or B on having a better season next year. That isn’t to say that past performance has no impact on the future: if Player B was injured in the -0.4 WAR season, we now know that A and B have 10 career WAR and 4.0 WAR last year, but we also know that B has an injury history. But without any additional information, it’s not unreasonable to project both players the same going forward.

Build in Best Physical Knowledge

This might be rather straightforward, but it always serves predictive models well to include the best possible understanding of the physical world. In cases where there is a finite amount of computational power, sometimes tradeoffs need to be made in order to have the model run efficiently. But if a relatively well-quantified physical change is expected to occur, it should certainly be considered. Take the example above with Hurricane Ophelia: somewhere between the black circle-H at 2 AM Sunday and the white circle-H at 2 AM Monday, Ophelia was expected to undergo a drastic physical change from a warm-core tropical cyclone to a cold-core extratropical low pressure system. During that transition, Ophelia slowed as its internal structure rearranged. If the meteorological model had ignored that change, Ophelia would have arrived at Ireland earlier and with a very different structure.

This concept can apply to many different aspects of baseball projection, but perhaps the most illustrative example is in understanding the ascent and downturn associated with players getting stronger and more skilled, and unfortunately, seeing their talents deteriorate with age. Research on the effect on body type, position, and conditioning level on “aging curves” can help extend and improve careers, but in general, players typically peak and decline on similar timeframes. Ignoring the cold reality of physical decline would, of course, lead to worse player projections. Aging veterans may bring other intangible benefits to teams through leadership and mentoring, but projection models do not (and should not) try and quantify such aspects of a player’s career.

Forecast Ensembles

In order to understand the uncertainty in hurricane forecast tracks, meteorologists use “forecast ensembles” in which different models are used and small changes are made in each model run. These changes help forecasters understand how (a) different physics in each model and (b) small discrepancies between actual conditions and those measured by NOAA Hurricane Hunters.

Ensemble forecast tracks from Tropical Tidbits (my favorite site, even for non-tropical data) for Hurricane Jose on 16 September 2017, showing results from different meteorological models. Note that some have wild shapes with loops and curls that aren’t found elsewhere. (https://web.archive.org/web/20170916131439/https://www.tropicaltidbits.com/storminfo/12L_tracks_latest.png)

Ensemble intensity guidance from Tropical Tidbits for Hurricane Irma on 4 September 2017, showing results from different meteorological models. Most models agreed that Irma was going to intensify, but only one correctly identified that it would intensify to Category 5 on the Saffir-Simpson scale. Some models (such as the orange TCLP at the bottom) never initialized correctly and therefore totally missed on the forecast. Forecast skill in track (future location) has traditionally been much better than in intensity.

NAEFS model ensemble from University of Wisconsin-Milwaukee for Hurricane Harvey, 22 August 2017. Because they were run with the same base model, they agree much more closely in their tracks than the plot for Jose above. (http://derecho.math.uwm.edu/models/archive/2017/al092017/al092017_2017082218_ens.png)

These ensembles can show where there is consensus on the future path and strength of tropical cyclones. Typically, they will diverge as time goes on and model differences continually compound. It’s possible that none of the individual model tracks will prove to be correct; the average path, bounded by some amount of uncertainty on either side, is typically the best way to present public forecasts for tropical cyclones. Any irregularities in a particular model or model run will likely be “smoothed out” by averaging it with other ensemble members.

It’s very unlikely that any particular baseball model (PECOTA, ZiPS, Steamer, even proprietary internal team models) will be able to capture correctly every way in which a player can contribute to the team. But the two types of “ensemble runs” used in meteorology, where multiple models are compared and where a single model is run with varying initial conditions, have powerful analogues in baseball. Sites such as FanGraphs even feature crowdsourced projections, which can be considered as “ensemble runs” themselves or can be included alongside other models. The key to getting these ensembles to produce realistic final numbers is determining how much to trust each one in different circumstances (a “weighting” problem, which I’ll explore in a future piece)

***

Thank you for bearing with me through these broad discussions of meteorology and baseball, each of which seems to somehow be surface-level (I could have drilled deeply into any single bullet point!) yet still confusingly complex to those without a background in meteorology. But I know there are a few meteorology/baseball enthusiasts out there, based on the comments I have received in the past week. Future pieces will be much narrower in scope, and I look forward to receiving suggestions from you all.

In case you were wondering, the station at Mace Head, Ireland survived Hurricane (technically, Extratropical Storm) Ophelia. The station will still be there when I (likely) return next spring to take a few more samples. The one thing I will miss, especially since I’ll now be old enough to rent a car, is the abundance of fresh blackberries growing like weeds in the late summer. Times goes on, seasons will change, and the blackberries and baseball players alike will return just as surely as tropical storms and hurricanes will form in the warm ocean waters. Such is baseball, and such is life.

Michael McClellan

twitter @mjm3mit

Michael is a PhD Candidate in Atmospheric Science in the Department of Earth, Atmospheric, and Planetary Science (EAPS, Course 12) at MIT. Along with three other sports-inclined MIT friends, he co-founded the baseball toolBOX (www.btoolbox.org) research venture, which aims to use analytical methods from fields such as atmospheric science and electrical engineering to discover actionable insights in baseball physics and mechanics. You can reach him with comments, questions, and juiced-ball conspiracy theories at michael@btoolbox.org

The Alternative Gift Guide for the Baseball Lover August 1, 2024

Next post: One Pitch
Previous post: When Safe Is Out

Projecting Baseball Like a Meteorologist (Part 2)

Leave a Reply Cancel reply

RECENT POSTS