Welcome to my first (Baseball) Physics Friday post! I’m always looking for ideas to write up anything, from a quick physics explanation to a longform narrative. This piece will be the latter, in the style of the two articles I recently submitted to FanGraphs Community Research. I would be happy to receive suggestions for future topics based on questions YOU have relating to baseball and science/physics/mechanics: you can drop them in the comments below or reach me at firstname.lastname@example.org.
This is part 1 of N (probably N=3) of an extended discussion of the overlap between forecasting weather and projecting baseball. Part 1 will focus on techniques used to produce daily forecasts at one location (as in the wxchallenge.com competition). Part 2 will focus on other techniques used, including in tropical cyclone forecasting. Part(s) 3(+) will dig more deeply into individual methods and experimental-computational combinations.
Please excuse me if the first words I write here constitute baseball apostasy, but my first love wasn’t baseball. It was weather. I didn’t grow up collecting baseball cards or playing baseball video games (with the exception of Super Bases Loaded at my neighbor’s house); instead, I tracked hurricanes on a laminated world map and dreamed of one day becoming a research meteorologist. These days, despite being close to completing my PhD in Atmospheric Science, I’m only an amateur meteorologist (I study air pollution and greenhouse gas emissions) and I have more baseball physics writing bylines than weather research paper co-author credits. From that jumbled research pedigree, I have learned quite a bit about the methods used to forecast the weather and methods used to project baseball player performance.
Forecasting the weather correctly can, in the most extreme cases, mean the difference between life and death. Therefore, it’s easy to understand why there are so many resources and staff members available to branches of the National Oceanic and Atmospheric Administration (NOAA) who forecast various aspects of the weather. MLB teams likely wouldn’t be able to fully harness the power of $45 million supercomputing clusters the way the National Weather Service does, but there are many approaches to meteorological forecasting that teams and baseball writing outlets could implement to improve the efficacy of their player and team projections.
When I was at the FanGraphs meet-up at Saberseminar 2017, I had the great fortune of grabbing Jeff Sullivan’s attention for about 10 minutes. We didn’t spend more than a few moments talking about baseball; instead, after mentioning that I was a member of the MIT WxChallenge Team (an intercollegiate competitive weather forecasting competition), we spent the rest of our conversation talking about the meteorology of mountainous areas. WxChallenge involves making a forecast (high temperature, low temp, wind speed, and precipitation) every weeknight for an assigned city, and the best WxC forecasters typically follow a pattern when deciding their final forecasts. In this piece, I will outline my typical workflow for creating a daily weather forecast, injecting discussion on how some of these ideas could be used to approach the problem of projecting future performance in the game of baseball.
- Where Is the Forecast?
This sounds more like a meta-rule than a real point to consider, but you first have to understand where you’re forecasting. It’s quite different for lifetime National Weather Service professionals, who live near the area they’re constantly forecasting, than for WxChallenge participants who have a new city every two weeks.
- In baseball, the scale of what you’re considering will drastically change the approach you take. Projecting a potential two-way (pitcher/hitter) draftee vs a veteran MLB player vs an entire team will all pose significantly distinct challenges.
Absent perfect recall of an entire geographic area, it’s crucial to examine the geographic situation of a forecast location. The most challenging forecast city I have ever done was Kodiak, Alaska (station PADQ if you want to read the raw METAR data) due to the drastic changes that come with slightly different wind patterns. Water can hold heat much better than air (due to its high heat capacity) so wind blowing over large water features can moderate drastic changes in temperature. Physical features such as mountains, hills, and valleys can also interrupt and redirect airflow near the stations.
- The connection to baseball here is perhaps too direct, but the stadium location and dimensions must be considered when projecting how a player will do. We all know about the offense-inflating effects of dry high-altitude stadiums, but prevailing winds AND wall height/distance in each stadium will also affect all aspects of any game played there.
Precipitation and Temperature climatology for Boston, MA (http://drought.unl.edu/archive/climographs/BostonANC.htm)
The final step in weather forecasting that comes before reading *specific* recent information is understanding the average weather conditions expected for the current date. This “climatology” by month can help put into context large trends; an abnormally dry summer, in which less rain falls than the long-term climatology would indicate, will influence future forecasts of precipitation.
- Are there any league-wide trends to consider? Home run rates, strikeout rates, and pitch velocity can be considered in a global league-wide sense.
- Take Stock of the Current Conditions
Surface Analysis map, showing front boundaries, barometric pressure contours, and current weather at major airports (temperature, pressure, dew point, precipitation, wind speed/direction, and sky cover) (http://www.wpc.ncep.noaa.gov/sfc/namussfcwbg.gif)
Once I have a handle on the forecast city, the first thing I check in future days is the graphical map of current conditions at the ground (called a surface analysis) at major airports and buoys. It takes some practice to read all of the information displayed, but this “kitchen sink” map point a forecaster’s attention to areas of notable or changing weather.
- I promise these baseball connections will get more specific as the weather examples get more specific! In this case, one could similarly look at how teams have done. Who are the stars on each team? If it’s the middle of the season, what are their records?
500 mb (500 millibars, a “pressure height” approximately 18000 ft or 5600 meters from the surface) analysis, including height, temperature, and wind where weather balloon observations were made (http://weather.unisys.com/upper_air/ua_man.php?plot=500&inv=0&t=cur)
Another important map to dissect is the 500 mb (millibar is a unit of atmospheric pressure, with the pressure at the surface being ~1000 mb) analysis, which shows the height (in meters) of the altitude where pressure drops to 500 mb. This map is created by displaying and stitching together the weather balloon measurements made at major airports. The wind at this level, indicated by the “flags” coming out of each point, is known to “steer” large-scale weather patterns much more strongly than winds at the surface.
- Just as the 500 mb surface steers the weather to come, a team’s future commitments (arbitration, pending free agents, etc.) and farm system status will determine likely future rosters. The weather can certainly be affected by conditions closer to the surface or by yet-unseen factors, and team rosters can be unexpectedly affected by trades, injuries, and early/late promotions of minor-league players.
- Weather Models
In the age of large supercomputer clusters, the major workhorses of weather forecasts are the large numerical weather prediction models produced by government agencies and research labs. These models typically assimilate real observations (at the surface and at altitude) and interpolate between them to fill in gaps in space and time.
The preeminent models all have different spatial resolutions in order to focus computational resources on the areas of greatest interest. The Global Forecast System (GFS) model forecasts for the entire globe, but at a relatively coarse spatial resolution. The North American Mesoscale Model (NAM), on the other hand, only looks at North America and has a finer spatial resolution. But recently, both American forecast models (GFS and NAM) have struggled to match the accuracy of the model produced by the European Center for Medium-range Weather Forecasting (ECMWF). Weather models have varying success in different meteorological situations, so it is important to examine all available models.
- All of the different baseball projection models follow a similar workflow: read in observations, process those observations, and determine (using calculations with different variables and parameters) the most likely future outcomes. In order to contextualize these models, however, one must have a good idea of specific situations that cause them trouble.
Some meteorological models aren’t concerned with the weather in large areas, and instead focus on one specific location. These “point forecast” models often read in the raw output from other large-domain models and apply computationally intensive calculations in one location to derive a better forecast there. The USL model does exactly that for WxChallenge cities, and the USL model typically performs better in those particular cities than NAM or GFS.
- Imagine, instead of trying to predict an individual player’s future performance from league-wide trends and a small number of observations of that particular player’s performance, a model was created that took in information on every swing a player takes (in BP, in Spring Training, in games) and made projected future performance from that. Sounds time-consuming, but potentially much better than projections solely from Steamer, ZiPS, etc.
- Past Performance and Adjustments
It’s rare that the forecast models get everything 100% right. In fact, it’s relatively common that they are pretty far off. But if the differences between the actual observed weather (also called “verification” conditions) and what they predict can be extremely instructive in how to adjust model output to create human-produced forecasts.
NAM predicted temperature (T, red/orange) and dew point temperature (Td, green) at Boston on 9 Oct 2017, with model surface conditions (“2m”), Model Output Statistics (MOS) adjustments, and real observations (http://wind.mit.edu/~btangy/Home/Forecast2.html)
At each METAR station where weather is observed and reported, there is a set of multiple linear regressions used to adjust raw model output to better match the observed weather. These adjustments lead to an enhanced “guidance” forecast using Model Output Statistics (MOS). The MOS guidance should help with prevailing temperature trends based on, for example, the wind direction and intensity. If the underlying model, however, is changed (even if it is improved), the old MOS adjustments will likely no longer be useful.
- If a baseball model, for example, did not correctly handle the value of a curveball, pitchers who heavily rely on a curveball might not be projected correctly. Introduction of a MOS-like guidance based on past performance of the model could greatly enhance the value of the model’s raw output without drastically overhauling the model itself.
High and low temperature error (departure from the pink center lines) at Boston, with light red/blue (“MET”) representing NAM MOS and dark red/blue (“MAV”) representing GFS MOS. The bottom panel is a complex chart showing observed cloud cover, temperature, dew point temperature, and wind. (http://wind.mit.edu/~btangy/Home/Forecast2.html)
Some model-observation mismatch cannot be solved by MOS guidance alone, especially when there are rapid changes (such as a cold front or hurricane passing through). Models are assigned a “skill score” based on how well they predicted verified past weather conditions. In the particular case above at Boston, the GFS MOS (MAV, dark red and blue) has a smaller average error (the small number on the right side of the plot) than the NAM MOS (MET).
- It’s a cold reality that some baseball projections are better than others. Each may have advantages in certain areas, but making choices on which model to believe in various situations might come down to how well they have operated in the past (and ignoring the models with serious issues).
- Human Element!
This doesn’t sound too scientific, but ultimately, the best weather forecasts use computational forecast models as a base on which human insight (honed by extensive practice) leads to minor adjustments. Sometimes models alone are best, sometimes they totally whiff. On the other side, humans are prone to reading patterns where they aren’t, but they can also spot patterns in apparent noise. Human forecasters can still show much more skill than raw or adjusted model output because they are capable of (a) reading and adjusting model output and (b) contextualizing observations before making a final decision.
- This is one area where baseball might actually be doing a good job! Internal MLB team projections likely assimilate large amounts of data produced by human scouts. In many teams, organizational decisions might still be driven by scouting reports first and projections/models second; however, that represents a much better starting point than all-scout or all-model forecasts!
So, dear reader, if you made it to the end, you can probably tell that I’m a real weather nerd. But I swear, I’m not even the best weather forecaster on the MIT team! But given that there are so many MLB team front office members and baseball writers with backgrounds in meteorology, teams must think the game of baseball can be enhanced by understanding core concepts in meteorology. I look forward to exploring some more of these concepts with you next week!Next post: Absolution or Ignominy
Previous post: The Losers Guide to the Divisional Round