I attended the Society for American Baseball Research’s Analytics Conference in Phoenix. It started Thursday, March 10 and ran through Saturday, March 12. (I was going to post daily updates, but didn’t, for networking/going-out-for-drinks reasons.) The full schedule is here, and the full program is here. Rather than give you a blow-by-blow of every presentation, I thought I’d just run off some of the greatest hits.

Diamond Dollars Case Competition

This is an annual feature of the conference. A number of college teams–there were 13 this year, nine undergrad and four graduate/professional–are given a case to analyze. This year’s challenge was to pick a National League team and design a five-person bullpen subject to the following restrictions:

  • One selection from Dellin Betances, Craig Kimbrel, Kenley Jansen, and Andrew Miller
  • One selection from Zach Britton, Ken Giles, Mark Melancon, and Tony Watson
  • Two selections from David Robertson, Cody Allen, Trevor Roshenthal, Drew Storen, Carson Smith, Darren O’Day, Sergio Romo, Josh Fields, Luke Gregerson, Liam Hendriks, Kelvin Herrera, and Jeurys Familia
  • One at-large selection. (Aroldis Chapman and Wade Davis were excluded)

To me, the pitchers chosen weren’t as interesting as the methodologies the teams used. I heard six presentations. Some teams looked for pitchers with certain features, e.g. lefty/righty, ground ball/fly ball, and K percentage. Some made projections using FIP, others xFIP, others PECOTA or STEAMER, and still others their own home-grown metrics. Some made specific adjustments based on their selected team’s opponents and parks in 2016. Some sought to fill specific roles (closer, setup, etc.) while others went for the best arms available. Here were the selections:

  • Syracuse–For the Pirates: Miller, Britton, Familia, O’Day, Brett Cecil
  • Middlebury–For the Diamondbacks: Jansen, Melancon, Smith, Robertson, Cecil
  • NYU*–For the Diamondbacks: Miller, Britton, Smith, Familia, Cecil
  • St. John Fisher–For the Diamondbacks: Betances, Britton, O’Day, Familia, Cecil
  • Virginia Polytech and State U–For the Cubs: Miller, Watson, Fields, Jared Hughes, Hector Rondon
  • Ohio University*–For the Pirates: Betances, Britton, Smith, Familia, Will Smith

The schools with the asterisks were two of the three winners. (I didn’t hear the third.)


SABR Analytics Conference Research Awards

AD previewed the Research Awards last month. The winners were:

  • Contemporary Baseball Analysis: Benjamin S. Baumer, Shane T. Jensen, Gregory J. Matthews, “OpenWAR: An Open Source System for Evaluating Overall Player Performance in Major League Baseball,” Journal of Quantitative Analysis, Vol. 11, Issue 2, June 2015
  • Contemporary Baseball Commentary: Alexis Brudnicki, “I’m Different, I’m the Same,” The Hardball Times, November 18, 2015
  • Historical Analysis/Commentary: John McMurray, “Examining Stolen Base Trends by Decade from the Deadball Era through the 1970s,” SABR Baseball Research Journal, Fall 2015


Notable Comments

MLB Now host Brian Kenny noted that the sabermetric revolution has been won, with very few holdouts, noting that most teams (including, most recently, the Brewers, Phillies, Reds, and Tigers) and managers have come to realize the value of analytics (even if it’s just to provide them with cover). There’s still resistance in media. Still, baseball remains very slow to adopt new ideas. Kenny noted that the live ball was introduced in 1909 but the Live Ball Era didn’t begin until 1921, Lou Boudreau’s Williams Shift debuted in 1946, but shifting wasn’t embraced until 2013. The Washington Senators’ Firpo Marberry was closing games in 1924, but the relief “fireman” didn’t emerge until the 1940s and 1950s. Sabermetrics began in the 1970s and 1980s but it took Moneyball (the book) in 2003 to popularize it.

“If I were a player now, I think I’d pay attention to more information”–Aaron Boone, noting that players realize that analytics can help extend their careers.

Baseball Information Systems (Defensive Runs Saved, The Fielding Bible) made two presentations. One introduced positioning, in addition to range and throwing, as a key defensive metric, given the popularity of shifts. Andrelton Simmons, now playing for the Angels, was the top-rated shortstop at range and throwing combined, but because his old team, the Braves, didn’t shift much, the Yankees’ Didi Gregorius rated ahead of Simmons overall due to the Yankees ranking third overall in shifts.

BIS also began cataloging injuries in 2015, including minor ones. Among the interesting findings:

  • The three most common injuries were batters fouling balls off their body, fielders (almost always a catcher) struck by a batted ball or bat, and batters being hit by pitches. However, those events rarely resulted in significant injuries.
  • The body parts most often affected were ankles/feet/toes, knees/lower legs, and heads.
  • Catchers are the most commonly hurt players, with 90% of their injuries arising from being hit by a ball or bat, and 44% of those affecting their head.
  • The injury most frequently resulting in a player leaving a game was throwing a pitch. When a pitcher hurts himself throwing a pitch, he’s removed from 44% of games.
  • When a catcher’s hit in the head more than once in a game, his offensive performance declines for about a week.
  • Moderately severe wrist injuries result in decreased offensive production.
  • Pitcher head injuries result in the largest decrease in fastball velocity, followed by hip and shoulder injuries.

Re analytics, “All 30 teams use it, it just depends on the degree”–Ken Rosenthal

“I think he has a chance to be a great player. We think he’s a champion-style player.”–Derrick Hall, Diamondbacks President, on departed No. 1 draft pick Dansby Swanson.

Arizona has three full time data analysts, but “we are about personality and character”–Hall. As an aside, I don’t think that’s a win-optimizing strategy.

“The early [sabermetric] decisions were made on an event basis. With new qualitative stats, we’re seeing a convergence between the stats and the scouting. It’ll confirm a lot of things were hearing from scouts.”–Dick Williams, Reds GM, explaining how process-oriented metrics, like spin rate and route efficiency and exit speed are enabling teams to evaluate players in ways that aren’t necessarily reflected in their outcomes.

“You can keep your foot on the accelerator but you have to focus on non-player spending as well”–Billy Eppler, Angels GM, on how a team can be successful while maintaining a high payroll provided it spends on drafting, foreign scouting, development, and the like. Eppler added that he divides each season into thirds. During the first third, from Opening Day to Memorial Day, he focuses on process more than results, identifying strengths and weaknesses. From Memorial Day to the trade deadline, he seeks to improve the club as much as possible. From then on, other than some minor league callups, it’s mostly a matter of playing the hand you’re dealt.

Trade that you didn’t make that you wish you had? “Chapman at the deadline.”–Williams

Skills desired for a front office hire: “SQL, R, data visualization, machine learning, an ability to see problem through, beginning to end.”–Williams. “Objectivity and open-mindedness”–Eppler.

“Tell the players what you value and they’ll make themselves that way.”–Eppler, relating a quote from Alex Rodriguez when Eppler was a Yankees executive.

The next wave of technology isn’t going to be stuff that occurs on the diamond. A technology panel discussed advances in kinesiology/physiology, motion capture, and neural research, measuring physical movement, force, and pitch recognition. The next big frontier? Bringing the information together for medical purposes. “Sensors are going to be everywhere,” on players and on equipment.

The two most visible faces of Statcast are Mike Petriello of mlb.com and Daren Willman of MLB Advanced Media and BaseballSavant.com. They discussed what they see as the next Statcast breakthroughs: outfielder and infielder batter-to-batter positioning, the possible relationship between exit velocity and injury, throw accuracy, swing force, and metrics based on expected outcomes given speed, angle, and spin.

We we were recording maybe 5% of fielding data back when all that was available was putouts, assists and errors. When Bill James introduced zone ratings, which were initially just plays per game, it rose to 10%. “Now we’re capturing 60%-65% of the data.”–John Dewan, owner of Basball Info Solutions, arguably the person most responsible for capturing that amount.

Dewan, describing a fielding fantasy league he was in, in which one roster position is reserved for a player who is a bad fielder and whose Defensive Runs Saved score (presumably negative) is multiplied by -1 before added to the team total: “Any Ramirez playing in Boston is a good bet for this.”

Alex Cora, ESPN analyst, on a concern with shifts: “They put fielders into positions in which it’s harder to turn a double play.” Dewan: Fewer double plays turned, but more opportunities.

Cora, on the demands of shifting: Second basemen with stronger arms than traditional second basemen, first basemen with good hands to handle throws from various angles, third basemen with more athleticism to handle slower-hit balls requiring more range.

“It’s not a needle in a haystack, it’s a needle in haystack made of needles”–MLB Network Radio and Diamondbacks analyst Mike Ferrin, describing Statcast.

There was a panel discussion titled “How Big Data and Analytics is Impacting Baseball’s Business Operations.” To be honest with you, my takeaway from this panel, focused on the use of analytics in the business side of baseball, is that the sport is behind not only other large private-sector businesses like retail, but also, according to some of the other conference attendees, other major sports. If I were a young analytics-savvy person wanting to get into the sport, this is the route I’d pursue.

“Make sure the numbers they [on-field personnel] get is the right context” so they understand why they have the information and what to do with it–Yeshayah Goldfarb, VP of Baseball Operations, Giants

It’s easier to introduce a new concept, like infield shifts, “when you play Tampa 19 times per year.”–Sarah Gelles, Director of Analytics, Orioles. This comment made me feel that lesson of Moneyball is still valid: there are market inefficiencies that a first mover can exploit, albeit most likely not for long.

Finally, there were several interesting research presentations. One focused on high school player draft age, noting that current ability is overvalued relative to potential value, with teams favoring older (i.e., 18- or 19-year-old) players over younger ones (17-year-old) even though younger players have generated greater value. Rob Arthur of FiveThirtyEight.com presented research suggesting that exit velocity may support the concept of hot and cold streaks–he singled out Josh Reddick, Josh Donaldson, Carlos Gonzalez, Kole Calhoun, and Shin-Soo Choo as streaky hitters–but warned that he found only exceptionally short streaks, with an average of four (yes, four) plate appearances, thought a quarter were greater than ten plate appearances. Several presentations discussed mixed modeling, breaking outcomes into components, e.g, pitching as a product of ballpark, framing, umpiring, batter, and fielder input as well as the pitcher’s, or batting as a product of fairly stable strikeout rates, somewhat stable home run rates, and less stable BABIP. Another researcher demonstrated that major league teams are not stealing bases optimally, and could improve win expectancy by targeting more steals with a positive run expectancy. Finally, some guy expanded on an article that was published here.


Next post:
Previous post:

2 Responses to “Notes from the SABR Analytics Conference”

  1. Bryan Lee

    The third winner was from Carnegie-Mellon. To follow your format: for the Braves: Kenley Jansen, Zach Britton, Carson Smith, David Robertson, and Brett Cecil.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.