Covering the Front and Back Pages of the Newspaper
January 22, 2007
BASEBALL: Lost in Translation
I have greatly enjoyed the new "Neutralize Stats" feature at Baseball-Reference.com, explained in detail here. Of course, long-time readers will recall my longstanding interest in "translated" statistics. (See here and here for examples). There's endless amounts of fun to be had playing around with the feature, which lets you translate players from all different kinds of contexts into a common context.
But I have been disappointed in a few of the features of B-R.com's translations. The problems are somewhat related, and are most apparent in dealing with 19th century players, but here they are, presented in a spirit of constructive criticism:
1. Comparing Earned Runs Allowed to League Runs/Game is an Apples-to-Oranges Comparison.
Consider this explanation of how the translation worked for Pedro Martinez in 2000:
Martinez allowed 42 earned runs in 2000. After adjustment, Martinez's runs created allowed went from 38.91 to 33.67. We'll assume that the change in his earned runs will be proportional to the change in his runs created allowed:
If you are following carefully, you will note that the method (1) adjusts ERA fairly well based on an adjustment of hits and the like keyed to the league scoring average, but then assumes a predictable relationship between runs and earned runs, and (2) then figures a W-L record based on a comparison of that assumed runs allowed rate to the actual league runs allowed rate.
The problem when you do this is that if you are starting with a pitcher from before 1920 and especially before 1900, when a huge proportion of league runs scored were scored on errors, you will create a "neutralized" runs allowed rate that is far lower than the league scoring average - and thus, you will assume that even an average pitcher in that era was allowing far fewer runs per inning than were being scored. If you don't believe me, just try to find a 19th century pitcher whose neutralized stats give him a losing record. Jim Hughey, for example, was a terrible pitcher, career ERA+ of 80, career record of 29-80 - but his neutralized career W-L record is 72-62. Tony Mullane goes from 284-220 (.563) to 554-178 (.778). Ted Breitenstein goes from 160-170 (.485) 276-128 (.683). Jim Devlin goes from 72-76 (.486) to 371-40 (.903) and a 1.27 career ERA.
Personally, I've always thought that translated W-L records should be tied, in some way, to actual W-L records - adjusted by context, yes, but tethered in some realistic fashion to the actual games actually won. Even if you disregard that, though, a realistic system for translating W-L records should not simply disregard the fact that different pitchers pitched under vastly different conditions in terms of unearned runs allowed.
2. Batting Statistics Should Be Adusted By Comparison to Batting Statistics
The corollary of the above is that the system massively underrates old-time hitters because it adjusts batting statistics by comparison to league runs scored per game, rather than, say, adjusting batting averages by comparison to league batting average. Thus, again, Cap Anson, with a career OPS+ of 141, adjusts to a career .289/.383/.337 hitter (yes, being old-fashioned I list slugging second). Dan Brouthers, career OPS+ of 170, adjusts to a more ordinary .306/.467/.378. Ross Barnes, who had a career OBP almost 100 points above the league average for his career, adjusts to a career .324 OBP.
More generally, the translations, because they are keyed to runs/game rather than to the individual components of scoring, fail to take account of the changing shape of a player's offensive numbers if he played in a high-average low-power era as opposed to a higher-HR era with lower batting averages. The pitcher translations similarly fail to account for the changing components of offense over time. For example, my own crude pitcher translations helped show how Lefty Grove, Dazzy Vance, Dizzy Dean and even the aging Satchel Paige were prime-time strikeout pitchers, or how Cy Young's incredible control was a constant throughout many changes in the game; the current B-R system submerges these facts.
3. Failure to Adjust Pitcher Workloads and Decisions
This has been my big gripe with pitcher-translation systems in the past - the failure to establish a meaningful way to recognize the changes in starting pitcher workloads and rates of decision over time and thus credit the guys who carried the heaviest loads relative to the era they played in. This is exacerbated by expanding the projected seasons to 162 games, if you are wondering how Tony Mullane could win 554 games. A workable computerized system could easily be developed along some of the lines I used in my prior analyses of pitcher translations and pitcher workloads, looking at the load carried by the top howevermany rotation starters in a league in a given year, and adjusting a pitcher's number of innings and decisions in step with that. If Sean Forman is interested in improving his system, I'd love to see some improvements along those lines.