Rating the Pitchers

This columnar addendum was originally posted on the Boston Sports Guy website.
Translated Pitching Records
One common theme in this column is that comparisons of pitchers over time, in different eras and different parks and for different teams, is only possible and certainly only sensible if some effort is made to adjust the statistical record to reflect the massive changes in the ways that starting pitchers are used and the conditions under which they labor. For that purpose, I have developed a simple, if primitive, method for converting or �translating� pitching records from one context into another, or (more commonly) into a common context.
The bottom line: when I run �Translated Pitching Records,� this is what I am talking about � translation into the same context for workload, league ERA, team offense, and park. Read on if you want the gory details of how the method works. I�ll be glad to answer email inquiries by anyone who thinks I�ve left too much out of this description.

TRANSLATED ERA
The basic concept is simple arithmetic: a 2.00 ERA in a league with a 4.00 ERA is worth exactly the same � all other things being equal � as a 2.50 ERA in a league with a 5.00 ERA. If you were translating the 2.00 ERA into a 5.00 league ERA context, you divide by 4.00 and multiply by 5.00. If the pitcher also pitched in a park that raised scoring by 10%, you reduce the ERA by 5% (remember, half of all games are on the road). It�s that simple.
I reached pitchers� Translated ERA, then, by the formula:
((ERA)*3.72)/((League ERA)*(Park Factor)).
I used 3.72 because that was the National League ERA in 1986, according to the STATS, Inc. All-Time Sourcebook. I used the 1986 NL as the baseline for three reasons: (1) I wanted modern workload and strikeout numbers so that the translated records would look familiar to modern readers; (2) I wanted an ERA around 3.75 to approximate the historical median between the great pitchers� eras, when the league ERA was around 2.60, and the great hitters� eras (like the one everyone but Pedro pitches in today) with league ERAs around 5.00 and higher; and (3) hey, I�m a Mets fan and it�s my method. If you want to spend six weeks in a room with a calculator, pen and the encyclopedias to change it to the 1967 AL, be my guest.
Here are the vital stats for the 1986 NL:
– 3.72 ERA
– 8.6 H/9IP
– 3.4 BB/9IP
– 6.0 K/9IP
– 254 Innings Factor
– 32 Decisions Factor.
This is an uncontroversial method � Total Baseball and baseball-reference.com have long used the same method for the �ERA+� stat. The only gripe I have with ERA+ is that it doesn�t look like a familiar stat. Thus, I use a translated ERA to (1) translate the stat into an intelligible, reader-friendly format and (2) use different park factors than Total Baseball uses, because I rely on the park factor that represents the actual run-scoring environment for that pitcher’s team’s season while I believe that Total Baseball uses a multi-year averaged factor that is intended to reflect the performance-altering aspects of the park itself.
WORKLOADS
The actual ERAs, of course, were rounded so that I would have an Earned Runs and Innings Pitched column for each season. The parts of the method dealing with workloads is where I believe that I may actually be doing something innovative here; the rest of the stuff is really just a cheap knockoff of ideas that date back at least to Pete Palmer and Bill James in the mid-1980s. It�s not overstating the case to say that the changes in the way starting pitchers have been used over time have been the single greatest change in the way baseball is played over its history. That�s my number-one objection to the view taken by virtually all analysts of the game � Bill James included — in comparing pitchers. Only by ignoring their relative workloads, in their times, for example, could one come to the conclusion that Lefty Grove stood out further from his contemporaries than Walter Johnson.
What I did, then, was to create a �Decisions Factor� and �Innings Factor� for each season. The logical way would be to come up with some measure of the average workload of a full-time starter, but I have scarce free time and limited computing skill, so instead what I did for IP is to average the number 3, 4, and 5 men in the league in IP and use that as a benchmark. I exclude the numbers 1 and 2 men from the IP and Decisions factors partially out of convenience but also because I don’t want the happenstance of one outlying factor – like Phil Niekro, Ed Walsh or Billy Martin – to skew the picture of average workloads. This yields a factor that illustrates the change, over time, of the workload of a near-the-top number one starter. The IP factors have varied widely over the years, running as high as 322 in the 1973 AL and into the 500s in 1880, 1883 and 1884 (the last year over 400 was 1894), but into the 250-270 range regularly between 1925 and 1963 and as low as 222 in the AL in 1999.
The �Decisions Factor� is separate because the relationship between a pitcher�s innings and his decisions has changed over time, as pitchers are increasingly likely to throw 5 or 6 innings in a no-decision in a given start; over time that means fewer decisions per inning pitched. For decisions, I used an average of the 3-4-5 men in W plus the 3-4-5 in L. Again, it�s just a factor to allow for a comparison of change over time. The Decisions Factor has been around 34-37 for most of the post-1920 period, but went as high as 41 in the early 1970s.
HITS, WALKS, K’S
I�ve mechanically adjusted H, BB, and K per inning by the league averages, which is sensible except that I haven�t adjusted for the theoretical limits on K per inning, with the result of people like Dazzy Vance and Lefty Grove striking out 400-500 people in a 250 inning season. Besides my practical limitations, I decided to live with that because it�s a reminder of (1) how way out of step those guys were with their era and (2) that this is a translation of what they were worth, not a projection of what they would have done. I also adjusted these 3 categories mechanically by park factors, which is stupid for several reasons but it�s too late to change now and besides we don�t have park factors for anything but Runs and HR before 1987.
WIN PERCENTAGE
The other area where I�ve departed from Bill James and from the Baseball Prospectus is in running translations of a pitcher�s actual W-L record adjusted by the distance of a team�s offense from the league average. Bill James went in one direction � comparing a pitcher�s record to the rest of his team�s, thus tethering a man�s evaluation to the other pitchers on his staff. The BP guys go in the opposite direction, throwing actual wins and losses out the window and calculating a shoulda-been W-L record from ERA (and, in more recent seasons, game-by-game runs allowed and IP, which is much more fair and accurate).
The reason why I adjust W/L records by a team’s offense rather than its overall W/L record is that it really begs the question to compare a pitcher to his team’s other pitchers – it should be obvious that Don Sutton’s ability to win games in his rookie season was affected by how good the Dodgers on the field were, not by how good Koufax and Drysdale were. That’s a prime example because the “rest of the team” had a great record, but by 1966 the Dodger offense, even when adjusted for park illusions, had sunk to a below-average outfit. The method is still imperfect because I can’t adjust for variances in bullpen support or defense (tough luck if you are Roger Clemens and it’s 1996), plus my mathematical adjustment would probably be slightly more accurate if I used some variant on Bill James’ Pythagorean method (squaring the offense factor) rather than a straight division. But here the method is constrained by my lack of computing sophistication.
I differ from the BP method because it ignores the reality that a pitcher allows runs under real game conditions, and will pitch differently depending upon what it takes to get the �W�. You can�t just throw the career W-L out the window, because some guys really do have a talent for winning games, however overrated that talent may be.
One of the virtues of the TR method is that it reveals the fact that many pitchers are really much more consistent over time than you think. Look at Cy Young’s career records and you see huge variations in IP, K/BB ratio, ERA, etc. But look at his TR and you see a guy who churned out essentially the same season year in and year out for two decades, with the only real variations coming when the majors contracted from 12 teams to 8 in 1900, producing an off year under more-competitive conditions, and then when it expanded to 16 teams in 1901, setting off a three-season spurt where Young dominated a still-weak new league. Most of the rest of the changes in his career were due to external factors – moving the mound back in 1893, the foul-strike rule in 1903, changes in teams and parks, etc.
I�ll do more historical reviews of how the TR analysis has affected my view of history�s greatest pitchers. For now I�ll just leave you with the formulas.
Decisions: ((W+L)*32)/(Seasonal Decision Factor)
Offensive Support factor: Team Runs/(Park Effect)*(League-Avg Team Runs)
Wins: (W*(Translated Decisions))/(Actual Decisions)*( Offensive Support
factor)
ERA: ((ERA)*3.72)/((League ERA)*(Park Factor)).

One thought on “Rating the Pitchers”