Covering the Front and Back Pages of the Newspaper
June 2, 2011
BASEBALL: A History of Team Defense (Part I of II)
Who are the best defensive teams of all time? Individual defensive statistics in baseball - as in other team sports - have been crudely kept and poorly understood for years, with the more sophisticated modern methods only being gathered for the past decade or two. As a result, even statistically-oriented baseball fans have tended to answer questions about defense as much by reputation and anecdote as anything. The lack of a statistical framework tends to make defense a bit invisible in our memories; even most knowledgeable fans have no more concrete sense of, say, Ty Cobb as a defensive player than they do of Turkey Stearnes as a hitter. My goal in this essay is to a little bit to remedy that on the team level.
We do have one measurement of team defense that endures over time and thus can be used as a baseline for measuring team defense: Defensive Efficiency Rating (DER). I'd like to walk you through the history of the best and worst teams in each league, and the league average, in DER from the dawn of organized league ball in 1871 down to this season. As usual, I'll try to explain here what I'm measuring in terms that make sense to readers who may not be all that familiar with the 'sabermetric' literature, although I make no claim to be current myself on every study out there, and welcome comments pointing to additional studies.
What is DER?
DER is, put simply, the percentage of balls in play against a team that are turned into outs. The exact formulas used to compute DER can vary a bit, and while Baseball-Reference.com - which I used for this study - computes DERs all the way back to the start of organized baseball in 1871, its description of the formula is a bit vague:
Percentage of balls in play converted into outs This is an estimate based on team defensive and pitching stats. We utilize two estimates of plays made. One using innings pitched, strikeouts, double plays and outfield assists. And the other with batters faced, strikeouts, hits allowed, walks allowed, hbp, and .71*errors committed (avg percent of errors that result in an ROE) Total plays available are plays made + hits allowed - home runs + error committed estimate.
All methods for computing DER look at the percentage of balls in play that become hits; it appears that Baseball-Reference.com's formula also counts the outs that result from double plays or outfield assists, both clear examples of outs created by good defense, as well as counting against the defense the one thing that fielding percentages always recorded - errors - but only where they put a man on base. From what I can tell, essentially the same formula is used over all of the site's historical DER data, so the data is generally consistent over time.
It's worth recalling that DER only measures outs vs. men reaching base - it doesn't deal with extra bases on doubles and triples, or stolen bases and caught stealing, or other baserunning issues. So, it's only one part of the picture just as on base percentage is just one part of the offensive picture. But like OBP, it's the single most important part.
What Goes Into Team DER?
One of Bill James' maxims throughout the 1980s was that "much of what we perceive to be pitching is in fact defense." As most of my readers will recall, Voros McCracken broke major ground in the field of baseball analysis of pitching and defense in 2001 with a study showing that Major League pitchers, over time, had no effect - or at least, there was no difference among Major League pitchers in the effect they had - on whether balls in play become outs. Strikeouts, walks and home runs (the so-called "Three True Outcomes") are the pitcher vs. the hitter, mano a mano, but on average, BABIP (batting average on balls in play, the flip side of DER) shows no tendency to be consistent year to year among individual pitchers; other statistical indicators also strongly suggest that a pitcher's BABIP tends to be mostly a combination of team defense and luck. The simple way of expressing McCracken's insight is that it's the defense rather than the pitcher that determines how many balls in play become outs.
As with most groundbreaking insights, further research has added some caveats to McCracken's theory. The first one, which he observed from the beginning, was that knuckleballers tend as a group to have lower than average BABIP, and thus are something of an exception to the rule. I haven't absorbed all the further studies, but there are reasons to suspect that other classes of pitchers may have a modest advantage in the battle against BABIP, including elite relievers (Troy Percival, Armando Benitez, Mariano Rivera, Trevor Hoffman and Keith Foulke all seemed to have much lower career BABIP than their circumstances would suggest) and possibly pitchers who throw a huge number of breaking balls (we'll discuss Andy Messersmith a bit below).
Also, McCracken's research, and most of the following research, looked at the conditions of modern baseball (at the time, Retrosheet and Baseball Prospectus' database only went back to the mid-1950s). It's entirely possible that pitchers had greater influence on BABIP/DER in the era before 1920, or further back, when there were pitchers who had consistent success even in the era when most plate appearances resulted in a ball in play and thus the pitcher had little opportunity to set himself apart from his peers by success in the Three True Outcomes. As I explained in this 2001 essay, the playing conditions were greatly different in 19th century baseball in particular, and I'd be hesitant without data on that era to just assume that the pitcher's effect on balls in play was as minimal then as it is now.
Finally, of course, as with other statistical measures, there are park effects. We all know that different parks are more or less favorable for hitters, and of the components of that, park effects on home runs are significant, and parks can effect walks and strikeouts as well. (Less so for baserunning, in most cases). Balls in play are no exception, and I don't have data handy on how park effects specifically affect balls in play over time besides the ability to notice some trends (for example, the Polo Grounds for many years was a great home run park but not a great hitters' park; I assume DER there tended to be high) and a few specific examples where I dug into the numbers we have. So bear in mind that the numbers set out below are not park-adjusted.
Key to the Charts
BIP%: Percentage of plate appearances resulting in a ball in play (i.e.,Plate Appearances minus homers, walks and strikeouts). Since I used league batting rather than pitching data for this, there may be a slight discrepancy for the period since the start of interleague play in 1997.
NL/AL etc.: Under the league name I have the league's DER for that season.
High/Low: The team with the league's highest and lowest DERs. I used Baseball-Reference.com's team abbreviations.
DER: That team's DER
High%/Low%: Team DER divided by the league average. This is the key number I use to identify the best and worst defensive teams, so we can see who were the best and worst defensive teams relative to the league average. As usual, I'm not using any math here more complicated than simple arithmetic and basic algebra.
Also, where I compute "rough" estimates of BABIP for pre-1950 pitchers I used the basic formula of (H-HR)/((IP*3)+H-HR-K)
Talent levels in the 1870s were especially uneven, as the first organized league - the National Association - began play in 1871 just two years after the debut of the first-ever professional team. Schedules were short (20 games in 1871, in the 60s by decade's end), fielders didn't wear gloves, playing surfaces were ungroomed and in some cases effectively without fences, and with nine balls for a walk and longballs unheard of, nearly every plate appearance resulted in a ball in play - the 1872 season's 96.5% rate is the highest in the game's history, and 1879 was the last season above 90%.
As you can see, defenses improved dramatically over this period, in part no doubt as professional pitchers and fielders learned their craft and more of the nation's best ballplayers gathered into the National Association and later the NL. But errors were a big chunk of the poor defense of the era - in each of the NL’s first five seasons, there were more unearned runs than earned runs scored, and it wasn't until 1906 that the average number of unearned runs would drop below 1 per game.
The most successful defensive team of the era was the 1876 St. Louis "Brown Stockings" team (not precisely the same organization as the Cardinals), the only Major League team ever to be 10% better than its league in DER. Starting pitcher George "Grin" Bradley struck out 1.6 men per 9 innings but led the league with a 1.23 ERA (the team also allowed the league's fewest runs, although their 2.36 unearned runs per 9 innings was only third-best in the league) while throwing all but four of the team's innings. A rough estimate of the BABIP against Bradley is .258 in 1875, .224 in 1876, but .285 after he changed teams the next year, when his ERA nearly tripled, and .267 for his career. Which at least seems consistent with the notion that Bradley's defense was doing most of the work.
Note that the Philadelphia Athletics of 1873-74, featuring Cap Anson and Ezra Sutton in their infield, made the only repeat appearance on the decade's leaderboard (Anson, in his early 20s, played multiple positions including short and third, while Sutton was beginning a long career as a third baseman and shortstop).
The worst defensive team of all time? I hate to give you such an underwhelming answer, but by a wide margin it's the 1873 Baltimore Marylands, who folded after just 6 winless games and almost none of whose players appeared in the big leagues again. The hapless Marylands allowed 144 runs in 6 games (24 per game), only 48 of which were earned; in addition to hideous defense their pitchers didn't strike out a single batter. (The offense was no better, as a team batting average of .156 with only one extra base hit and no walks attest). When you think of the level of competition in those early years, think of the Marylands.
National Association-National League
The game gradually professionalized in the 1880s, but not without a great many bumps along the way. The Union Association of 1884 was only barely a major league (four teams, including Wilmington, folded after playing less than a quarter of the schedule), but diluted the talent level of the two major leagues. The 4-ball/three-strike count wasn't standardized until 1889, after a gradual decline in the number of balls for a walk and a one-year experiment in 1887 with four strikes for a strikeout; DERs rose sharply after the three-strike rule was restored. The schedule topped 100 games for the first time in 1884, and had reached 135 by 1888. The color line was established in the wake of the failure of Reconstruction (which effectively ended in 1877), after only a few black players had taken the field. The first gloves were becoming commonly used by decade's end.
Anson's 1882 White Stockings (now Cubs) and the 1882 Red Stockings (now Reds) became the first pennant-winning teams to lead the league in DER since the founding of the National League (in the NA, only the 1872 Boston team had done so); four teams would do so in each of the two leagues in ten years, plus the Union Association champs. Bid McPhee, enshrined in the Hall of Fame in 2000 largely for his defense, anchored the Red Stockings teams that led the league three times in their first six seasons in the league, and their 1882 and 1883 DERs were the most dominant of the decade outside the UA, but the mid-decade St. Louis Browns (now Cardinals) juggernaut also emerged as a defensive powerhouse. The woebegotten 1883 Philadelphia Quakers were the decade's worst defensive team. The NL's most successful defensive squad? The 1884 Providence Grays, much to the benefit of Old Hoss Radbourn, who had his famous 59-12, 1.38 ERA season. Radbourn also struck out 441 batters in 678.1 innings, so he did his share as well, and by a rough calculation the opposing BABIP of .242 - while a career best - wasn't hugely out of line with his career .271 mark. Lucky and good is a good combination.
The NL achieved dominance after the Players League war. The modern era of pitching arrived in 1893 when the mound was moved back from 50 feet to its current 60 feet 6 inches; the percentage of balls in play spiked as strikeouts became almost non-existent, while DERs plunged in 1894 and 1895, suggesting more hard-hit balls off pitchers struggling to adjust to the new distance. The 1890 Pirates were the decade's worst defensive team, the 1895 Baltimore Orioles (with extra balls hidden in the long grass of the outfield among their notorious tricks) the best, although the late-decade Beaneaters (now Braves, featuring Hall of Famers Hugh Duffy and Billy Hamilton in the outfield, Jimmy Collins at third, and Kid Nichols as the staff ace) were consistently dominant and would remain so through 1901. (Collins left in 1901, Duffy the previous year, but Nichols, Hamilton and infield anchors Herman Long, Bobby Lowe and Fred Tenney were there the whole time; Long and Nichols had also been on the 1891 team). Four teams had the NL's best record while leading the league in DER, three of them Beaneaters teams.
The foul-strike rule, adopted in the NL in 1901 and the AL in 1903, brought back the strikeout and contributed, along with better gloves and more "small ball," to rising DERs, as the NL in 1907 became the first league ever to turn 70% of balls in play into outs, rising to 71.4% in 1908, a level that would not be matched again until 1942. Schedules also started to be standardized in 1904, settling around 154 games after a decade mostly in the high 120s.
Surprisingly, defense was not the essential element for many of the pennant winners of the Dead Ball Era's first decade - only one AL pennant winner (the 1903 Red Sox, featuring Jimmy Collins yet again) led the league, and only two NL pennant winners. That being said, the Cubs of the Tinker-Evers-Chance era have as good an argument as anyone to be the dominant defensive team of all time. They led the NL in DER eight times in nine years, as well as finishing a close second (at 726, 101.68% of the league) the ninth of those, and second again in 1912. In 1906, on the way to a 116-36 record, they became the first of five post-1900 teams to beat the league average by 5% or more, and their 736 DER bested the second-place Phillies by 29 points and would not be topped (in raw terms) for 62 years, by men using vastly superior equipment. It's possible there was a park factor at work, although Baseball-Reference.com lists West Side Park (where the Cubs played until Wrigley opened in 1916) as if anything a hitters park until late in the decade; in 1906, the Cubs combined to score and allow 7.24 runs per game at home, 7.03 on the road, with the defense in particular allowing 2.22 runs per game on the road compared to 2.78 at West Side Park. Was it the pitchers? By my rough estimate, the BABIPs against four or the five pitchers on that staff to throw 1000 or more innings as Cubs between 1903 and 1912 -Three Finger Brown, Carl Lundgren, Orval Overall, and Jack Pfiester - varied between .237 and .241 compared to a team average of .241 for all pitchers to throw at least 200 innings on the team over those years, with only one such pitcher above .254. Only Ed Reulbach, at .230, seems to have stood out a bit. That suggests that the team's defense was the predominant factor. The same BABIP figure for the rival Giants, a good but more normal defensive team, was .259 - the 19-point advantage on balls in play for Brown over Christy Mathewson is almost certainly the main explanation for why Brown's ERA was better (1.75 to 1.90) over those years, although of course Brown was nonetheless a great pitcher.
Best AL defensive team? The 1901 Red Sox, another Jimmy Collins squad. Worst team of the decade? The unraveling 1902 Baltimore Orioles, who were deserted by John McGraw in mid-season and relocated to New York (now the Yankees) the following spring (like the prior year's Milwaukee franchise - there's a long history of teams getting folded or moved after cellar-dwelling DERs, as terrible defense is often a byproduct of organizational failure).
Also, note the atrocious showings by the late-decade Washington Senators, the team on which Walter Johnson broke in, yet another way in which Johnson's early career was plagued by bad teams. Johnson would bear some closer study - a quick look suggests that his BABIPs may have been better than his teams' for much of his career, as if he needed more advantages on top of leading the AL in K/BB ratio nine times, K/9 seven times, fewest BB/9 twice and fewest HR/9 three times (a favorite stat: Johnson in 1918-19 threw 616.1 innings and allowed just two home runs, both of them by Babe Ruth). His BABIP seems to have hit a career low of .219 in 1913 at the same time as his career high 6.39 K/BB ratio, another example of perhaps being both lucky and good, or perhaps there being a correlation between the two.
Defense had the upper hand in the teens, with DERs regularly topping 70% leaguewide in the second half of the decade, especially in the NL. If top defensive teams winning the pennant were a rarity in the prior decade, they became routine in the teens - five times in the NL, five in the AL. The Red Sox were the decade's dominant team in the AL both defensively and overall, and continued to lead the league even after the departure in 1916 of Tris Speaker. (Oddly, the Red Sox went from the best DER in the AL in 1912 to the worst in 1913 and back to the best in 1914; more on that below.) Meanwhile, the NL's revolving door of pennant winners (and World Series doormats) from 1915-19 were generally whoever handled the balls in play best. Yet most of those NL teams didn't beat the league average by all that much, and the best single-season showing was the 1919 Yankees. The worst, unsurprisingly, was the post-fire-sale 1915 A's (with a fossilized 40-year-old Nap Lajoie at second and their best remaining player, catcher Wally Schang, playing out of position at third), although the doormat 1911 Braves weren't far behind.
The Cubs' defense stopped being dominant with the 1913 departure of Joe Tinker, who went on to anchor the Federal League's best defense, while Johnny Evers was part of lifting those Braves out of their 1911-12 defensive funk to a slightly above average defensive team in 1914 (they'd been below average in 1913 - that said, I'd expected the 1914 Miracle Braves to be one of the teams that had a huge year defensively, and even with Evers and Rabbit Maranville, they didn't).
Lower strikeout rates with the lively ball's arrival were probably the largest factor in the sudden increase in scoring in the Twenties, as even the gradual arrival of home run hitters and a leaguewide rise in walks couldn't stop the upward march of the percentage of balls in play. But DERs dropped a good 15 points as well.
Defense was slightly more the hallmark of AL than NL pennant winners in the Twenties - six in the AL, four in the NL. Naturally the 1927 Yankees were the best in the league at this, too, their fifth league lead in nine years. And Walter Johnson finally got some real defensive support when the Senators won their two pennants in 1924-25, dropping Johnson's BABIP from .280 to .248 in 1924.
As discussed in the next decade, you have to figure a significant park effect was at work in the fact that the Phillies were dead last in the NL in DER 14 times in their last 17 full seasons in the Baker Bowl, including the NL's worst showing of the decade in 1926. Then again, nearly all of those Phillies teams were terrible teams, with a collective .383 winning percentage and only one winning record, in 1932 when their DER was 98.5% of the league average. And the Phillies had led the league in DER behind Grover Alexander in 1915.
1935 saw the arrival of night baseball, which would eventually be a factor in bringing back strikeout rates, as would the growth of relief pitching, still taking its first baby steps in the Thirties; between those factors and more home runs, the AL in 1937 became the first major league in which less than 80% of plate appearances resulted in a ball in play, after being above 83% in the AL and 84% in the NL for much of the Twenties. Six AL pennant winners had the league's best DER, compared to just two in the NL.
The 30s were the best and worst of times. The Phillies hit their nadir in 1930, at 631 the worst raw DER since 1900 (the 1911 Braves being the only other team since 1906 to finish below 650), the worst relative to the league since the ill-fated 1899 Cleveland Spiders and the only team lower than 95% of the league average since the 1915 A's. Not for nothing did they post a modern-record 6.71 team ERA, allow 7.69 runs per game, and lose nearly two-thirds of their games even with Lefty O'Doul batting .383/.453/.604 and scoring 122 runs and Chuck Klein (probably the most park-created of all Hall of Famers) batting .386/.436/.687 with 158 runs scored and 170 RBI. Then again, they also had the league's worst K/BB ratio and allowed the league's most homers, so it wasn't all the defense's fault. And the Phillies left the Baker Bowl for good at the end of June 1938, and still finished last in DER in 1938 and 1941 plus three more times in the mid-1940s.
In the AL, the late-30s St. Louis Browns, presumably despite Harlond Clift at third, were the league's worst, hitting bottom in 1939. Also in St. Louis, if you're curious, the 1934 "Gashouse Gang" Cardinals team was league-average.
On the positive end, we have the 1900s Cubs' top competition for the title of the best defensive team of all time, the 1939 Yankees, the team that Rob Neyer and Eddie Epstein (measuring by runs scored and allowed relative to the league) marked as the greatest team of all time in "Baseball Dynasties," noting that they led the league in runs scored and fewest runs allowed four years in a row. So it's not surprising to encounter them here. The Yankees' DER was the furthest above their league of any team since 1885, and their 730 DER led the league by 35 points. This was part of a string of six straight seasons and 12 in 13 years when they had the league's most successful defense, starting in Babe Ruth's last year two years before the arrival of Joe DiMaggio and running clear through World War II. While a number of players appeared on many of those teams (DiMaggio, Tommy Henrich, Frank Crosetti, Red Rolfe, Joe Gordon), the only constants were manager Joe McCarthy and catcher Bill Dickey. (Both had also been on the 1933 team that was last in the AL in DER before cutting back the Babe's playing time and putting Earle Combs and Joe Sewell, both 34, out to pasture). You have to give McCarthy some of the credit for the Yankees' consistent defensive excellence, if only in how he chose to distribute playing time.
That said, a significant park effect can't be discounted here. Yankee Stadium was always a pitcher's park, and seems to have been a particularly extreme one in 1939: unlike for the Cubs, we have home/road detailed splits for the 1939 Yankees, which show that Yankee hitters had a BABIP of .273 at home, .315 on the road, while Yankee opponents had a BABIP of .248 at home, .267 on the road - combined, .260 at home, .292 on the road. I haven't had time to run the splits for the Yankees' whole run in that period - this essay took up quite enough of my time, and it would be a worthwhile project for someone else to carry on further - but even on the basis of the huge split for 1939, as remarkable as the Yankees' defensive performance was in the McCarthy era, it has to be taken with the same grain of salt as the Baker Bowl era Phillies. (The 1930 Phillies' Home/Road BABIP splits were .352/.300 for their offense, .365/.341 for their pitching staff, and a combined line of .358/.321 - a 36-point spread)
Speaking of managers, Walter Johnson may not have had great defenses as a pitcher, but as a manager he did better, skippering the Senators to two league-best DERs in four years from 1929-32. And the 1938 Braves became the first Casey Stengel-managed team to lead the league in DER, albeit a squad he inherited from Bill McKechnie with the decade's best DER in the NL in 1937.
In the 1940s, change was in the winds. The war decimated MLB's talent level and introduced inferior baseballs (due to wartime shortages) that traveled poorly when hit. DERs rose back above 70% even before the war in the NL, and in 1942 in the AL. After the war, integration followed and the game was off to the races, while night baseball really came into its own.
In the NL, defense was king - seven pennant winners led the league in DER in nine years between 1939-47, plus the 104-win second-place 1942 Dodgers; four pennant winners led the AL, but three of those were the 1941-43 Yankees. The strongest defensive teams of the decade were McKechnie's 1940 Reds and Lou Boudreau's 1948 Indians (a team famous for its outstanding infield of Boudreau, Ken Keltner, Joe Gordon and Eddie Robinson), the weakest the 1940 Pirates and 1942 Senators (the difference between the Senators of the mid-40s and the Indians of the 50s explains a lot about Early Wynn's career). The chicken-egg question remains regarding good defenses and successful managers, as Leo Durocher's arrival in Brooklyn in 1939 and Billy Southworth's in St. Louis in 1940 were followed within a few years by the construction of superior defensive teams.
The 1947 Reds were the third and last team to go from first to last in the league in DER in a single season, after the 1913 Red Sox and 1880 Buffalo Bisons:
The Bisons and their ace pitcher, Hall of Famer Pud Galvin, hail from baseball's ancient past, and the Red Sox were a bit of a fluke, given the small size of their decline and their rapid rebound the following year. What of the 1947 Reds? 1946 was the last season of McKechnie's career, and McKechnie was notoriously defense-obsessed. The team gave a lot more playing time to 30-year-old shortstop Eddie Miller, outfielder Frank Baumholtz and noodle-armed 35-year-old left fielder Augie Galan. Sidearmer Ewell Blackwell had his big breakthrough season in 1947, improving his K/BB from 1.27 to a league-leading 2.03, but saw his ERA slip slightly from 2.45 to 2.47, while veterans Johnny Vander Meer and Bucky Walters got completely wiped out by the defensive collapse.
Posted by Baseball Crank at 12:00 PM | Baseball 2011 | Baseball Studies | Comments (1) | TrackBack (0)