Covering the Front and Back Pages of the Newspaper
November 29, 2004
BASEBALL: Age and Established Win Shares
One of my major projects of late has been plugging the 2004 Win Shares data from the Hardball Times into a series of spreadsheets to (1) analyze the usefulness of my Established Win Shares Levels figures from earlier this year and (2) run similar EWSL numbers for 2005. EWSL is explained here; in a nutshell, it's an application to Win Shares of Established Performance Levels, which take a weighted measurement of a player's accomplishments in a given category over the prior three years. I ran an EWSL analysis of each team starting here, listing 23 players (13 non-pitchers and 10 pitchers).
As I've said before, EWSL is just a compilation of the past, not a projection of the future, although past performance is always a useful thing to have in projecting a ballplayer's future. Anyway, one issue with EWSL, especially on a team level, is that it tends to overrate older players and underrate younger ones by relying on established track records.
That, we already knew. But by how much? I had used a number of adjustments to deal with this issue, and I'll return to those later, but first I wanted to take a look at how the unadjusted EWSL fared as a predictor. So I broke down by age each of the 678 players I had listed to compare their unadjusted EWSL entering 2004 to their 2004 Win Shares, and grouped the results by age. The Average EWSL and Average 2004 Win Shares columns are rounded off; the % column shows the total 2004 Win Shares for that age group (un-rounded) divided by the total EWSL (also un-rounded), with 1.00 meaning the group matched its EWSL, numbers above 1.00 showing an increase and below 1.00 showing a decrease. I grouped the 20-21 and over-40 groups because they were so small (20 was just Edwin Jackson, who never did get a shot in 2004).
Although the overall aging pattern is hardly a surprise, I was struck by how vividly the pattern came out even over a relatively small sample size. (The breakdowns of numbers of players by age is interesting in its own right). The 40+ crowd, of course, was dominated by Clemens and Randy Johnson, which is what throws that off. Since Established Performance Levels acts as something of a multiplier of inexperience, it's not surprising to see the average player doubling or tripling his past track record at a very young age, when many in the group are rookies, and that time-lag may also contribute to why the break point for decline starts at 29 rather than 28. I was also struck by the overall stability of the numbers, as there was relatively little variance in the 2004 quality of production over age groups, although of course the mid-30s crowd did underperform the mid-20s crowd even though the mid-20s contingent included a much larger number of marginal players who won't last past 30.
The wipeout of the 35-year-olds was especially gruesome, and can be attributed partly to having a small sample and the highest starting point in the range. But there were more than just a few disasters in that group: Tim Salmon (down from 18 to 2), Bret Boone (29 to 9), Shigetoshi Hasegawa (10 to 3), John Olerud (20 to 10 - the Mariners had way too many of these guys), Mike Mussina (18 to 10), Paul Quantrill (10 to 6), Pat Hentgen (6 to 0), Fernando Vina (12 to 1), Sammy Sosa (27 to 14), and most egregiously of all, Hideo Nomo (15 to -6).
Anyway, there's more work still to be done, but clearly to be useful as a predictive tool EWSL needs to be adjusted for age in some fashion.