Baseball Crank
Covering the Front and Back Pages of the Newspaper
November 29, 2004
BASEBALL: Age and Established Win Shares

One of my major projects of late has been plugging the 2004 Win Shares data from the Hardball Times into a series of spreadsheets to (1) analyze the usefulness of my Established Win Shares Levels figures from earlier this year and (2) run similar EWSL numbers for 2005. EWSL is explained here; in a nutshell, it's an application to Win Shares of Established Performance Levels, which take a weighted measurement of a player's accomplishments in a given category over the prior three years. I ran an EWSL analysis of each team starting here, listing 23 players (13 non-pitchers and 10 pitchers).

As I've said before, EWSL is just a compilation of the past, not a projection of the future, although past performance is always a useful thing to have in projecting a ballplayer's future. Anyway, one issue with EWSL, especially on a team level, is that it tends to overrate older players and underrate younger ones by relying on established track records.

That, we already knew. But by how much? I had used a number of adjustments to deal with this issue, and I'll return to those later, but first I wanted to take a look at how the unadjusted EWSL fared as a predictor. So I broke down by age each of the 678 players I had listed to compare their unadjusted EWSL entering 2004 to their 2004 Win Shares, and grouped the results by age. The Average EWSL and Average 2004 Win Shares columns are rounded off; the % column shows the total 2004 Win Shares for that age group (un-rounded) divided by the total EWSL (also un-rounded), with 1.00 meaning the group matched its EWSL, numbers above 1.00 showing an increase and below 1.00 showing a decrease. I grouped the 20-21 and over-40 groups because they were so small (20 was just Edwin Jackson, who never did get a shot in 2004).

Age#Avg EWSLAvg 2004 WS+/-%
20-21637+42.77
2213411+73.15
2311610+41.61
2426610+41.76
2539610+41.64
267078+11.22
276089+11.19
2860911+21.26
294998-10.89
3062101000.96
315198-10.89
3253109-10.91
3341107-30.71
343487-10.87
3521147-70.52
3626109-10.84
3719119-20.81
381798-10.86
39111311-20.84
40+9111101.00

Although the overall aging pattern is hardly a surprise, I was struck by how vividly the pattern came out even over a relatively small sample size. (The breakdowns of numbers of players by age is interesting in its own right). The 40+ crowd, of course, was dominated by Clemens and Randy Johnson, which is what throws that off. Since Established Performance Levels acts as something of a multiplier of inexperience, it's not surprising to see the average player doubling or tripling his past track record at a very young age, when many in the group are rookies, and that time-lag may also contribute to why the break point for decline starts at 29 rather than 28. I was also struck by the overall stability of the numbers, as there was relatively little variance in the 2004 quality of production over age groups, although of course the mid-30s crowd did underperform the mid-20s crowd even though the mid-20s contingent included a much larger number of marginal players who won't last past 30.

The wipeout of the 35-year-olds was especially gruesome, and can be attributed partly to having a small sample and the highest starting point in the range. But there were more than just a few disasters in that group: Tim Salmon (down from 18 to 2), Bret Boone (29 to 9), Shigetoshi Hasegawa (10 to 3), John Olerud (20 to 10 - the Mariners had way too many of these guys), Mike Mussina (18 to 10), Paul Quantrill (10 to 6), Pat Hentgen (6 to 0), Fernando Vina (12 to 1), Sammy Sosa (27 to 14), and most egregiously of all, Hideo Nomo (15 to -6).

Anyway, there's more work still to be done, but clearly to be useful as a predictive tool EWSL needs to be adjusted for age in some fashion.

Posted by Baseball Crank at 7:40 AM | Baseball 2004 | Comments (4) | TrackBack (0)
Comments

The age 29 thing is also what "Marcel the Monkey, Forecasting System" does.

Note, we are *not* saying that you peak at age 29.

Don't forget what this 3-yr "established level" is doing.

Take the performance at age 26,27,28. You take the average (weighted preferably), and try to figure out his performance at age 29.

Assume that the peak is at age 27.5.

Trying to predict Age 29 (1.5 years past prime) from Age 26 (1.5 years before prime), and you are essentially at the same level. That is, if you are a .350 OBA at age 26, you will be .250 at age 29 (more or less, and if the peak is actually 27.5).

Trying to predict Age 29 from 27, and we see that at age 27, you are 0.5 years from peak, compared to the 1.5 years from peak for Age 29. So, you need a small adjustment downward.

Trying to predict Age 29 from 28, and again, you need the same small adjustment downward.

Now, if you work it out, and weight your ages at a 5/4/3 level, the peak age becomes 28.08.

Note: I use age as truncated age of Dec 31 of the year in question (or simply, year in question minus birth year). That makes my players 0.5 years older than other systems, and would make the true peak age, according to the above method, as 27.58.

That is, a peak age of 27.5 is entirely consistent with Win Shares showing no change at age 29 from the established level of 26 to 28.

Posted by: tangotiger at November 29, 2004 10:31 AM

Tango, I think we are saying largely the same thing here (except you're saying it with more precise math ;) - the three-year weighted averaging creates a lag that makes the peak appear to be a little later than it is.

Posted by: The Crank at November 29, 2004 10:43 AM

Your last statement about the lag is the one statement I should have said. Agreed.

Posted by: tangotiger at November 29, 2004 4:20 PM

Great stuff, Crank. Thanks.

Posted by: studes at November 29, 2004 4:46 PM
Site Meter 250wde_2004WeblogAwards_BestSports.jpg