Thursday, November 21, 2013

PTOLEMAIC POST-MORTEM: TRACKING HISTORICAL MVP VOTING PATTERNS

There's absolutely no truth to the rumor that Andrew McCutchen's
folks circulated this picture of Paul Goldschmidt to NL MVP
voters just before the end of the season...
We noted that the Ptolemaic MVP method--which in its somewhat truncated 2013 implementation had its AL leader finish #1 in the voting (Miguel Cabrera) and its NL leader finish #2 (Paul Goldschmidt)--is in need of additional nuance and context. This seems like as good a time as any to ruminate a bit on just what some of that might be.

We'd already alluded to the idea of positional adjustments, which occur in the various manifestations of the Wins Above Replacement (WAR) method. Among the first goals of WAR--in its original incarnation back in the 1980s, before it became putty in the hands of increasingly feverish tool-and-die, widget-obsessed atom-splitters--was to more equitably measure player value based on the impact of playing more difficult defensive positions. (In other words, players who play up the middle have tougher defensive assignments and have generally hit less well that those who play on the corners.)

So a "Ptolemaic" method could simply gather the WAR data for its snapshots and add those up (or average them) over time. But since the defensive component of WAR as implemented from play-by-play data remains (how to say this politely...) "problematic," we'd hate to buy that pig right now, even if poked. (Otherwise, we 'd be selling you the notion that Carlos Gomez was the MVP in the 2013 National League, just as the leaderboard at Forman et fils suggests).

So, at best, a total WAR value would only be one component in a revamped Ptolemaic MVP. Players at the top of the defensive rankings would have the 4-3-2-1 point scale applied just the way it's done for  OPS, OBP, SLG, etc. We wouldn't weight it in a way that could propel even a top-flight up-the-middle fielder into first place when his hitting isn't in the top ten (Gomez was 13th in OPS, 20th in OPS+ in the NL this past season).

Clearly there are many slippery slopes available when we consider this topic. Given that incontrovertible reality, the first step in terms of creating a truly credible "MVP to date" projection is to have a thorough road map of historical MVP voting patterns. To do that, we created four versions of the table you see (below, at right).

This one has the MVP results for the American League since 1969. (As you'll remember, 1969 was the beginning of divisional play, which began the escalating alteration of the post-season, both in perception and in basis of fact.) We picked the AL because it has more potential "aberrations" in its MVP selections--as Jon Bernstein pointed out way back in the "glory daze" of rec.sport.baseball, the AL MVP voters have been prone to select pitchers for the MVP, while the NL voters have kept pitcher honors strictly limited to the Cy Young Award.

Each line has the MVP, team played for, the team's winning percentage (WPCT), whether it made the post-season (Y/N), position in its division, the MVP ranking by WAR, and the MVP ranking by OPS.

When we get to the third column from the right, the display turns into data for what we will call the "WAR MVP"--if and when the top-rated player in WAR for that year is different from the MVP winner. Right from the top of the chart we can see that this happens a good bit. (We color-coded the WAR and OPS leaders in light blue so that they will stand out.)

Other pertinent color-codings: green indicates a player that won the MVP or was the "WAR MVP" for a wild card team; orange indicates MVP or "WAR MVP" whose team finished under .550 and did not make the post-season; yellow highlights players who were MVPs or "WAR MVPs" on teams that finished below .500.

Just to be clear, this is one of four such tables (AL 1969-2013; NL 1969-2013; AL 1931-68; NL 1931-68). We'll get around to publishing the other three of these at a later date. These form the basis for some generalized findings about MVP voting behavior, focused on three key elements: where the team of the MVP (or "WAR MVP") finished in the standings; the ranking of MVP by WAR and OPS. Those findings are summarized in two tables that can be found below--the first for 1969 to the present, and the second for 1931 to 1968.

Before we summarize the results from those tables, we should note that the average WPCT of a team with the MVP is .592. That's been in decline every since the first year that the BBWAA voted for the MVP, when they selected the A's Lefty Grove in the AL (team WPCT of .704) and the Cardinals' Frank Frisch in the NL (team WPCT of .656). In the master table above, the AL average for the MVP is .579.

That value for the "WAR MVP," however, is even lower, down at .548. That's because a pure numerical method won't take into account where teams finish in the standings. And, as the table above shows, there are far more "WAR MVPs" on sub-.500 teams (a total of nine) than is the case for actual MVP winners (only two).

Our two historical summary charts (rendered, perhaps metaphorically, in shades of blue...) show one reason why a Ptolemaic method needs to know about team performance. Only 3% of MVPs come from teams with sub-.500 records. (And note that this rate is consistent on both the 1969-2013 table and the 1931-68 table below.)

That helps explain why Andrew McCutchen didn't win in 2012 (Pirates faded below .500) and why he did win this year (Pirates make the post-season). Paul Goldschmidt (aka Mr. Sombrero...) had better overall numbers than McCutcheon--including in the Ptolemaic data--but his team (the Diamondbacks) finished 81-81.

"WAR MVPs" come from sub-.500 teams about five times as often as in "real life" (14%). That figure is on the rise during divisional play (20% since 1969). BBWAA voters have clearly taken team finish into account at a seriously elevated level from the get-go: the percentage of MVPs from non-post season teams is just over 27% since 1931, while "WAR MVPs" have come from twice as many non-post-season teams (55%).

This pattern seems to be hardening in recent years. Since Cal Ripken was AL MVP in 1991 playing for an Orioles squad that won only 67 games (.414 WPCT), 90% of all MVPs have come from teams making the post-season. (Some of that might come from the addition of the wild card team, but it's hard to get the exact handle on just how much of an effect is present.) That figure stands out from the other "quartile" measures that you'll find in the two tables (marked "MVP From PS team 19xx--xx"), which show a remarkable consistency in that percentage (71% for 1931-49; 66% for 1950-68; 67% for 1969-91) until the last twenty years.

One interesting comparison is to see the contrasting percentages in the two  eras for the percentage of real-life MVPs who finished lower than fifth in WAR. That figure is noticeably higher in the 1969-2013 period (27%) than it was during 1931-68 (17%).

By now, of course, we are somewhat removed from the numbers that need to be contextualized for the Ptolemaic MVP, but it's probably worth it in order that we might get a sense of how WAR would shape the MVP awards if it "ran the zoo." One thing that's worth noting is that it restores the embattled Alex Rodriguez to a level of prominence that has been swept away by the ongoing subterfuge of controversy. A-Rod can be seen as having won three real-life MVPs in which WAR agreed with the BBWAA, along with three more "WAR MVPs" (in 1998, 2000, and 2002) that were given to others (with only the 2000 MVP, which went to Jason Giambi, being a reasonable alternative). While he's no Barry Bonds or Willie Mays in terms of "WAR MVPs" (Bonds would have eleven, Mays ten), A-Rod's total of six such "WAR MVPs" ought to put him back in our eyes as one of the game's greatest players.

So, to sum up, the Ptolemaic MVP method should add WAR as a component, despite its myriad problems; it should take into account the BBWAA voters' rising tendency to give MVPs to players on post-season teams; and it should consider some other possible ways to implement some positional adjustments.