Thursday, February 11, 2016


So--is it Zager...or Evans? Does anyone
really know what time it is??
Yes, a long, long hiatus...we have been inundated with other projects (including the shocking recomposition of muscle tone in our upper arms; the better to throttle you with, of course...)--and, besides, it takes a long time to "recover" from the specter of the Royals being World Champions. They are due again in the year 2045 (no, not 2525, though one wonders if the thirty-year interval might bring them into alignment with the dizzy duo of Zager and Evans...).

As we stretch our sea legs in anticipation of another baseball campaign, let's spend some time with Hall of Fame voting. Predictions of a long, intractable logjam by baseball numberologists have not come to pass, and while the Baseball Writers Association of America (BBWAA)  remains an organization stumbling toward transparency, they might not be as catastrophic a voting body as they are usually made out to be.

There are two major rants about the BBWAA record. The first, that they are deficient in selecting a sufficient number of players through their "front door" to Cooperstown, is a fact as undeniable as it is ugly. We are not here to meditate (or, heavens forfend, mediate) on this claim--while it is somewhat overblown in the eyes of the die-hard complainers, there is more than a kernel of truth in these charges. (We have the Hall of Merit as a corrective for the excesses and exclusions in the Hall of Fame voting process, but that's really more of a glorified kluge than an adaptation of the stringent induction requirements the BBWAA somehow must utilize.)

Two fellas who've been helping to "skew and screw" the BBWAA HOF
voting results for the past few years...
The BBWAA's problem, however, is due mostly to the 75% voting threshold. They've elected 122 players, and that in itself is something of a miracle given the level of consensus needed. If the voting threshold was 50%, the BBWAA would have inducted 14 additional players, or about 12% more; taken down to a "large plurality" (40%), this would make only another 11 more, which would produce only a 20% increase in "front door" inductions. This suggests a series of factors involved in the decision-making that do not conform to "bell-curve" distributions--a fact that, in and of itself, will drive hard-core numberologists stir-crazy.

And that leads into the second rant, which is the current "meme" or "trope" or "semi-intellectual fetish" (you decide which term gets closest to what's actually happening...we prefer the term "hegemonic delusion--but you already knew that). The ranting is about the BBWAA's purported need to let HOF candidates "ripen on the vine" before inducting them.

Why on earth--the growl quickly morphs into a screech--can't the mo-fos just put the deserving ones in the first time around? What's with all this $#$%-ing pussyfoot tut-tut-tut posturing horse manure, anyway? (Will someone please invent a Xanex spray that can silently emanate from these guys' computer screens??)

It's time to following the bouncing ball, dudes. The data in the table below gives the lie to this perception of BBWAA antics. As you'll see, the BBWAA may have trouble identifying all of the deserving players (rant #1, remember)...but they aren't really doing that bad of a job inducting the ones they actually do manage to stumble across:

So what you see here is as follows: over the nine decades that the BBWAA chimps have been circling the selection process, they have inducted 43% of the players in the very first year of their eligibility. (Yes, all these numbers are percentages.) But: as we look at the decade-by-decade data (the orange column), we can see that their big problems occurred early in the process; since the 1960s, they have been comfortably above that ever-increasing overall "first-year induction" average.

From here we can read across and see how this plays out for Year 2, Year 3, etc., all the way to Year 15. As you may remember, the Hall of Fame lopped off Years 11-15 from the selection process recently, mostly in order to cast out those two nasty dudes whose mugs are displayed further up in the post and put them into Veterans Committee limbo. What this does from an historical standpoint is to threaten about 10% of potential inductees who've needed those last five years in order to make it in through the front door.

But as you can see, there is no evidence that the BBWAA's recent voting patterns display a pattern of "dragging their feet" with respect to the induction process. They were a bit slugging in Years 2-4 during the decade of the 2000s, and that might be in part due to the "roid rage" that dominated media coverage during those years. But the effect, if there was one at all, was at best a mild one, and it has emphatically reversed itself so for in the current decade.

So no comfort food for those who want to make the BBWAA into carrion. They are far from perfect--and who is, for that matter--but they look to be doing a consistent and mostly acceptable job of identifying deserving inductees and voting them in without undue delay. (But...only that Xanex spray is likely to put a stop to the whining.)

Monday, November 30, 2015


You may dimly remember a series of posts about an obscure stat that lives in the bowels of the data at Forman et fils--a statistical breakout for relief pitcher that they call "Non-save situations."

It appears at first (and quite possibly even second and third) glance to be a catch-all, garbage-like stat, capturing all of the performances by relievers when the game is either well under control (ahead  by four or more runs), or where the team is trailing--and also when the score is tied.

It also captures early inning usages of relievers (prior to the sixth inning) regardless of the game situation, but these are the rarest of the events that cluster into this odd "catch-all" area.

Oddly enough, however, these situations produce a .570 WPCT for the pitchers who get decisions in this "afterthought area." (That WPCT, by the way, is for one hundred and two seasons' worth of data: the won-loss totals for "non-save situation" decisions are 39671 (wins) and 29966 (losses),

When studying World Series teams, the question regarding these games in the context of the question in  our title is rather interesting with respect to assessing the meaning of this "outcome anomaly." Will it prove to be a random function--meaning that all teams, regardless of their overall won-loss record, win around 57% of the decisions that occur from these situations--or is it a function that is defined and controlled by team quality, where better teams have better WPCTs in their "non-save situations" decisions.

Now, if you read those blog posts, you'll already know the answer. It turns out that the function is indeed defined and controlled by team quality. Teams that have won the World Series have an aggregate WPCT of .656 in non-save situation decisions; teams that lost the World Series have an aggregate WPCT of .630 for this breakout.

So it's very likely that the thing that World Series winning teams have done best over the course of baseball history is to generate a significantly higher-than-average WPCT in games where the pitcher getting the decision is working in a non-save situation. Who woulda thunk?

Actually, with the number of decisions occurring in the non-save situation on the increase (due to the rise of reliever innings), it's becoming part of the strategic landscape--and a team with otherwise ordinary performance elsewhere can offset that with a top-flight performance in this obscure area. That was most definitely the case for the Royals in 2015 (24-11, .686 WPCT, 2.92 ERA) and the Giants in the previous season (30-7, .811 WCT, 2.80 ERA).

Thus a "garbage" stat, one apparently deserving the briefest of afterthoughts, is evolving into another key tool in winning games. And, as we noted in the earliest posts about it, it restores meaning in the won-loss stats...consider it another piece of moral relativism stuffed down the throats of those who probably aren't paying attention.

Saturday, November 14, 2015


There are literally hundreds of ways to try to answer the question in the title above...the first thing that needs to be done in order to narrow the focus is to decide what our point of comparison is. Are we comparing World Series champs to all other teams? Are we comparing them to all other playoff teams?

Or are we going to look at them only in terms of their opponents in the World Series?

Prior to 1969, of course, that was the only point of comparison we had. So to keep any potential data set operating on at least a semi-consistent basis, what we propose to look at here (in a series of posts to appear irregularly during the off-season) is what separates World Series winners and losers. So we are performing only a binary comparison here.

Even with that, we still have many ways to skin the cat. What type of performance are we talking about? Is it what happens in the World Series itself? No, that would be too small a sample size. We'd be better off looking at the in-season data for the two teams and seeing if any strong patterns emerge from it.

So--in-season data. What type of data? Pitching? Do we want to look at bullpen performance? Particular layers of that performance? What about hitting? What would be significant enough as a rough guide to capture differences? And will any of them prove to be more than a random pattern?

Well, no way to know without just diving in somewhere and hoping that it's not the shallow end of the pool. Reaching in semi-blindly, we're choosing to begin by looking at the teams' hitting with two outs. That's a large enough data sample in each season to be meaningful: we're not down to something that's only a tenth of the total plate appearances.

It turns out that this particular split data goes back to 1957, with a few other seasons prior to that available as Retrosheet fills in more play-by-play data further in the past. Interestingly enough, when we use the sOPS+ value from Forman et fils as a way of gauging how much better than average, we find out two interesting facts. First, World Series winners are, on average, 9% better than their overall league average in hitting with two outs. Second, the 1957 Milwaukee Braves, one of the very first teams for whom we have this data, have the highest sOPS+ value of any World Series winner, at 137.

The 2015 Royals, the most recent World Series winner, rank twelfth on this list, with a 119 sOPS+.

Oddly enough, the 1985 Royals--the last KC team to win a World Series, rank dead last (62nd) in this stat, with an 85 sOPS+.

The other interesting thing here is that we are seeing a lot of recent World Series winners on either extreme of this list. Particularly unusual is the fact that the San Francisco Giants, in all three incarnations of their recent even-year dominance of the World Series, were very poor performers when hitting with two outs.

So now what we want to know is: how does this stack up against the teams they beat in the World Series? All of the above wouldn't mean jack if the losing team in the Series had a higher sOPS+ in plate appearances with two outs. And it turns out that it is lower--not a lot lower (104), but lower.

But there is another nuance we should explore here--namely, when two teams face off in the World Series, does the fact that one of these teams performs better with two outs of any predictive value with respect to who is the eventual World Champion? Or is this simply another random variable?

The answer: there is some possibility that it is, in fact, an indicator--particularly in recent times. Measuring the data from the first year where we have both winners and losers available (1957), we see that the eventual World Series winner has had a higher sOPS+ when hitting with two out in 32 out of 54 Fall Classics, or 59% of the time.

But this was a 50/50 proposition from 1960 through 1982; since then, the odds are closer to 2 to 1 in favor of the World Series winner having a better 2-out hitting performance during the regular season that the World Series loser.

Interestingly, as the teams that make the World Series become more subject to the random forces that have taken hold due to the expanded post-season, the more robust this trend has seemed to become. In the past 20 World Series dating back to 1996, the team with better 2-out hitting has won 14 times (70%). And as the chart at right shows, the five-year smoothed ten-year average for this data shows an even higher correlation than that over the past ten years.

Small sample size? Of course. And none of this takes into account all of the intermediate post-season matchups that occur along the way to the Fall Classic. But it is interesting to note that this trend has strengthened even as teams in the World Series are declining in average WPCT due to the randomizing effects of the expanded post-season.

This is one we will have to keep an eye on moving forward...

Sunday, October 25, 2015


Let's sum up what we know about World Series geography in a succession of visual displays.

First, the overview of the percentages for the nine categories with at least one incidence (we're still looking for the first "South-South" World Series (in the table at right).

Since 1961, "East-Midwest" (the geographic matchup that we have for 2015) has been the highest (22%) but the incidences are much more spread around the categories now.

That's especially the case from 1998 to the present.

What about summing things up by basic region? (That is, East-Midwest-West-South, as we did up to a point in the previous post.) We really only need to do this from 1961 to the present, since the West and South simply didn't exist as regions until then.

This will look best in a "running total," and in a chart rather than a table, so here goes.

We can see that while the West got into the act early, the South languished and didn't get into the World Series action until 1991.

(And we should also remember that the Southern region consist of just five teams out of MLB's total of thirty, so it's likely to be trailing the pack. The Midwest region actually has twice as many teams in it than the South (with ten), so by rights they should have the highest percentage of team in the WS over this time span, but they don't: the East does.

Finally, here's a table that sums up what's been going on by region with respect to the World Series since 1991. We wind up at the bottom with the total number of WS appearances for each region as the numbers add up.

We also get the color coding for the times when there are years where the World Series occurs entirely with a single region (EE, MM, WW...remember, no "SS WS" has happened yet).

We can see that it's the Eastern teams who've managed to get into the most "all-region WS," with three over the past twenty-five years,

And we can see just how the West languished, reaching the WS just once in the first nine years of the time period, and not really showing something like a normal distribution until as late as 2010.

When you look at it this way, the South (with half as many teams in its region as the Midwest) has done a good job of holding its own.

And, in fact, the Midwest has had to stage a rally over the past five years to slip ahead, with a team in the WS in each season.

At some point we'll put all this together with which region actually won all of these World Series, but we'll save that until we know who wins this one. Stay tuned...

Saturday, October 24, 2015


Well, now we know what the "geographical configuration" of the 2015 World Series will be--with the Royals eliminating the Blue JAys, it'll be "East-Midwest."

That category (EM for short) is one of ten possible geographic "collisions" that exist due to baseball's franchise movement and its incremental expansion.

The "West" came into the picture in 1958; the South in 1962 (with the Houston Colt .45s, later the Astros).

As you'll see at the bottom of the chart (at right), the South has insinuated itself into the World Series 20% of the time since 1961...

...but the South has yet to crash through with that tenth category, the "South-South" World Series, a situation partially explained by the fact that the playoff-bound teams from the South have seemingly found themselves in the same league most of the time, making it extremely difficult to bring off the still-elusive "SS" World Series.

And that's why, in case you were wondering, there is no column on the chart for the "SS" series. When it happens, we'll add it.

Since '61, East-Midwest (EM) and East-West (EW) World Series have been the most plentiful, though our breakouts (1961-80 and 1981-2015) show that these two categories have faded into the pack over the past thirty-five years...

...with the East-South (ES) matchup having the highest preponderance (15%) over the past thirty-five years, thanks in large part to the 90s Braves.

If we measure from 1991, when the Braves began their run of World Series appearances (augmented by the Marlins in '97 and '03, the Rays in '08, and the Rangers in '10 and '11), teams from the South have appeared in 44% of the World Series over the past twenty-five years.

The "all-Midwest" (MM) World Series had a bit of a flurry in the 80s, but it then went nearly twenty years before manifesting again in 2006 (Cardinals-Tigers).

Overall, however, Midwest teams are well-represented in the Fall Classic over the past thirty-five years, appearing in 47% of the World Series since 1981 (and, like the South, in 44% since 1991).

Teams from the East have matched the Midwest's performance, appearing in 47% of the World Series since '81.

It's the West that's lagged behind: they have made it to the WS only 36% of the time over that time frame.

We'll sum up the breakouts in an aggregate chart next.


Since blogs work backwards, we'll be our usual prickly selves and reverse the reverse order, thus beginning (perversely) at the beginning.

As noted in the previous post (which is behind you, not in front of you...) World Series geography--the categories of regional identity for the two teams facing off in the Fall Classic--has expanded as baseball itself has grown.

Back in the day (pre-1953, to be exact), baseball had nine teams in the East and seven in the Midwest, and thus there were only three possible categories:

--East-East (EE)
--Midwest-Midwest (MM)
--East-Midwest (EM)

That changed, as we also noted previously, when the Dodgers and Giants moved west, giving us three new categories:

--East-West (EW)
--Midwest-West (MW)
--West-West (WW)

The Dodgers managed to inaugurate one of these new categories before the first expansion era hit, in 1959, with their playoff win over the Braves (then in their Milwaukee way-station between Boston and Atlanta) depriving us of another "all-Midwest" World Series.

The chart at left gives you a visual fix on the narrow geographical bandwidth of baseball's post-season, which was also quaintly "narrow" in the sense that the World Series, in those days, was--as it is increasingly hard to fathom--the only post-season baseball at all.

The "golden age" of East-East World Series is clearly to be found in the early history of the post-season, from 1903-24, when those matchups accounted for 52% of the World Series. The other two categories (remember, the only other two possible categories at the time...) split the remaiming World Series evenly.

There was a "last hurrah" for the "EE" category right after WWII, when eight of ten World Series from 1947-56 featured teams from the East (in fact, all but one of these teams being from New York).

But clearly the 1925-60 period was dominated by East-Midwest (EW) matchups, with "Midwest-Midwest" fading to a distant third. And the overall story for the pre-expanion period is that the two major categories (EE and EW) each accounted for the same percentage of World Series matchups (40%).

Of course, that will all change in the expansion era. But, since blogs work backwards, you already know that...

Thursday, October 22, 2015


We are not going to have an "all Midwestern World Series" this year, thanks to the New York Mets. (Which, from one perspective, is regrettable, as we tend to think that the Chicago Cubs, with math-meth-"magician" Joe Maddon at the helm, would have made a better story had they wound up doing in the World Series what they just did in the NLCS. (All the better for Theo Epstein's self-burnishing "legacy," don't you know.)

But snark is just a side dish here--the question that we are asking here (though the title of the post isn't quite in sync with it...) is how many World Series have there been with all-Midwestern teams facing off against each other?

When, for example, was the last "all-Midwestern" World Series?

Answer: 2006, when the St. Louis Cardinals swept the Detroit Tigers (thanks, in part, to some wild throwing--to first base--by Tigers relief pitchers).

We'll have some charts on this tomorrow, but let's at least answer the basic question here. First, however, let's anatomize the categories that exist for a geographic rendering of the World Series.

For many years, there were only two such categories--East and Midwest. Franchise movement altered that configuration in the fifties, with the West coming into the MLB picture in 1958 (Dodgers and Giants to the coast). The first expansion added the South, with Houston (particularly with their original "south-western" nickname, the Colt .45's). The South added more teams via franchise movement in the sixties (Braves) and seventies (Rangers), and would later on colonize Florida.

The West would add teams via expansion (Angels, Padres, Pilots--later replaced by the Mariners), eventually adding Arizona and Colorado. The A's would move to Oakland.

So, from the original possible categories of East-East (EE), East-Midwest (EM) and Midwest-Midwest (MM) that still operated by themselves as late as 1957, we have further categories of East-West (EW), Midwest-West (MW), West-West (WW), East-South (ES), Midwest-South (MS) and West-South (WS) that came into existence as MLB expanded.

The answer to the basic question--how many "all-Midwestern" (MM) World Series have there been--is fifteen. There have been five since the first year of expansion (1961):

2006 STL-DET
1987 STL-MIN
1985 STL-KCR
1982 STL-MIL
1968 STL-DET

That's rather monolithic for the NL representative, come to think of it....

More tomorrow...stay tuned.

Sunday, October 11, 2015


Chase Utley is a borderline Hall-of-Famer whose late start as a major leaguer has doomed him to a long, possibly infinte Veterans Committee purgatory. All across his career he's shown a command of the "little things" that win ball games.

Two of the most prominent of these "little things" are extra OBP in the form of walks, and superior, intelligent baserunning (as measured by stolen base success rate and out-on-the-basepaths stats).

All of the evidence surrounding Utley suggests that he is a thinking man's player.

Er,'re not on the base--and you've just screwed up
how people will remember you for the rest of recorded time...
So it's a "dirty old shame" (as Karen Carpenter would croon for us, had she not been the victim of her own takeout slide) that Chase Utley is now likely to be remembered mostly for a play on the basepaths that looks uglier and uglier the more it is replayed.

Worse yet is the unconscionable set of errors made by the umpiring crew in interpreting and ruling on what should have been the result of that play--which not only resulted in a needless season-ending injury to Mets' shortstop Ruben Tejada, but allowed the Dodgers to score four runs in an inning when the proper call would have resulted in them scoring none at all.

The irony is that the umpiring crew made one of the most egregious errors in baseball history while using the very system designed to prevent such errors from occurring.

So, as the title of this post indicates, what we have on our hands now is a "big painful mess that needs the rug the size of Jupiter in order to be swept out of view."

Utley's "slide" was probably not 100% intentional. It's one of those things that happens in athletic contests on rare and unfortunate occasions when two people moving in opposite directions wind up moving right into each other's path. The results are cataclysmic.

But the fact that Utley was uninjured as a result of the collision indicates that his intent was of a magnitude that cannot be overlooked by MLB. Plays of this nature, when there is even a scintilla of evidence pointing toward non-accidental intent, must be legislated in a way that makes it clear that no grey areas will be tolerated. The present and future health of players, particularly middle infielders, needs just as much special attention from MLB as is the case with catchers.

The simplest solution for baseball when such a play occurs--one that results in an injury--is to eject the player who caused the injury. Eject him immediately and without exception or recourse to appeal. (This does not apply, of course, when two teammates collide--only when injuries occur on the basepaths.)

The "big painful mess," aside from Tejada's needless season-ending injury, is that Utley was not called out (for any of three legitimate reasons which were, against all odds, completely overlooked) and a double play imposed as a penalty for causing the injury. (Replays indicate that Tejada was attempting to position himself for a throw to first when Utley slammed into him.)

We've previously suggested that players who cause injury, whether this way or by charging the mound (Carlos Quentin-->Zack Greinke), should be suspended for the length of the time that it takes the injured player to return to action. Applying that in this case, Utley should be suspended for the rest of the post-season.

This will make teams think twice about condoning the type of behavior that creates unnecessary risk on the playing field--something that a smart player like Utley certainly knew was the case, and had the option to not "put into play" in last night's game.

Joe Torre, who is getting too old to be involved in such serious matters, is almost certain to whitewash this--his hands are tied by the shabby actions of everyone involved in last night's "painful mess." But MLB can institute a coherent, consistent policy with respect to this "terror on the basepaths."


We told you the previous post in this ongoing series, on the final day of the 2015 regular season, we noted that we had hit 99 on the complete game list and that we would have at least one more that day to keep the "quest for double figures" in play for another year.

And so it was...

10/4 Cole Hamels, TEX (vs. LAA, 3-hitter, 9-2 win)

Interestingly, there were only 98 games in 2015 where pitchers threw 8 2/3 or more IP, regardless of whether they had a complete game. That should remind us that a sub-set of CGs are games where the pitcher throws eight innings--and, of course, loses. There were 19 such games in 2015.

A look at the official stats will show you that there were actually 104 CGs in 2015...but once again we remind you that those extra four CGs were due to games called early due to rain. (We do not recognize any CGs of less than 8 IP as legitimate.)

So...clearly we can't get any closer without dipping down into double figures. Next year projects to be another down-to-the-wire type of affair with respect to this endangered species...