Baseball fans frequently debate whether high-spending Major League Baseball teams can “buy” championships. The New York Yankees inspire hatred, envy, and even satire on this front. Unlike most other professional sports in the US, baseball has no salary cap: its weak “luxury tax” nudges teams to control spending rather than forcing them below a hard cap, leaving clubs free to break the bank for a roster of top talent — if they have the top-line revenue to support that spending. But can a high-spending team trump a more efficient but poorer team by going all out on salary to win games? I’ll explore that in R using data envelopment analysis (PDF link).1
Measuring team efficiency in baseball is very difficult — and the media pundits who attempt to measure it typically use unsophisticated methodologies. Google “baseball team efficiency” and you’ll find a number of articles that mostly divide salary by wins to get a cost per win figure, or otherwise use simple correlations of money out the door to wins and championships. Data envelopment analysis is a tool that allows the user to optimize a production function over multiple inputs and outputs, perfect for assessing multiple measures of team performance vs. salary. I’ll quote from the Journal of Sports Economics paper that inspired this post to explain:
DEA was introduced to measure the relative efficiency of decision-making units (DMUs) that change inputs into outputs (Charnes, Cooper, & Rhodes, 1978). The DEA model is a linear programming technique that compares the levels of inputs and outputs of one DMU with the rest of its peer group. The DMUs that produce the highest outputs with their inputs are deemed efficient, and these efficient DMUs form a piecewise linear frontier. The frontier surface is a hyperplane with as many dimensions as there are inputs and outputs. All inefficient DMUs are evaluated relative to the efficient surface.
That paper, “Is Winning Everything?: A Data Envelopment Analysis of Major League Baseball and the National Football League,” (PDF link) was published in 2004, with data leading right up to the introduction of baseball’s luxury tax. The author, Prof. Karl W. Einolf of Mount St. Mary’s University, proposes pitcher salaries (as a measure of investment in defense) and all other salaries (as a measure of offensive investment) as inputs, and measures how efficiently teams convert those inputs into team batting average, team ERA,2 and wins as outputs. I replaced AVG with OPS in my model and am using slightly different data (from the Baseball Databank) but am otherwise following the paper’s methodology.
In this analysis, perfectly efficient teams will have an efficiency factor of 1; perfect inefficiency is denoted by a factor of 0. I found Major League Baseball to have a mean efficiency of 0.76 during the period 1985-2013 with a standard deviation of 0.23, less efficient and more variable than Prof. Einolf’s results from 1985-2002. Interestingly, I found the efficiency factor of World Series-winning teams to be 0.86 — higher than the MLB average — but the difference is not statistically significant given the high standard deviation of both league efficiency factors and World Series winners’ efficiency factors (0.23 and 0.22 respectively).
Taking a look at the distribution of league efficiency compared to the distribution of world champions’ efficiency factors is a little more illuminating:
As you can see, the distribution of all teams skews to the right — even losing teams use their inputs relatively efficiently in most years. However, the distribution of world champions is even more efficient, with no champions falling below 0.2 and a much higher proportion of World Series winners attaining perfect efficiency than the proportion of all teams (57% vs. 35%).
Returning to the New York Yankees, in three of the five years they won the World Series in the sample data, they were perfectly efficient, while in two of those years they were significantly below average:
> teams[teams$WSWin == "Y" & teams$franchID == "NYY",c("franchID", "yearID", "eff")] franchID yearID eff 1996.496 NYY 1996 0.6067819 1998.498 NYY 1998 1.0000000 1999.499 NYY 1999 1.0000000 2000.500 NYY 2000 0.2975017 2009.509 NYY 2009 1.0000000
This suggests it is possible to throw a lot of money at payroll and still win championships despite being fairly inefficient. In fact, the 2000 New York Yankees’ 0.30 efficiency factor is the worst of any World Series winner 1985-2013. The next most inefficient World Series winner was the 2006 St. Louis Cardinals, also acknowledged as a “come from nowhere” success — their regular-season winning percentage of .516 is the worst ever for a World Series champion.
However, as the distributions above suggest, these are outliers — luck and uncharacteristically strong performance in the post-season are both established features of championship runs in Major League Baseball, but efficiency in converting payroll into runners on base, strong defensive performance, and regular-season wins pays off in the post-season more often than not.
However, the truth may be that efficiency is necessary in many years but not sufficient. Other analyses suggest that teams can buy wins, at least up to a point, meaning poorer, smaller-market teams may struggle to compete no matter how efficient they are. It’s unlikely that the competitive balance tax (“luxury tax”) can help level the playing field — I’ll explore competitive balance in my next post.