I thought it'd be interesting to see if there was any difference between clubs' MLB rank by earned runs and runs. As most of you know, errors are out of vogue in fielding statistics for a variety of reasons. Scorer bias and subjectivity is one of them. If a Safeco scorer gives Ichiro a hit when the second baseman actually committed an error, and Ichiro comes around to score, that run is earned. Perhaps the exact same chain of events happens in Toronto, for example. Unearned run. Combine subjectivity with the fact that errors really aren't a very good gauge of defence, and you start to see looking at earned runs doesn't actually tell you how many runs a team's given up on the year. Case in point:

Table 1: 2009 Runs, Earned Runs, and MLB ranks. All data from Baseball-Reference.
Green denotes that the team was over-rated by using ERA, red denotes them being under-rated. Interesting, huh? Scanning that chart, I'm not seeing any correlation between green/red teams and actual defensive ability (and with this being a quick post I'm not inclined to dig too deep). I just thought it was interesting that although we can generally multiply ERA by 1.087 to get RA, that relationship doesn't even hold true at the team level. Note that only six teams don't see their rank change - the Mariners are one of them, holding steady at sixth in both categories. The most underrated team is the Minnesota Twins, who go from 25th by earned runs to 19th by runs, and at the opposite end of the spectrum we have the Florida Marlins, who tumble from 14th to 20th.
1 recs | 59 comments
I looked at the correlation between Team UZR and R-ER
There is no correlation. R^2 = 0.04. At least it shows that a good team uzr says that R-ER should go down.
Edgar for Pres - January 24, 2010
BABIP is way better to use to predict Team UZR than R-ER
Edgar for Pres - January 24, 2010
Also I was playing around with some numbers and...
Its also kinda strange.
I did some correlations to predict total runs allowed. I had two linear regressions using either FIP and UZR or FIP and BABIP.
FIP and UZR had an R^2 of 0.89 and a standard error of 32.9 runs.
FIP and BABIP had an R^2 of 0.93 and and a standard error of 25.3 runs.
If you use xFIP instead of FIP the difference gets even bigger.
xFIP and UZR had an R^2 of 0.80 and a standard error of 42.5 runs.
xFIP and BABIP had an R^2 of 0.91 and a standard error of 29.8 runs.
All these components are extremely significant (p-value is very small). I haven’t really thought about what all this means but its interesting.
If you are curious, regression equation I came up with is RA = FIP*156.76 + 3389.54*BABIP – 927.44.
You can also use the equation R-R[avg] = (FIP-FIP[avg])x156.76 + (BABIP-BABIP[avg])x3289.54 where everything is represented as relative to league average. I set the constant to zero however if you don’t set it to zero, the regression equation predicts it should be -8×10^-14. With the constant set to zero, the R^2 is 0.935 and the standard error is 24.9 runs.
This kind of tells me that BABIP can be pretty much used to explain team defense. These equations could probably be cleaned up to make everything make more sense but I think it shows the value of BABIP at the team level.
Edgar for Pres - January 24, 2010
I'm not exactly sure what you're going for here.
lailaihei - January 24, 2010
I wasn't sure what I was going for either
But in the end, team BABIP might be a better predictor of team defense than team UZR.
Edgar for Pres - January 24, 2010
BABIP includes defense + luck
We don’t really care about the luck. UZR tries to isolate skill.
vivaelpujols - January 24, 2010
I was surprised UZR didn't predict RA better
It could be at the team level, BABIP doesn’t really have that much luck since the sample size is larger. I only put out all that data here because it surprised me.
Edgar for Pres - January 24, 2010
BABIP only really includes luck for individual hitters or pitchers,
because once the ball is put in play it’s largely out of their control whether it’s a hit or an out. But on a team-wide level over a whole season it’s a pretty damn good measure of overall team defensive ability/efficiency. What could be a better measure of team-wide defense than how many balls in play were converted to outs?
Terminator X - January 24, 2010
In other words:
For pitchers and (to a somewhat lesser degree) hitters, BABIP is a measure of things outside their control, because whether it turns into an out is mostly dependent upon how good the fielders are. The implication of this being that BABIP is mostly controlled by the opposing defense. As such, it makes perfect sense to use BABIP as a team-wide measure of defensive efficiency.
Terminator X - January 24, 2010
...
That’s not true. BABIP is more lucky/unlucky bounce, balls just out of reach, etc. than it is fielding skill, even at a team level. Studes did some research for the THT annual a couple years ago on this matter IIRC.
vivaelpujols - January 24, 2010
Link?
Terminator X - January 24, 2010
I can't find it
vivaelpujols - January 24, 2010
Fair enough.
I would suspect that unlucky bounces constitute a very small percentage of all BIP, and over and entire season each team would have a similar # of unlucky bounces. As far as balls just out of reach, isn’t that to some degree a measure of fielding skill? A ball just out of reach of Torii Hunter is a ball that Gutz probably gets, because he’s got better range. Sure there’s some luck involved there, but again I would think when comparing two teams over the course of an entire season most of it would balance out pretty well.
Terminator X - January 24, 2010
UZR is basically BABIP though
It just normalizes for how difficult it is to field a groundball in a zone. Since we are looking at the team prospective and if we assume the distribution of balls in play is roughly the same for all zones on most teams, most these factors probably average out in the end (or at least it appears like they do).
Edgar for Pres - January 24, 2010
I don't see why those things would even out
Most of them are going to be caused by biases (like a pitching staff that allows a lot of line drives, or pulled balls in play), and even if the biases are limited (which I don’t think they are), the difficulty of batted balls could vary simply by random chance.
vivaelpujols - January 24, 2010
I'll get to the first part of your comment in a second.
But regarding your last phrase “the difficulty of batted balls could vary by random chance” what I’m saying is that over a large enough sample size the difficulty of batted balls should balance out. How large that sample size would have to be I do not know.
Terminator X - January 24, 2010
I was going to post something more in-depth on this but I just confused myself with the math so I'm simplifying.
How do you feel about actual BABIP vs. expected BABIP from an average defensive team adjusted for GB/LD/FB percentages as a measure of team defensive ability? Taking it one step further we could convert to run values to account for the fact that LD and FB have higher run values than GB.
Terminator X - January 24, 2010
It would be interesting to then compare that to fangraph's team UZR
since they should both be scaled to runs saved above an average team.
Terminator X - January 24, 2010
I think you'll see that they are very similar to UZR
That actually might be better, because it won’t fuss around with all of the adjustments UZR makes to individualize defense.
vivaelpujols - January 24, 2010
I'm going to work on it a little.
Might be a few days before I get it up put I’ll post my results in a fanpost or something.
Terminator X - January 24, 2010
Yeah if you had BABIP broken down by GB, FB, and LD you could
probably do pretty well.
Edgar for Pres - January 24, 2010
Any suggestions for where to get that on a team-by-team basis?
looking for it on BB-ref right now
Terminator X - January 24, 2010
BB-ref should have it
If not, ask someone with a Retrosheet database!
vivaelpujols - January 24, 2010
BB-ref is too confusing sometimes
major information overload right now.
Terminator X - January 24, 2010
If anyone has LahmanID mappings for Gameday player ideas, I could get you the data myself
vivaelpujols - January 24, 2010
I got it from fangraphs team page under pitching
You can export this to excel which makes the data gathering faster.
Edgar for Pres - January 24, 2010
Oh I didn't see what you were asking for
I’m not sure.
Edgar for Pres - January 24, 2010
Yeah from the team perspective I think I might like BABIP more than UZR
because its so simple. That being said, UZR has to be more accurate right? It includes all the stuff that should matter like the value of a fly/groundball. There must be something with FIP and BABIP/UZR or just random luck that made the regression screwy right?
Edgar for Pres - January 24, 2010
What samples are you using for the regression
Just 2009 data, or like 02-09 data?
vivaelpujols - January 24, 2010
I only used 2009. I was lazy and didn't think it was gonna be this interesting so I didn't put that much effort into it.
Do you think it would make a difference? I might go back through at some point and see if a bigger sample would help.
Edgar for Pres - January 24, 2010
It might make a difference
Correlation coefficients are heavily dependent on sample size.
First off, you ran the regressions against ERA-FIP, not just FIP, right?
vivaelpujols - January 24, 2010
I basically wanted to predict runs allowed
I used FIP or xFIP and UZR or BABIP. It appears that FIP and BABIP are the best to use which is surprising to me. I don’t really understand these results though and it appears BABIP is a very good measure of team defense.
Edgar for Pres - January 24, 2010
Oh, I see
So you ran a multivariable regression on runs allowed, and BABIP+FIP came out better than UZR+FIP.
I’ll have to think about why that might be. First you should try getting more seasons (but I can understand why that would be a pain in the ass). But I’ll try to dig up the study I saw on BABIP and luck.
vivaelpujols - January 24, 2010
Don't use xFIP to retroactively 'predict' runs
Regressing home runs kills it for such a purpose
Graham MacAree - January 24, 2010
True, that makes a lot of sense.
Edgar for Pres - January 24, 2010
Park factors would probably help some
Edgar for Pres - January 24, 2010
Of course BABIP is going to be a better retroactive predictor of runs scored.
That should be obvious.
Matthew - January 24, 2010
Hah, thanks. Now that you mention it, it is pretty obvious.
I confused myself with too much numbers and not enough critical thinking.
FIP and BABIP tell us what happened. xFIP (or tRA) and UZR tell us how good a team actually was. I think it makes sense now.
Edgar for Pres - January 24, 2010
I'm glad others in the comment thread are making this post meaningful
Without a comparison of (R -. ER) vs. UZR or a more meaningful fielding stat, we really don’t know how loose the correlation is, or if it is worth being dismissed as insignificant.
That a team’s ranking can change by any where from 0 to 6 spots based on R or ER proves what?
CBF - January 24, 2010
I'm sorry not every post is a thesis
Next time I’m working on something and have an idea for a post I can’t flesh out completely, I’ll be sure to run it by you first to see it’s worth your time and effort.
Graham MacAree - January 24, 2010
Leading the discussion is tough and I think we all love the amount of content you guys have put out
This guy’s tone sucks.
Edgar for Pres - January 24, 2010
my tone is terrible
Actually Graham makes great posts which is why I read this site.
The mad pace of posts has uncannily had amazing content each time and nearly each post has in itself been a little thesis…..
It’s amazing the authors haven’t run out of content….
Edgar for Pres, you’ve done a magnificent job of looking deeper into the data…and noticing my tone sucks, I’ll be less prickly in the future.
CBF - January 25, 2010
Do you want your 10 bucks back?
OlSalty - January 24, 2010
please
CBF - January 25, 2010
point taken
my criticism is unconstructive
CBF - January 25, 2010
It could be constructive if put in different terms.
Take out the title, reword your critique, and it could easily have fostered a discussion more useful than “How big of an asshole is this guy?”.
Faux - January 25, 2010
It was, thank you for realizing that and correcting yourself.
To others, the piling on and repeating of the same comments was also completely unnecessary.
Matthew - January 25, 2010
To continue,
The point about tone had already been brought up and if you still felt compelled to respond in some way because you thought the comment was inappropriate, that would have been a perfect time to flag the comment rather than furthering disgusted replies which serve no purpose.
Matthew - January 25, 2010
Yay awesome moderating!
Terminator X - January 26, 2010
It might be interesting to look at the ratio of R to ER
for all the teams in the league. This would give you an idea of how much the ratio varies.
The “change in ranks” method is a bit funny since teams which are at the upper and lower end of the runs rankings won’t appear to move as much because they’re already in the tails of the distribution.
For example, the Marlins have R/ER = 1.101, and the Nationals have R/ER of 1.104. But the Marlins have a delta of 6 because they’re in the middle of the pack (with more teams close to them in runs), while the Nats have a delta of only 1 because they are in the upper tail.
cyberwulf - January 25, 2010
Interestingly
when you run the numbers this way, the Mariners have the highest R/ER ratio in baseball, at 1.107. The lowest ratio is Philadelphia, at 1.053.
cyberwulf - January 25, 2010
Yep
The reason I used rankings was that I was intending to demonstrate how using the fairly arbitrary earned runs/total runs distinction warps perception, and what better way to show that than by comparing overall rankings?
R/ER ratio will obviously be correlated with errors, and we made a lot last year.
Graham MacAree - January 25, 2010
R/ER vs UZR
the correlation is nonexistent
example:
SEA – worst R/ER, best UZR
MIN – best R/ER, second worst UZR
CBF - January 25, 2010
Edgar for Pres already pointed that out in the comments above
Graham MacAree - January 25, 2010
essentially he did, yes
I’m looking at the ratio, R/ER, and not the difference, R-ER, but same conclusion.
the only reason to look at the ratio instead is that a team with bad pitching and good fielding (by the erroneous “error standard”) would conceivably have a higher difference, but lower ratio than some other teams
CBF - January 25, 2010
Please use the reply link in people's posts when responding to comments
it makes the conversation a lot easier to follow. Thanks!
pdb - January 25, 2010
In the spirit of what Graham really likes
Here is a recap of a thread at Tango many years ago about DIPS that I thought was pretty interesting. Its from long ago talking about pitcher’s year to year correlation of BABIP.
here I found it at Tango’s blog where he posted it recently.
Edgar for Pres - January 25, 2010
You must Login with your SB Nation account and be a member of Lookout Landing to post a comment.