SB Nation - Login for mobile commenting

Lookout Landing

Quick Thought: Runs vs. Earned Runs

I thought it'd be interesting to see if there was any difference between clubs' MLB rank by earned runs and runs. As most of you know, errors are out of vogue in fielding statistics for a variety of reasons. Scorer bias and subjectivity is one of them. If a Safeco scorer gives Ichiro a hit when the second baseman actually committed an error, and Ichiro comes around to score, that run is earned. Perhaps the exact same chain of events happens in Toronto, for example. Unearned run. Combine subjectivity with the fact that errors really aren't a very good gauge of defence, and you start to see looking at earned runs doesn't actually tell you how many runs a team's given up on the year. Case in point:

Table 1: 2009 Runs, Earned Runs, and MLB ranks. All data from Baseball-Reference.

Green denotes that the team was over-rated by using ERA, red denotes them being under-rated. Interesting, huh? Scanning that chart, I'm not seeing any correlation between green/red teams and actual defensive ability (and with this being a quick post I'm not inclined to dig too deep). I just thought it was interesting that although we can generally multiply ERA by 1.087 to get RA, that relationship doesn't even hold true at the team level. Note that only six teams don't see their rank change - the Mariners are one of them, holding steady at sixth in both categories. The most underrated team is the Minnesota Twins, who go from 25th by earned runs to 19th by runs, and at the opposite end of the spectrum we have the Florida Marlins, who tumble from 14th to 20th.

1 recs  |  59 comments

Comments

I looked at the correlation between Team UZR and R-ER

There is no correlation. R^2 = 0.04. At least it shows that a good team uzr says that R-ER should go down.

BABIP is way better to use to predict Team UZR than R-ER
Also I was playing around with some numbers and...

Its also kinda strange.

I did some correlations to predict total runs allowed. I had two linear regressions using either FIP and UZR or FIP and BABIP.

FIP and UZR had an R^2 of 0.89 and a standard error of 32.9 runs.

FIP and BABIP had an R^2 of 0.93 and and a standard error of 25.3 runs.

If you use xFIP instead of FIP the difference gets even bigger.

xFIP and UZR had an R^2 of 0.80 and a standard error of 42.5 runs.

xFIP and BABIP had an R^2 of 0.91 and a standard error of 29.8 runs.

All these components are extremely significant (p-value is very small). I haven’t really thought about what all this means but its interesting.

If you are curious, regression equation I came up with is RA = FIP*156.76 + 3389.54*BABIP – 927.44.

You can also use the equation R-R[avg] = (FIP-FIP[avg])x156.76 + (BABIP-BABIP[avg])x3289.54 where everything is represented as relative to league average. I set the constant to zero however if you don’t set it to zero, the regression equation predicts it should be -8×10^-14. With the constant set to zero, the R^2 is 0.935 and the standard error is 24.9 runs.

This kind of tells me that BABIP can be pretty much used to explain team defense. These equations could probably be cleaned up to make everything make more sense but I think it shows the value of BABIP at the team level.

I'm not exactly sure what you're going for here.
I wasn't sure what I was going for either

But in the end, team BABIP might be a better predictor of team defense than team UZR.

BABIP includes defense + luck

We don’t really care about the luck. UZR tries to isolate skill.

I was surprised UZR didn't predict RA better

It could be at the team level, BABIP doesn’t really have that much luck since the sample size is larger. I only put out all that data here because it surprised me.

BABIP only really includes luck for individual hitters or pitchers,

because once the ball is put in play it’s largely out of their control whether it’s a hit or an out. But on a team-wide level over a whole season it’s a pretty damn good measure of overall team defensive ability/efficiency. What could be a better measure of team-wide defense than how many balls in play were converted to outs?

In other words:

For pitchers and (to a somewhat lesser degree) hitters, BABIP is a measure of things outside their control, because whether it turns into an out is mostly dependent upon how good the fielders are. The implication of this being that BABIP is mostly controlled by the opposing defense. As such, it makes perfect sense to use BABIP as a team-wide measure of defensive efficiency.

...
For pitchers and (to a somewhat lesser degree) hitters, BABIP is a measure of things outside their control, because whether it turns into an out is mostly dependent upon how good the fielders are.

That’s not true. BABIP is more lucky/unlucky bounce, balls just out of reach, etc. than it is fielding skill, even at a team level. Studes did some research for the THT annual a couple years ago on this matter IIRC.

Link?
I can't find it
Fair enough.

I would suspect that unlucky bounces constitute a very small percentage of all BIP, and over and entire season each team would have a similar # of unlucky bounces. As far as balls just out of reach, isn’t that to some degree a measure of fielding skill? A ball just out of reach of Torii Hunter is a ball that Gutz probably gets, because he’s got better range. Sure there’s some luck involved there, but again I would think when comparing two teams over the course of an entire season most of it would balance out pretty well.

UZR is basically BABIP though

It just normalizes for how difficult it is to field a groundball in a zone. Since we are looking at the team prospective and if we assume the distribution of balls in play is roughly the same for all zones on most teams, most these factors probably average out in the end (or at least it appears like they do).

I don't see why those things would even out

Most of them are going to be caused by biases (like a pitching staff that allows a lot of line drives, or pulled balls in play), and even if the biases are limited (which I don’t think they are), the difficulty of batted balls could vary simply by random chance.

I'll get to the first part of your comment in a second.

But regarding your last phrase “the difficulty of batted balls could vary by random chance” what I’m saying is that over a large enough sample size the difficulty of batted balls should balance out. How large that sample size would have to be I do not know.

I was going to post something more in-depth on this but I just confused myself with the math so I'm simplifying.

How do you feel about actual BABIP vs. expected BABIP from an average defensive team adjusted for GB/LD/FB percentages as a measure of team defensive ability? Taking it one step further we could convert to run values to account for the fact that LD and FB have higher run values than GB.

It would be interesting to then compare that to fangraph's team UZR

since they should both be scaled to runs saved above an average team.

I think you'll see that they are very similar to UZR

That actually might be better, because it won’t fuss around with all of the adjustments UZR makes to individualize defense.

I'm going to work on it a little.

Might be a few days before I get it up put I’ll post my results in a fanpost or something.

Yeah if you had BABIP broken down by GB, FB, and LD you could

probably do pretty well.

Any suggestions for where to get that on a team-by-team basis?

looking for it on BB-ref right now

BB-ref should have it

If not, ask someone with a Retrosheet database!

BB-ref is too confusing sometimes

major information overload right now.

If anyone has LahmanID mappings for Gameday player ideas, I could get you the data myself
I got it from fangraphs team page under pitching

You can export this to excel which makes the data gathering faster.

Oh I didn't see what you were asking for

I’m not sure.

Yeah from the team perspective I think I might like BABIP more than UZR

because its so simple. That being said, UZR has to be more accurate right? It includes all the stuff that should matter like the value of a fly/groundball. There must be something with FIP and BABIP/UZR or just random luck that made the regression screwy right?

What samples are you using for the regression

Just 2009 data, or like 02-09 data?

I only used 2009. I was lazy and didn't think it was gonna be this interesting so I didn't put that much effort into it.

Do you think it would make a difference? I might go back through at some point and see if a bigger sample would help.

It might make a difference

Correlation coefficients are heavily dependent on sample size.

First off, you ran the regressions against ERA-FIP, not just FIP, right?

I basically wanted to predict runs allowed

I used FIP or xFIP and UZR or BABIP. It appears that FIP and BABIP are the best to use which is surprising to me. I don’t really understand these results though and it appears BABIP is a very good measure of team defense.

Oh, I see

So you ran a multivariable regression on runs allowed, and BABIP+FIP came out better than UZR+FIP.

I’ll have to think about why that might be. First you should try getting more seasons (but I can understand why that would be a pain in the ass). But I’ll try to dig up the study I saw on BABIP and luck.

Don't use xFIP to retroactively 'predict' runs

Regressing home runs kills it for such a purpose

True, that makes a lot of sense.
Park factors would probably help some
Of course BABIP is going to be a better retroactive predictor of runs scored.

That should be obvious.

Hah, thanks. Now that you mention it, it is pretty obvious.

I confused myself with too much numbers and not enough critical thinking.

FIP and BABIP tell us what happened. xFIP (or tRA) and UZR tell us how good a team actually was. I think it makes sense now.

I'm glad others in the comment thread are making this post meaningful

Without a comparison of (R -. ER) vs. UZR or a more meaningful fielding stat, we really don’t know how loose the correlation is, or if it is worth being dismissed as insignificant.

That a team’s ranking can change by any where from 0 to 6 spots based on R or ER proves what?

I'm sorry not every post is a thesis

Next time I’m working on something and have an idea for a post I can’t flesh out completely, I’ll be sure to run it by you first to see it’s worth your time and effort.

Leading the discussion is tough and I think we all love the amount of content you guys have put out

This guy’s tone sucks.

my tone is terrible

Actually Graham makes great posts which is why I read this site.
The mad pace of posts has uncannily had amazing content each time and nearly each post has in itself been a little thesis…..
It’s amazing the authors haven’t run out of content….

Edgar for Pres, you’ve done a magnificent job of looking deeper into the data…and noticing my tone sucks, I’ll be less prickly in the future.

Do you want your 10 bucks back?
point taken

my criticism is unconstructive

It could be constructive if put in different terms.

Take out the title, reword your critique, and it could easily have fostered a discussion more useful than “How big of an asshole is this guy?”.

It was, thank you for realizing that and correcting yourself.

To others, the piling on and repeating of the same comments was also completely unnecessary.

To continue,

The point about tone had already been brought up and if you still felt compelled to respond in some way because you thought the comment was inappropriate, that would have been a perfect time to flag the comment rather than furthering disgusted replies which serve no purpose.

Yay awesome moderating!
It might be interesting to look at the ratio of R to ER

for all the teams in the league. This would give you an idea of how much the ratio varies.

The “change in ranks” method is a bit funny since teams which are at the upper and lower end of the runs rankings won’t appear to move as much because they’re already in the tails of the distribution.

For example, the Marlins have R/ER = 1.101, and the Nationals have R/ER of 1.104. But the Marlins have a delta of 6 because they’re in the middle of the pack (with more teams close to them in runs), while the Nats have a delta of only 1 because they are in the upper tail.

Interestingly

when you run the numbers this way, the Mariners have the highest R/ER ratio in baseball, at 1.107. The lowest ratio is Philadelphia, at 1.053.

Yep

The reason I used rankings was that I was intending to demonstrate how using the fairly arbitrary earned runs/total runs distinction warps perception, and what better way to show that than by comparing overall rankings?

R/ER ratio will obviously be correlated with errors, and we made a lot last year.

R/ER vs UZR

the correlation is nonexistent

example:
SEA – worst R/ER, best UZR
MIN – best R/ER, second worst UZR

Edgar for Pres already pointed that out in the comments above
essentially he did, yes

I’m looking at the ratio, R/ER, and not the difference, R-ER, but same conclusion.

the only reason to look at the ratio instead is that a team with bad pitching and good fielding (by the erroneous “error standard”) would conceivably have a higher difference, but lower ratio than some other teams

Please use the reply link in people's posts when responding to comments

it makes the conversation a lot easier to follow. Thanks!

In the spirit of what Graham really likes

Here is a recap of a thread at Tango many years ago about DIPS that I thought was pretty interesting. Its from long ago talking about pitcher’s year to year correlation of BABIP.

here I found it at Tango’s blog where he posted it recently.

You must Login with your SB Nation account and be a member of Lookout Landing to post a comment.