SB Nation - Login for mobile commenting

Lookout Landing

Sabermetrics 101: BaseRuns

We're almost to #10 in our little series here, so I wanted to take stock of what things people are liking vs. not liking? Too short? Too long? Too technical? Let me know!

Prerequisites for Understanding: Linear weights, Game state.

Prerequisites for Derivation: Game state, database.

Star-divide

Non-Linear Weights

In getting an understanding of how linear weights help us convert on-field events to runs, we also came across a rather interesting problem: in relying upon league average values to derive our run weightings, we start to lose accuracy at the fringes. This is because scoring runs is not actually a linear function - we can reduce it to one for a reasonable result, but the nature of relying on linear weights means that we'll never really overcome this deficit. Instead, we maintain accuracy where it's most important, and sacrifice some elsewhere in favour of having an easily-derived, simple system.

After all, non-linear relationships are a total pain to both derive and understand, right?

Not so fast!

Let's start with what we know must be true about run scoring. Home runs mean at least one run is scored. The rest of the runs are scored by runners on base who manage to advance to home plate. Many baserunners do not manage to score, because the inning ends with them left aboard, or the get caught stealing, or gunned down at third base on a single to right. Regardless of what actually happens to baserunners (note that our definition of 'baserunner' excludes those rounding the bases on a home run, simply for ease of writing's sake), this must be true.

Base Runs

With the three truths about run scoring above, we can actually construct an entirely theoretical run estimator. This is Base Runs (BsR), and it's a very simple idea: runs can only be scored on home runs or by driving runners in. Here it is, in equation form:

Now, this appears to be trivial, and fairly unhelpful. We've shifted the problem around from trying to figure out runs as a whole to trying to determine the frequency of baserunners being driven in. Yes, if we stopped here, we wouldn't really have solved anything. Fortunately Base Runs does not stop here. There appears to be an empirical relationship between the fraction of scoring baserunners (we'll call this number S from now on) and baserunner advancement (A), as well as outs (B):

As far as I know, there's no proof that this relationship must be true, but it makes sense seeing as you need to advance baserunners along to home plate in order to score and adding more outs must mean less runs scoring. Best of all, it is (for the most part) immune to environmental effects, so the relationship should hold true all across the possible baseball spectrum. By using Base Runs (of course, you have to calculate average runner advancement per type of play using game state, which is difficult), you avoid many of the problems we encounter with linear weights. It's especially useful for accurately measuring pitchers, who frequently stray into territory that linear weights finds a little daunting. Not many current statistics actually use Base Runs as their run estimator, but it's an important concept to grasp as part of the thought process of sabremetrics (and the ways our current crop of stats might be improved).

0 recs  |  25 comments

Comments

If baseruns are more accurate (especially for pitchers) why aren't they the basis for tRA rather than linear weights?

Is it because they are less accurate for the middle section (and are really just good for the fringes), because we have less data for them, or something else? It just seems that if linear weights have the most trouble with pitchers, and tRA is a statistic designed to measure pitchers, we should use the best measure we can.

That said, I bet I’m missing something here, and I hope this doesn’t come across as a criticism of tRA, because it certainly isn’t. Just a wee bit confused.

We should be using it for things like tRA

I didn’t really understand BsR when I was developing tRA (what was that, late 2006?), and since then I’ve just have never got around to converting it, because it requires a rebuild from scratch, and if I’m going to be doing that there are some other things I want to incorporate… things I don’t have access to right now.

Makes perfect sense.

So does this mean that we need to adjust tRA for pitchers like Felix (or on the other end Washburn?)

Also, since I didn’t answer the questions: this is a phenomenal series at just about the right length. Is there any chance you could incorporate a “for the brave” section with some math in it?

We know we need to adjust a little, yes. tRA probably bunches pitchers up too tightly.

To answer your second question, right now there’s no chance of adding any real mathematics or graphs or anything like that. Sorry.

Sorry, I suppose I kind of asked the wrong question.

Does StatCorner do the adjusting for tRA, or is that something that should be taken into account when using it? If it’s not adjusted, how many pitchers (approx) would this impact?

As to the math, that’s fine. This series is still phenomenal. Thanks again!

We don't adjust tRA

It’s not really clear how much we need to be adjusting, or even in which direction. The effect isn’t huge, but it’s something to note for all pitchers significantly above and below the mean. Just be aware that our measures aren’t perfect, is all.

We talked about this at some point maybe a week back

After chatting, I found some data on the interwebs and have a spreadsheet/post that is going to come up at some point applying Baseruns to pitchers. Its not going to be perfect in any way but I think its going to turn out a little interesting. Sorry for being a tease bringing this up. There are a lot of little things I’m checking before I put it up. Basic framework is done but number checking takes a little while and I’m trying to make sure I’m not doing something blatantly stupid although I know there are a few small issues I can’t get past. I’ll make all the numbers available. I’m sure when you get around to it, you can make some well needed changes by getting better numbers to input into the model and dealing with the regression of components.

Patriot's already beaten you to it with both FIP and SIERA
Patriot?

I thought it was Colin Wyers who introduced the BaseRuns-derived FIP on Statspeak either in 2008 or 2009.

Probably

I find it hard to keep up sometimes

I hadn't seen that. Still a tad different than what I have but differences are small

I can’t get his article to come up to verify this though. Is this it:

http://www.statspeak.net/2008/08/creating-a-dynamic-fip-with-baseruns.html

Page doesn’t load for me. I can only see the comments from the cached pages on google.

Yeah it's that

But I think statspeak went under with the rest of the MVN network of blogs, so I don’t really know how we can get to these articles.

Maybe the wayback machine saved it?
No luck

Maybe Colin will be kind enough to reproduce it elsewhere.

Wayback archives

Go through May 2008 for StatSpeak. I haven’t been able to find anything more recent on there. :(

If we can unearth it, I'd be interested at looking at it.
Sabremetrics 101

This stuff is great! I don’t understand it all, but the “101” is helping, Mucho! I wish someone back in high school would have showed me this stuff applied to things I was passionate about – like baseball. I would have learned a lot more. Some advise for the confused reader – don’t try to understand it right away – just read and let it sink in slowly.
Keep it coming!

Feedback: I think the length of these is perfect

It’s enough to give a basic understanding, without being long enough to bore people like me who sometimes can’t stand reading excessively long articles

So, after reading these and the wOBA post...

I’m getting the impression that alot of these advanced metrics are figured using league averages and not just the four true outcomes and the subsets thereof. Would that be a correct statement?

I'm confused by this question

What exactly do you mean?

I think I might understand your question (maybe not though)

wOBA is built off of linear weights which uses league average to compute the run values for events. Baseruns hypothetically does not use league average but things such as players advancing on hits are based on how often a baserunner advances on average.

The way I like to think about it is that wOBA depends on the talent/environment and Baseruns is a more universal metric. wOBA works great for last year’s MLB. Baseruns should be more universal and should work for 1940s MLB or your softball league (there are small points like baserunning that would cause errors).

This is the first time I have seen the phrase 'four true outcomes' applied to baseball

You may be thinking of advanced pitching metrics, many of which use only the three true outcomes. But putting a number on pitching talent is very different from pitching.

For pitchers the samples are smaller and they play in front of only one defense which can skew RA. Hitters are far easier, and the traditional metrics like OPS, SB, etc can tell you 80% of what the advanced ones tell you. You use wOBA and BsR because you want to be able to translate the slash stats into runs and consequently wins in a context-neutral fashion.

I might be wrong about some or all of what I’ve said.

One important note about Baseruns vs wOBA

is that for hitters, its not trivial to use baseruns for an individual player. It works well for team hitting or for pitchers since they control the run environment. One of the ways people try to use Baseruns for hitters is to compute the runs scored for the team with the player and runs scored for the team minus the player’s stats. The difference of these should be the value of the player’s production to the team.

.
It works well for team hitting or for pitchers since they control the run environment.

This isn’t why it works for team hitting or pitching. BsR works well because you’re looking at the whole system by using team stats (or pitcher stats). The environment doesn’t matter.

You must Login with your SB Nation account and be a member of Lookout Landing to post a comment.