SB Nation - Login for mobile commenting

Lookout Landing

Sabermetrics 101: Batting

Only a few left! Yay!

Prerequisites for Understanding: The Isolation Problem, Linear Weights, Base Runs, Replacement Level, The Run-Win Conversion, Value, Regression, Correlation, Park Effects, Environment, WPA and LI, Data.

Star-divide

On Batting

Gone are the days when the triple crown defined greatness. Instead, we're faced with a dizzying array of statistics, from our old friends batting average and home runs to linear-weights based systems to some frankly impenetrable figures. The ideal, of course, is to weight what a batter does according to what it's actually worth in terms of wins and losses, so our goal is to extract real meaning from the myriad numbers at our disposal.

Usually, we would immediately begin isolating a player's contributions from that of his peers in order to get a better read on said player's actual abilities. However, there's a strong belief in many baseball circles that clutch hitting is a sustainable skill, so let's detour into some murky territory before we set off on our quest to eliminate any interference from other sources. First of all, we have a pretty handy definition of clutch in the form of Win Probability. Our putative clutch hitters will do better in high leverage situations than our average hitters will, by definition. They should also be contributing less in low-leverage situations, otherwise they're not clutch hitters so much as good hitters. So taking the differences between their actual WPA contributions and their expected contributions given their overall stat line, we have a metric for clutchness, and we can check to see how stable it is.

Short answer: It isn't.

Long answer: Clutch hitting does show up as a skill, but it is far, far, far more likely (5x or so) to be random than skill-based. It just doesn't correlate well, whether you're looking at split-season values or year-to-year. And after all, if a hitter is so good in the clutch, why doesn't he try that hard all the time?

So let's accept that we can ignore context in measuring our hitters. The clear next step is to convert events on the field to runs, using linear weights or Base Runs. I'm partial to Base Runs, myself, but linear weights is slightly easier to implement, and for the most part they give highly similar results. We now have a good idea of how many runs the average single, double, triple, or even error is worth, and we can sum up a player's hitting output in runs and compare it to average/convert to a win value. Of course, we should be remembering to park and league adjust as well.

Park-adjusted offensive statistics based on linear weights is essentially the cutting edge of evaluating hitters. But are we actually done? I would argue that we're not quite there yet. We've only looked at what happens at the end of a play. It might result in an out, or a double, or a home run, but there's another factor that comes into play between the bat striking the ball and the actual outcome of a play: the defence. In fact, we know this is a big factor - we hear about 'robbed' hits at least once a game, but the assumption is that a batter's defensive 'luck' evens out over the course of the year. There's no reason for this to be true though, so what can we do to mitigate the fact that even the most sweetly struck ball can find itself nestled in a fielder's glove?

Well, we can use the same techniques as we applied when looking at pitchers. With third generation data, we have a pretty good idea of what trajectory class any given batted ball falls into (standard caveats about the reliability of these data apply, of course), which means that applying linear weights or BsR figures based on batted balls to batters leaves you with... the wrong answer. Completely and totally. When we look at pitchers, we can safely assume that they face roughly the same calibre of batters over the course of a season, which justifies splitting information up by standard BIP data. Hitters are much, much quirkier, and they put their own stamp on things. Soe are fast. Some are slow. Some hit the ball harder than others. Some hit the ball much harder than others. The run value of a line drive, for example, has a rather extreme range when comparing Albert Pujols to Miguel Cairo. It simple doesn't make sense to use league-wide linear weights on individual batters. A better way of doing things may be to simply generate linear weights on BIP data for a batter's recent career, regress those, and apply them to his batting line. This should give us an idea of what a batter has done without defence getting overly involved. Of course, hit f/x will be helpful as well, especially on line drives and fly balls.

One element of batting statistics that I've totally neglected thus far is the scale to put our measurements on. There are lots to chose from. Batting average is familiar, as are on-base percentage and OPS. They all suffer from not really being very meaningful in terms of runs scored, but we don't have an obvious scale to turn to - R/9 is out for individuals due to the fact that they have teammates, so the question is very much up in the air.

I also haven't touched on baserunning, which is a neat topic in its own right. I probably won't be able to do it justice here, but it's one of the few times we should be including leverage index in our calculations - you can always choose how aggressive to be depending on the situation. In essence, we want to measure baserunning based on a combination of stealing and advancing on other plays. Stolen bases and times caught stealing are easy to evaluate using run modellers (in fact, certain batting statistics sometimes embed them with hits, walks, and outs), but advancement is an entirely different kettle of fish. In essence, we take the chances for a runner to advance on a single, double, flyout, or groundout, find the league average extra bases and outs generated per chance, and compare it to what our baserunner actually did. This neglects to take into account any luck which might affect the ease of advancing on a play, but it's certainly better than nothing.

In any case, these are the avenues I'd be pursuing in order to evaluate total offensive performance. It's worth bearing in mind that we're by now very very good at evaluating hitting, and improvements to our current top-of-the-line metrics (tAV/EqA, wOBA) are only going to result in very marginal advances in our understanding of talent level, so don't stress out if we never push much further into the weird world of extracting pure offensive value from the offence/defence dynamic.

0 recs  |  11 comments

Comments

You are awesome
Only somewhat related, but have you seen this yet?

From Dave Allen at FG, that’s Chipper Jones Z-Swing vs good to bad pitchers. I’m really curious if other good hitters have a similar strategy or if this is exclusive to Jones.

Yes.

I might post about it at USSM, but who knows.

I’d love to try this others… but then, I would pay serious money just to see Dave’s MySql queries and R entries. Not to have them explained; I’m too stupid – it would take forever. I just want to know how to do it.

(And again, that story and that chart are just amazing to me. I love it so, so much.)
Huh, the pic disappeared
I can see it
Yeah that was weird, it was gone for a little while then came back
Your discussion of "clutchness" made me look at Fangraphs again

Ichiro has had a positive clutch for every year of his career. If I’m reading it right he has also accumulated 6.3 wins from being clutch over his 8 year career. That comes out to around 0.8 wins that we are missing from his valuation.

The clutchness we measure is probably 90% luck however there are some cases I could see making that would change how we look at some players. Its an interesting topic because I don’t think there is any way for us to pick out which players would be “clutch” and its so hard to say with any certainty that it remains an area of unfinished business.

I personally prefer "clutchiness"
Question

How are plays like sacrifice flies and sac bunts factored into stats like wOBA?

StatCorner includes the former and subtracts out the latter

Not sure on FG.

You must Login with your SB Nation account and be a member of Lookout Landing to post a comment.