Sabermetric Series, Part 4: ERA Estimators

Welcome to Part 4 of this Sabermetric Series, where we will be moving from hitters to pitchers. We’re going to start with the basics and look into the benefits of using various ERA estimators. We’ll go over a whole bunch of fun acronyms, such as:

LOB%
K-BB%
FIP
xFIP
SIERA
ERA-/FIP-/xFIP-

OK, so why all these random letters? What’s wrong with good ol’ ERA (acknowledging that they, too, are just letters)? Well, ERA lets us know a pitcher’s past results, but it doesn’t tell us if those results are justified. Neither is it at all predictive of what we can expect from a pitcher going forward. These different ERA estimators take into account things such as BABIP, LOB%, BB%, and more. They give us a more complete picture of what a pitcher’s ERA should have been, which gives also us a better sense of what to expect in the future. After all, if we knew what was going to happen, we’d win all of our leagues. That’s the goal.

Previous Installments

Sabermetric Series, Part 1: Quality of Contact and Batted Balls

Sabermetric Series, Part 2: Applying Metrics to Splits

Sabermetric Series, Part 3: Plate Discipline

The Guts

We’ll start with what goes into most of the estimators. BABIP is a big one. We went over BABIP extensively in Part 1, so you know the deal there. ERA estimators take into account how far from league average a pitcher’s BABIP is and normalizes it. This doesn’t always work because there are certain pitchers that can maintain a lower BABIP by inducing soft contact, but we’ll get to that in a later edition.

Another major spoke in the estimation wheel is LOB% (Left On Base Percentage), otherwise known as Strand Rate. This is the percentage of baserunners a pitcher is able to prevent from scoring, or “strand” on the bases. In 2017, the league average LOB% was 72.6%. Most estimators take into account how much higher or lower a pitcher is from the average and normalizes that ratio, same as with BABIP. Of course, this doesn’t always work perfectly, especially with relievers. You almost have to treat them separately from starters. Their sample sizes are much smaller than with starters, so you can get some pretty crazy numbers even over the course of 65 innings. For example, if you have a reliever that walks a lot of batters but also strikes out a lot, you can see him maintain a well above-average LOB% for extended periods of time. For the most part, though, if a pitcher is quite a bit above or below league average, you can expect positive or negative regression to the mean.

Another factor that goes into some (not all) estimators is HR/FB%. As with BABIP, we’ve gone over HR/FB% before, so we don’t need to rehash that again. However, it is important to know that the estimators regress everything to the 12.7% league average. That can be misleading in some cases, but we’ll get into that in a minute.

Finally, before we get to the actual metrics, we should look at K-BB%. Strikeouts and walks independently factor into many of the ERA estimators, but K-BB% can be useful in itself. K-BB% is the pitcher’s strikeout percentage minus his walk percentage. Strikeouts and walks aren’t everything, but generally speaking, the higher the K-BB% the better. Strikeouts? Good! Walks? Bad! It’s a simple concept to comprehend, and one that can be very beneficial. You can even sort K-BB% on Fangraphs and look at the leaderboard to see who was best in that department over a certain period of time. The league average K-BB% in 2018 was 13.8%, with the average K% being 22.3% and BB% being 8.5%. The league leader was Justin Verlander with a 30.4% K-BB%, with the worst mark among qualified pitchers being Lucas Giolito at 4.5%.

The Estimators

Now that we know what goes into the metrics, we can look at the individual estimators and see how they can benefit us. We’ll start with the most predominantly used metric, FIP. FIP, or Fielding Independent Pitching, is based solely on how the pitcher has himself performed, without taking into consideration his defense. This includes strikeouts, walks, hit by pitches, and home runs allowed. It also takes “luck” out of the equation, removing variance in BABIP. It’s based on the same scale as ERA for ease of use. It varies by season, but in 2018 the league average FIP was 4.15. The best mark was Jacob DeGrom at 1.99, while the highest FIP was (again) Lucas Giolito at 5.56. Ouch. He was that bad over 32 starts. At least he was consistent.

Another helpful tool that Fangraphs offers up is ERA-FIP, which sorts by the difference between a pitcher’s ERA and FIP. This makes for a very good black and white list, telling us exactly who is over- and under-performing their ERAs. You can use this tool to find potential buy-low targets, but you also have to take it with a grain of salt. A lot of the leaders have just been pitching badly, so who cares if their FIP is a full run lower than their ERA when their ERA is 6.25. That’s still a bad pitcher you don’t want.

xFIP is similar to FIP, but the main difference is that it accounts for fly balls and normalizes a pitcher’s HR/FB%, as opposed to just counting home runs like FIP. If a pitcher has a 20% HR/FB%, you can bet his xFIP will be lower than his ERA because it doesn’t think he’ll continue to give up so many homers. Conversely, if a pitcher has limited batters to a 5% HR/FB%, said pitcher’s xFIP will likely be higher than his ERA because xFIP thinks more homers are coming. While this generally gives you a good idea which way a pitcher can be expected to go in the future, there are several factors to take into account. Does this pitcher pitch in an extreme hitting or pitching environment? Does this pitcher induce a lot of fly balls? Does this pitcher have a history of limiting or allowing an excessive number of home runs? Given these contextual factors, there are circumstances where xFIP can be misleading.

The other primary ERA estimator is SIERA, or Skill Interactive Earned Run Average. SIERA takes into account a pitcher’s batted ball profile while also rewarding pitchers with high strikeout rates more than FIP. If you generate an above-average number of ground balls, fly balls, or pop-ups, SIERA rewards you for that as well since it’s a skill that can sustain a lower BABIP. It also doesn’t punish walks as much as FIP if you have a high ground ball rate; more ground balls will lead to double play outs. SIERA is also park adjusted, taking into account whether you’re pitching in a more extreme hitting or pitching environment.

It also bears mentioning that Baseball Prospectus has its own model called DRA, or Deserved Runs Allowed. That metric also takes into account platoon splits, catcher framing, and opponents, among other things. It’s an advanced tool, but it requires a paid subscription to utilize, so I won’t go into detail on it here.

Finally, there’s ERA-, FIP-, and xFIP-. These are basically the pitching equivalents of wRC+ or OPS+, adjusting everything to a scale where 100 is average. It adjusts for park and league, and every point above or below 100 is one percent deviation from the league average. These aren’t especially beneficial in fantasy, but they can be good tools when comparing different pitchers.

That does it for this edition, our first foray into pitching Sabermetrics. Next time we’ll look at quality of contact and plate discipline for pitchers. That one is going to be a lot of fun. Hope to see you there!