Building A New ERA Estimator (Stuff-ERA)

Previously, I had written a series of articles focusing on two new metrics I developed that I have coined Stuff and Command. While I found a few interesting pitchers by analyzing the numbers I was disappointed to see that there was not a strong correlation between either metric and ERA. To develop a better correlation with ERA, I realized I needed to include a form of contact results.

What?! Your league is not planning on using Fantrax? Inconceivable! Check out everything Fantrax has to offer and I’m sure you’ll come around to our way of thinking.

ERA Estimator Components

wOBA Influence

Using the same strategy I previously used to develop Stuff and Command, I decided to determine a location-based expected wOBA. To do this I found all balls in play and broke them up into groups based on pitch type and count. Then I built a model that predicted wOBACon based solely on the location of the pitch. Using this location I developed an xwOBACon value for every pitch thrown since 2015.

Then using the same random effects design, I determined the impact of each given pitcher on the results wOBACon. The results were positive with Mike Soroka being the best pitcher on contact in 2019 and Corbin Burnes being the worst within my sample.

While this did not match up perfectly with the other metrics I have used to build my Stuff and Command metrics I realized that to best correlate with ERA, I would need to include some form of contact management.

Stuff-ERA

Now that I had a contact management component, I went to building a model. In my initial runs of the model, I was weighting all four underlying components equally. This was Whiffs, In Zone Swings, Out of Zone Swings, and wOBA. I decided instead to use a linear regression equation fit on the sample of pitcher seasons above 120 innings to help find ideal weightings of the various metrics. Unsurprisingly the wOBA influence was most important followed by Whiffs, In Zone, and lastly Out of Zone.

Additionally, I rethought the way I determined the Command scores. Instead of scaling the different Command ratings to show percent better than average, I converted each of the four command components to a Z-Score. This showed me how many standard deviations better than the average a given pitcher was. This new Command became the fifth variable in my regression equation. Initially, the results were extremely positive but I noticed a flaw when comparing to FIP.

One of the components of FIP is a seasonal constant, which helps to account for the scoring environment of a given season. For example, in 2015 and 2019 had completely different run-scoring environments. adding in a seasonal constant will help to improve that correlation. Within my 120 inning sample, my Stuff-ERA has a correlation of 0.81 compared to 0.78 for FIP along the same period. Since I was curious if this was related to the sample I picked, I plotted correlation at different innings benchmarks.

As you can see, as soon as the 60 inning mark, my new estimator had a stronger correlation with ERA than FIP. This is only descriptive obviously, but the overall gap in correlation grows as the innings numbers increase.

Predictive

The power with any good ERA estimator lies not only in its descriptive ability but in the predictive power it holds. Using the same idea, I compared seasonal pairs to see how well FIP and my Stuff-ERA in the following season.

Unfortunately, the results are much less promising when we try to predict the future. FIP strongly outperforms my new metric in the Predictability test. This is mostly due to the wOBA component of my analysis. The four other components of my Stuff-ERA estimator have a year to year correlation of over 0.65 but the wOBA component only has a yearly correlation of 0.07. Essentially, this means that there is no correlation between a pitcher’s ability to control contact year over year. This seems to fit with everything we know about BABIP and whether or not contact management skills do exist.

However, this does show that there may be some credible value to a regressed version Stuff-ERA similar to xFIP which uses a league-average HR/FB rate to account for the fact that pitchers do not tend to possess repeatable contact management skills.

Conclusion

Overall, I was able to use the research I have done on Stuff and Command to build an ERA-Estimator that was more descriptive than FIP. While there are still some major flaws due to the fluctuating nature of wOBACon, I believe there is room for growth concerning the metric and predictive power. I think this opens up a whole new group of pitchers for us to analyze who may excel according to my metrics but lag behind in FIP. For the 2019 season, Reds ace Luis Castillo paced baseball in Stuff-ERA but his FIP claimed that his 2019 ERA was better than his underlying numbers suggested. I plan to look deeper into specific cases like Castillo to see if there are biases in both FIP and my new Stuff-ERA and how we can use these biases to better evaluate pitching talent.

With whispers that the baseball season may not be too far away, it’s time to jump back in! For more great rankings, strategy, and analysis check out the 2020 FantraxHQ Fantasy Baseball Draft Kit. We’ll be adding more content and updating everything the minute we know when the season will start!

Fantrax was one of the fastest-growing fantasy sites of 2019 and we’re not letting our foot off the pedal now! With multi-team trades, designated commissioner/league managers, and drag/drop easy click methods, Fantrax is sure to excite the serious fantasy sports fan – sign up now for a free year at Fantrax.com.