Today we present Part III of our series on fielding metrics! We are joined by none other than Mitchel Lichtman, affectionately known as MGL, who will talk about his Ultimate Zone Rating and a whole lot more!!
Joe Hamrahi (JH): Generally speaking, but in as much detail as you can, please describe Ultimate Zone Rating.
Mitchel Lichtman (MGL): Thanks for asking. Being an analyst and not generally a promoter, I love these kinds of questions, rather than, “What do you think of the stats v. scouts debate?” or some such thing.
Actually, for a great description of UZR, read Dewan’s Fielding Bible. They use essentially the same methodology.
UZR uses a STATS Inc. database, rather than BIS, although I assume that the databases are very similar; at least they should be. I break the field down into “grids” or zones, much smaller than the zones that STATS uses for its zone ratings, based upon distance and location. The location is a “pie slice” from foul line to foul line, with 22 slices across the field, each one around 4 degrees wide of course.
For example, let’s say that at point a, b, a line drive is caught 20% of the time, and at point c, d, a little further from a typical fielder location, it is caught 22% of the time, and at point e, f, a little further still, it is caught 11% of the time. Clearly you don’t want to use those exact numbers. You would want to use something like 22, 16, and 11, if you catch my drift. I don’t know whether John “smoothes” out those probabilities or not. That is the proper way to do it.
One reason that I absolutely cannot use the “coordinate methodology” (for lack of a better term) with no “smoothing” is that I have many more parameters than just distance and location. The sample sizes for each “bucket” would be miniscule if I did not use segments of the field that were a lot larger than one x, y coordinate. Of course, the proper way is to use a “smoothing function” and x, y coordinates, but alas I do not. Other analysts working for the Cardinals are presently working on this better methodology. That segues nicely into the next question.
Oh, before we get into that, I do essentially the same thing with the data and the baseline probabilities as John does. If a ball is turned into an out, that fielder gets credit for that portion of a “ball” that he normally does not convert into an out. For example, if he normally converts that ball (with all the concurrent parameters) into an out 60% of the time, he gets credit for .4 outs. Converting “balls” into runs is simple (and I don’t know why John did not do that before he wrote his book). An out is worth the value of an out (around .28 runs) plus the value of an average hit for that location/type of batted ball, around .5 for an IF’er and .6 for an OF’er (roughly), depending on the location and type of course. So a ball caught (as opposed to not caught) is worth around .8 runs.
If a ball is caught, no other fielder is “docked” any balls/runs. If a ball is not caught, then any fielder who typically (league-wise) catches that location/type of ball at least some of the time, gets docked some number of balls/runs (equal to the percentage of time an average fielder catches that type/location ball).
For outfielders I track fly balls, line drives, and pop flies, although I group pop flies and fly balls together. For IF’ers, I only track ground balls, although there is some evidence that catching line drives is more of a skill than you might think, at least for middle infielders. I also like the way John tracks certain pop flies between the infield and outfield. I do not, but I may in the future (as well as track the line drives).
I don’t do bunts separately, which John does. I treat them the same as any ground ball. One reason I think that he treats them separately is that I don’t think he uses distance as a parameter for the ground balls. I do. So a “caught” bunt would be a short ground ball and even a bunt hit would probably be a short ground ball as well. So I am not sure I need to track bunts separately, although it would probably be better (because of the positioning of the IF for one thing, in potential bunt situations).
I also don’t track good and bad “catches” (scoops and the like) by the first baseman, although that can be done in a number of ways (for example, in the database, STATS indicates whether the first baseman made a good play or not on a thrown ball from another infielder). Again, maybe I will include this in the future.
It is technically not part of UZR, but I definitely track outfielder throws, in the same way that John does (runner advances and runner kills and other assists), although again, I present those results in terms of runs saved or cost above/below average so that it can simply be added to UZR runs.
I also treat errors (both roe and non-roe errors) separately. I am not sure how John handles them. They should be treated separately. Contrary to popular belief, even though the result is around the same, an error, as far as a fielder’s runs saved/cost, is not the same thing as a hit.
Finally, I have a separate rating for turning the double play for all infielders. Basically I split the responsibility 50/50 between the fielder and pivot man. I don’t know if a 50/50 split is correct, but I do it nonetheless. Again, an infielder “turning the DP” is expressed as runs saved/cost and can be added to their UZR’s. Since I track errors and “range” separately in UZR, I sometimes present them separately (to give the reader an idea of a fielder’s “range skill” and “error skill”), although UZR is technically “range plus errors.”
FWIW, a fielder’s error runs involve much more luck than their range runs, and the spread (variance) in runs saved/cost for any number of opportunities, is much smaller for errors than for range.
OK, now on to the question about the parameters.
JH: What factors do you consider when determining the probability of a ball being turned into an out?
MGL: I’m glad you asked! Type (hard, medium, soft, same as John and BIS I think), location (as I said, distance, in typically 30 to 35 feet increments, and 22 pie slices from foul line to foul line), the handedness of the batter (which implicates fielder positioning and sub-speeds within the three classes of speed), the baserunners and outs (which also implicates fielder positioning), and the G/F proclivity of the pitcher (which implicates speed). I hope I didn’t miss any, but I could have.
JH: Do ballparks play a role in your calculation? If so, to what extent?
MGL: Yes, I compute park factors, in the same way that regular offensive and pitching park factors are calculated, using up to 13 years of park data and using a regression formula which includes the physical characteristics and ambient conditions of a park.
It is most important for certain parks and locations within a park, like all of Coors, and left field at Fenway and Minute Maid. Also, some parks have much faster infields than others. For example, the ARI infield is lightening fast, due to both the altitude and the fact that it is hard and the grass is short, and some of the newer breeds of artificial turf (like Nexturf) are just as fast as grass. BTW, it is true that traditional artificial surfaces are faster than grass, but it is not true that there are more ground ball singles on turf than grass (they are about the same). Why is that you ask? More infield singles and bunt singles on grass! But I digress.
I don’t think that John uses park factors. He should of course. It doesn’t make that much of a difference in most parks, but it makes quite a bit of difference in some. They are a little tricky to apply of course. Ideally, for any metric, if we had enough data, we would want to use road data only (plus perhaps a small percentage of home data adjusted for HFA), to get rid of any home park bias. But since we are always limited by the sample size of our data (until players start playing for 50 years or so with little change in talent), it helps to double our sample size by using home data as well, and then do the best we can to park adjust that home data.
JH: Describe the role of fielder positioning in your model and what affect it has on results.
MGL: Good question. Like most of the models and metrics out there (PMR, ZR, Dewan’s Plus/Minus System, UZR, etc.), unique (where a particular fielder likes to play or is directed to play by his coaches, or even is forced to play due to the park) fielder positioning is inherent in the results. We don’t track (no data that I am aware of does) fielder positioning before a play evolves. So really the results of UZR and all of the similar methodologies, as far as I am aware, really measure range and positioning. If one fielder is better at positioning than another, then he will likely have a better UZR even if he has the same or worse range.
For example, they say that positioning is what made Ripken so good. I don’t know, although I have seen him play quite a bit (I don’t trust my eyes very much at all when it comes to evaluating baseball player talent – at least not fielding talent).
The metric cannot separate the two (range and positioning). There is no way for us to know where a fielder is playing based on the data. Actually, I take that back. It is theoretically possible to estimate a fielder’s average position from the scatter plot of the balls fielded and not fielded, but it would be a fairly vague inference and it wouldn’t change a fielder’s UZR anyway.
Most people, myself included, feel that a fielder’s positioning, good or bad, should be part of his skill set. Of course, what if a coach misdirects where a fielder should play, or he only plays in a sub-optimal location because some other adjacent fielder is exceptionally good or bad? That fielder would unfairly get shortchanged in his UZR results. The converse is also true to some extent. If a fielder were correctly playing in an unusual location because his pitching staff has an unusual distribution of BIP’s, his fielding skill would actually be overrated. Some people have suggested that is why I have Swisher rated so good in RF.
As far as positioning due to baserunners and outs, I take care of that in my adjustments or parameters that I discussed earlier. Although Dewan does not do that, it should all even out in the long run.
In the short-run, my numbers are going to be better than his because of all the adjustments I do. The most important adjustment that should be done (besides park factors), as it may not even out in the long-run, especially if a player stays with a particular team for a long time, are the pitcher handedness adjustments. If a fielder plays on a team with a preponderance of lefty or righty pitchers (more than the norm), his numbers in John’s system will be a little “warped.”
JH: How does your model of UZR compare to Dewan’s plus/minus system? To David Pinto’s PMR?
MGL: I think I addressed the comparison with Dewan, and then some. PMR is essentially the same as both systems, although I don’t think that David used distance as a parameter in the OF (maybe he does now). That was an egregious error, and I don’t know why he did it that way. He stated that the “speed” parameter served as a proxy for distance for air balls. That is ridiculous of course (with all due respect to David, who is a great guy and a great researcher). For example, a hard hit line drive in the OF could be 150 feet or 350 feet, depending on the trajectory.
Again, in the long run none of these things are a big deal, but for limited data (1-2 years maybe), they can be. I am not sure off the top of my head what kinds of other adjustments he does or parameters he uses or how many years he uses to establish his baseline probabilities (nor am I sure how many years John uses – he may use one year only). I use 6 years of data to establish the baselines, however I “zero out” everyone’s UZR combined at each position for each year (IOW, a players UZR is always relative to all other players at that position for that year only), for various reason which I won’t go into.
JH: When did you begin working on UZR and how has it evolved/improved over the years?
MGL: I really have forgotten how long I have worked with UZR. 10 or more years maybe. I started out with a simple ZR, which I developed independently from STATS ZR and Sherri Nichols’ Defensive Average (all of them essentially doing the same thing). I was using Project Scoresheet data at the time (I think Sherri was too). At some point I modified my simple “ZR” (I had no particular name for it) and converted it into UZR. That conversion was actually inspired by the old STATS Scoreboard, but I have forgotten exactly how and when.
JH: What modifications have you made to UZR as a result of enhanced data through improved technology?
MGL: Not much really. I started using STATS data about 3 or 4 years ago, rather than the Project Scoresheet (PS) data, although I used to convert the STATS zones into PS zones, which were quite a bit larger (and therefore less granular). I did this only because I did not feel like re-writing my computer routines, which were already set up for the PS data. After much badgering from Chris Dial, however, I decided to use the STATS data, unadulterated. I have also recently improved the methodology and over the years have added the aforementioned parameters and adjustments, and fine-tuned the park adjustments (originally, like Dewan’s system, I had no adjustments at all, except for maybe park factors – I am not sure).
JH: You recently mentioned in another interview that you are working on the UUZR. Can you elaborate on the new metric, what it will incorporate, and how it is different from UZR?
MGL: Boy, you can’t say anything around the sabermetric community without it being noticed and remembered by someone!
Basically two things. One, the subjective ratings that I alluded to when I was talking about the first baseman’s ability to receive throws. STATS provides a subjective rating on every play made, by all fielders involved. Using that would enhance a defensive metric like UZR (or any of them) a lot. For example, much of the data we use is noise. Who cares how many routine plays a fielder makes or how many impossible plays he misses? We really want to know which plays made are great or a little better than average (etc.), and which plays not made could have been made by an average, good, or great fielder (etc.). Unfortunately, the data does not include a subjective “rating” on plays not made. That would be nice, and maybe someday STATS or BIS will provide that, as it is easily done.
The second thing is using the x, y coordinate and smoothing function that I discussed earlier, as well as inferring a player’s average positioning (given certain parameters I guess) in order to separate range and positioning.
To me, that is a UUZR, given the data we have now. Of course once detailed “3D” data (as Tango likes to call it) is available through sophisticated video and other computer technologies installed at the ballparks, we can work on a UUUZR.
JH: The evaluation of defense continues to be a hot topic. With the data that we currently have available, how much closer can we expect to get to perfecting the study of defense?
MGL: Honestly, not a whole lot. We are maybe 90% of the way there. I understand and am sympathetic to the fact that defensive metrics are a lot harder (than offensive and pitching metrics) for the average person (and even some analysts) to get their arms around so to speak. Therefore they are often treated with skepticism and mistrust. Many people think that our methodologies are a “black box” even though almost all of us that have developed these advanced defensive metrics have explained them in detail in various forums (like this one).
Despite the skepticism, I think that we currently do a pretty good job of evaluating and quantifying defense. Not perfect of course and not as good as we do with offense and pitching, but light-years ahead of what was out there before (FA and Range and even Palmer’s Fielding Runs, with all due respect to him, a friend, colleague, and sabermetric pioneer). Books like Dewan’s, which was extremely well-written, will help to legitimize these types of metrics I think.
JH: Who would you say is the most overrated defensive player today in the eyes of the public? The most overrated player in the last 20 years?
MGL: Why Jeter of course! That was an easy one. Griffey, although he was once good, he has been atrocious for the last 5 years or even more. Ditto for Finley. Michael Young of Texas is atrocious although you don’t often hear that (I have even heard that he is good). Grissom (just retired?) has not been good for many years, although he always had a rep as being a good defensive outfielder. Going back some years, the only one that comes to mind (I focus mainly on recent players) is Sandberg. He has never rated that highly (around average I think) in my defensive metrics and of course he is generally known as a very good if not great defensive second baseman. Fielders who hit well and who have few errors (e.g., Jeter, Sandberg) are often overrated. Ditto for those who look “smooth.”
JH: Can you explain your role with the Cardinals and how you serve the front office? Do you consult on more than just fielding metrics and analysis?
MGL: Sure, I consult on everything, although my main focus is on projecting total player performance, including pitching. I have also done work on analyzing and projecting college players for the draft, recommending certain in-game strategies, lineup and bullpen construction, (such as those in our new book), and giving them my opinion (whether they ask for it or not) on player acquisitions, salaries, and even player development in the minor leagues.
JH: What other projects are you working on now or may be working on in the near future besides UUZR?
MGL: My golf game.
JH: Do you make any use out of your law degree anymore?
MGL: Yes, when they come to me for legal advice, telling friends and family to go find a real lawyer and in some cases, recommending one.
JH: Please feel free to expand on anything else you feel is important that we haven’t covered today.
MGL: I think we’ve covered a lot. Thanks for having me. I appreciate it a lot.