A reader wrote:
I read your EPV paper, and thought it's a really cool idea. But I think they way you guys approached it leaves a rather fundamental problem unresolved. Say, player A is amazingly good. Giving him the ball is really valuable. Who gets the credit for passing to him? If your model realizes that player A is very good, then the credit goes to the person who gave him the pass. If yours model doesn't know how good A is, then the credit goes to A (because after receiving the pass, he will of course do something very impressive, like score or whatever). This means that EPV, as defined, isn't a very robust metric of the contribution of individual players.
You can reformulate the question to make EPV a measure of the decision quality of each player (so explicitly ignore things like movement, shots, etc. - only focusing on decision making). Then I think you could build a more consistent model, since the credit will only be allocated for instantaneous decisions. However, this is a quite different approach to the one you took.
The other concern I had (due to my inability to read complex math) is how you define the "benchmark" for the decision. Specifically, your EPV, theoretically, should allow you to identify the "best" decision in each case - but clearly you're not using this as the benchmark (otherwise, any decision by the players would change EPV only by 0 or less). Instead, I assume, you model imperfect player decision process, and compare whatever "average" decision you think a typical player would make to the actual decision made. I'm not sure how exactly you defined the "typical" decision, but it seems a very fragile part of the model.
In general, we appreciate the reader's critical eye, especially the questions about how robust or fragile our metrics are to particular choices. Long story short, we don't think these represent fragility, so much as they represent different questions that you might try to answer about basketball. We made some very specific choices about questions of exactly this type while designing the methodology, so this email gives us a great opportunity to talk about some of the deeper thought process that went into the EPV paper.
The reader's points strike at the heart of the problem of evaluating decisions and players in sports analysis. There are key distinctions between knowing how much a given situation in a basketball game is worth, knowing how much a single decision is worth, and knowing what a player is worth.
By definition, EPV directly answers the first question. This is the easiest of the three questions because it’s completely a function of observed data — essentially, the model finds a nice way to pool information about possessions that also passed through the situation in question, and look at the average outcome. EPV’s mission is to incorporate as much information as is available at a given moment to answer this question.
(Aside: To answer the reader’s first point: we designed EPV precisely so that player skill would be “baked in” at the moment that a really good player A gains control of the ball — the scenario that he raises where EPV always spikes after player A makes a decision with the ball is exactly what we designed EPV to avoid. We incorporate a huge amount of information about how a player's skills and tendencies interact with the situations he’s faced with to make the underlying model as aware as possible of how “good” player A is.)
We do think that EPV can be a major building block in answering questions about individual decision value and player value, but having a good EPV estimate can’t answer these questions on its own. These questions are harder because, to give coherent answers to them, we have to compare the actions that we saw on the court to the actions that could have happened — we call these counterfactuals.
Evaluating a single decision is relatively simple here and the model we used for EPV breaks this down (we think) very elegantly: at every moment in a possession, we look at all of the options that a player has and assign each one a point value. We can then ask whether the player took the highest value option that was available to him. We do perform comparisons of the options a player had in our ShotSatistfaction metric, and I think this speaks to the reader’s point about evaluating the decision a player actually made against the "optimal" one.
Evaluating a full player is by far the hardest question here, because the counterfactual is hard to define. When you say a player is worth $x$, this raises the question, “compared to whom?” A team isn’t going to go out and play with just 4 players, so you need to define a benchmark to compare a player against. Do you want to compare a player against whomever would come off the bench to replace him? The best trade prospect? The first pick in the draft? Each of these comparisons could be useful in the context of a given decision, and each of these gives a very different answer about a player’s value. The point is that there’s no single “true” value of a player without defining this comparison — this isn’t a matter of fragility, it’s just the fact that these are all different questions with different answers. In the EPV paper, we presented EPVA using a “default” baseline of the average player, defined in terms of the average tendencies of all NBA players who faced similar situations. It gives a general sense of a player’s overall decision-making, but isn’t optimal for any single decision that, say, a GM would want to make. In that sense, the metric is purely demonstrative, and true stakeholders should be substituting in whatever benchmark is most appropriate.