After considering whether Michael Jordan is the king of low variance, I want to discuss two other sports situations where people always argue about the right way to aggregate results
All “Big 3” Tennis players currently have 20 grand slams. So who is actually the best? (Leaving aside how that changes in the future…)
An intuitive idea is that the data does not only show us the players’ totals, but also an indication to how “versatile” vs “specialized” they are. The majority of Nadal’s grand slams were achieved in the French Open. We can penalize for that (or reward that) by choosing a different aggregation scheme. A common method, parameterized by p, is the Lp norms:
For p=1, this is just the total sum of the grand slams (For example, Federer’s grand slams vector is (6,1,8,5), and the L1 norm is 20). Other interesting parameter values are 0, 1/2, 2, and Infinity (where infinity means the limit of the Lp norms with p->Inf). For p=0, the Lp norm is just the number of non-zero vector elements. That is 4 for all of the players. If some player would have “missed out” on one of the grand slam competition, he would be behind in the L0 norm. For p=Infinity, the Lp norm is the max number in the vector. For this norm Nadal wins, due to his 13 wins in the French Open (Djokovic arrives second with his 9 wins in the Australian Open). For p=2, again Nadal wins, with a score of ~13.78 vs ~11.4 for Djokovic and ~11.22 for Federer. But for p=1/2, Djokovic wins with a score of ~17.19, Federer second with ~17.03, and Nadal last with ~16.04. Generally speaking, low p values reward “well-balanced” performance more, while high p values reward extremes. We can actually map the different regions of p parameter that induce a different top player. More than that, we can actually show that:
There is no Lp norm for which Federer is the top player.
In fact, for every p>1 (the “specialized” regime), Nadal is on top, and for every p<1 (the “versatile” regime), Djokovic is on top. Because the proof is involved, we show something more simple: That for p>1, Nadal is always ahead of Federer.
Aside from arithmetics, there are two important steps here: First, since x^p is a convex function for p>1, we can use the Jensen inequality, in particular to show that 4^p + 2^p ≥ 2*3^p. More importantly, I don’t know of a general way to solve complicated exponential inequalities of the form
a * b^x + c*d^x…
Continue reading: https://towardsdatascience.com/lp-norms-grand-slams-and-olympic-medals-f005e002ae8e?source=rss—-7f60cf5620c9—4