What we already know: Earth is locally flat. This is true not only for the (surface of) the Earth, but for any so-called Riemannian manifold, a generalization of surfaces to any number of dimensions. Even though there is curvature, if one zooms enough into a point, curvature disappears and their neighborhood will look flat. This is why some in... Read more 11 May 2023 - 28 minute read
Background Back in the second year of high school, a friend shared with me a problem that his geometry teacher had shown him. I was going through a small crisis regarding my future career. I couldn’t decide whether I wanted to pursue a major in the Humanities (Arts or Design were on the top of the list) or in STEM. Before eventually settling d... Read more 16 Apr 2023 - 4 minute read
The AUC in the real world A common application of binary classification models is ranking, more than classification itself. The difference between the two is subtle: In classification, you want to say how likely a point is to belong to class 1 or class 0; In ranking, you care whether point A, who is in class 1, is more likely than another... Read more 27 Jan 2023 - 13 minute read
Context I recently published a short discussion (in Portuguese) on LinkedIn about how Jensen’s inequality complicates the process of building regressions for transformations of an original variable. More specifically, we discussed how \[\boxed{\exp(\mathbb E[\log(Y)] \leq \mathbb E[Y]}\] This is due to $x\mapsto \log x$ being concave and both... Read more 25 Sep 2022 - 8 minute read
AUC status quo The ROC AUC is the most used statistic to assess the predictive power of a classification model. However, few working data scientists know theoretical results about its statistical fluctuations. Here, we show in detail a derivation of a commonly found result on the variance of the ROC AUC. We have not found this demonstration do... Read more 22 Sep 2022 - 21 minute read