KakkoKari (仮) Another (data) science blog. By Alessandro Morita

Linear trees in LightGBM: how to use

This was originally written as a “Hello world” kind of program aimed at giving my team at the DataLab some help getting started with less noisy variants of GBDTs. What are linear trees? From this post: Not everybody knows simple yet effective variations of the Decision Tree algorithm. These are known as Model Trees. They learn an optim... Read more

A ROC AUC partial to misclassification cost

This was originally written as a quick intro to partial AUCs, aimed at giving my team at the DataLab some insights into cost-based classification. Below, we consider the standard binary classification problem. Assume we pay a cost $c_\mathrm{FN} >0 $ in case we classify a point of the positive class as a negative, and, similarly, pay a ... Read more

The Carr-Madan decomposition of arbitrary payoff functions

The Carr-Madan decomposition is used in quant finance to break any payoff into a (continuous) combination of calls and puts, plus a forward. Namely, for any twice differentiable function: \[\boxed{ f(x) = f(y) + f'(y)(x-y) + \int_{-\infty}^y f''(z) (z-x)^+ dz + \int_{y}^\infty f''(z) (x-z)^+ dz }\] where $(x)^+ \equiv \max(0, x)$ is the positi... Read more

A vector space structure for probabilities

This post is based on this article. Does it even make sense to discuss about adding two probabilities? Probabilities definitely look like vectors: they are arrays of numbers. For example, it could make sense that a coin toss would be described by an array with two numbers, something like $(0.5, 0.5)$. However, it is not obvious how they would... Read more