use CDFs not histograms

Sep 04, 2022

Histograms are a rightfully popular way to present data like latency, throughput, object size, and so on. Histograms avoid some of the difficulties of picking a summary statistic, or group of statistics, which is hard to do right. I think, though, that there's nearly always a better choice than histograms: the empirical cumulative distribution function (eCDF).

Found via Dan Luu on twitter, who advocates also including a PDF (Proabability Density Function wiki)

↑ up