Bubble Entropy of fractional Gaussian noise and fractional Brownian motion

George Manis1, Matteo Bodini2, Massimo W Rivolta3, Roberto Sassi3
1University of Ioannina, Dept. of Computer Science and Engineering, 2Università degli Studi di Milano, 3Dipartimento di Informatica, Università degli Studi di Milano


Aims: Bubble Entropy (bEn) is a metric which links the complexity of the series to the cost of sorting its samples, with limited dependence on parameters. It takes into consideration the entropy of the swaps necessary to order with Bubble Sort a portion of elements of two subsequent lengths m and m+1. bEn is larger for those sequences which display all the possible ordering of their samples. Fractional Brownian motion (fBm) is a long-memory process, displaying self-similarity, which has largely been used in modeling heart rate variability (HRV). fBm displays ephemeral regularities and periodicity at multiple time scales, which then vanish to reform differently. In this work we tested if the continuously growing or decaying trends in fBm, which hints a broad range of swaps necessary for sorting, lead to maximal values of bEn.

Methods: We synthetically generated realizations of fBm (length >1e6), along with its increments, the fractional Gaussian noise (fGn), a time-discrete Gaussian process. The Hurst exponent H, on which fBm and fGn are parameterized, was varied in the entire range (0,1). bEn was computed with m ranging up to 200 (typically beyond the scope of other entropy metrics), verifying the convergence of the estimates.

Results: For fGn, a stationary process, bEn showed a very small, if minimal, dependence on m. Unexpectedly, it scaled as (H/2+3/4) times the bEn of a WGN. As awaited, the dependence on m was more significant, at low values of m, for fBm, a non-stationary process. When m grew, bEn approached the constant value 4/3 times the bEn of a WGN.

Conclusions: While the swap distribution of a fBm is not uniform, it covers the entire range of possible swaps. bEn behaves like a scaling estimator for stationary Gaussian long-memory processes, but less so when stationarity becomes relevant (as it is for HRV).