WORKING PAPER

Efficient Online Non-Parametric Density Estimation

Christophe G. Lambert
Department of Computer Science, Duke Unversity, Durham, NC 27708

Scott E. Harrington
Transworld Numerics Ltd., \#4 Nantucket Lane, Smiths Parish, FL05 Bermuda

Nathan D. Bronson
Transworld Numerics Ltd., \#4 Nantucket Lane, Smiths Parish, FL05 Bermuda

Campbell R. Harvey
Duke University, Durham, NC 27708
National Bureau of Economic Research, Cambridge, MA 02138

Arman Glodjo
Duke University, Durham, NC 27708
Transworld Numerics Ltd., \#4 Nantucket Lane, Smiths Parish, FL05 Bermuda

Abstract

Non-parametric density estimation has broad applications in computational finance especially in cases where high frequency data is available. However, the technique is often intractible given the run times necessary to evaluate a density. We present a new and efficient algorithm based on multipole techniques. Given the n kernels that estimate the density, current methods take O(n) time to directly sum the kernels to perform a single density query. The cumulative O(n2<\sup>) running time for n queries makes it very costly, if not impractical, to compute the density for large n. Our new Multipole-accelerated Online Density Estimation (MODE) algorithm is general in that it can be applied to any kernel (in arbitrary dimensions) that admits Taylor series expansion. The running time reduces to O(log(n)) or even constant time, depending on the kernel chosen, and hence, the cumulative running time is reduced to O(nlog(n)) or O(n), respectively. Our results show that the MODE algorithm provides dramatic advantages over the direct approach to density evaluation. For example, we show using a modest computing platform that on-line density updates and queries for one million points and two dimensions take 8 days to compute using the direct approach versus 40 seconds with the MODE approach.