Adaptation of a mixture of multivariate Bernoulli distributions
Ankur Kamthe, Miguel Carreira-Perpinan and Alberto Cerpa
The mixture of multivariate Bernoulli distributions (MMB) is a statistical model for high-dimensional binary data. Recently, the MMB has been used to model sequences of packet receptions and losses of wireless links in sensor networks. Given an MMB trained on long data traces recorded from links of a deployed network, one can then use samples from the MMB to test different routing algorithms for long periods. However, learning an accurate model for a new link requires collecting long traces over periods of hours, a costly process in practice (e.g.\ limited battery life). We propose an algorithm that can adapt a preexisting MMB trained with extensive data to a new link from which very limited data is available. Our approach constrains the new MMB's parameters through a nonlinear transformation of the existing MMB's. The transformation has a small number of parameters that are estimated using a generalized EM algorithm with an inner loop of BFGS iterations. We demonstrate the efficacy of the approach using the MNIST dataset of handwritten digits, and wireless link data. We show we can learn accurate models from data traces of about 1 minute, about 10 times shorter than needed if training an MMB from scratch.