Nonparametric Bayesian kernel models

Feng Liang, Kai Mao, Ming Liao, Sayan Mukherjee and Mike West

Duke University

Revised January 2008 (Original April 2007)

Kernel models for classification and regression have emerged as widely applied tools in statistics and machine learning. We discuss a Bayesian framework and theory for kernel methods, providing a new rationalisation of kernel regression based on nonparametric Bayesian models. Functional analytic results ensure that such a nonparametric prior specification induces a class of functions that span the reproducing kernel Hilbert space corresponding to the selected kernel. Bayesian analysis of the model allows for direct and formal inference on the uncertain regression or classification functions. Augmenting the model with Bayesian variable selection priors over kernel bandwidth parameters extends the framework to automatically address the key practical questions of kernel feature selection. Novel, customised MCMC methods are detailed and used in example analyses. The practical benefits and modelling flexibility of the Bayesian kernel framework are illustrated in both simulated and real data examples that address prediction and classification inference with high-dimensional data.


Some Key Words: Dirichlet process priors, Kernel parameter estimation, Kernel principal component regression, Reproducing kernel Hilbert space, Semi-supervised learning, Nonparametric Bayesian analysis.


The manuscript is available in PDF format.


Research partially supported by National Science Foundation (DMS-0342172) and National Institutes of Health (NCI U54-CA-112952-01). Any opinions, findings and conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of the NSF or NIH.