11:00 - 11:20
Scalable Large Margin Gaussian Process Classification (116)
Martin Wistuba (IBM Research), Ambrish Rawat (IBM Research)
We introduce a new Large Margin Gaussian Process (LMGP) model by formulating a pseudo-likelihood for a generalised multi-class hinge loss.We derive a highly scalable training objective for the proposed model using variational-inference and inducing point approximation.Additionally, we consider the joint learning of LMGP-DNN which combines the proposed model with traditional Deep Learning methods to enable learning for unstructured data.We demonstrate the effectiveness of the Large Margin GP with respect to both training time and accuracy in an extensive classification experiment consisting of 68 structured and two unstructured data sets.Finally, we highlight the key capability and usefulness of our model in yielding prediction uncertainty for classification by demonstrating its effectiveness in the tasks of large-scale active learning and detection of adversarial images.
|
11:20 - 11:40
Integrating Learning and Reasoning with Deep Logic Models (182)
Giuseppe Marra (University of Florence; University of Siena), Francesco Giannini (University of Siena), Michelangelo Diligenti (University of Siena), Marco Gori (University of Siena)
Deep learning is very effective at jointly learning feature representations and classification models, especially when dealing with high dimensional input patterns. Probabilistic logic reasoning, on the other hand, is capable of take consistent and robust decisions in complex environments. The integration of deep learning and logic reasoning is still an open-research problem and it is considered to be the key for the development of real intelligent agents. This paper presents Deep Logic Models, which are deep graphical models integrating deep learning and logic reasoning both for learning and inference. Deep Logic Models create an end-to-end differentiable architecture, where deep learners are embedded into a network implementing a continuous relaxation of the logic knowledge. The learning process allows to jointly learn the weights of the deep learners and the meta-parameters controlling the high-level reasoning. The experimental results show that the proposed methodology overcomes the limitations of the other approaches that have been proposed to bridge deep learning and reasoning.
|
11:40 - 11:00
Data Association with Gaussian Processes (286)
Markus Kaiser (Siemens AG; Technical University of Munich), Clemens Otte (Siemens AG), Thomas A. Runkler (Siemens AG; Technical University of Munich), Carl Henrik Ek (University of Bristol)
The data association problem is concerned with separating data coming from different generating processes, for example when data comes from different data sources, contain significant noise, or exhibit multimodality.We present a fully Bayesian approach to this problem.Our model is capable of simultaneously solving the data association problem and the induced supervised learning problem.Underpinning our approach is the use of Gaussian process priors to encode the structure of both the data and the data associations.We present an efficient learning scheme based on doubly stochastic variational inference and discuss how it can be applied to deep Gaussian process priors.
|
12:00 - 12:20
Incorporating Dependencies in Spectral Kernels for Gaussian Processes (510)
Kai Chen (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences; Radboud University; Shenzhen Engineering Laboratory of Ocean Environmental Big Data Analysis and Application), Twan van Laarhoven (Radboud University; Open University of The Netherlands), Jinsong Chen (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences; Shenzhen Engineering Laboratory of Ocean Environmental Big Data Analysis and Application), Elena Marchiori (Radboud University)
Gaussian processes (GPs) are an elegant Bayesian approach to model an unknown function. The choice of the kernel characterizes one's assumption on how the unknown function autocovaries. It is a core aspect of a GP design, since the posterior distribution can significantly vary for different kernels. The spectral mixture (SM) kernel is derived by modelling a spectral density - the Fourier transform of a kernel - with a linear mixture of Gaussian components. As such, the SM kernel cannot model dependencies between components. In this paper we use cross convolution to model dependencies between components and derive a new kernel called Generalized Convolution Spectral Mixture (GCSM). Experimental analysis of GCSM on synthetic and real-life datasets indicates the benefit of modeling dependencies between components for reducing uncertainty and for improving performance in extrapolation tasks.
Reproducible Research
|
12:20 - 12:40
Deep convolutional Gaussian processes (645)
Kenneth Blomqvist (Aalto University; Helsinki Institute for Information Technology HIIT), Samuel Kaski (Aalto University; Helsinki Institute for Information Technology HIIT), Markus Heinonen (Aalto University; Helsinki Institute for Information Technology HIIT)
We propose deep convolutional Gaussian processes, a deep Gaussian process architecture with convolutional structure. The model is a principled Bayesian framework for detecting hierarchical combinations of local features for image classification. We demonstrate greatly improved image classification performance compared to current convolutional Gaussian process approaches on the MNIST and CIFAR-10 datasets. In particular, we improve state-of-the-art CIFAR-10 accuracy by over 10 percentage points.
Reproducible Research
|