{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, I present a practical example of what is actually happening when you train a Gaussian process model. I don't suggest you ever use this implementation to do actual machine learning. I have coded everything with readability in mind, at the cost of efficiency. There are also a bunch of cool linear algebra and numerical stability tricks that I have also skipped in the interest of simplicity.\n", "\n", "The leading libraries for working with GPs (in Python) are;\n", "* [GPy](https://sheffieldml.github.io/GPy/) - Probably the most popular\n", "* [George](https://george.readthedocs.io/en/latest/) - Written by an astrophysicist I had beers with once, so maybe I'm biased, but this is the one I used to use as it gives the user a lot of flexibility and plays well with stochastic sampling methods\n", "* [Scikit-Learn](https://scikit-learn.org/stable/modules/gaussian_process.html) - Used to be considered a poor implementation, not sure if that is still true" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, some notation. We have;\n", "\n", "* $x_t$ - the inputs for our training data\n", "* $y_t$ - the outputs for our training data\n", "* $x_p$ - the inputs for which we want to predict the output\n", "* $y_p$ - the (unknown) outputs that we want to predict\n", "\n", "We generate some training data;" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "