Generative Models

Introduction

Generative model is a type of ML model that generates new data samples that are similar to the training data. It models the probability distribution of the training data and generates new data points accordingly. Generative models are often used in image generation, language modeling, speech synthesis, and music generation. In particular, they are especially useful when there is a limited amount of training data.

Generative models aim at developing the joint probability distribution $P(X,Y)$, either directly or by computing $P(Y,X)$ and $P(X)$ and then inferring the conditional probabilities required to classify newer data. This complete probabilistic description/structure of the dataset enables the model to generate data.

Overview

Generative models primarily interested in parametric approximations of the data distribution, which summarize all the information about the dataset $D$ in a finite set of parameters. We can think of the task of learning a generative model as picking the parameters within a family of model distributions that minimizes some notion of distance between the model distribution and the data distribution. Mathematically, our goal is the following optimization problem

$$ \min_{\theta\in M} d(p_{\text{data}}, p_{\theta}) $$

where $M$ is the model family, and $d$ is notion of distance. So we can focus on the following questions:

What is representation of $M$
What is $d$
What is the optimization procedure for minimizing $d$

Inference tasks

While there are a wide range of applications to which generative models have been used, we can identify three fundamental inference queries for evaluating a generative models:

Density estimation: Given a datapoint $x$, what is the probability assigned by the model?
Sampling: How can we generate novel data from the model distribution?
Unsupervised representation learning: How can we learn meaningful feature representations for a datapoint $x$?

For example, if we are learning dog images, we can expect a good generative model to work as follows. We expect the probability for dog images to be high and low otherwise. We expect the model to generate novel images of dogs beyond the ones we observe. Finally, representation learning can help discover high-level structure in the data such as breed of dogs.

Types of Generative Models

Variational Autoencoder (VAE)
Generative Adversarial Network (GAN)