Machine Learning

Definition

Machine Learning
: A computer program is said to learn from experience $E$ with respect to some class of tasks $T$ and performance measure $P$ , if its performance at tasks in $T$ , as measured by $P$ , improves with experience $E$

[r] Michell, Machine Learning 1997
From a scientific and philosophical point of view, machine learning is interesting because developing our understanding of machine learning entails developing our understanding of the principles that underlie intelligence

Task T

ML tasks are usually described in terms of how the ML system should process an example.

An example is a collection of features that have been quantitatively measured from some object or event that we want the ML system to process. We typically represent an example as a vector $x \in R^{n}$ where each entry $x_{i}$ of the vector is an feature.

The features of an image are usually the values of the pixels in the image.

Most common ML tasks:

Classification
Classification with missing inputs
- When some of the inputs may be missing, the learning algorithm must learn a set of functions, each corresponding to classifying x with a different subset of its inputs missing
- With n input variables, we can now obtain all 2^n^ different classification functions needed for each possible set of missing inputs, but we only need to learn a single function describing the joint probability distribution
Regression
Structured output
- Transcription
- Machine translation
Anomaly detection
Synthesis and sampling
Imputation of missing values
Denoising
Density estimation or probability mass function estimation
- Probability distribution estimated can be used to solve other tasks, such as the missing value imputation task

Performance Measure P

In order to evaluate the abilities of a machine learning algorithm, we must design a quantitative measure of its performance. Usually this performance measure P is specific to the task T being carried out by the system.

Most common performance measures:

Accuracy or error rate: classification, structured output
average log-probability (i.e. Cross-Entropy) the model assigns to some examples: density estimation

Usually we are interested in how well the machine learning algorithm performs on data that it has not seen before. We therefore evaluate these performance measures using a test set of data that is separate from the data used for training the machine learning system.

Experience E

The experience of most ML algorithms is the dataset, a collection of many examples/data points, and an example is a collection of features.

ML algorithms can be broadly categorized as Unsupervised Learning or Supervised Learning by what kind of experience they are allowed to have during the learning process.

Unsupervised Learning
- learn useful properties of the structure of this dataset
- Examples: density estimation, synthesis, denoising, clustering
Supervised Learning
- each example is also associated with a label/target
- Example: classification, regression, structured output
Semi-Supervised Learning

Blurry line between UL and SL

The chain rule of probability states that for a vector $x \in R^{n}$ , the joint distribution can be decomposed as

p (x) = \prod_{i = 1}^{n} p (x_{i} ∣ x_{1}, \dots, x_{i - 1})

This decomposition means that we can solve the ostensibly unsupervised problem of modeling $p (x)$ by splitting it into $n$ SL problems.

Alternatively we can solve the SL problem of learning $p (y | x)$ by applying UL to learn the joint distribution $p (x, y)$ and inferring

p (y ∣ x) = \frac{p (x, y)}{\sum_{y^{'}} p (x, y^{'})}

Describing a Dataset

Matrix - a different example in each row, a different feature in each column
- Requires that every example has the same features vectors of the same size
- With a label vector in SL
Set - examples as elements

Methods

General Concepts & Techniques

Course Log

2023-01-17

Supervised Learning
- Two tasks
  - Regression: Using a set of inputs, predict real-valued output
  - Classification: Using a set of inputs, predict a discrete label (aka class)
- Tools
  - Deep Learning
Unsupervised Learning
- Task: uncover the structure in the data
  - NO correct answer; no supervise
- Applications
  - predictions
  - recommendations
  - efficient data exploration
    - Learn the dominant topics from a set of news articles.
A probabilistic model is a set of probability distributions, $p (x | θ)$

graph TD
B[0. Build Model] --> C[2. Infer hidden variables]
A[1. Data] --Optimization--> C
C --> D[3. Predict & Explore]
A --Supervise--> D

Differences between Supervised Learning and Unsupervised Learning mainly lie in blocks 1 and 3
Differences between probabilistic and non-probabilistic approaches mainly lie in blocks 0 and 2

Gaussian Distribution

We want to estimate the distribution of some data.

Block 0. We assume it is a multivariate Gaussian distribution
Block 1. Sample data
Block 2. Maximum Likelihood Estimation
Block 3. Predict the properties of the whole dataset

2023-01-19

Linear Regression

2023-01-24

Ridge Regression

2023-01-26

2023-01-31

2023-02-02

2023-02-07

Classification
- Nearest Neighbor
Statistical Learning
- How to find an accurate regression function/ classifier?
- Key assumption: data (feature and label) are i.i.d. from a distribution, then the past should look like the future
Generative vs Discriminative Model

2023-02-09

Linear Classifier
- Perceptron Algorithm

2023-02-14

2023-02-016

2023-02-21/23

Support Vector Machine

2023-02-28

by zcysxy

Machine Learning

Definition

Task T

Performance Measure P

Experience E

Describing a Dataset

Methods

General Concepts & Techniques

Course Log

2023-01-17

2023-01-19

2023-01-24

2023-01-26

2023-01-31

2023-02-02

2023-02-07

2023-02-09

2023-02-14

2023-02-016

2023-02-21/23

2023-02-28

2023-03-2

2023-03-21

2023-03-23

2023-03-28

2023-03-30

2023-04-04

2023-04-06

2023-04-11

2023-04-13

2023-04-18

2023-04-20