Machine Learning
Definition
Machine Learning
: A computer program is said to learn from experience
-
[r] Michell, Machine Learning 1997
-
From a scientific and philosophical point of view, machine learning is interesting because developing our understanding of machine learning entails developing our understanding of the principles that underlie intelligence
Task T
ML tasks are usually described in terms of how the ML system should process an example.
An example is a collection of features that have been quantitatively measured from some object or event that we want the ML system to process. We typically represent an example as a vector
- The features of an image are usually the values of the pixels in the image.
Most common ML tasks:
- Classification
- Classification with missing inputs
- When some of the inputs may be missing, the learning algorithm must learn a set of functions, each corresponding to classifying x with a different subset of its inputs missing
- With n input variables, we can now obtain all 2^n^ different classification functions needed for each possible set of missing inputs, but we only need to learn a single function describing the joint probability distribution
- Regression
- Structured output
- Transcription
- Machine translation
- Anomaly detection
- Synthesis and sampling
- Imputation of missing values
- Denoising
- Density estimation or probability mass function estimation
- Probability distribution estimated can be used to solve other tasks, such as the missing value imputation task
Performance Measure P
In order to evaluate the abilities of a machine learning algorithm, we must design a quantitative measure of its performance. Usually this performance measure P is specific to the task T being carried out by the system.
Most common performance measures:
- Accuracy or error rate: classification, structured output
- average log-probability (i.e. Cross-Entropy) the model assigns to some examples: density estimation
Usually we are interested in how well the machine learning algorithm performs on data that it has not seen before. We therefore evaluate these performance measures using a test set of data that is separate from the data used for training the machine learning system.
Experience E
The experience of most ML algorithms is the dataset, a collection of many examples/data points, and an example is a collection of features.
ML algorithms can be broadly categorized as Unsupervised Learning or Supervised Learning by what kind of experience they are allowed to have during the learning process.
- Unsupervised Learning
- learn useful properties of the structure of this dataset
- Examples: density estimation, synthesis, denoising, clustering
- Supervised Learning
- each example is also associated with a label/target
- Example: classification, regression, structured output
- Semi-Supervised Learning
The chain rule of probability states that for a vector
This decomposition means that we can solve the ostensibly unsupervised problem of modeling
Alternatively we can solve the SL problem of learning
Describing a Dataset
- Matrix - a different example in each row, a different feature in each column
- Requires that every example has the same features vectors of the same size
- With a label vector in SL
- Set - examples as elements
Methods
- Supervised Learning
- Ensemble Learning
- Unsupervised Learning
- Meta-Learning
- Unsupervised Learning
- Semi-Supervised Learning
- Reinforcement Learning
General Concepts & Techniques
- Mercer Kernel & Reproducing Kernel Hilbert Space
- Dimensionality Deduction
- Overfitting and Underfitting
- Hyperparameter
- Cross-Validation
- Estimation
Course Log
2023-01-17
- Supervised Learning
- Two tasks
- Regression: Using a set of inputs, predict real-valued output
- Classification: Using a set of inputs, predict a discrete label (aka class)
- Tools
- Two tasks
- Unsupervised Learning
- Task: uncover the structure in the data
- NO correct answer; no supervise
- Applications
- predictions
- recommendations
- efficient data exploration
- Learn the dominant topics from a set of news articles.
- Task: uncover the structure in the data
- A probabilistic model is a set of probability distributions,
graph TD B[0. Build Model] --> C[2. Infer hidden variables] A[1. Data] --Optimization--> C C --> D[3. Predict & Explore] A --Supervise--> D
- Differences between Supervised Learning and Unsupervised Learning mainly lie in blocks 1 and 3
- Differences between probabilistic and non-probabilistic approaches mainly lie in blocks 0 and 2
We want to estimate the distribution of some data.
- Block 0. We assume it is a multivariate Gaussian distribution
- Block 1. Sample data
- Block 2. Maximum Likelihood Estimation
- Block 3. Predict the properties of the whole dataset
2023-01-19
2023-01-24
2023-01-26
2023-01-31
2023-02-02
2023-02-07
- Classification
- Statistical Learning
- How to find an accurate regression function/ classifier?
- Key assumption: data (feature and label) are i.i.d. from a distribution, then the past should look like the future