Take-Away Notes for Machine Learning by Stanford University on Coursera.

Week 1, Lecture 1-3

"Introduction"

Introduce the core idea of teaching a computer to learn concepts using data without being explicitly programmed.

Definition of Machine Learning

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Supervised & Unsupervised Learning

Supervised Learning
- Trained with labeled data
- Classification(Discrete), Regression(Continuous)
  - Multi-Class Classification
  - Decision Tree
  - Bayesian Logic
  - Linear Regression
  - Logistic Regression
  - Support Vector Machine
Unsupervised Learning
- Trained with unlabeled data
- Clustering
- KNN
- Apriori Algorithm

"Linear Regression with One Variable"

Linear regression predicts a real-valued output based on an input value.

Model & Cost Function

Model Representation

Hypothesis:

Parameters:

Cost Function

Goal:

Model Notations - = Numbers of training examples - = input variable / features - = output variable features - = the No.i variable x - = "Hypothesis" <-> "Model"

e.g.

Cost Function: Least Mean Square/Mean Square Error

Purpose: Choose , so that is close to for training examples

Parameter Learning

Gradient Descent Algorithm

Mathematical Algorithm: Repeat Simultaneous Update Algorithm

i.e.

repeat until convergence{

}

where refers to the learning rate;

if is too small, gradient descent can be slow;

if is too large, gradient descent can overshoot the minimum, fail to converge or even diverge.

The intuition behind the converge is that approaches 0 as approach the bottom of the convex function.

"Linear Algebra Review"

Basic understanding of linear algebra is necessary for the rest of the course, especially as we begin to cover models with multiple variables.

Matrices & Vectors

Notations: - refers to the element in ith row and jth column of matrix A - A vector with n rows is referred to as an n-dimensional vector - refers to the element in the ith row of the vector. - vectors and matrices in Octave/Matlab will be 1-indexed. Note that for some programming languages(e.g. Python), the arrays are 0-indexed. - Matrices are usually denoted by uppercase names while vectors are lowercase. - "Scalar" means that an object is a single value, not a vector or matrix. - refers to the set of scalar real numbers. - $𝕟$ refers to the set of n-dimensional vectors of real numbers.

Coding Kit:

% Initializing Matrix
A = [1, 2, 3; 4, 5, 6; 7, 8, 9; 10, 11, 12]

% Initializing Vector
v = [1; 2; 3; 4; 5; 6; 7]

% Size of Matrix
[m, n] = size(A)

% Size of Vector
l = size(v)

% Indexed Term
A_23 = A(2, 3)

Addition & Scalar Multiplication

Element-wise Operations.

To add of subtract two matrices, their dimensions must be the same.

Coding Kit

% Initialize Matrix A and B 
A = [1, 2, 4; 5, 3, 2]
B = [1, 3, 4; 1, 1, 1]

% Initialize Scalar s 
s = 2

% Element-wise Addition
add_AB = A + B 

% Element-wise Subtraction
sub_AB = A - B

% Scalar Multiplication
mult_As = A * s

% Scalar Division
div_As = A / s

% What happens if we have a Matrix + scalar?
add_As = A + s % Element-wise addition, e.g. A_ij = A_ij + s

Matrix Matrix(Vector) Multiplication

The general rule is, m x n matrix(vector) multiples a n x p matrix(vector) results in a m x p matrix(vector).

Matrices are NOT commutative:

Matrices are associative:

The identity matrix , when multiplied by any matrix of the same dimensions, results in the original matrix.

See "Essence of Linear Algebra" for more tricks of fast calculation of matrix multiplication.

Coding Kit

% Initialize Matrix A 
A = [1, 2, 3; 4, 5, 6;7, 8, 9] 

% Initialize Vector v 
v = [1; 1; 1] 

% Initialize 3 by 3 Identity Matrix
I = eye(3)

% Multiply A * v
Av = A * v

% Multiply A * A
AA = A * A

% Multiply A * I
AI = A * A

Inverse & Transpose

The inverse of a matrix A is denoted as . Multiplying by the inverse results in the identity matrix.

The transposition of a matrix A is denoted as . It's like rotating the matrix 90° in clockwise direction and then reversing it.

Note: An invertible(also non-singular or non-degenerate) matrix is a matrix for which matrix inversion operation exists, given that the determinant of which is non-zero.

Coding Kit

% Initialize Matrix A 
A = [1, 2, 0; 0, 5, 6; 7, 0, 9]

% Transpose A 
A_trans = A' 

% Take the inverse of A 
A_inv = inv(A)

% What is A^(-1)*A? 
A_invA = inv(A)*A % Identity Matrix