Machine Learning | by Andrew Ng | Coursera

Brief Information

Key Words


Lectures

Week 1

Welcome to Machine Learning! This week, we introduce the core idea of teaching a computer to learn concepts using data—without being explicitly programmed.

We are going to start by covering linear regression with one variable. Linear regression predicts a real-valued output based on an input value. We discuss the application of linear regression to housing price prediction, present the notion of a cost function, and introduce the gradient descent method for learning.

We’ll also have optional lessons that provide a refresher on linear algebra concepts. Basic understanding of linear algebra is necessary for the rest of the course, especially as we begin to cover models with multiple variables. If you feel confident in your understanding of linear algebra, feel free to take a break or help other students out in the forums.

Introduction
  •  Welcome
    • 1-1 Welcome to Machine Learning!
  • Introduction
    • 1-2 Welcome
    • 1-3 What is Machine Learning?
    • 1-4 Supervised Learning
    • 1-5 Unsupervised Learning
  • Quiz 1-1: Introduction
Linear Regression with One Variable

Linear regression predicts a real-valued output based on an input value. We discuss the application of linear regression to housing price prediction, present the notion of a cost function, and introduce the gradient descent method for learning.

  • Model and Cost Function
    • 1-6 Model Representation
    • 1-7 Cost Function
    • 1-8 Cost Function – Intuition I
    • 1-9 Cost Function – Intuition II
  • Parameter Learning
    • 1-10 Gradient Descent
    • 1-11 Gradient Descent Intuition
    • 1-12 Gradient Descent For Linear Regression
  • Quiz 1-2: Linear Regression with One Variable
Linear Algebra Review

This optional module provides a refresher on linear algebra concepts. Basic understanding of linear algebra is necessary for the rest of the course, especially as we begin to cover models with multiple variables.

  • Linear Algebra Review
    • Matrices and Vectors
    • Addition and Scalar Multiplication
    • Matrix Vector Multiplication
    • Matrix Matrix Multiplication
    • Matrix Multiplication Properties
    • Inverse and Transpose
  • Practice Quiz: Linear Algebra
Week 2

Welcome to week 2! I hope everyone has been enjoying the course and learning a lot! This week we’re covering linear regression with multiple variables. We’ll show how linear regression can be extended to accommodate multiple input features. We also discuss best practices for implementing linear regression.

We’re also going to go over how to use Octave. You’ll work on programming assignments designed to help you understand how to implement the learning algorithms in practice. To complete the programming assignments, you will need to use Octave or MATLAB.

As always, if you get stuck on the quiz and programming assignment, you should post on the Discussions to ask for help. (And if you finish early, I hope you’ll go there to help your fellow classmates as well.)

Linear Regression with Multiple Variables
  • Multiple Linear Regression
    • 2-1 Multiple Features
    • 2-2 Gradient Descent for Multiple Variables
    • 2-3 Gradient Descent in Practice I – Feature Scaling
    • 2-4 Gradient Descent in Practice II – Learning Rate
    • 2-5 Features and Polynomial Regression
  • Computing Parameters Analytically
    • 2-6 Normal Equation
    • 2-7 Normal Equation Noninvertibility
  • Submitting Programming Assignments
    • 2-8 Working on and Submitting Programming Assignments
  • Quiz 2-1: Linear Regression with Multiple Variables
Octave/Matlab Tutorial

This course includes programming assignments designed to help you understand how to implement the learning algorithms in practice. To complete the programming assignments, you will need to use Octave or MATLAB. This module introduces Octave/Matlab and shows you how to submit an assignment.

  • Octave/Matlab Tutorial
    • 2-9 Basic Operations
    • 2-10 Moving Data Around
    • 2-11 Computing on Data
    • 2-12 Plotting Data
    • 2-13 Control Statements: for, while ,if statement
    • 2-14 Vectorization
  • Quiz 2-2: Octave/Matlabe Tutorial
  • Programming Assignment: Linear Regression
Week 3

Welcome to week 3! This week, we’ll be covering logistic regression. Logistic regression is a method for classifying data into discrete outcomes. For example, we might use logistic regression to classify an email as spam or not spam. In this module, we introduce the notion of classification, the cost function for logistic regression, and the application of logistic regression to multi-class classification.

We are also covering regularization. Machine learning models need to generalize well to new examples that the model has not seen in practice. We’ll introduce regularization, which helps prevent models from overfitting the training data.

As always, if you get stuck on the quiz and programming assignment, you should post on the Discussions to ask for help. (And if you finish early, I hope you’ll go there to help your fellow classmates as well.)

Logistic Regression
  • Classification and Representation
    • 3-1 Classification
    • 3-2 Hypothesis Representation
    • 3-3 Decision Boundary
  • Logistic Regression Model
    • 3-4 Cost Function
    • 3-5 Simplified Cost Function and Gradient Descent
    • 3-6 Advanced Optimization
  • Multiclass Classification
    • 3-7 Multiclass Classification: One-vs-all
  • Quiz 3-1: Logistic Regression
Regularization

Machine learning models need to generalize well to new examples that the model has not seen in practice. In this module, we introduce regularization, which helps prevent models from overfitting the training data.

  • Solving the Problem of Overfitting
    • 3-8 The Problem of Overfitting
    • 3-9 Cost Function
    • 3-10 Regularized Linear Regression
    • 3-11 Regularized Logistic Regression
  • Quiz 3-2: Regularization
  • Programming Assignment: Logistic Regression
Week 4

Welcome to week 4! This week, we are covering neural networks. Neural networks is a model inspired by how the brain works. It is widely used today in many applications: when your phone interprets and understand your voice commands, it is likely that a neural network is helping to understand your speech; when you cash a check, the machines that automatically read the digits also use neural networks.

Neural Networks: Representation
  • Motivations
    • 4-1 Non-linear Hypotheses
    • 4-2 Neurons and the Brain
  • Neural Networks
    • 4-3 Model Representation I
    • 4-4 Model Representation II
  • Applications
    • 4-5 Examples and Intuitions I
    • 4-6 Examples and Intuitions II
    • 4-7 Multi-class Classification
  • Quiz 4-1: Neural Networks: Representation
  • Programming Assignment 3: Multi-class Classification and Neural Networks
Week 5

In Week 5, you will be learning how to train Neural Networks. The Neural Network is one of the most powerful learning algorithms (when a linear classifier doesn’t work, this is what I usually turn to), and this week’s videos explain the ‘backpropagation’ algorithm for training these models. In this week’s programming assignment, you’ll also get to implement this algorithm and see it work for yourself.

The Neural Network programming exercise will be one of the more challenging ones of this class. So please start early and do leave extra time to get it done, and I hope you’ll stick with it until you get it to work! As always, if you get stuck on the quiz and programming assignment, you should post on the Discussions to ask for help. (And if you finish early, I hope you’ll go there to help your fellow classmates as well.)

Neural Networks: Learning
  • Cost Function and Backpropagation
    • 5-1 Cost Function
    • 5-2 Backpropagation Algorithm
    • 5-3 Backpropagation Intuition
  • Backpropagation in Practice
    • 5-4 Implementation Note: Unrolling Parameters
    • 5-5 Gradient Checking
    • 5-6 Random Initialization
    • 5-7 Putting It Together
  • Application of Neural Networks
    • 5-8 Autonomous Driving
  • Quiz 5-1: Neural Networks: Learning
  • Programming Assignment 4: Neural Network Learning
Week 6

In Week 6, you will be learning about systematically improving your learning algorithm. The videos for this week will teach you how to tell when a learning algorithm is doing poorly, and describe the ‘best practices’ for how to ‘debug’ your learning algorithm and go about improving its performance.

We will also be covering machine learning system design. To optimize a machine learning algorithm, you’ll need to first understand where the biggest improvements can be made. In these lessons, we discuss how to understand the performance of a machine learning system with multiple parts, and also how to deal with skewed data.

When you’re applying machine learning to real problems, a solid grasp of this week’s content will easily save you a large amount of work.

Key words: skewed data, accuracy, precision, recall, F_1 score

Advice for Applying Machine Learning
  • Evaluating a Learning Algorithm
    • 6-1 Deciding What to Try Next
    • 6-2 Evaluating a Hypothesis
    • 6-3 Model Selection and Train/Validation/Test Sets
  • Bias vs. Variance
    • 6-4 Diagnosing Bias vs. Variance
    • 6-5 Regularization and Bias/Variance
    • 6-6 Learning Curves
    • 6-7 Deciding What to Do Next Revisited
  • Quiz 6-1: Advice for Applying Machine Learning
  • Programming Assignment 5: Regularized Linear Regression and Bias/Variance
Machine Learning System Design
  • Building a Spam Classifier
    • 6-8 Prioritizing What to Work On
    • 6-9 Error Analysis
  • Handling Skewed Data
    • 6-10 Error Metrics for Skewed Classes
    • 6-11 Trading Off Precision and Recall
  • Using Large Data Sets
    • 6-12 Data For Machine Learning
  • Quiz 6-2: Machine Learning System Design
Week 7

Welcome to week 7! This week, you will be learning about the support vector machine (SVM) algorithm. SVMs are considered by many to be the most powerful ‘black box’ learning algorithm, and by posing a cleverly-chosen optimization objective, one of the most widely used learning algorithms today.

As always, if you get stuck on the quiz and programming assignment, you should post on the Discussions to ask for help. (And if you finish early, I hope you’ll go there to help your fellow classmates as well.)

  • Large Margin Classification
    • 7-1 Optimization Objective
    • 7-2 Large Margin Intuition
    • 7-3 Mathematics Behind Large Margin Classification
  • Kernels
    • 7-4 Kernels I
    • 7-5 Kernels II
  • SVMs in Practice
    • 7-6 Using an SVM
  • Quiz 7-1: Support Vector Machines
  • Programming Assignments 6: Support Vector Machines
Week 8
Unsupervised Learning

Hello all! I hope everyone has been enjoying the course and learning a lot! This week, you will be learning about unsupervised learning. While supervised learning algorithms need labeled examples (x,y), unsupervised learning algorithms need only the input (x). You will learn about clustering—which is used for market segmentation, text summarization, among many other applications.

We will also be introducing Principal Components Analysis, which is used to speed up learning algorithms, and is sometimes incredibly useful for visualizing and helping you to understand your data.

As always, if you get stuck on the quiz and programming assignment, you should post on the Discussions to ask for help. (And if you finish early, I hope you’ll go there to help your fellow classmates as well.)

  • Clustering
    • 8-1 Unsupervised Learning: Introduction
    • 8-2 K-means Algorithm
    • 8-3 Optimization Objective
    • 8-4 Random Initialization
    • 8-5 Choosing the Number of Clusters
  • Quiz 8-1: Unupervised Learning
Dimensionality Reduction

In this module, we introduce Principal Components Analysis, and show how it can be used for data compression to speed up learning algorithms as well as for visualizations of complex datasets.

  • Motivation
    • 8-6 Motivation I: Data Compression
    • 8-7 Motivation II: Visualization
  • Principal Component Analysis
    • 8-8 Principal Component Analysis Problem Formulation
    • 8-9 Principal Component Analysis Algorithm
  • Applying PCA
    • 8-10 Reconstruction from Compressed Representation
    • 8-11 Choosing the Number of Principal Components
    • 8-12 Advice for Applying PCA
  • Quiz 8-2: Principal Component Analysis
  • Programming Assignment 7: K-Means Clustering and PCA
Week 9
Anomaly Detection

Hello all! I hope everyone has been enjoying the course and learning a lot! This week, we will be covering anomaly detection which is widely used in fraud detection (e.g. ‘has this credit card been stolen?’). Given a large number of data points, we may sometimes want to figure out which ones vary significantly from the average. For example, in manufacturing, we may want to detect defects or anomalies. We show how a dataset can be modeled using a Gaussian distribution, and how the model can be used for anomaly detection.

We will also be covering recommender systems, which are used by companies like Amazon, Netflix and Apple to recommend products to their users. Recommender systems look at patterns of activities between different users and different products to produce these recommendations. In these lessons, we introduce recommender algorithms such as the collaborative filtering algorithm and low-rank matrix factorization.

As always, if you get stuck on the quiz and programming assignment, you should post on the Discussions to ask for help. (And if you finish early, I hope you’ll go there to help your fellow classmates as well.)

  • Density Estimation
    • 9-1 Problem Motivation
    • 9-2 Gaussian Distribution
    • 9-3 Algorithm
  • Building an Anomaly Detection System
    • 9-4 Developing and Evaluating an Anomaly Detection System
    • 9-5 Anomaly Detection vs. Supervised Learning
    • 9-6 Choosing What Features to Use
  • Multivariate Gaussian Distribution (Optional)
    • 9-7 Multivariate Gaussian Distribution
    • 9-8 Anomaly Detection using the Multivariate Gaussian Distribution
  • Quiz 9-1: Anomaly Detection
Recommender Systems

When you buy a product online, most websites automatically recommend other products that you may like. Recommender systems look at patterns of activities between different users and different products to produce these recommendations. In this module, we introduce recommender algorithms such as the collaborative filtering algorithm and low-rank matrix factorization.

  • Predicting Movie Ratings
    • 9-9 Problem Formulation
    • 9-10 Content Based Recommendations
  • Collaborative Filtering
    • 9-11 Collaborative Filtering
    • 9-12 Collaborative Filtering Algorithm
  • Low Rank Matrix Factorization
    • 9-13 Vectorization: Low Rank Matrix Factorization
    • 9-14 Implementational Detail: Mean Normalization
  • Quiz 9-2: Recommender Systems
  • Programming Assignment 8: Anomaly Detection and Recommender Systems
Week 10
Large Scale Machine Learning

Welcome to week 10! This week, we will be covering large scale machine learning. Machine learning works best when there is an abundance of data to leverage for training. With the amount data that many websites/companies are gathering today, knowing how to handle ‘big data’ is one of the most sought after skills in Silicon Valley.

  • Gradient Descent with Large Datasets
    • 10-1 Learning With Large Datasets
    • 10-2 Stochastic Gradient Descent
    • 10-3 Mini-Batch Gradient Descent
    • 10-4 Stochastic Gradient Descent Convergence
  • Advanced Topics
    • 10-5 Online Learning
    • 10-6 Map Reduce and Data Parallelism
  • Quiz 10-1: Large Scale Machine Learning
Week 11
Application Example: Photo OCR

Congratulations on making it to the eleventh and final week! This week, we will walk you through a complex, end-to-end application of machine learning, to the application of Photo OCR. Identifying and recognizing objects, words, and digits in an image is a challenging task. We discuss how a pipeline can be built to tackle this problem and how to analyze and improve the performance of such a system.

  • Photo OCR
    • 11-1 Problem Description and Pipeline
    • 11-2 Sliding Windows
    • 11-3 Getting Lots of Data and Artificial Data
    • 11-4 Ceiling Analysis: What Part of the Pipeline to Work on Next
  • Quiz 11-1: Application: OCR

Unsolved Questions

  • [Week 3] Why is the cost function of logistic regression \textup{cost}(h_{\theta}(x),y)= -y \textup{log}(h_{\theta}(x))-(1-y) \textup{log}(1-h_{\theta}(x))
  • [Week 5] How to derive the backpropagation algorithm?
  • [Week 11] How to compute the accuracy of each component in ceiling analysis?

My Supplementary References

Leave a Reply

Your email address will not be published. Required fields are marked *