(CS/CNS/EE 155) Machine Learning & Data Mining

2015/2016 Winter Term (previous year)

Course Description

Prerequisite: background in algorithms and statistics (CS/CNS/EE/NB 154 or CS/CNS/EE 156a or instructor’s permission)

This course will cover popular methods in machine learning and data mining, with an emphasis on developing a working understanding of how to apply these methods in practice. This course will also cover core foundational concepts underpinning and motivating modern machine learning and data mining approaches. This course will be research-oriented, and will cover recent research developments.

Course Survey Results: link

Course Details

Late Homework Policy

Students are allowed 48 free late hours for submitting homeworks and miniprojects. After using the free late hours, a 50% penalty will be assessed to submissions that are one day late, and submissions beyond one day late will not be accepted. Please specify how many late hours you are using at the top when you submit your homework.


Yisong Yue               yyue@caltech.edu

Teaching Assistants

Lucy Yin lyin@caltech.edu
Ritvik Mishra rmishra@caltech.edu
Kevin Tang ktang@caltech.edu
Fabian Boemer fboemer@caltech.edu

Office Hours

Optional Textbooks

  • Machine Learning: a Probabilistic Perspective, by Kevin Murphy
  • Convex Optimization: Algorithms and Complexity (Free Version), by Sebastien Bubeck
  • A Course in Machine Learning, by Hal Daume III
  • Since this is an advanced level course, all relevant course materials can be learned via research papers and supplementary lecture notes. However, these books are excellent references and I will refer to various chapters throughout the course.


    Lectures & Recitation Schedule

    Note: schedule is subject to change.

                                    Further Reading:
    1/05/2016 Lecture: Administrivia, Basics, Bias/Variance, Overfitting [slides]
    1/07/2016 Lecture: Perceptron, Gradient Descent [slides] Daume Chapter 3
    Mistake Bounds for Perceptron [link]
    AdaGrad [link]
    Stochastic Gradient Descent Tricks [link]
    Bubeck Chaper 3
    1/07/2016 Recitation: Introduction to Python for Machine Learning [slides][SciPy Tutorial]
    1/12/2016 Lecture: SVMs, Logistic Regression, Neural Nets, Loss Functions [slides]
    1/14/2016 Lecture: Regularization, Lasso [slides] Murphy 13.3
    1/14/2016 Recitation: Linear Algebra [slides] The Matrix Cookbook [link]
    1/19/2016 Lecture: Decision Trees, Bagging, Random Forests [slides] Overview of Decision Trees [pdf]
    Overview of Bagging [pdf]
    Overview of Random Forests [pdf]
    1/21/2016 Lecture: Boosting, Ensemble Selection [slides] Shapire's Overview of Boosting [pdf]
    1/21/2016 Recitation: Probability [slides]
    1/26/2016 Lecture: Probabilistic Models, Naive Bayes [slides] Murphy 3.5
    1/28/2016 Lecture: Sequence Prediction, Hidden Markov Models [slides][notes] Murphy 17.3--17.5
    1/28/2016 Recitation: Viterbi Review [slides]
    2/2/2016 Lecture: Conditional Random Fields [slides][notes] Hanna Wallach's intro to CRFs [link]
    2/4/2016 Lecture: Conditional Random Fields Continued, General Structured Prediction [slides][notes] Hanna Wallach's intro to CRFs [link]
    2/4/2016 Recitation: NO RECITATION
    2/9/2016 Lecture: Recent Applications [slides] Tutorial on Learning Reductions [link]
    Data-Driven Animation Project [link]
    2/11/2016 Lecture: NO LECTURE
    2/11/2016 Recitation: Conditional Random Field Gradient Descent [slides]
    2/16/2016 Lecture: Unsupervised Learning, Clustering, Dimensionality Reduction [slides]
    2/18/2016 Lecture: Latent Factor Models, Non-Negative Matrix Factorization [slides] Original Netflix Paper [link]
    2/18/2016 Recitation: NO RECITATION
    2/23/2016 Lecture: Embeddings [slides] Locally Linear Embedding [link]
    Playlist Embedding [link]
    word2vec [link]
    2/25/2016 Lecture: Deep Learning [slides]
    2/25/2016 Recitation: Advanced Optimization [notes]
    3/1/2016 Lecture: Recent Applications [slides] Sparse Multiclass Cancer Detection [link]
    Badge Dictionary Learning from Twitter [link]
    Learning Embedding of Visual Style [link]
    3/3/2016 Lecture: Survey of Advanced Topics [slides]
    3/3/2016 Recitation: NO RECITATION
    3/8/2016 Lecture: NO LECTURE

    Additional References