2015/2016 Winter Term (previous year)

Prerequisite: background in algorithms and statistics (CS/CNS/EE/NB 154 or CS/CNS/EE 156a or instructorâ€™s permission)

This course will cover popular methods in machine learning and data mining, with an emphasis on developing a working understanding of how to apply these methods in practice. This course will also cover core foundational concepts underpinning and motivating modern machine learning and data mining approaches. This course will be research-oriented, and will cover recent research developments.

**Course Survey Results:** link

- Lectures on Tu/Th at 2:30pm-4pm in Annenberg 105
- Recitations on Th at 7:30pm-9pm (usually lasting 1 hour), in Annenberg 105
- We will be using Moodle for managing homeworks and grades [link]
- We will be using Piazza for discussion forums and announcements [link]
- 6 Homeworks (worth approximately 60% of final grade)
- 2 Miniprojects (worth approximately 24% of final grade)
- Final Exam (worth approximately 16% of final grade)

Students are allowed 48 free late hours for submitting homeworks and miniprojects. After using the free late hours, a 50% penalty will be assessed to submissions that are one day late, and submissions beyond one day late will not be accepted. Please specify how many late hours you are using at the top when you submit your homework.

Yisong Yue yyue@caltech.edu

Lucy Yin | lyin@caltech.edu |

Ritvik Mishra | rmishra@caltech.edu |

Kevin Tang | ktang@caltech.edu |

Fabian Boemer | fboemer@caltech.edu |

- Homework 1 (due Jan 12th at 2pm via Moodle) [assignment][dataset]
- Homework 2 (due Jan 19th at 2pm via Moodle) [assignment][datasets]
- Homework 3 (due Jan 26th at 2pm via Moodle) [assignment][dataset]
- Homework 4 (due Feb 2nd at 2:30pm via Moodle) [assignment][datasets]
- Miniproject 1 (report due Feb 11th at 7pm via Moodle) [link] [description]
- Homework 5 (due Feb
~~16th~~**18th**at 2:30pm via Moodle) [assignment][dataset] - Homework 6 (due Marth 1st at 2:30pm via Moodle) [assignment][dataset][word2vec paper]
- Miniproject 2 (due Marth 10th at 5pm via Moodle) [assignment][datasets]

Note: schedule is subject to change.

Further Reading: |
||||

1/05/2016 | Lecture: | Administrivia, Basics, Bias/Variance, Overfitting | [slides] | |

1/07/2016 | Lecture: | Perceptron, Gradient Descent | [slides] | Daume Chapter 3 Mistake Bounds for Perceptron [link] AdaGrad [link] Stochastic Gradient Descent Tricks [link] Bubeck Chaper 3 |

1/07/2016 | Recitation: | Introduction to Python for Machine Learning | [slides][SciPy Tutorial] | |

1/12/2016 | Lecture: | SVMs, Logistic Regression, Neural Nets, Loss Functions | [slides] | |

1/14/2016 | Lecture: | Regularization, Lasso | [slides] | Murphy 13.3 |

1/14/2016 | Recitation: | Linear Algebra | [slides] | The Matrix Cookbook [link] |

1/19/2016 | Lecture: | Decision Trees, Bagging, Random Forests | [slides] | Overview of Decision Trees [pdf] Overview of Bagging [pdf] Overview of Random Forests [pdf] |

1/21/2016 | Lecture: | Boosting, Ensemble Selection | [slides] | Shapire's Overview of Boosting [pdf] |

1/21/2016 | Recitation: | Probability | [slides] | |

1/26/2016 | Lecture: | Probabilistic Models, Naive Bayes | [slides] | Murphy 3.5 |

1/28/2016 | Lecture: | Sequence Prediction, Hidden Markov Models | [slides][notes] | Murphy 17.3--17.5 |

1/28/2016 | Recitation: | Viterbi Review | [slides] | |

2/2/2016 | Lecture: | Conditional Random Fields | [slides][notes] | Hanna Wallach's intro to CRFs [link] |

2/4/2016 | Lecture: | Conditional Random Fields Continued, General Structured Prediction | [slides][notes] | Hanna Wallach's intro to CRFs [link] |

2/4/2016 | Recitation: | NO RECITATION | ||

2/9/2016 | Lecture: | Recent Applications | [slides] | Tutorial on Learning Reductions [link] Data-Driven Animation Project [link] |

2/11/2016 | Lecture: | NO LECTURE | ||

2/11/2016 | Recitation: | Conditional Random Field Gradient Descent | [slides] | |

2/16/2016 | Lecture: | Unsupervised Learning, Clustering, Dimensionality Reduction | [slides] | |

2/18/2016 | Lecture: | Latent Factor Models, Non-Negative Matrix Factorization | [slides] | Original Netflix Paper [link] |

2/18/2016 | Recitation: | NO RECITATION | ||

2/23/2016 | Lecture: | Embeddings | [slides] | Locally Linear Embedding [link] Playlist Embedding [link] word2vec [link] |

2/25/2016 | Lecture: | Deep Learning | [slides] | |

2/25/2016 | Recitation: | Advanced Optimization | [notes] | |

3/1/2016 | Lecture: | Recent Applications | [slides] | Sparse Multiclass Cancer Detection [link] Badge Dictionary Learning from Twitter [link] Learning Embedding of Visual Style [link] |

3/3/2016 | Lecture: | Survey of Advanced Topics | [slides] | |

3/3/2016 | Recitation: | NO RECITATION | ||

3/8/2016 | Lecture: | NO LECTURE |

- Stochastic Gradient Descent Tricks [link]
- Papers on Ensemble Selection. [paper1][paper2][KDD Cup Report]
- Practical Bayesian Optimization for Efficient Grid Search of Tuning Parameters. [paper][software]
- Reasonably Accessible Paper on Regularized Multi-Task Learning. [paper]
- Overview of Topic Models. [paper]
- Overview of Structural SVMs. [paper]
- A Brief Overview of Deep Learning. [link]
- Tutorial on Learning Reductions. [pdf]
- The Matrix Cookbook (a lot of useful properties of matrices). [link]
- Learning Reductions Overview. [paper]