CS 159: Advanced Topics in Machine Learning (Spring 2016)

Course Description

This course will cover a mixture of the following topics:

Course Details


Yisong Yue               yyue@caltech.edu

Teaching Assistants

Stephan Zheng        stephan@caltech.edu
Hoang Le                 hmle@caltech.edu

Office Hours

Datasets & Testbeds

May be interesting for final project

Presentation Schedule

Note: schedule is subject to change.

Date Papers Presenters                     Materials
3/29/2016  Introduction & Administrivia 
Follow the Leader Algorithm & Perceptron
Yisong Yue [slides]
  • Perceptron Mistake Bounds (Sections 1 & 2)
  • Online Learning (Chapter 1)
  • 3/31/2016 Online Learning with Experts & Multiplicative Weights Algorithm Stephan Zheng [slides]
  • The Multiplicative Weights Update Method: A Meta-Algorithm and Applications (Section 1.1, Section 2.0 & Section 2.1)
  • 4/5/2016 Online Convex Optimization Ellen Feldman,
    Gautam Goel,
    Milan Cvitkovic
    Mentor: Yisong
  • Online Learning and Online Convex Optimization (primarily Section 2.4, although you may need to read beginning of Section 2 for notation)
  • 4/7/2016 Multi-armed Bandits & UCB1 Algorithm Connor Lee,
    Ritvik Mishra,
    Hoang Le
    Mentor: Hoang
  • Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems (Section 1)
  • Finite-time Analysis of the Multiarmed Bandit Problem (Primarily UCB1 Algorithm & Theorem 1)
  • 4/12/2016 Linear Bandits & Applications Feng Bi,
    Joon Sik Kim,
    Leiya Ma,
    Pengchuan Zhang
    Mentor: Yisong
  • A Contextual-Bandit Approach to Personalized News Article Recommendation
  • course notes
  • Improved Algorithms for Linear Stochastic Bandits (Theorem 2 & Theorem 3)
  • 4/14/2016 Monte Carlo Tree Search & Applications Suraj Nair,
    Peter Kundzicz,
    Vansh Kumar,
    Kevin An
    Mentor: Stephan
  • A Survey of Monte Carlo Tree Search Methods (Chapter 3, although may need parts of Chapter 2 for background)
  • Mastering the game of Go with deep neural networks and tree search (focus on the application of tree search, not the details of deep learning)
  • 4/19/2016 Q-Learning for Reinforcement Learning & Applications Timothy Chou,
    Charlie Tong,
    Vincent Zhuang
    Mentor: Stephan
  • coures notes
  • Playing Atari with Deep Reinforcement Learning (focus on the application of Q-learning and epsilon-greedy exploration, not the details of deep learning)
  • Convergence of Stochastic Iterative Dynamic Programming Algorithms
  • 4/21/2016 Apprenticeship Learning for Reinforcement Learning & Applications Nick Haliday,
    Audrey Huang,
    Ritwik Anand,
    Dryden Bouamalay
    Mentor: Hoang
  • course notes
  • An Application of Reinforcement Learning to Aerobatic Helicopter Flight
  • Exploration and Apprenticeship Learning in Reinforcement Learning (theory reference)
  • Apprenticeship Learning via Inverse Reinforcement Learning (theory reference)
  • 4/26/2016 Imitation Learning Richard Zhu,
    Andrew Kang
    Mentor: Hoang
  • A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
  • 4/28/2016 Active Learning for Supervised Learning Daniel Gu,
    Matthew Morgan,
    Keegan Ryan,
    Matthew Clark
    Mentor: Hoang
  • course notes overviewing active learning
  • Importance Weighted Active Learning
  • 5/3/2016 Active Learning for Decision Making Joe Marino,
    Grant Van Horn,
    Alvita Tran,
    Remy Yang
    Mentor: Yisong
  • Near Optimal Bayesian Active Learning for Decision Making
  • Jupyter Python Demo
  • 5/5/2016 Crowdsourcing Madhav Mohandas,
    Vincent Zhuang,
    Richard Zhu
    Mentor: Yisong
  • Optimistic Knowledge Gradient Policy for Optimal Budget Allocation in Crowdsourcing [appendix][journal version]
  • 5/10/2016 Machine Teaching Justin Leong,
    Kevin Tang,
    Zilong Chen,
    Kaikai Sheng
    Mentor: Yisong
  • How Do Humans Teach: On Curriculum Learning and Teaching Dimension
  • Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education
  • (supplemental application paper) Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners
  • (supplemental application paper) Becoming the Expert - Interactive Multi-Class Machine Teaching
  • 5/12/2016 Machine Teaching for Crowdsourcing Nancy Cao,
    Andrew Chico,
    Betsy Fu,
    Daniel Wang
    Mentor: Yisong
  • Near-Optimally Teaching the Crowd to Classify
  • 5/17/2016 Modeling Human Decision Making Zachary Fein,
    Eric Gorlin,
    Emily Mazo,
    Kc Emezie
    Mentor: Hoang
  • Forgetful Bayes and myopic planning: Human learning and decision-making in a bandit setting
  • 5/19/2016 Combinatorial Action Spaces & Adaptive Routing Luciana Cendon,
    Tobias Bischoff,
    Jiyun Ivy Xiao,
    Brennan Young
    Mentor: Yisong
  • Adaptive Collective Routing Using Gaussian Process Dynamic Congestion Models [journal version]
  • 5/24/206 Dueling Bandits Fabian Boemer,
    Kushal Agarwal,
    Jialin Song,
    Aman Agarwal
    Mentor: Yisong
  • The K-armed Dueling Bandits Problem (no need to read the theoretical analysis in detail)
  • How Does Clickthrough Data Reflect Retrieval Quality? (primarily Section 5) [journal version]
  • 5/26/2016 Coactive Learning Rohan Batra,
    Avishek Dutta,
    Nand Kishore,
    Siddarth Murching
    Mentor: Hoang
  • Online Structured Prediction via Coactive Learning [journal version]
  • Learning Trajectory Preferences for Manipulators via Iterative Improvement
  • 5/31/2016 Bayesian Optimization Dimitar Ho,
    Danni Ma
    Mentor: Stephan
  • Practical Bayesian Optimization of Machine Learning Algorithms
  • 6/2/2016 Off-Policy Evaluation Miguel Aroca-Ouellete,
    Akshta Athawale,
    Mannat Singh
    Mentor: Hoang
  • Exploration Scavenging
  • Reading List

    Presentation Signup Sheet

    Extended Reference Material (could be useful for picking final project)

    Note: some papers belong to multiple categories.

    Basic Online Learning

    Online Learning with Experts

    More Papers on Full Information Online Learning

    Basic Multi-Armed Bandits (Partial Information Online Learning)

    Bandit Convex Optimization

    Bandits with Dependent Arms

    Pure Exploration in Multi-Armed Bandits

    Contextual Bandits

    Bayesian Optimization

    Online Learning in Combinatorial Action Spaces

    Active Learning

    Online Learning from Preference Feedback

    Reinforcement Learning and Imitation Learning

    Off Policy Evaluation and Learning


    Machine teaching

    Modeling Human Decision Making & Interpreting Human Feedback

    Safe Exploration

    Connections to Game Theory

    Related Courses, Tutorials, and Textbooks