(CS 101) Projects in Machine Learning

2016 Fall Term

Course Description

Prerequisite: CS 155 or equivalent

This is a project-based course for students looking to gain practical experience in machine learning. Students are expected to be proficient in basic machine learning. Students will work in groups. Each group will be provided a project topic to work on along with domain expert advisors. Alternatively, students can propose their own projects, subject to approval by course instructors.

Course Details


  • SIGN UP FORM: link.
  • Piazza link, for Q&A.


    Yisong Yue               yyue@caltech.edu
    Omer Tamuz            omertamuz@gmail.com

    Teaching Assistants


    List of Projects

    subject to change

    Visualizing & Analyzing twitter Data. Mentored by Mike Alvarez. Analyzing twitter data is extremely challenging, given the streaming nature of the data as well as the sheer volume. This project will be involve creating a pipeline to visualize analysis of twitter data in an easy to view way. Examples include simple analyses such as creating distributions of twitter words by geography, to more complicated ones such as visualzing latent-factor modeling analysis of twitter data.

    Data Forensics for Voter Registration Data. Mentored by Mike Alvarez. Develop tools to collect and analyze voter registration data. A number of states (e.g., NC and OH) post their entire state’s voter registration and voter history file online, with weekly updates. One can use machine learning to identify simple problems such as duplicate entries and minor errors, or challenging errors such as ballot stuffing. See this paper for an early example.

    Computational humor. Write a program that makes people laugh (with it, not at it). Some options include: an automatic meme generator, a twitter bot, and a reaction gif suggester.

    Adaptive Teaching. Teach children the alphabet / basic reading / basic math by learning on the fly what they know and don't know and choosing learning tasks accordingly.

    Minimum Intelligent Signal Test. This test is a variant of the Turing test, in which the a tester asks a responder questions, and the responder is only allowed to answer "yes" or "no". Build a system in which users are randomly assigned to be either testers or responders, and testers chat with responders who are randomly chosen to be either machines or humans. Learn from the answers given by human responders to train machine responders.

    AI for Quantum Chess (Requires CS 159). Mentored by Spiros Michalakis and Chris Cantwell. Quantum Chess is a new game developed by Caltech, USC, and industry partners. Students must be familiar with concepts such as Monte Carlo tree search.

    AI for Classical Games (Requires CS 159). The project will usef interactive machine learning to train AI systems to play various classical games. Students must be familiar with concepts such as Monte Carlo tree search. Candidate games include:

    (No Longer Available) Machine learning techniques applied to data from Mars Science Laboratory (MSL). Mentored by Michela Munoz Fernandez.

    Machine Learning for Sports. Mentored by Hoang Le. Develop ML interface with the RoboCup simulator. Work on ML algorithms to learn defensive formations in soccer. May involve tracking data from professional games.

    Machine Learning for Mobile Platforms. Mentored by Grant van Horn. Mobile devices offer unique challenges to machine learning pipelines: not only do the systems still need to achieve low error, but they also must satisfy strict memory and performance constraints. Many state of the art deep convolutional neural networks used for image classification do not readily fit on mobile devices. The goal of this project is to develop a computer vision model for bird species classification that can run efficiently on mobile phones that range from 512 MB to 2+ GB of memory while gracefully degrading in performance. Possible directions:

    Improve Open Source ML Platforms. Develop extensions of existing ML platforms (such as deep learning). Examples include: