ELEN0062 - Introduction to machine learning (iML)

Random ML quote

He uses statistics as a drunken man uses lamp posts—for support rather than for illumination.

Andrew Lang

Informations

Schedule

/ 18 Sep. 2019
Assignment 25 Sep. 2019

First assignment

Python resources:

On how to present results

Q&A 02 Oct. 2019
Q&A 09 Oct. 2019
Q&A 16 Oct. 2019
Deadline 20 Oct. 2019

Don't forget to submit your first assignment

Assignment
23 Oct. 2019

Second assignment (Antonio Sutera is the reference TA for this assignment)

Deadline
17 Nov. 2019

Don't forget to submit your second assignment.

How to present data

Presenting data well is key to efficient communication. Here are a few pointers:

Assignments

First assignment

Installing Anaconda

There are many ways to install Python on a computer and get all the libraries needed. One quick way is to install anaconda, which comes with all the libraries we will need.

  1. Get the anaconda installer for your operating system. Make sure you install a Python 3.5+ version.
  2. Open a Python console:
    • From a unix command line: python
    • Or open spyder IDE, which comes with anaconda
    Note that you can use the ipython interpreter, which is much easier to work with.
  3. Run the following commands:
      
    import numpy as np
    import pandas as pd
    import sklearn
    import scipy
    
    print(np.__version__)
    print(pd.__version__)
    print(sklearn.__version__)
    print(scipy.__version__)
      
    
    If there is no error, the installation went fine

Cheat sheet for ML in Python

Check out datacamp for more.

Supplementary material

Here is a very scarce list of supplementary material related to the field of machine learning. I tend to update this section when I come across interesting stuff but if you feel like you need more material on some topic, do not hesitate to ask!

Machine learning in general

There are tons of online and accessible material in the domain of machine learning:

Linear regression

The geometry of Least Squares (1 variable)

Note that the ANOVA is a special case of linear models where the input variables are dummy one-hot class variables. Consequently, the basis vector of the column space are orthogonal and the problem reduces to many 1 variable least squares.

Artifical neural networks

There have been three hypes about ANN. The first one was about the perceptrons in the 60s until it was discovered it could not solve a XOR problem. The second hype started with the discovery of backpropagation but it soon became clear that the large and/or deep neural nets were very hard to train. We are in the midts of the third one right now with "deep learning": neural nets with several (many) invisible layers. As a consequence, internet is bursting with resource on the topic, from the simplest models (multi-layer perceptron) to the most advanced architectures (such as GANs), going through more classical ones (such as Convnets and LSTM).

Learning theory (Bias/Variance...)

Support Vector Machines

Unsupervised learning

Misc.

There are many YouTube channels about ML. Here are a few:

Pre-requisites

Machine learning requires a solid background in maths, especially in linear algebra, (advanced) probability theory and (multivariable) calculus. There are even more resources on those than on deep learning. Here is a short selection, which emphasizes intuition.

Linear algebra

  • 3 brown 1 blue serie on linear algebra
  • If you prefer paper (or PDF): Practical Linear Algebra: A Geometry Toolbox 2nd Edition by Farin, Gerald, Hansford, Dianne. A K Peters/CRC Press (2004)

Calculus

Last modified on October 23 2019 09:23