# ELEN0062 - Introduction to machine learning (iML)

He uses statistics as a drunken man uses lamp posts—for support rather than for illumination.

Andrew Lang

## Informations

## Schedule

/ | 18 Sep. 2019 | |

Assignment | 25 Sep. 2019 |
First assignment - Statement [Updated 06/10/19]
- Code
- Cheat sheet
Python resources: |

Q&A | 02 Oct. 2019 | |

Q&A | 09 Oct. 2019 | |

Q&A | 16 Oct. 2019 | |

Deadline | 20 Oct. 2019 |
Don't forget to submit your first assignment |

Assignment |
23 Oct. 2019 |
Second assignment (Antonio Sutera is the reference TA for this assignment) |

Deadline |
17 Nov. 2019 |
Don't forget to submit your second assignment. |

## How to present data

Presenting data well is key to efficient communication. Here are a few pointers:

- How to Present Scientific Data
- A few additional thoughts
- A more thorough tour
- How To Present Research Data?
- Principles of data visualization

## Assignments

### First assignment

#### Installing Anaconda

There are many ways to install Python on a computer and get all the libraries needed. One quick way is to install anaconda, which comes with all the libraries we will need.

- Get the anaconda installer for your operating system. Make sure you install a Python 3.5+ version.
- Open a Python console:
- From a unix command line:
`python`

- Or open
`spyder`

IDE, which comes with anaconda -
Run the following commands:
`import numpy as np import pandas as pd import sklearn import scipy print(np.__version__) print(pd.__version__) print(sklearn.__version__) print(scipy.__version__)`

`ipython`

interpreter, which is much easier to work with.
## Cheat sheet for ML in Python

Check out datacamp for more.

## Supplementary material

Here is a very scarce list of supplementary material related to the field of machine learning. I tend to update this section when I come across interesting stuff but if you feel like you need more material on some topic, do not hesitate to ask!

### Machine learning in general

There are tons of online and accessible material in the domain of machine learning:- Andrew Ng's online course (Standford): The most popular online course on ML. Archived from coursera.
- Pedro Domingos' online course (Washington).
- Reza Shadmehr (Baltimore) and his slides.
- Jeffrey Ullman's course on mining massive datasets (Standford) based on his reference book. Not everything is related to the course though.

### Linear regression

The geometry of Least Squares (1 variable)

Note that the ANOVA is a special case of linear models where the input variables are dummy one-hot class variables. Consequently, the basis vector of the column space are orthogonal and the problem reduces to many 1 variable least squares.### Artifical neural networks

There have been three hypes about ANN. The first one was about the perceptrons in the 60s until it was discovered it could not solve a XOR problem. The second hype started with the discovery of backpropagation but it soon became clear that the large and/or deep neural nets were very hard to train. We are in the midts of the third one right now with "deep learning": neural nets with several (many) invisible layers. As a consequence, internet is bursting with resource on the topic, from the simplest models (multi-layer perceptron) to the most advanced architectures (such as GANs), going through more classical ones (such as Convnets and LSTM).- Graham Taylor: An Overview of Deep Learning and Its Challenges for Technical Computing (2014)
- Geoffrey Hinton: Introduction to Deep Learning and Deep Belief Nets (2012)
- Geoffrey Hinton: The Next Generation of Neural Networks (2007)
- Leon Bottou: Multilayer Networks series
- A simplified version of Backprop illustrated.
- An illustrated taxonomy of learning networks.

### Learning theory (Bias/Variance...)

### Support Vector Machines

- Visualizing the kernel trick
- A couple of videos about constraint optimization (by Khan Academy):

### Unsupervised learning

### Misc.

There are many YouTube channels about ML. Here are a few:- Sentex: A bit of everything
- Derek Kane: A bit of everything
- Welch Labs: A few videos about Neural Nets
- Two minutes papers: Many articles relate to (applications of) ML
- Siraj Raval (this guy is crazy)
- Introductory online course on ML (covers linear/logistic regression, decision trees/random forests, basics on neural networks and a clustering).

## Pre-requisites

Machine learning requires a solid background in maths, especially in linear algebra, (advanced) probability theory and (multivariable) calculus. There are even more resources on those than on deep learning. Here is a short selection, which emphasizes intuition.

### Linear algebra

- 3 brown 1 blue serie on linear algebra
- If you prefer paper (or PDF): Practical Linear Algebra: A Geometry Toolbox 2nd Edition by Farin, Gerald, Hansford, Dianne. A K Peters/CRC Press (2004)

### Calculus

- 3 brown 1 blue serie on calculus. Saddly, it does not go on to multivariable calculus.
- Khan academy serie
- If you prefer paper (or PDF): Calculus: Concepts and contexts 4th Edition by Stewart, James. (Also available in french)