This is a chapter from my upcoming book, Meta-Learning: Powerful Mental Models for Deep Learning and Thriving in the Digital Age. It is currently available for pre-order and you can pick it up at 50% off.
Before you can learn to do deep learning, you must become a developer. Being a developer is not only about programming. In the larger scheme of things, being able to write code is just a small part of what a developer can do.
If I were to start all over, I would optimize this part of the journey for fun. I would want to…
We write machine learning code in a very specific context. But from what I have seen so far nothing has convinced me that machine learning code is fundamentally different from any other type of code.
This means that standard development practices apply with testing being their very important component.
The rewards of testing can be immense, but so can be the price that one would need to pay for testing poorly or not at all.
Let’s take a closer look at what testing looks like in the context of various machine learning applications.
This is the bread and butter of…
Imagine you live in the mountains. One of your kin has fallen sick and you volunteer to get medicine.
You stop by your house to grab the necessities — a map of the city along with a marble-shaped rock you claim brings you good fortune. You hop onto your dragon and fly north.
All that matters initially is a general sense of direction. The details on the ground are barely visible and you cover distance quickly.
As is widely known, dragons need a lot of space to land. …
Data science is a conspiracy.
“Hi, my name is Bob and I’ll be your instructor. I’ll teach you how to drive a car. Open your books on page 147 and let’s learn about different types of exhaust manifolds. Here is the formula for the essential Boyle’s Law…”
This would never happen, right?
And yet in teaching data science elaborating on complex topics is common place whereas no love is given to the fundamentals. We are not told when to accelerate and when to slow down. …
For me, the appeal is simple. I get to listen and occasionally talk to people doing amazing things in the field I care about (Machine Learning with a focus on Deep Learning). I also sometimes fantasize that what I write is helpful to others. Mostly that is just me dreaming things up.
But Twitter is not without quirks and some things are not what they appear.
Below I present the missing manual…
I have just come out of a project where 80% into it I felt I had very little. I invested a lot of time and in the end it was a total fiasco.
The math that I know or do not know, my ability to write code — all of this has been secondary. The way I approached the project was what was broken.
I now believe that there is an art, or craftsmanship, to structuring machine learning work and none of the math heavy books I tended to binge on seem to mention this.
I started using PyTorch a couple of days ago. Below I outline key PyTorch concepts along with a couple of observations that I found particularly useful as I was getting my feet wet with the framework (and which can lead to a lot of frustration if you are not aware of them!)
Tensor — (like) a numpy.ndarray but can live on the GPU.
Variable — allows a tensor to be part of a computation by wrapping itself around it. If created with requires_grad = True, will have gradients calculated during the backwards phase.
You perform calculations by writing them out…
In the first lecture of the outstanding Deep Learning Course (linking to version 1, which is also superb, v2 to become available early 2018), we learned how to train a state of the art model using very recent techniques (for instance, the optimal learning rate estimation as described in the Cyclical Learning Rates for Training Neural Networks paper from 2015).
While explaining stochastic gradient descent with restarts, Jeremy Howard made a very interesting point — upon convergence, we would like to find ourselves in a part of the weight space that is resilient, meaning where small changes to the weights…
In this article we will take a look at two ideas that can help you make the most of your training data.
In order to get a better feel for the techniques we will apply them to beating the state of the art from 2013 on distinguishing cats and dogs in images. The plot twist is that we will only use 5% of the original training data.
We will compete against an accuracy of 82.37% achieved using 13 000 training images. Our train set will consist of 650 images selected at random.
Models will be constructed from scratch and we…
In 2013 Kaggle ran the very popular dogs vs cats competition. The objective was to train an algorithm to be able to detect whether an image contains a cat or a dog.
At that time, as stated on the competition website, the state of the art algorithm was able to tell a cat from a dog with an accuracy of 82.7% after having been trained on 13 000 cat and dog images.
I applied transfer learning which is a technique where you take a model trained to carry out some other though similar task and you retrain it to do…
I ❤️ ML / DL ideas — I tweet about them / write about them / implement them. Self-taught RoR developer by trade.