- Analyst Launch
- Posts
- Learn Python & Machine Learning Quickly
Learn Python & Machine Learning Quickly
This will take your data career to the next level
Open up a whole new swath of job opportunities.
Four years ago, I started that journey. Below are the exact steps I took.
Learn Python
Learn Pandas
Learn Machine Learning libraries
Learn Statistics & Probability
Final Project
Firstly, If I can do it, then anyone can do it.
The only programming I’d done before was 25 years ago in university when I did some Pascal programming.
Also, when I say I “learnt” it, I mean I can write and understand Python. Then take Machine learning code templates and adapt them to what I need to do.
Learn Python – 6 weeks
I read the book “A smarter way to learn Python” by Mark Myers. ($12)
It starts you at the very beginning, as in “hello world”.
His online practice questions, based on the book, are great (free).
Next, this online book (free) is great. “Automate the Boring Stuff with Python” by Al Sweigart.
Learn Pandas – 4 weeks
Pandas are necessary if you want to use Python for machine learning. It allows you to manipulate data quickly in table format.
I just watched a lot of YouTube videos.
This series by Cory Schafer is good. The whole time, I was practicing Python.
Learn Machine Learning libraries – 8 weeks
I did this Udemy course “Machine Learning A-Z” ($20).
It’s great, with a fundamentals/concept module followed by a live coding example, for each model.
You can also download all their code templates to just adapt them to your project. They give lessons in both Python and R (I did the Python modules only)
It’s a massive course, covering most of the popular Machine Learning models, including classification, reinforcement learning, natural language processing and deep learning.
The main library they use is Scikit-learn
Learn Statistics & Probability – 4 weeks
Halfway through the previous step, I felt I needed more training on Stats & Probability (last time I touched it was in high school).
I did this Udemy course ($20), “Become a Probability & Statistics Master”
It’s very good, and I understood most of it.
Final Project – 10 weeks
Practice projects where a teacher gives you everything and then you try to solve it, are boring (Also, I had no teacher). I like to discover new things on my own.
I wanted to find some original data, that as far as I could see was understudied. Then eureka, I found this.
Cricsheet, has statistics for ball-by-ball data of every T20 cricket match played. All games, India, Australia, England... everywhere.
BTW, thanks Stephen for putting this data together.
Goal: Build a model that could predict the odds of a side winning a match BALL BY BALL! This is exactly what bookmakers do, they adjust the odds of a match after every ball. Called “In-Play” or “Live” betting.
Then, use this model to look for inefficiencies in betting markets, to make a profit.
Process:
Each game was in a separate CSV file. I had to write a python script that would read the file, strip everything out I didn’t need, and then write a row to a master CSV file.
Source the pre-game betting odds to each match and write it to master CSV.
Prepare the data to be machine learning ready in a Panda.
Test every machine learning model I had learnt, to find which models were the best predictors.
I chose Random Forest, then I tested it with different parameters.
10 weeks later, I was left with a working model, that was looking promising.
So, the project was complete, or was it?
At the time, I felt I had learnt everything that I set out to do and was busy with running my other business. So, that’s where it ended.
When I have spare time, I want to really complete this project and test it with real life trading. I could be sitting on a fortune!
Still to do:
Write a script that will, take live odds from Betfair (sports betting exchange). They have API access.
Get API access to ball by ball live cricket scores.
Run the model live after each ball, calculating the odds.
Use the Betfair API to place programmatic bets when the model says the odds are in my favour.
Do it over 100’s of games and make millions of $$$. That’s the idea anyway. 😂
I’m pretty sure it would be profitable, because after each ball, the odds tend to fluctuate wildly, as traders/punters tend to overreact to events on the pitch.
Final Word – Learn Python & Machine Learning, it’s easier than you think, it just needs time and practice.
That’s all for today!
Enjoyed this newsletter? Share it with a friend. It only takes 30 seconds.
Cheers Shano.
Reply