Updates from week of November 26, 2019
- Gilbert Strang playlist on MIT OCW (Matrix Algebra): https://www.youtube.com/playlist?list=PLUl4u3cNGP63oMNUHXqIUcrkS2PivhN3k
- The 100 page ML Book: http://themlbook.com/wiki/doku.php?id=start
- Clean Code ML repo: https://github.com/davified/clean-code-ml
- Transformers:
- Simple Transformers Blog https://medium.com/swlh/simple-transformers-multi-class-text-classification-with-bert-roberta-xlnet-xlm-and-8b585000ce3a
- Repo: https://github.com/ThilinaRajapakse/simpletransformers
- Nvidia Apex https://github.com/NVIDIA/apex
- Hugging Face: Blog: https://medium.com/tensorflow/using-tensorflow-2-for-state-of-the-art-natural-language-processing-102445cda54a, Examples: https://github.com/huggingface/transformers/tree/master/examples, Docs: https://huggingface.co/transformers/quickstart.html
- Illustrated BERT: http://jalammar.github.io/illustrated-bert/
- Allen NLP: https://github.com/allenai/allennlp
- Financial Models:
- Notebook collection https://github.com/cantaro86/Financial-Models-Numerical-Methods, in particular one on Kalman filters: https://github.com/cantaro86/Financial-Models-Numerical-Methods/blob/master/5.1%20Linear%20regression%20-%20Kalman%20filter.ipynb
- Another gem on Kalman filters: https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python and PDF https://drive.google.com/file/d/0By_SW19c1BfhSVFzNHc0SjduNzg/view
- Data Sampling in Presto: https://ragrawal.wordpress.com/2017/08/11/data-sampling-in-presto/
- Clean Pytorch implementation of Style Transfer: https://github.com/shivamswarnkar/Style-Transfer/tree/871b2607d68d7dfa46c0242e4fdd9e98f77bbd93
- Kaggle class on TWIML AI:
- Github: https://github.com/philpackmohr/kaggle-twimlai
- Kaggle Winning Solutions & Pipeline: http://kagglesolutions.com/r/?ref=headerlinkh
- Incredible Glossary: https://www.kaggle.com/shivamb/data-science-glossary-on-kaggle/
- Winning Kaggle Solutions: https://www.kaggle.com/sudalairajkumar/winning-solutions-of-kaggle-competitions/notebook
- Short clean kernel: https://www.kaggle.com/lopuhin/mercari-golf-0-3875-cv-in-75-loc-1900-s
- Pavel Pleskov secrets: https://www.youtube.com/watch?v=fXnzjJMbujc
- Model stacking (Kaggle Blog): http://blog.kaggle.com/2016/12/27/a-kagglers-guide-to-model-stacking-in-practice/
- No Free Hunch: http://blog.kaggle.com/
- Feature Selection via Target Permutations: https://www.kaggle.com/ogrellier/feature-selection-target-permutations and https://www.kaggle.com/ogrellier/feature-selection-with-null-importances
- Feature Importances: https://medium.com/the-artificial-impostor/feature-importance-measures-for-tree-models-part-i-47f187c1a2c3
- Slidedecks with tricks: https://www.slideshare.net/markpeng/general-tips-for-participating-kaggle-competitions, https://www.slideshare.net/HJvanVeen/kaggle-presentation?qid=9945759e-a06f-447d-bcfb-2a15592f30b6&v=&b=&from_search=11, https://www.slideshare.net/DariusBaruauskas/tips-and-tricks-to-win-kaggle-data-science-competitions?qid=2ea2c741-a9af-4c84-9292-d11725c0c68c&v=&b=&from_search=5, https://www.slideshare.net/gabrielspmoreira/feature-engineering-getting-most-out-of-data-for-predictive-models-tdc-2017, https://www.slideshare.net/jeongyoonlee/winning-data-science-competitions-74391113
- Good AMA: https://towardsdatascience.com/ask-me-anything-session-with-a-kaggle-grandmaster-vladimir-i-iglovikov-942ad6a06acd
- MLCourse.ai:
- Resources: https://mlcourse.ai/resources
- Kernels: https://www.kaggle.com/kashnitsky/mlcourse/kernels
- Github: https://github.com/Yorko/mlcourse.ai
- Open Data Science courses: https://medium.com/open-machine-learning-course
- Optuna vs HyperOpt: https://neptune.ml/blog/optuna-vs-hyperopt
- Free PyTorch Intro Book: https://pytorch.org/assets/deep-learning/Deep-Learning-with-PyTorch.pdf
- Makefiles everywhere: https://blog.mindlessness.life/makefile/2019/11/17/the-language-agnostic-all-purpose-incredible-makefile.html
- DVC for model version control: https://dvc.org/
- Eli5 Model Explainability: https://github.com/TeamHG-Memex/eli5
- Autoencoders: https://www.kaggle.com/shivamb/how-autoencoders-work-intro-and-usecases