KAGGLE ENSEMBLING GUIDE

Model ensembling is a very powerful technique to increase accuracy on a variety of ML tasks. In this article I will share my ensembling approaches for Kaggle Competitions. For the first part we look at creating ensembles from submission files. The second part will look at creating ensembles through stacked generalization/blending. I answer why ensembling … More KAGGLE ENSEMBLING GUIDE

Winning solution of Kaggle Higgs competition: what a single model can do?

Winning solution of Kaggle Higgs competition: what a single model can do? This blog is for describing the winning solution of the Kaggle Higgs competition. It has the public score of 3.75+ and the private score of 3.73+ which has ranked at 26th. This solution uses a single classifier with some feature work from basic high-school physics … More Winning solution of Kaggle Higgs competition: what a single model can do?

Glmnet Vignette

Trevor Hastie and Junyang Qian Stanford June 26, 2014 Introduction Installation Quick Start Linear Regression Logistic Regression Poisson Models Cox Models Sparse Matrices Appendix: Internal Parameters Introduction Glmnet is a package that fits a generalized linear model via penalized maximum likelihood. The regularization path is computed for the lasso or elasticnet penalty at a grid … More Glmnet Vignette

Click-Through-Rate Prediction with GraphLab Create Feature Engineering Transformers

Feature engineering is well known to be one of the key ingredients in a successful intelligent application. In this notebook, we introduce what feature engineering is, why it is important, and how to engineer features. For this, we work through a Kaggle competition on click-through prediction (CTR) using Avazu’s anonymized dataset. More specifically, you will … More Click-Through-Rate Prediction with GraphLab Create Feature Engineering Transformers

Leaving Academia: How To Get A Job In Industry After Your PhD

Getting a job in industry after your PhD is an honorable alternative to an academic career. Despite its appeal, many PhD students seem terrified to take the jump. I want to share with you the one thing you have to do if you want to successfully get a job in industry after your PhD. Warning … More Leaving Academia: How To Get A Job In Industry After Your PhD

Top 10 R Packages to be a Kaggle Champion

Kaggle top ranker Xavier Conort shares insights on the “10 R Packages to Win Kaggle Competitions”. By Anmol Rajpurohit, @hey_anmol Across all major surveys, R has clearly dominated as one of the top programming choices for data scientists. Thus, it is no wonder that knowing the important R packages can be a vital advantage in … More Top 10 R Packages to be a Kaggle Champion

Discover Feature Engineering, How to Engineer Features and How to Get Good at It

Feature engineering is an informal topic, but one that is absolutely known and agreed to be key to success in applied machine learning. In creating this guide I went wide and deep and synthesized all of the material I could. You will discover what feature engineering is, what problem it solves, why it matters, how … More Discover Feature Engineering, How to Engineer Features and How to Get Good at It

Profiling Top Kagglers: Owen Zhang, Currently #1 in the World

Next up in our series on top Kagglers is the #1: Owen Zhang(Zhonghua Zhang). Owen comes from an engineering background and currently works as the Chief Product Officer at DataRobot. Owen Q&A How did you start with Kaggle competitions? Back in 2011 I had just switched to analytics as a full time job (after several … More Profiling Top Kagglers: Owen Zhang, Currently #1 in the World

Profiling Top Kagglers: KazAnova, Currently #2 in the World

There are Kagglers, there are Master Kagglers, and then there are top 10 Kagglers. Who are these people who consistently win Kaggle competitions? In this series we try to find out how they got to the top of the leaderboards. First up is KazAnova — Marios Michailidis — the current number 2 out of nearly … More Profiling Top Kagglers: KazAnova, Currently #2 in the World