Deep Learning is severely overrated!

If I am not working in this field, and work as a general tech guy in a tech company, I would have been overwhelmed by this trend as well, seriously. While the world is promoting AI (specifically, tech companies), few people really understand the techniques that are in the center of play.

Machine learning, AKA modeling, has been in the field for a much longer period, and its base on mathematics and statistics has made it a very powerful tool for statisticians and engineers to train computers to help their business. Deep learning, which is signified by the development of computing power in neural network models, has become the hottest topic in recent years, followed by the successful stories of AlphaGo which has beaten one of the best Go players in human history.

If you look closely, though, or if you work as a data scientist like I do in one of those “big” tech corporations, you would soon realize that deep learning can quite often give you worse results in reality. In other words, deep learning is not for everyone in every situation. Neural networks have been great in three specific fields. Firstly, it is an excellent tool for computer vision. The development of new network structures such as convolutional neural network has transformed the way machines see pictures, thus giving it a pretty decent accuracies in fields like object detection and image recognition. Further more, text analysis, such as machine translation and word prediction have been enhanced by recurrent neural networks, in which the structure can remember previous occurrences for a specific event. Lastly, reinforcement learning(which is basically machine learning new things by exploration or exploitation) has seen its biggest enhancement followed by the help from deep learning. AlphaGo uses a more complex type of this network setting to succeed in overcoming the difficulties to defeat human top-notch Go players.

However, if you are in a traditional field such as anti-fraud, and you have about 20 features with slightly over 100,000 observations, you would be amazed by the fact that as simple as a model like logistic regression can serve you better. Theoretically, though, deep learning has the power to imitate any linear or non-linear models, but setting the hyper parameters just about right is an art instead of science. Quite often, at least to me what happened is tree models(gradient boosting machines, random forest) or linear models(logistic regression, elastic net regression) have better predictive power and easier to be interpreted. Does it mean some mistakes in my deep learning experiments? I used to think so, until I realized that the inner drawback of deep learning: it can’t replace math and statistics in modeling! Especially when you are dealing with a highly imbalanced dataset, using deep learning models would easily make it overfitting or less predictive than math models, and this has been exemplified by some practices of mine in my daily work.

So don’t be fooled by any crazy promotions of AI. It does have change some fundamental ways for machines to learn new things, but it can’t guarantee you good results when it comes to modeling. A lot of companies are using this as a trick to attract new fundings, just like what Bitcoin has been to our world. You won’t assume there’s a Swiss knife for modeling, won’t you? LOL.

Leave a Reply

Your email address will not be published. Required fields are marked *