Modern deep learning libraries such as Keras allow you to define and start fitting a wide range of neural network models in minutes with just a few lines of code.
Nevertheless, it is still challenging to configure a neural network to get good performance on a new predictive modeling problem.
The challenge of getting good performance can be broken down into three main areas: problems with learning, problems with generalization, and problems with predictions.
Once you have diagnosed the specific type of problem that you are having with a network, a suite of classical and modern techniques can then be selected to address the issue and improve performance.
In this post, you will discover a framework for diagnosing performance problems with deep learning models and techniques that you can use to target and improve each specific performance problem.
After reading this post, you will know:
- Defining and fitting neural networks has never been easier, although getting good performance on new problems remains challenging.
- Neural network modeling performance problems can be decomposed into learning, generalization, and prediction type problems.
- There are decades of techniques as well as modern methods that can be used to target each type of model performance problem.
Let’s get started.
This tutorial is divided into seven parts; they are:
- Neural Network Renaissance
- Challenge of Configuring Neural Networks
- Framework for Systematically Better Deep Learning
- Better Learning Techniques
- Better Generalization Techniques
- Better Predictions Techniques
- How to Use the Framework
Neural Network Renaissance
Historically, neural network models had to be coded from scratch.
You might spend days or weeks translating poorly described mathematics into code and days or weeks more debugging your code just to get a simple neural network model to run.
Those days are in the past.
Today, you can define and begin fitting most types of neural networks in minutes with just a few lines of code, thanks to open source libraries such as Keras built on top of sophisticated mathematical libraries such as TensorFlow.
This means that standard models such as Multilayer Perceptrons can be developed and evaluated rapidly, as well as more sophisticated models that may previously have been beyond the capabilities of most practitioners to implement such as Convolutional Neural Networks and Recurrent Neural Networks like the Long Short-Term Memory network.
As deep learning practitioners, we live in amazing and productive times.
Nevertheless, even though new neural network models can be defined and evaluated rapidly, there remains little guidance on how to actually configure neural network models in order to get the most out of them.
Challenge of Configuring Neural Networks
Configuring neural network models is often referred to as a “dark art.”
This is because there are no hard and fast rules for configuring a network for a given problem. We cannot analytically calculate the optimal model type or model configuration for a given dataset.
Instead, there are decades worth of techniques, heuristics, tips, tricks, and other tacit knowledge spread across the code, papers, blog posts, and in peoples heads.
A shortcut to configuring a neural network on a problem is to copy the configuration of another network for a similar problem. But this strategy rarely leads to good results as model configurations are not transferable across problems. It is also likely that you work on predictive modeling problems that are most unlike other problems described in the literature.
Fortunately, there are techniques that are known to address specific issues when configuring and training a neural network that is available in modern deep learning libraries like Keras.
Further, discoveries have been made in the past 5 to 10 years in areas such as activation functions, adaptive learning rates, regularization methods, and ensemble techniques that have been shown to dramatically improve the performance of neural network models regardless of their specific type.
The techniques are available; you just need to know what they are and when to use them.