Obviously Awesome

Activations

The Dying ReLU Problem, Clearly ExplainedKeeping your neural network alive by understanding the downsides of ReLUKenneth LeungJust now·5 min readContents(1) What is ReLU and what are its advantages? (2) What’s the Dying ReLU problem? (3) What causes the Dying ReLU problem? (4) How to solve the
Activation functions are functions which take an input signal and convert it to an output signal. Activation functions introduce non-linearity to the networks that is why we call them non-linearities.
Recently, a colleague of mine asked me a few questions like “why do we have so many activation functions?”, “why is that one works better than the other?”, ”how do we know which one to use?”, “is it hardcore maths?” and so on.
Get the latest machine learning methods with code. Browse our catalogue of tasks and access state-of-the-art solutions.
The universal approximation theorem implies that a neural network can approximate any continuous function that maps inputs (X) to outputs (y). The ability to represent any function is what makes the neural networks so powerful and widely-used.
Happy second week of the #100DaysofCode challenge and Happy Thanksgiving, check out week one where I discussed parsing CSV rows into separate text files.
The purpose of this post is to provide guidance on which combination of final-layer activation function and loss function should be used in a neural network depending on the business goal. This post assumes that the reader has knowledge of activation functions.
Activation functions are one of the many parameters you must choose to gain optimal success and performance with your neural network.