Inspired by the self-gating property of the Swish activation function, the Mish activation function was designed. The Mish function is a self-regularised non-monotonic activation function. It is similar to the Swish function.
x: an input data point
In general and .
import tensorflow as tf # Using tensorflow math function # Mish Function def mish(x): return x * tf.math.tanh(tf.math.softplus(x)) y = mish(x) plot_graph(x, y, 'Mish')
Used in hidden layers and can be used as an alternative to ReLU.
SELU induces self-normalizing property to the neural networks. That is, the neuron activations converge towards zero mean and unit variance.
It isn’t affected by vanishing and exploding gradient problems.
Need more computation power while training the network.