Deep Neural Networks for Classification
Deep neural networks can be used for classification. The last activation function is usually a softmax, i.e. for a value
Binary classification
One typically uses the binary cross-entropy function as the loss function to train the neural network.
- In the discrete case, the loss function becomes
- In the continuous case, when dealing with probabilities predicted by a neural network that are not strictly 0 or 1, the cross-entropy loss for binary classification is still used. The formula remains the same, but it is interpreted as a measure of dissimilarity between the predicted probability distribution and the true binary labels.
Multiple classes
For the case of multiple classes, the extension of binary cross-entropy loss is the categorical cross-entropy loss. The categorical cross-entropy loss is used when dealing with classification problems involving more than two classes. The formula for categorical cross-entropy loss is as follows:
where:
is an indicator function ( if the actual class is , otherwise) is the predicted probability that the instance belongs to class
The goal during training is to minimize this average categorical cross-entropy loss across all instances in the training dataset.