machine learning - Classification with noisy labels? - Cross Validated Let p t be a vector of class probabilities produced by the neural network and ℓ ( y t, p t) be the cross-entropy loss for label y t. To explicitly take into account the assumption that 30% of the labels are noise (assumed to be uniformly random), we could change our model to produce p ~ t = 0.3 / N + 0.7 p t instead and optimize