I think that the current implementation normalizes both classification and consistency losses by minibatch_size instead of using the appropriate denominators for each loss component. (unless I didnt iiunderstand the paper correctly)
Current code:
class_loss = class_criterion(class_logit, target_var) / minibatch_size
But isnt Cross-entropy losses supposed to be normalized by labeled_minibatch_size (samples that contribute to supervised learning).
class_loss = class_criterion(class_logit, target_var) / labeled_minibatch_size
I think that the current implementation normalizes both classification and consistency losses by
minibatch_sizeinstead of using the appropriate denominators for each loss component. (unless I didnt iiunderstand the paper correctly)Current code:
class_loss = class_criterion(class_logit, target_var) / minibatch_sizeBut isnt Cross-entropy losses supposed to be normalized by
labeled_minibatch_size(samples that contribute to supervised learning).class_loss = class_criterion(class_logit, target_var) / labeled_minibatch_size