Abstract:Albeit the Dice loss is one of the dominant loss functions in medical image segmentation, most research omits a closer look at its derivative, i.e. the real motor of the optimization when using gradient descent. In this paper, we highlight the peculiar action of the Dice loss in the presence of missing or empty labels. First, we formulate a theoretical basis that gives a general description of the Dice loss and its derivative. It turns out that the choice of the reduction dimensions $\Phi$ and the smoothing term $\epsilon$ is non-trivial and greatly influences its behavior. We find and propose heuristic combinations of $\Phi$ and $\epsilon$ that work in a segmentation setting with either missing or empty labels. Second, we empirically validate these findings in a binary and multiclass segmentation setting using two publicly available datasets. We confirm that the choice of $\Phi$ and $\epsilon$ is indeed pivotal. With $\Phi$ chosen such that the reductions happen over a single batch (and class) element and with a negligible $\epsilon$, the Dice loss deals with missing labels naturally and performs similarly compared to recent adaptations specific for missing labels. With $\Phi$ chosen such that the reductions happen over multiple batch elements or with a heuristic value for $\epsilon$, the Dice loss handles empty labels correctly. We believe that this work highlights some essential perspectives and hope that it encourages researchers to better describe their exact implementation of the Dice loss in future work.

What problem does this paper attempt to address?

This paper attempts to solve the problems encountered when using the Dice loss function in medical image segmentation, especially in the case of missing or empty labels. Specifically, the paper focuses on two main issues: 1. **Handling of missing or empty labels**: In many practical applications, the training data set may contain missing labels (i.e., the labels of certain categories do not exist in some samples) or empty labels (i.e., the labels of certain categories do not exist in the entire data set). These problems will cause abnormalities in the gradient calculation during the optimization process, thus affecting the performance of the model. 2. **Configuration of the Dice loss function**: The paper explores the impact of the selection of two key parameters in the Dice loss function - **dimension reduction $\Phi$** and **smoothing term $\epsilon$** on the performance of the model. The selection of these two parameters is crucial for handling missing or empty labels, but existing research has paid less attention to this. ### Main contributions of the paper 1. **Theoretical analysis**: - The paper first theoretically analyzes the behavior of the Dice loss function and its derivative in the case of missing or empty labels. The author points out that the selection of dimension reduction $\Phi$ and smoothing term $\epsilon$ is non - trivial and crucial. - The author proposes several heuristic combinations of $\Phi$ and $\epsilon$, which perform well in handling missing or empty labels. 2. **Experimental verification**: - The author conducts experiments using two public data sets in binary - classification and multi - classification segmentation tasks to verify the correctness of the theoretical analysis. - The experimental results show that by reasonably selecting $\Phi$ and $\epsilon$, the Dice loss function can effectively handle missing or empty labels, and in some cases is even superior to the loss functions specifically designed for missing labels. ### Key formulas - **Dice Similarity Coefficient (DSC)**: \[ \text{DSC}(Y_\phi, \tilde{Y}_\phi)=\frac{2|Y_\phi\cap\tilde{Y}_\phi|}{|Y_\phi| + |\tilde{Y}_\phi|} \] - **Smoothed Dice Loss (DL)**: \[ \text{DL}(Y, \tilde{Y}) = 1-\frac{1}{|\Phi|}\sum_{\phi\in\Phi}\frac{2\sum_{\phi\in\phi}y_\phi\tilde{y}_\phi+\epsilon}{\sum_{\phi\in\phi}(y_\phi+\tilde{y}_\phi)+\epsilon} \] - **Derivative of Dice Loss**: \[ \frac{\partial\text{DL}(Y, \tilde{Y})}{\partial\tilde{y}_\omega}=-\frac{1}{|\Phi|}\left(\frac{2y_\omega\sum_{\phi\in\phi_\omega}(y_\phi+\tilde{y}_\phi)+\epsilon - 2\sum_{\phi\in\phi_\omega}y_\phi\tilde{y}_\phi+\epsilon}{\left(\sum_{\phi\in\phi_\omega}(y_\phi+\tilde{y}_\phi)+\epsilon\right)^2}\right) \] ### Conclusion Through theoretical analysis and experimental verification, the paper shows the importance of reasonably selecting the dimension reduction $\Phi$ and smoothing term $\epsilon$ of the Dice loss function when handling missing or empty labels. This provides a valuable reference for future research and encourages...

The Dice loss in the context of missing or empty labels: Introducing $Φ$ and $ε$

On the dice loss gradient and the ways to mimic it

Dice Semimetric Losses: Optimizing the Dice Score with Soft Labels

Do we really need dice? The hidden region-size biases of segmentation losses

Optimizing the Dice Score and Jaccard Index for Medical Image Segmentation: Theory & Practice

Optimization for Medical Image Segmentation: Theory and Practice when evaluating with Dice Score or Jaccard Index

Rethinking Dice Loss for Medical Image Segmentation.

Generalised Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations

Noisy Image Segmentation With Soft-Dice

Theoretical analysis and experimental validation of volume bias of soft Dice optimized segmentation maps in the context of inherent uncertainty

A Generalized Surface Loss for Reducing the Hausdorff Distance in Medical Imaging Segmentation

Testing Segmentation Popular Loss and Variations in Three Multiclass Medical Imaging Problems

Loss odyssey in medical image segmentation

Dice loss for data-imbalanced NLP tasks

Boundary-wise Loss for Medical Image Segmentation Based on Fuzzy Rough Sets

Adaptive t-vMF Dice Loss for Multi-class Medical Image Segmentation

Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation

Multi-class Gradient Harmonized Dice Loss with Application to Knee MR Image Segmentation.

Learning Non-Unique Segmentation with Reward-Penalty Dice Loss

FESS Loss: Feature-Enhanced Spatial Segmentation Loss for Optimizing Medical Image Analysis