CondSeg: Ellipse Estimation of Pupil and Iris via Conditioned Segmentation

Zhuang Jia,Jiangfan Deng,Liying Chi,Xiang Long,Daniel K. Du
2024-08-30
Abstract:Parsing of eye components (i.e. pupil, iris and sclera) is fundamental for eye tracking and gaze estimation for AR/VR products. Mainstream approaches tackle this problem as a multi-class segmentation task, providing only visible part of pupil/iris, other methods regress elliptical parameters using human-annotated full pupil/iris parameters. In this paper, we consider two priors: projected full pupil/iris circle can be modelled with ellipses (ellipse prior), and the visibility of pupil/iris is controlled by openness of eye-region (condition prior), and design a novel method CondSeg to estimate elliptical parameters of pupil/iris directly from segmentation labels, without explicitly annotating full ellipses, and use eye-region mask to control the visibility of estimated pupil/iris ellipses. Conditioned segmentation loss is used to optimize the parameters by transforming parameterized ellipses into pixel-wise soft masks in a differentiable way. Our method is tested on public datasets (OpenEDS-2019/-2020) and shows competitive results on segmentation metrics, and provides accurate elliptical parameters for further applications of eye tracking simultaneously.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in augmented reality (AR) and virtual reality (VR) products, accurate eye - tracking and gaze estimation are crucial for implementing functions such as foveated rendering and user interaction. However, existing methods have some limitations when dealing with the analysis of pupils and irises: 1. **Multi - category segmentation methods**: Mainstream methods regard eye - component analysis (such as pupils, irises and scleras) as a multi - category segmentation task, but this method can only provide information about the visible part of the pupil or iris. 2. **Ellipse - parameter regression methods**: Other methods perform regression by manually annotating the complete ellipse parameters of the pupil or iris, but this requires a large amount of annotation work and the annotation process is complex. To solve these problems, the author proposes a new method named CondSeg, which aims to directly estimate the ellipse parameters of the pupil and iris from the segmentation labels and use the eye - region mask to control the visibility of the estimated pupil and iris ellipses. Specifically, this method utilizes two prior knowledge: - **Ellipse prior**: The projected complete pupil or iris can be modeled as an ellipse. - **Condition prior**: The visibility of the pupil and iris is controlled by the openness of the eye region. By introducing this prior knowledge, CondSeg can accurately estimate the ellipse parameters of the pupil and iris without explicitly annotating the complete ellipse, and shows competitive results on public datasets (such as OpenEDS - 2019/ - 2020). ### Main contributions 1. **Introducing conditional segmentation**: Transforming the problem from multi - category segmentation to conditional segmentation by decoupling the prediction of the eye region and the pupil/iris ellipse. 2. **Explicitly encoding the ellipse prior**: Proposing a method to convert 5D ellipse parameters into a soft - segmentation mask, enabling the network to optimize ellipse parameters at the pixel level instead of through regression. 3. **Simplifying the annotation process**: Eliminating the need for explicit ellipse - parameter annotation, thereby reducing the annotation burden. ### Method overview The CondSeg network architecture includes two main components: - **Eye - region mask prediction**: Used to generate the segmentation mask of the eye region. - **Ellipse - parameter prediction**: Used to predict the ellipse parameters of the complete pupil and iris. By converting the predicted ellipse parameters into a soft - segmentation mask and combining it with the eye - region mask, CondSeg can generate the pupil and iris regions occluded by the eyelids, thereby calculating the loss to optimize the network parameters. ### Experimental results Experiments show that CondSeg performs excellently in multiple metrics, especially it can still accurately estimate the complete pupil and iris ellipse parameters without the need for explicit ellipse annotation. In this way, CondSeg not only improves the accuracy of pupil and iris analysis but also significantly reduces the annotation workload, providing a more efficient and elegant solution for practical applications.