Abstract
Cataract surgery is the primary treatment for cataracts and requires precise surgical skills that are critical for successful outcomes. To enhance the training of new surgeons, automated analysis of surgical videos can provide valuable feedback, particularly in identifying instruments and anatomical structures. However, current segmentation methods often fail to accurately identify these components with consistent performance, which limits their effectiveness as training tools. To address this, we propose CatSeg, an advanced surgical scene segmentation model designed to improve the accuracy of both instrument and anatomical segmentation during cataract surgery. CatSeg employs a pyramidal convolution backbone and a Multi-Scale Instrument Feature Attention (MSIFA) module, enhancing edge feature preservation for better instrument identification. Evaluated on the Cataract-1k and CaDIS benchmarks, CatSeg achieved an improved mean IoU of +8.13% (0.8107 → 0.8920) and +1.35% (0.8670 → 0.8805), respectively. With an efficient operation speed of 6.35 to 7.66 frames per second, CatSeg is well-suited for surgical video analysis, paving the way for enhanced training and analysis in the development of new surgeons. Source code is available at: https://github.com/Muraam-Abdel-Ghani/CatSeg/.
| Original language | English |
|---|---|
| Article number | 110079 |
| Journal | Biomedical Signal Processing and Control |
| Volume | 120 |
| DOIs | |
| Publication status | Published - 1 Jul 2026 |
Keywords
- Anterior segment structures
- Cataract surgical segmentation
- Deep learning
- Holistic scene segmentation
- Intraocular lens
- Multi-scale attention
- Pyramidal convolution
Fingerprint
Dive into the research topics of 'CatSeg: A holistic cataract surgical scene segmentation model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver