MAGAZINE
medRxiv
mBRSET: A Portable Retina Fundus Photos Benchmark Dataset for Clinical and Demographic Prediction
AUTHORS & DATE
Chenwei Wu, David Restrepo, Luis Filipe Nakayama, Lucas Zago Ribeiro, Zitao Shuai, Nathan Santos Barboza, Maria Luiza Vieira Sousa, Raul Dias Fitterman, Alexandre Durao Alves Pereira, Caio Vinicius Saito Regatieri, Jose Augusto Stuchi, Fernando Korn Malerbi, Rafael E. Andrade
12/07/2024
Abstract
Background
Imaging exams, particularly retinal fundus photos, are crucial for diagnosing and monitoring ophthalmological pathologies, but traditional tabletop fundus cameras are expensive, cumbersome, and inaccessible, especially in Low- and Middle-Income Countries (LMICs), leading to a significant scarcity of ophthalmological data in these regions. The advent of compact, portable cameras offers a cost-effective solution for screening and telemedicine, crucial for preventing visual impairment in resource-limited settings. However, the existing Artificial Intelligence (AI) algorithms designed for these purposes often lack accuracy and fairness due to a lack of representative and generalizable data from LMICs and the growing portable imaging modality. To address this, the paper introduces mBRSET, the first publicly available diabetic retinopathy dataset captured using handheld retinal cameras in real-life, high-burden scenarios in Brazil, aiming to fill the gap by providing diverse data and comprehensive metadata to advance the development of fair and accurate AI-assisted diagnostic tools.
Methods
The work received approval from the Institutional Review Board of Instituto de Ensino Superior Presidente Tancredo de Almeida Neves (IPTAN) under protocol number CAAE 64219922.3.0000.9667. It encompassed retinal fundus photos alongside clinical and demographic data. Notably, all identifiable patient information was removed from the images within this dataset to ensure confidentiality and adherence to ethical standards. All patients have given written consent to the image capture and open publication.
Results
The research successfully introduced mBRSET, the first publicly available dataset of 5,164 retina images from 1,291 diverse patients captured using handheld retinal cameras in real-life, high-burden scenarios in Brazil, complete with extensive clinical and demographic metadata. The dataset's utility was validated by benchmarking state-of-the-art deep learning models (ConvNext V2, Dino V2, and SwinV2), which achieved high accuracy in clinical tasks, including diagnosing diabetic retinopathy (DR) and macular edema (ME), with F1 scores reaching 87.4% for binary DR classification and 83.06% for ME detection, demonstrating diagnostic performance comparable to that of traditional tabletop cameras. Furthermore, the models showed a significant ability to predict demographic and socioeconomic factors (gender, education, and insurance status) from the retinal images, with F1 scores up to 84.38% for gender and 76.11% for insurance status, highlighting previously unexplored associations between socioeconomic disparities and retinal health. This resource is intended to drive the development of fair, accurate, and generalizable AI algorithms to enhance ophthalmological care, especially in resource-constrained environments like Low- and Middle-Income Countries (LMICs).
Conclusion
The mBRSET dataset serves as a crucial, publicly available resource that addresses the shortage of representative ophthalmological data from Low- and Middle-Income Countries (LMICs) by providing 5,164 retina images captured using portable, handheld cameras in a high-burden clinical setting in Brazil, complete with comprehensive clinical and demographic metadata. The technical validation successfully demonstrated that state-of-the-art deep learning models, such as ConvNext V2, DINO V2, and Swin V2, can achieve high diagnostic performance—comparable to that of traditional tabletop cameras—in clinical tasks, including Diabetic Retinopathy (DR) and Macular Edema (ME) classification. Moreover, the study established a significant finding by showing the models' ability to infer demographic and socioeconomic factors (like gender, education, and insurance status) from the retinal images, which highlights previously unexplored associations between socioeconomic disparities and retinal health. Ultimately, mBRSET is poised to facilitate the development of fair, accurate, and generalizable AI algorithms to revolutionize screening, diagnosis, and monitoring processes, thereby enhancing accessible ocular care and addressing healthcare inequalities in resource-constrained environments globally.