Deep learning models trained on retinal images from routine retinopathy of prematurity (ROP) screening were able to predict bronchopulmonary dysplasia and pulmonary hypertension in premature infants, according to a diagnostic study published in JAMA Ophthalmology.
Researchers evaluated whether posterior pole fundus photographs – captured as part of standard neonatal intensive care unit screening – contained nonocular signals associated with cardiopulmonary disease. The work builds on the emerging field of “oculomics,” in which retinal imaging is used to infer systemic disease risk. In this setting, the potential advantage is practical: extremely premature infants already undergo repeated retinal imaging for ROP screening, offering a built-in pathway for deployment if predictive models prove clinically useful.
The analysis used images from infants enrolled in the multicenter Imaging and Informatics in Retinopathy of Prematurity (i-ROP) study, which recruited infants from seven neonatal intensive care units between 2012 and 2020. For the current analysis, images were limited to those taken at 34 weeks’ or less post-menstrual age to ensure that retinal data preceded the clinical diagnosis of bronchopulmonary dysplasia or pulmonary hypertension.
The study included a total of 493 infants in the bronchopulmonary dysplasia cohort, of which 184 infants were also included in the pulmonary hypertension cohort. Bronchopulmonary dysplasia was defined by oxygen requirement at 36 weeks’ post-menstrual age. Pulmonary hypertension was defined based on echocardiography at 34 weeks’ postmenstrual age at a single site, Oregon Health & Science University, where the additional cardiology labels were available.
Researchers compared three approaches: a model trained on image features alone, a model trained on demographic risk factors alone (gestational age, birth weight, and post-menstrual age), and a multimodal model that combined both image features and demographics. Retinal image features were extracted using a ResNet18 deep learning architecture and then classified using a support vector machine.
For bronchopulmonary dysplasia, the multimodal model performed better than either demographics alone or imaging alone. In the held-out test set, the multimodal model achieved an area under the receiver operating characteristic curve (AUC) of 0.82 compared with 0.72 for both the demographics-only and imaging-only models.
For pulmonary hypertension, imaging appeared to carry especially strong diagnostic signal. The imaging-only model achieved an AUC of 0.91, outperforming the demographics-only model, which had an AUC of 0.68. The multimodal model also achieved an AUC of 0.91, suggesting that adding demographics did not meaningfully improve performance beyond the retinal features.
A key concern in this type of work is confounding by ROP itself, since prematurity-related conditions tend to cluster. To address this, researchers trained secondary models restricted to images without visible ROP signs and reported that results remained consistent.
In the discussion, researchers suggested several possible biological explanations. For bronchopulmonary dysplasia, they proposed that oxygen exposure, mechanical ventilation, or continuous positive airway pressure might influence retinal or choroidal vasculature in ways that are detectable in fundus images. For pulmonary hypertension, they hypothesized that elevated right-sided pressures could contribute to venous congestion or altered retinal vascular drainage, similar to changes described in adults with pulmonary hypertension.
The researchers emphasized that the findings are proof-of-concept and hypothesis generating. They noted several limitations, including the relatively small pulmonary hypertension cohort (reducing statistical power), the lack of external validation across different imaging devices, and the absence of model explainability analyses. They also cautioned that deep learning models can often perform poorly when applied to out-of-distribution images, such as those acquired in low-resource settings or using different camera systems.
Nevertheless, the study raises a clinically relevant question for both neonatology and pediatric ophthalmology: whether retinal imaging already embedded in care pathways could eventually support earlier identification of infants at high risk for severe cardiopulmonary complications, potentially prompting earlier echocardiography or more aggressive pulmonary management.
Researchers evaluated whether posterior pole fundus photographs – captured as part of standard neonatal intensive care unit screening – contained nonocular signals associated with cardiopulmonary disease. The work builds on the emerging field of “oculomics,” in which retinal imaging is used to infer systemic disease risk. In this setting, the potential advantage is practical: extremely premature infants already undergo repeated retinal imaging for ROP screening, offering a built-in pathway for deployment if predictive models prove clinically useful.
The analysis used images from infants enrolled in the multicenter Imaging and Informatics in Retinopathy of Prematurity (i-ROP) study, which recruited infants from seven neonatal intensive care units between 2012 and 2020. For the current analysis, images were limited to those taken at 34 weeks’ or less post-menstrual age to ensure that retinal data preceded the clinical diagnosis of bronchopulmonary dysplasia or pulmonary hypertension.
The study included a total of 493 infants in the bronchopulmonary dysplasia cohort, of which 184 infants were also included in the pulmonary hypertension cohort. Bronchopulmonary dysplasia was defined by oxygen requirement at 36 weeks’ post-menstrual age. Pulmonary hypertension was defined based on echocardiography at 34 weeks’ postmenstrual age at a single site, Oregon Health & Science University, where the additional cardiology labels were available.
Researchers compared three approaches: a model trained on image features alone, a model trained on demographic risk factors alone (gestational age, birth weight, and post-menstrual age), and a multimodal model that combined both image features and demographics. Retinal image features were extracted using a ResNet18 deep learning architecture and then classified using a support vector machine.
For bronchopulmonary dysplasia, the multimodal model performed better than either demographics alone or imaging alone. In the held-out test set, the multimodal model achieved an area under the receiver operating characteristic curve (AUC) of 0.82 compared with 0.72 for both the demographics-only and imaging-only models.
For pulmonary hypertension, imaging appeared to carry especially strong diagnostic signal. The imaging-only model achieved an AUC of 0.91, outperforming the demographics-only model, which had an AUC of 0.68. The multimodal model also achieved an AUC of 0.91, suggesting that adding demographics did not meaningfully improve performance beyond the retinal features.
A key concern in this type of work is confounding by ROP itself, since prematurity-related conditions tend to cluster. To address this, researchers trained secondary models restricted to images without visible ROP signs and reported that results remained consistent.
In the discussion, researchers suggested several possible biological explanations. For bronchopulmonary dysplasia, they proposed that oxygen exposure, mechanical ventilation, or continuous positive airway pressure might influence retinal or choroidal vasculature in ways that are detectable in fundus images. For pulmonary hypertension, they hypothesized that elevated right-sided pressures could contribute to venous congestion or altered retinal vascular drainage, similar to changes described in adults with pulmonary hypertension.
The researchers emphasized that the findings are proof-of-concept and hypothesis generating. They noted several limitations, including the relatively small pulmonary hypertension cohort (reducing statistical power), the lack of external validation across different imaging devices, and the absence of model explainability analyses. They also cautioned that deep learning models can often perform poorly when applied to out-of-distribution images, such as those acquired in low-resource settings or using different camera systems.
Nevertheless, the study raises a clinically relevant question for both neonatology and pediatric ophthalmology: whether retinal imaging already embedded in care pathways could eventually support earlier identification of infants at high risk for severe cardiopulmonary complications, potentially prompting earlier echocardiography or more aggressive pulmonary management.