Colon to Rectum
Gastroenterology. 2025;168(4):741-753
Preclinical protein signatures of Crohn’s disease and ulcerative colitis: A nested case-control study within large population-based cohorts
Background and aims: Biomarkers are needed to identify individuals at elevated risk of inflammatory bowel disease. This study aimed to identify protein signatures predictive of inflammatory bowel disease.
Methods: Using large population-based cohorts (n ≥ 180,000), blood samples were obtained from individuals who later in life were diagnosed with inflammatory bowel disease and compared with age- and sex-matched controls, free from inflammatory bowel disease during follow-up. A total of 178 proteins were measured on Olink platforms. The authors used machine-learning methods to identify protein signatures of preclinical disease in the discovery cohort (n = 312). Their performance was validated in an external preclinical cohort (n = 222) and assessed in an inception cohort (n = 144) and a preclinical twin cohort (n = 102).
Results: In the discovery cohort, a signature of 29 proteins differentiated preclinical Crohn’s disease (CD) cases from controls, with an area under the curve (AUC) of 0.85. Its performance was confirmed in the preclinical validation (AUC, 0.87) and the inception cohort (AUC, 1.0). In preclinical samples, downregulated (but not upregulated) proteins related to gut barrier integrity and macrophage functionality correlated with time to diagnosis of CD. The preclinical ulcerative colitis signature had a significant, albeit lower, predictive ability in the discovery (AUC, 0.77), validation (AUC, 0.67), and inception cohorts (AUC, 0.95). The preclinical signature for CD demonstrated an AUC of 0.89 when comparing twins with preclinical CD with matched external healthy twins, but its predictive ability was lower (AUC, 0.58; p = 0.04) when comparing them with their healthy twin siblings, that is, when accounting for genetic and shared environmental factors.
Conclusion: The authors identified protein signatures for predicting a future diagnosis of Crohn’s disease (CD) and ulcerative colitis, validated across independent cohorts. In the context of CD, the signature offers potential for early prediction.
DOI: 10.1053/j.gastro.2024.11.006
Dr. Lena Sophie Mayer
Dr. Lena Sophie Mayer, Specialist Internal Medicine, University Medical Center Freiburg, Department of Internal Medicine II, Hugstetter Str. 55, 79106 Freiburg, Germany
Prediction of inflammatory bowel disease based on peripheral blood preclinical protein signatures?
The inflammatory bowel diseases (IBD) Crohn’s disease (CD) and ulcerative colitis (UC) are caused by a complex interplay of genetic and environmental factors that lead to dysregulation of the mucosal immune response. At the time of diagnosis, a significant proportion of patients with CD already present with complications such as fistulas, strictures, or stenoses. Patients with UC may experience treatment-refractory disease during their initial flare, sometimes necessitating colectomy. To date, no validated predictive biomarkers are available for early diagnosis of IBD. The clinical manifestation of IBD is preceded by a phase of subclinical inflammation, which needs to be detected in order to be able to take preventive measures.
In this Swedish case-control study, the authors determined preclinical protein signatures in peripheral blood to predict the manifestation of later CD or UC in a preclinical cohort (n = 28,000), where each individual with subsequent CD or UC was matched to a healthy control. The signatures were then validated in an independent preclinical cohort (n = 143,000). An external, population-based cohort (n = 519) was used to determine the predictive accuracy of the signatures. Furthermore, the influences of genetic and shared environmental risk factors on the predictive protein signatures were investigated using a preclinical population-based twin cohort (n = 12,591).
In the preclinical cohorts, the median time from sample collection to IBD diagnosis was 8.7 years (interquartile range [IQR], 14.3–3.2 years) for CD and 7.2 years (IQR, 14.2–3.4 years) for UC. Of 34 proteins associated with the later onset of CD in the preclinical screening cohort, 9 proteins were confirmed in the larger validation cohort. Six of these proteins were upregulated and 3 were downregulated. Correlation analyses between protein levels and time to diagnosis showed a significant negative correlation only for the 3 decreased proteins, which are associated with mucosal barrier integrity and macrophage functionality. Out of 45 proteins identified to be predictive for the diagnosis of UC, 12 upregulated proteins but no downregulated proteins could be validated. Only MMP-10 levels correlated with time to diagnosis.
Regularized logistic regression was used to determine a signature of 29 proteins that could distinguish with high predictive capacity between individuals who later developed CD and healthy controls (AUC 0.85; 95% CI: 0.78–0.93 in the screening cohort and AUC 0.87; 95% CI: 0.77–0.97 in the validation cohort). The sensitivity was 77% and the specificity 87%. The predictive capacity increased with the proximity to the time of diagnosis, was higher for male participants than for female participants, and was independent of age at study inclusion. The predictive capacity of the identified signature to predict manifestation of UC was higher in the screening cohort than in the validation cohort (AUC 0.77, 95% CI: 0.71–0.83 and AUC 0.67, 95% CI: 0.59–0.76, respectively), associated with a moderate sensitivity and specificity of 61%. The predictive capacity did not increase with proximity to the time of diagnosis, and it was more accurate for older participants than for younger participants.
The predictive signatures for CD and UC were each well suited for discriminating between individuals with a future diagnosis of IBD and healthy controls in the external validation cohort, but they did not distinguish well between CD and UC due to some overlap of proteins in both signatures.
A cohort of 111 healthy twin pairs (monozygotic n = 35, dizygotic n = 76) was used to estimate the heritability for the protein markers. A high heritability was found for none of the proteins downregulated in preclinical CD. Comparative analyses of twins with preclinical CD against external twin controls on the one hand and against their healthy twin siblings on the other showed that the genetic and shared environmental factors influence the predictive protein signature. This was only shown to a limited extent for the signature predictive of UC.
Limitations of this study were the reliance on the case-control design, the lack of longitudinal samples to investigate individual changes in the signatures over time, different cohort sizes, a small number of cases affected by preclinical CD or UC in the twin cohort, a relatively high patient age at diagnosis and thus limited transferability of the signatures to younger populations, the use of plasma in one and serum in the other cohorts, which could be the reason for the lack of replication of some dysregulated proteins, as well as the preselection of the protein markers investigated, which, considering different underlying pathophysiological mechanisms in CD and UC, could explain the different robustness of the respective signatures.
In summary, the authors identified and validated a protein signature predictive for the later manifestation of CD many years before diagnosis in large and demographically heterogeneous population-based cohorts. Results from the twin cohort suggest that genetic and environmental factors partly influence the dysregulation of the signature proteins. The signature determined for UC showed a lower, though significant, predictive capacity. Downregulation of protective proteins associated with intestinal barrier integrity and macrophage function correlated with the time to diagnosis of CD. The long preclinical phase of more than 16 years before onset of CD disease allows for preventive measures such as dietary adjustments and early drug therapy to positively influence the course and prognosis of the disease. However, some aspects remain to be addressed, including the target population for preclinical screening in light of health economic considerations, the optimal timing of protein signature screening relative to the typical age of IBD onset, and the ideal timepoint at which to initiate monitoring diagnostics, preventive measures, or pharmacological therapy in preclinical IBD.