AI in Clinical Medicine, ISSN 2819-7437 online, Open Access
Article copyright, the authors; Journal compilation copyright, AI Clin Med and Elmer Press Inc
Journal website https://aicm.elmerpub.com

Original Article

Volume 2, 2026, e15


A Machine Learning Model to Guide Computed Tomography Angiography Use in Acute Gastrointestinal Bleeding: A Decision-Support Tool for Gray-Zone Cases

Figures

Figure 1.
Figure 1. Proposed clinical workflow for implementation of the model. This diagram outlines a step-by-step integration of the machine learning model into clinical practice. Starting with patient presentation and lab draw, the model ingests key laboratory values (e.g., hemoglobin, INR, BUN) to generate a predicted probability of a positive CTA. Based on this probability and clinical context, the model stratifies patients into high- and low-likelihood categories, guiding providers on whether to pursue immediate CTA or consider alternative diagnostics. This approach supports real-time, evidence-informed decision-making for intermediate-risk cases. BUN: blood urea nitrogen; CTA: computed tomography angiography; INR: international normalized ratio.
Figure 2.
Figure 2. Workflow for developing an interpretable logistic regression model to assist CTA decision-making. This flowchart outlines the sequential steps in developing an interpretable logistic regression model to assist CTA decision-making. Using routinely available labs from 24 h prior to imaging, the workflow includes data preprocessing, statistical evaluation of features, model selection with SMOTE upsampling, and final performance tuning focused on sensitivity to minimize missed bleeds. CTA: computed tomography angiography; SMOTE: Synthetic Minority Oversampling Technique.
Figure 3.
Figure 3. Logistic regression coefficients for predicting positive CTA. This plot displays the relative importance of laboratory features in the logistic regression model. In our analysis, we use a random forest predictor for computing importance; our findings therefore rely on random forest assumptions, including independence of variable shuffling. Negative coefficients (left of zero) are associated with decreased likelihood of a positive CTA, while positive coefficients (right of zero) increase predicted probability. Max hematocrit and delta hemoglobin emerged as highly influential predictors, despite being statistically non-significant on univariate analysis—highlighting the value of multivariable modeling. CTA: computed tomography angiography.
Figure 4.
Figure 4. Projected reduction in unnecessary CTA scans with model implementation. The left column represents baseline imaging practice, where 24,000–40,000 of every 100,000 CTAs are estimated to be unnecessary. The right column shows the anticipated effect of model-assisted triage, which reduces avoidable CTAs by 20% while preserving all necessary scans. This change corresponds to a direct reduction in imaging volume, with estimated savings between 24,000 and 40,000 scans per 100,000 patients. CTA: computed tomography angiography.

Tables

Table 1. Univariate Statistical Findings and Model-Level Impact for Candidate Predictors (24-h Pre-CTA)
 
FeatureDefinition (window)t-test PMann-Whitney PCohen’s dBayesian P(mean diff > 0)Mean diff (95% CI)Permutation importanceImportance in final model
Continuous features were assessed with independent-samples t-tests and Mann–Whitney U tests; effect size (Cohen’s d) and Bayesian posterior probability of a positive mean difference are reported. Non-linear predictive contribution was estimated via random-forest permutation importance. Notably, Δ-hemoglobin demonstrated a moderate predictive trend in isolation, while Δ-hematocrit was not significant univariately but later emerged as a high-impact contributor in the final model. The feature Max INR consistently emerged as the most robust predictor across statistical tests, Bayesian analysis, and effect size calculations. This illustrates how certain features may lack standalone power yet contribute meaningfully in multivariable contexts. The discrepancy between raw statistical significance and feature importance highlights the potential for complex collinearity and reinforces the value of structured feature engineering when developing interpretable predictive models. In the final logistic model, maximum hematocrit was the strongest predictor, with Δ-hemoglobin and Δ-hematocrit also highly influential, whereas maximum INR was the only feature with significant univariate separation. BUN: blood urea nitrogen; CI: confidence interval; CTA: computed tomography angiography; Hb: hemoglobin; Hct: hematocrit; INR: international normalized ratio; Plt: platelets.
Max INRPeak INR0.00980.08360.34430.00−0.31 (−0.54, −0.08)0.052High (top tier)
Closest BUNClosest BUN to order time0.11960.63190.25290.06−4.99 (−11.22, 1.23)0.052Moderate
Max HctHighest Hct0.24800.16090.24500.12−1.43 (−3.83, 0.97)0.090Highest (rank-1)
Min HctLowest Hct0.22500.14940.25370.11−1.56 (−4.05, 0.93)Moderate
Δ-HbMax−Min Hb0.41740.4036−0.17540.790.21 (−0.30, 0.72)0.041High (top tier)
Δ-HctMax−Min Hct0.86520.8993−0.03560.570.13 (−1.36, 1.62)High (top tier)
Min HbLowest Hb0.45730.25010.16080.23−0.35 (−1.27, 0.57)Low
Max HbHighest Hb0.75250.51280.06790.38−0.14 (−1.01, 0.73)Low
Min PltLowest Plt0.66740.97060.07730.33−8.40 (−46.57, 29.76)< 0.01Low

 

Table 2. Performance Metrics Across Classification and Anomaly Detection Models for Predicting CTA-Positive GI Bleeding
 
ModelAccuracyPrecisionRecallF1ROC-AUC
This table presents accuracy, precision, recall, F1 score, and ROC-AUC for both traditional classification models and unsupervised anomaly detection approaches. Among classifiers, logistic regression with SMOTE upsampling yielded the best overall balance across metrics, achieving the highest precision (0.82), F1 score (0.71), and ROC-AUC (0.71). While random forest had higher recall (0.75), it showed lower precision, suggesting over-identification of positives. In contrast, anomaly detection models demonstrated near-perfect precision and recall—but with low ROC-AUC, indicating poor discrimination likely due to class imbalance and lack of labeled signal. These findings emphasize the challenge of low-prevalence detection in small datasets and highlight the potential utility of structured upsampling in improving model robustness. CTA: computed tomography angiography; ROC-AUC: area under the receiver operating characteristic curve; SMOTE: Synthetic Minority Oversampling Technique.
Evaluation metrics for classification models
  Naive logistic regression0.640.720.640.670.73
  *With ADASYN upsampling0.640.810.640.670.70
  *With SMOTE upsampling0.680.820.680.710.71
  Random forest0.750.610.750.670.61
  XGBoost0.640.590.640.610.5
Evaluation metrics for anomaly detection models
  AutoEncoder0.950.990.950.970.54
  One-class SVM0.990.990.990.990.51
  Isolation forest0.990.990.990.990.50

 

Table 3. Multivariable Standardized Logistic Regression Coefficients and Odds Ratios for Model Predicting CTA Positivity
 
FeatureDefinition (window)Logistic coefficient (β)Standard errorZ statisticP-valueOdds ratio (OR)95% CI (lower)95% CI (upper)
Table 3 reports coefficients and odds ratios from the final multivariable standardized logistic regression model used to predict CTA positivity. All continuous predictors were standardized prior to model fitting; therefore, odds ratios correspond to a one–standard deviation increase in each laboratory variable rather than a one-unit change. Standard errors, confidence intervals, and P-values are reported only for variables with reliably estimable variance. For several hemoglobin- and hematocrit-derived features, substantial multicollinearity prevented stable estimation of standard errors, and corresponding inferential statistics are not shown. Odds ratios for these variables should therefore be interpreted as descriptive of model behavior rather than independent inferential estimates. BUN: blood urea nitrogen; CI: confidence interval; CTA: computed tomography angiography; Hb: hemoglobin; Hct: hematocrit; INR: international normalized ratio; Plt: platelets.
Max INRPeak INR–0.54410.3362–1.61840.10560.58030.30031.1217
Closest BUNClosest BUN to order time–0.34220.2120–1.61400.10650.71020.46871.0761
Max HctHighest Hct–1.46360.2314
Min HctLowest Hct–0.54660.5789
Δ-HbMax−Min Hb1.23893.4518
Δ-HctMax−Min Hct–1.19070.3040
Min HbLowest Hb0.61271.8454
Max HbHighest Hb1.07412.9275
Min PltLowest Plt–0.12140.2093–0.58010.56190.99570.58771.3348