Abstract
Aim
The purpose of this study was to use artificial intelligence to predict the risk of pulmonary embolism (PE) in patients with suspected PE admitted to the emergency room based on physical examination, laboratory, and clinical probability prediction scores without computed tomography angiography.
Materials and Methods
A comprehensive analysis was conducted on a total of 156 individuals who were admitted to the emergency room with PE. Seventy-eight patients were diagnosed with PE through anamnesis, physical examination, clinical likelihood prediction scores, investigations, and imaging. These patients were then included in the PE group. The data set includes gender, age, shock index, vital signs, complaints at arrival to the emergency department, comorbidities, medications used, medical history, radiological examinations, presence of deep vein thrombosis, electrocardiography, echocardiography findings, Wells score, Geneva score, PERC score, and laboratory tests performed.
Results
The average age of the patients in the study was 69.46±15 years. Dyspnea was the most prevalent presentation, affecting 88 patients (56.4%). The most prevalent comorbidities were hypertension in 52 patients (33.1%), cancer in 51 patients (32.7%), and coronary artery disease in 35 patients (22.4%). The Wells score, D-dimer, low partial carbon dioxide pressure, and tachycardia were discovered to be important factors in the diagnosis of PE. Statistically significant parameters were investigated using a multilayer perceptron artificial intelligence model. The diagnosis of PE was correct with 96% accuracy and 89% specificity.
Conclusion
According to the findings of our study, a thorough review of the patient’s anamnesis, physical examination, laboratory and imaging data, and the application of scores are all crucial in the diagnosis of PE. Furthermore, it was determined that artificial intelligence can be used to diagnose PE before using imaging modalities.
Introduction
Venous thromboembolism (VTE) consists of deep vein thrombosis (DVT) and pulmonary embolism (PE). PE is a clinical condition that occurs when a thrombus passes from the venous circulation to the pulmonary arteries and clots. The clinical presentation varies from asymptomatic to fatal. For this reason, it is difficult to determine the true incidence of PE. Nevertheless, the incidence has increased over the years (1). PE is a critical condition that, along with myocardial infarction and stroke, is among the leading causes of cardiovascular-related death (2). PE-related mortality may vary depending on the patient’s age, comorbid diseases, disease burden, and duration of effective treatment. Thirty-day all-cause mortality in patients with PE is 6.6%. PE-related seven-day mortality was 1.1%, while thirty-day mortality was 1.8% (3). The annual cost of PE to the European Union countries was found to be 8.5 billion Euros, including indirect costs such as pre-hospital prevention, in-hospital treatment, and post-hospital care. Both the aging of the population, the increase in incidence, and the decrease in mortality will increase the financial burden of VTE events on governments in Europe and other countries of the world (4).
Although PE is a common disease, there are no pathognomonic findings or diagnostic tests. For this reason, the clinician should be the primary authority for making the diagnosis. It is difficult for the clinician to diagnose the disease as it has a wide range of clinical presentations, from asymptomatic to fatal outcomes. It is emphasized that PE can be fatal if there are delays in diagnosis and treatment (5). The relatively high prevalence of PE makes it a common and potentially life-threatening disease (6). Early diagnosis of PE is crucial, as even patients with minor symptoms are at risk of recurrent PE (7). Currently, the gold standard diagnostic method is computed tomography-pulmonary angiography (CTPA) (8). Because of the risks associated with CTPA, including contrast agent allergy, contrast nephropathy, radiation exposure, and economic reasons, diagnostic algorithms have been proposed and clinical probability prediction scores have been developed to diagnose PE before imaging (9-15). Two of these scores are the Wells Clinical score and the Geneva score. The Wells clinical score is a widely recognized and validated tool for assessing the clinical probability of PE. It includes physical findings and risk factors such as DVT, lack of alternative diagnoses, tachycardia, immobilization or recent surgery, history of DVT or PE, hemoptysis, and malignancy (16). To help diagnose PE, the Geneva score, like the Wells score, is a standardized tool to help determine the clinical probability of PE based on several criteria, including heart rate, clinical signs of DVT, hemoptysis, and previous PE or DVT (17).
With the increasing awareness of PE among physicians and the increasing availability of diagnostic tests and imaging, the need to avoid unnecessary tests has become evident. The aim was to avoid complications of the tests and to reduce the excessive cost and length of hospitalization. For this purpose, they defined “pulmonary embolism exclusion criteria (PEEC).” It is a rule based on clinical criteria to exclude this condition in patients suspected of having it. The PEEC rule aims to prevent unnecessary additional testing in low-risk patients by assessing whether patients have certain clinical characteristics. Besides all these algorithms, only D-dimer has been validated as a biomarker to aid in the decision to exclude PE. Although not specific for PE, elevated white blood cell count, serum lactate dehydrogenase (LDH), C-reactive protein (CRP), aspartate aminotransferase (AST), and increased sedimentation rate may be detected. The diagnosis of PE plays a critical role in the management of this life-threatening condition, alongside the use of many methods and algorithms. The use of advanced imaging techniques such as CTPA and the application of algorithms, together with a high index of suspicion and rapid intervention, is essential in providing a timely and accurate diagnosis that can significantly affect patient outcomes. In this study, our aim is to investigate the feasibility of using artificial intelligence (AI) approaches in the diagnosis of PE; to identify possible risk factors; and to ensure that CTPA, the gold standard in the diagnosis of PE, is used in appropriate patients based on AI findings.
Materials and Methods
The appropriateness of this study was approved by the İnönü University Scientific Research and Publication Ethics Committee with (decision number: 2022/45, date: 20.04.2022). In addition, the study was supported by İnönü University Scientific Research Projects Unit with project number 3002.
Dataset
In this study, 156 patients admitted to the Department of Emergency Medicine of İnönü University Faculty of Medicine Turgut Özal Medical Center from 13.10.2022-14.10.2024, with PE symptoms, were prospectively analyzed. Adult patients presenting to the emergency department with PE symptoms were included in the study. Pediatric patients under 18 years of age, as well as pregnant and recently delivered patients, were excluded. Anamnesis, physical examination, computerized order tracking system (COTS), and laboratory tests were evaluated. All patients underwent bolus-tracking pulmonary angiography, the gold standard imaging method in PE. Seventy-eight patients were diagnosed with PE and then enrolled in the PE group. The 78 patients with alternative diagnoses in whom PE was ruled out were enrolled as the control group. In the medical records of patient admissions, the admission number, name-surname, gender, age, shock index, vital signs (temperature, pulse, systolic and diastolic blood pressures, saturation values), complaints at presentation to the emergency department, comorbidities, medications used, medical history, radiological examinations, presence of DVT, electrocardiography (ECG), echocardiography findings [ejection fraction (EF)], pulmonary artery pressure (PAP), right ventricular volume (RVV), Wells score, Geneva score, PERC score, laboratory tests hemoglobin, hematocrit, mean cellular volume, monocyte count, platelet, activated partial thromboplastin time, international normalized ratio values, CRP, prothrombin time, platelet distribution width, erythrocyte distribution width, liver enzymes [alanine aminotransferas, AST, creatine kinase (CK), CK myocardial band, renal function tests [blood urea nitrogen (BUN) and creatinine], total protein, albumin, LDH, triglycerides, cholesterol, low-density lipoprotein (LDL), blood gas parameters (pH, PCO2, PO2 lactate, HCO3), D-dimer, fibrinogen, pro-brain natriuretic peptide, procalcitonin (PCT), high sensitivity troponin (HS troponin), PCT triglycerides, total cholesterol, high-density lipoprotein (HDL-cholesterol), LDL-cholesterol, plasmin, vitamin K, fibrinopeptide A, factor V Leiden, and protein S were examined.
Artificial Intelligence
AI is increasingly being integrated into various aspects of healthcare, revolutionizing the field and providing new opportunities for better patient care and outcomes. AI applications in healthcare cover a wide range of areas, from diagnosis and treatment to administrative tasks and patient engagement. Machine learning (ML) techniques such as support vector machines, neural networks, and deep learning have been instrumental in leveraging structured and unstructured data to improve decision-making in healthcare (18). In the field of medical imaging, AI has played an important role in improving diagnostic accuracy and treatment strategies. ML algorithms have been used to predict outcomes and help analyze medical images such as magnetic resonance ımaging and computed tomography scans, leading to improved diagnostic capabilities (19). The integration of AI into healthcare has been positively received by both patients and healthcare professionals, highlighting the potential benefits of AI in improving healthcare delivery and patient outcomes (20). However, it is crucial to ensure the interpretability and ethical use of AI in healthcare to protect patient safety and data privacy (21). Overall, the development of AI in healthcare holds great promise for transforming the sector, increasing diagnostic accuracy, improving treatment outcomes, and optimizing healthcare delivery processes.
Statistical Analysis
Data analysis was performed using IBM© SPSS© Statistics (version 25 for Windows, IBM Corporation, Armonk, New York, USA). Shapiro-Wilk test, histogram distribution, and skewness-kurtosis parameters were used for normality analysis. Descriptive statistics are presented as mean ± standard deviation for variables with normal distribution, median (minimum-maximum) for variables with non-normal distribution, and count of cases and (%) for nominal variables. The chi-square test and the Fisher’s exact test were used to analyze the relationship between categorical variables. In the evaluation of the relationship between continuous variables, the Mann-Whitney U test was used if the variables were nonparametric, and the Student t test was used if the variables were parametric. Results were considered statistically significant for p<0.05.
AI Modeling
The multilayer perceptron (MLP) artificial neural network model was used with the variables that were statistically different between the PE and control groups. Gradient descent was used as the optimization function for the model. 70% of the data was used in the training of the model, while 30% was used in the testing phase.
Results
Biostatistical Analysis
The mean age of the patients included in the study was 69.46±15 years. Of the patients, 79 were male (50.6%) and 77 were female (49.4%). When the presenting complaints of the patients were analyzed, 88 patients (56.4%) presented with dyspnea, 11 patients (7.1%) with palpitations, 8 patients (5.1%) with chest pain, 4 patients (2.6%) with syncope, 3 patients (1.9%) with hemoptysis, and 40 patients (25.6%) with other reasons. The general data of the patients included in the study are shown in Table 1.
The mean age of the patients with PE was 68.48±13.4 years, while the mean age of our control group was 70.44±13.4 years. In the PE group, 45 patients (57.7%) were female and 33 patients (42.3%) were male. In the control group, 32 patients (41%) were female and 46 patients (59%) were male. The comparison of risk factors for PE and control groups is given in Table 2.
According to Table 2, when PE patients were compared with the control group, PE patients were more likely to be female (p=0.037), and complaints such as palpitations (p=0.005) and shortness of breath (p=0.001) were more common. In terms of vital signs, PE patients had lower systolic and diastolic blood pressure (p<0.001) and higher heart rate (p<0.001). Comorbidities such as diabetes (p<0.001), coronary artery disease (p=0.004), and DVT (p<0.001) were more common in PE patients. These data suggest that PE patients differ from the control group in terms of certain demographic and clinical characteristics. The examination of cardiac markers in our study between PE patients and the control group is given in Table 3.
According to Table 3, in ECG findings, normal sinus rhythm was found to be 65.4% in PE patients while 74.4% in the control group, and this difference was not statistically significant (p=0.222). Syncope or tachycardia was 17.9% in the PE group and 6.4% in the control group, and this difference was significant (p=0.028). There were no significant differences between the groups in terms of atrial fibrillation, block, and other ECG findings.
When EF were analyzed, mean EF, was not statistically significant between PE and control groups (p=0.069). The mean of PAP was also similar between the groups (p=0.545). RVV was 29.5% in the PE group and 18.4% in the control group, and this difference was not statistically significant (p=0.108). These data show that ECG and EF of PE patients had some differences compared to the control group, but most of these differences were not statistically significant. The results of the statistical analysis of the utility scores used in PE estimation are given in Table 4.
The results of hemogram, coagulation, and blood gas parameters of the patients in PE and control groups are given in Table 5.
According to Table 5, the coagulation parameter D-dimer and the blood gas parameter PaCO2 were statistically different between the groups. D-dimer was found to be high in the PE group, while PaCO2 was found to be low. The descriptive statistics of the biochemical parameters between the groups included in the study are shown in Table 6.
According to Table 6, statistically significant differences were observed in some parameters in the biochemistry analysis between PE patients and the control group. BUN levels were significantly lower in PE patients (26.72±18.63) compared to the control group (32.40±22.70, p=0.020). CK level was higher in the PE group (194.77±650.76) compared to the control group (120.02±151.06, p=0.038). Total protein level was lower in PE patients (6.23±1.11) than in the control group (6.60±0.83, p=0.023). Similarly, albumin level was lower in PE patients (3.39±0.64) compared to the control group (3.63±0.54, p=0.013). CRP level was higher in the PE group (7.51±6.94) compared than the control group (6.02±7.92, p=0.015). In addition, triglyceride levels were significantly higher in PE patients (136.74±91.60) compared to the control group (112.69±59.70, p=0.020). These findings reveal that there are significant differences in the biochemical profiles of PE patients compared to the control group. The comparison of coagulation factors in PE and the control group is given in Table 7.
On the other hand, in this study, the effect of PE on 3-month mortality was examined, revealing 34 (43.6%) patients in the PE group and 19 (24.4%) patients in the control group died within 3 months. 3-month mortality in the PE group was significantly higher than in the control group (p<0.05).
AI Modeling
To classify PE, a MLP artificial neural network model was created in which the variables that were statistically different between PE and the control group were used as the independent variables. Gradient descent was used as the optimization function for the model. The performance metrics of the classification model are given in Table 8.
Considering the performance metrics obtained in Table 8, the MLP classification model is quite successful in classifying PE and control groups. Figure 1 shows the graph of the variable importance values obtained from the MLP model, where the most important variables in classifying PE are displayed in order. As a result of the MLP model created according to Figure 1, Wells score and D-dimer were found to be the two important variables in predicting PE.
Discussion
The mean age of the PE group was 68.48±16.4 years; the female rate was 57.7%, and the most common presenting complaint was dyspnea. Palpitation was a significant symptom in the PE group. Diabetes mellitus and DVT were significantly higher in the PE group, whereas hypertension and coronary artery disease were significantly higher in the control group. Among the vital signs, tachycardia and low mean systolic and diastolic blood pressure were significant in PE. We found both Wells and Geneva scores to be significant in the diagnosis of PE. When we compared the scores, the specificity and sensitivity of the Wells score were higher compared to another scoring method. An increasing shock index is significant for the diagnosis of PE. Low PaCO2 in blood gas was a significant finding in patients with PE. Elevated D-dimer, elevated CRP, triglycerides, and low BUN, total protein, and albumin were significant in the diagnosis of PE. In addition, fibrinopeptide A, factor 5, protein S, vitamin K, and plasminogen from the thrombophilia panel were not significantly associated with PE.
When the mean age of PE patients was analyzed, it was found that Wells et al. (9) had a mean age of 50.5±18.4 years, van der Hulle et al. (11) had a mean age of 53±18 years, Le Gal et al. (15) had a mean age of 60.6±19.4 years, Roy et al. (22) had a mean age of 52±18.5 years, and Penaloza et al. (23) had a mean age of 63.9 years. In the present study, the mean age of the PE group was 68.48±16.4 years, and the mean age of the control group was 70.44±13.4 years. No significant difference was found between them. The female rate was 62.7% in Wells et al (9), 62% in van der Hulle et al. (11), 58.2% in Le Gal et al. (15), 60.8% in Roy et al. (22), and 62% in Penaloza et al. (23). In this study, consistent with the literature, the percentage of females in the PE group was found to be 57.7%, while the percentage of males was 42.3%. Many studies have demonstrated the relationship between tachycardia and PE (24-26). The significantly higher rate of tachycardia in the PE group is compatible with the literature.
PE is most often a complication of DVT. According to the literature, the rate of DVT in patients diagnosed with PE varies between 21% and 37%. Even in patients with suspected PE, if the lower extremity Doppler USG is positive, anticoagulant treatment can be started without the need for further examination (27). In the current study, the detection rate of DVT in the PE group was found to be significantly higher than in the control group. As a result, the relationship between PE and DVT is similar to that described in the literature.
Upon examining studies comparing the diagnostic accuracy of the most commonly used scoring systems (KOTS, Wells, and modified Geneva), it was found that the diagnostic accuracy of the Wells score was higher than that of the modified Geneva and simplified Geneva. In the studies of Shen et al. (17) and Wong et al. (28), the specificity and sensitivity of the Wells score were found to be significantly higher than the modified Geneva score. In the current study, it was found that high Wells and Geneva scores were significant in the diagnosis of PE. In this respect, the study aligns with previous literature in its methodological approach.
Thrombophilia is a inherited risk factor for VTE. Factor V Leiden deficiency and protein C deficiency are two additional common causes. Depending on the characteristics of the population selected in studies, the thrombophilia detection rate is between 10-50% (29). In this study, we measured Factor V Leiden, fibrinopeptide a, protein s, vitamin K, and plasminogen levels in accordance with the thrombophilia panel. In our study, no significant difference was found in these parameters between the PE group and the control group. The reason why thrombophilia is not significant, unlike in the literature, is what we think is due to the high average age of the patient population in the study. Additionally, it would be appropriate to perform a thrombophilia examination by waiting 3-6 weeks after the diagnosis of PE, but this situation could not be achieved.
D-dimer is the fibrin breakdown product resulting from the destruction of thrombus formed during thrombolytic events (30). With acute PE, D-dimer level increases. Studies have shown that high D-dimer levels have HS, but low specificity in VTE. It has a high negative predictive value as it excludes.
VTE events with >95% sensitivity in ambulatory patients and in patients with low or medium COTS, unless the latter have any comorbidities. While sensitivity with high positive predictive value has low specificity, sensitivity with high negative predictive value has high specificity. it is more meaningful in excluding PE rather than making a diagnosis (31). In a meta-analysis study, in the preliminary diagnosis of PE, the D-dimer test was found to be high in 94% of the patients and normal in 6%. In our study, consistent with the literature, the positive predictive value of D-dimer in the diagnosis of PE was found to be significantly high.
There are publications on the use of BUN levels for predicting mortality in PE-patients. In a study conducted in our country, the relationship between a BUN value of 34.5 mg/dL at the time of diagnosis and mortality in patients diagnosed with acute PE, and treated aggressively with t-PA, was found to be significant with 85% sensitivity and 91% specificity (32). In another study, the ratio of BUN to serum albumin (B/A) was investigated to predict the mortality of patients hospitalized in the intensive care unit with a diagnosis of PE. This study showed that as the B/A ratio increases, the intensive care mortality of PE patients also increases (33). In the study, the BUN value at the time of admission was found to be significantly higher in the control group. Since the current study attempts to diagnose rather than predict mortality, there are no similar studies in the literature. More studies are needed on this subject.
Aujesky et al. (34) investigated the benefits of using CRP in combination with KOTS in diagnosing PE and concluded that while a CRP value >5 mg/dL was significant in excluding PE when combined with low KOTS, CRP alone could not exclude PE. Roumen-Klappe et al. (35), also reported that CRP increased in PE. A study comparing D-dimer and CRP levels in the diagnosis and exclusion of PE found that a standard CRP test using a cut-off level of 5 mg/dL can be used alone or in combination with KOTS to safely exclude PE (36). In the current study, the CRP value was significantly higher in the PE group compared to the control group. Considering similar studies in the literature, we think that elevated CRP, combined with COTS at medium to high risk, may be meaningful. However, we believe that studies with a larger number of patients are needed on this subject.
When the studies investigating the use of AI in the diagnosis of PE were examined, Müller-Peltzer et al.’s (37) study found common false positives originating from soft tissue and pulmonary vein in diagnosing PE with AI. In their study, Li et al. (38), Douillet et al. (39) stated that AI with ML algorithms will be a future tool to guide the physician regarding suspected acute PE. With the modeling obtained, it was observed that PE was classified with very high success and possible risk factors were obtained. According to the variable importance values obtained by modeling, Wells score and D-dimer were identified as the most important risk factors. With the current study, it has been shown that AI can be used in PE prediction, in line with the literature. In terms of better evaluation of the results of our study and the usability of AI in the clinic, we think that further studies with a larger number of patients are needed.
Study Limitations
The first limitation of our study is that the accuracy of the presented AI model was not tested prospectively. The second limitation is that the patients diagnosed with PE or alternative diagnoses were not examined in the survey. The contribution to the survey of the AI model that emerged as a result of the study could not be examined. Another limitation is that it is not known whether the clinicians who collected the qualitative data of the study were trained extensively about PE. Another limitation is that the study was conducted in a single center and with a limited number of patients. Therefore, it is recommended that the study be repeated in a multicenter study with a larger number of patients and conducted by clinicians who have standard knowledge about PE and its exclusion. That the AI model be studied prospectively in diagnosis and that the patients diagnosed be followed up in the survey.
Conclusion
As a result, in the diagnosis of PE, the importance of evaluating patients’ anamnesis, physical examination, laboratory and imaging findings well and using scores has been determined. However, it has been determined that AI can be used before imaging methods are requested in the diagnosis of PE.