Automated Computer Evaluation of Acute Ischemic Stroke and Large Vessel Occlusion
Large vessel occlusion (LVO) plays an important role in the diagnosis of acute ischemic stroke. Identifying LVO of patients in the early stage on admission would significantly lower the probabilities of suffering from severe effects due to stroke or …
Authors: Jia You, Philip L.H. Yu, Anderson C.O. Tsang
Automated Computer Ev aluation of Acute Isc hemic Strok e and Large V essel Occlusion ? Jia Y ou 1 , Philip L.H. Y u 1 , Anderson C.O. Tsang 2 , Ev a L.H. Tsui 3 , P auline P .S. W o o 3 , and Gilb erto K.K. Leung 2 1 Departmen t of Statistics and Actuarial Science, The Univ ersity of Hong Kong, Hong Kong 2 Division of Neurosurgery , Departmen t of Surgery , The Universit y of Hong Kong, Hong Kong 3 Departmen t of Statistics and W orkforce Planning, Hospital Authority , Hong Kong { plhyu } @hku.hk Abstract. Large v essel o cclusion (L V O) pla ys an imp ortan t role in the diagnosis of acute ischemic stroke. Iden tifying L VO in the early stage of admission w ould significantly lo wer patien ts probabilit y of suffering from sev ere effects of strok e or ev en sa ve their lives. In this paper, we utilized b oth structural and imaging data from all recorded acute isc hemic strok e patien ts in Hong Kong. T otal 300 patien ts (200 training and 100 testing) are used in this study . W e established three hierarc hical mo dels based on patien ts demographic data, clinical data and features obtained from computerized tomography (CT) scans. The first tw o stages of mo deling are merely based on demographic and clinical data. Besides, the third mo del utilized extra CT imaging features obtained from deep learning mo del. The optimal cutoff is determined at the maximal Y ouden index based on 10-fold cross-v alidation. With b oth clinical and imaging fea- tures, the Lev el-3 mo del ac hieved the best p erformance on testing data. The sensitivity , specificity , Y ouden index, accuracy and area under the curv e (AUC) are 0.930, 0.684, 0.614, 0.790 and 0.850 resp ectiv ely . Keyw ords: Large v essel occlusion · isc hemic strok e · brain CT · mac hine learning · deep learning 1 In tro duction Acute isc hemic strok e (AIS) has becoming a leading cause of morbidit y and mor- talit y worldwide, and recent adv ances in endov ascular thrombectomy (EVT) for treatmen t of AIS caused by large v essel o cclusion (L V O) ha ve been widely ac- cepted around the w orld [1]. Similar to in tra venous throm b olysis, rapid access to EVT remains paramoun t that the effectiv eness of EVT can b e largely ensured if patien ts initiated within 6 hours from symptom onset than late diagnosed [1]. Prehospital care fo cuses on rapid iden tification of life-threatening emergen- cies and primary transp ort to a hospital ideally suited to care for that patient, thereb y av oiding the lengthy time delays of interfacilit y transfers [2]. ? Thanks Hong Kong Hospital Authority for Data Preparation 2 J. YOU et al. Recen t decades hav e witnessed the dev elopment of prehospital L V O predic- tion scales in order to differentiate L VO from milder strokes and thus allo w paramedics to make rapidly diagnosis in the prehospital setting. P opular scales include the 3-item Stroke Scale [3], the Los Angeles Motor Scale [4], the Rapid Arterial Occlusion Ev aluation Scale [5], the Cincinnati Prehospital Strok e Sev er- it y Scale [6], the Field Assessmen t Stroke T riage for Emergency Destination [7] and the Prehospital Acute Stroke Severit y [8]. Some of the ab ov e scales sp ecifi- cally aim to iden tify strok e patien ts with L V O rather than all AIS patien ts. These scales are con v erted from National Institutes of Health Stroke Scale (NIHSS) items, a criterion standard for strok e. The algorithm of these scales is mainly based on the hypothesis of linear correlation b etw een patien ts clinical features and the presence of strok e [3]. The drawbac k of these measuremen ts is from the ignorance of some p oten tial stroke-related features, suc h as age and clinical history factors. Compared with previous standard scales, this study will combine multiple demographic data, clinical structure data and CT imaging data, and construct a model for L VO prediction. The mo del will use eXtreme Gradien t Bo osting (X GBo ost) classifier, whic h is an efficien t and scalable implementation of gradi- en t b oosting framework by J. F riedman [9], [10]. XGBoost is now a widely used and p opular mac hine learning technique among data scientists communities. It is an ensemble tec hnique that builds the mo del in a stage-wise metho d that new mo dels are added to correct the errors made b y the previously trained mo dels. In the past several y ears, the deep learning has b een widely applied to computer vision tasks, demonstrating the state-of-the-art p erformances. There are v ariet y of applications in differen t medical imaging problems as well. In this study , we also adopted deep learning mo del as a feature extractor to brain CT scans. 2 Metho ds 2.1 Study P opulation & Data Preparation The Hong Kong Hospital Authorit ys Clinical Managemen t System (CMS) has w ell-established records of all patien ts admitted to the public hospitals for all t yp es of acute ischemic stroke. Patien ts selected in this study need to satisfy all of the following inclusion criteria: (a) aged 18 y ears or ab ov e; (b) with a principal diagnosis of cerebral embolism with mention of cerebral infarction (ICD9cm= 434.11) or cerebral artery occlusion, unsp ecified with mention of cerebral in- farction (ICD9cm= 434.91); (c) emergency admission; and (d) with a com puter tomograph y (CT) brain scan p erformed within 24 hours of AED admission. F or other detailed criteria and sampling prop erties please refer to the pap er [11]. T otal 300 sub jects were selected and then w ere randomly split into 200 for mo del training and 100 for mo del testing. The data in this study contains basic demographic data, structural clinical data, disc harge notes and corresp onding CT scans. The demographic data contain some basic information of patien ts, suc h as gender and age. The structural clinical data include patien ts records of the pre-existing clinical records, other clinical symptoms and signs at A&E admission, such as the existence of diab etes mellitus, hypertension and smoking Automated Computer Ev aluation of Acute Ischemic Strok e 3 history . Discharge notes are rep orts written by do ctors at the time of discharge from the A&E department or hospital. All patien ts hav e their corresp onding brain CT scans as w ell. The diagnosis of an terior circulation L V O w as indep enden tly v erified b y 2 cerebro v ascular disease specialists based on av ailable admission notes, neu- roimaging including CT and disc harge record. An y discrepancies w ere resolv ed b y consensus. Among the 300 patients, 130 suffered from L V O and rest 170 w ere not. The existence of h yp erdense middle cerebral artery (MCA) sign is direct visualization of thrombo embolic material within the lumen, whic h is a critical impact factor when making judgmen t at diagnosis for L VO. The hyper- dense MCA dot sign was v erified b y 2 cerebro v ascular disease sp ecialists, as w ell. Then, the segment lab el was manually drawn through soft ware FSL [12]. The CT images hav e similar quality , spatial resolution and field-of-view. The in-plane resolution is 0.426*0.426 mm. The slice thickness is 5.0 mm for most cases. Each axial slice has identical size of 512*512 pixels. There are 74 cases hav e witnessed h yp erdense MCA dot signs. 2.2 Hierarc hical Mo deling The present study aimed to dev elop mac hine learning models for L VO predic- tion based on a data hierarc hy of three differen t lev els. Level-1 mo del only uti- lizes some basic features that can b e easily obtained by every individual. These features include age, gender, the existence of sp eec h deficits, facial weakness, left-facial weakness, righ t-facial weakness, lim b weakness, left-side weakness and righ t-side weakness. In additional to the features used in Level-1, Level-2 includes all struc- tural clinical data, such as the existences of diab etes mellitus and hypertension, whether the patien t is or was a smok er, curren t smoker (or quit < = 2 y ears), previous smoker (quit > 2 years), diastolic and systolic blo od pressure, Glasgow coma scale (GCS), and the corresp onding sub-scales of ey e, verbal and motor. Some previous diagnoses of atrial fibrillation, atherosclerosis and v alvular heart disease w ere included as well. Many of these features contain missing v alues. T ogether with the features used in Lev el-1 and Lev el-2 models, Lev el-3 model uses image features from patien ts CT scans. Since the hyperdense MCA dot sign is an imp ortant factor in the diagnosis of L VO [13], we aimed to lo calize the MCA in level-3 mo del. Deep learning mo del works as a feature extractor to obtain additional useful features from CT scans. Each patient has 26 to 30 CT images, and we prefer to c ho ose the slice contains the largest segmented MCA dot sign, whic h is considered contains more relev an t information for L V O. The selected image will b e fed into the w ell-trained arc hitecture, and total 16384 features will b e selected for eac h patient. All features were divided into tw o subgroups based on whether the patien t suffered from L V O. Then, applying a tw o-sample t-test to eac h feature, we could identify significant features that can distinguish b etw een the L VO and non-L VO. The top-10 image features with the smallest p-v alues w ere extracted and combined with the Lev el-1 and Level-2 features to build the Lev el-3 mo dels for the final L V O prediction. 4 J. YOU et al. W e c hose extreme gradien t b o osting model as our classifier, whic h superior to off-the-shelf classifiers such as random forest and supp ort v ector mac hine (SVM) [14]. Another imp ortan t factor for the choice of XGBoost classifer is the consideration for missing data. Compared with other traditional machine learning metho ds, X GBo ost is capable to handle missing v alues automatically . Th us, data imputation is not required, and the mo del can giv e a robust prediction when new observ ations contain incomplete features. 2.3 Deep Learning W e applied deep learning in the Level-3 mo del since the h yp erdense MCA dot signs hav e large contribution to patients L V O diagnosis. The proposed arc hi- tecture b elongs to the category of fully conv olutional netw orks (FCN) [15] that extends the conv olution pro cess across the entire image and predicts the seg- men tation mask as a whole. The arc hitecture works as a feature extractor that the high-level features at the end of encoding part will b e subtracted as sho wn in Fig. 1. The enco ding part resembles a traditional con volutional neural net- w orks (CNN) that extract a hierarch y of image features from low to high com- plexit y . The decoding part then transforms the features and reconstructs the segmen tation lab el map from coarse to fine resolution. The mo del contains skip connections, whic h is similar to the U-net architecture [16]. Fig. 1. Unet Arc hitecture The CT scans w ere prepro cessed using the fully automatic pre-pro cessing pip eline through FSL and Nibab el library . As shown in prepro cessing flow chart (Fig. 2), the first step is brain extraction to strip the skulls. In the second step, all CT scans are rotated and translated through a rigid-bo dy 2D registration pro cedure in order to mak e sure all brains within images are horizon tally sym- metric. All the MCA dot signs hav e H.U. index b et ween 30 and 70; thus in step Automated Computer Ev aluation of Acute Ischemic Strok e 5 3, a threshold of 20 to 80 is utilized in order to eliminate the irrelev ant image information. Since the hyperdense MCA dot signs largely course in a plane p er- p endicular to the transv erse plane of imaging, the recognition of the MCA dot signs can b e lo calized within a sp ecified area of the scans. T o b etter sp ecify the region where MCA dot sign, w e localize a b ounding b o x in step 4. The colored b ounding b o x has size of 128*128; while tw o colors indicating left and righ t hemi- spheres. Besides, given clinical information for differen t side of limb w eakness, w e can then extract the infracted hemisphere as shown in the last step. Fig. 2. Preprocessing Flo w Chart 2.4 P erformance Ev aluation T o ev aluate the performances of these mo dels, 10-fold cross v alidation metho d w as adopted. The prediction p erformance was ev aluated by sensitivity , sp eci- ficit y , Y ouden Index [17] and area under the curv e (AUC) of receiver operating c haracteristics (ROC) for L VO classification in all three levels. Aiming to mini- mize b oth the sensitivit y and sp ecificity of the fitted predictiv e mo del, the cutoff w as chosen on Y ouden index, γ , which is derived from sensitivity and sp ecificity and denotes a linear corresp ondence balanced accuracy , given as: γ = sensitiv ity + specif icity − 1 (1) Y ouden index has been commonly used to ev aluate predictiv e mo del p erformance and has sho wn go od p erformance on mo del assessmen t esp ecially for im balanced dataset [18]. The best cutoff was obtained through the indication of largest Y ouden index based on a 10-fold cross v alidation. 3 Results Besides imaging features in Level-3, the first tw o models inv olved total 24 fea- tures. T o ha ve an intuitiv e view of all v ariables, student t-test and P earsons Chi-square test w ere done for contin uous and discrete v ariables, resp ectively . 6 J. YOU et al. Based on these tests, a wide range of factors are found to be asso ciated with an increased risk of L V O. These include demographic factors (older age and female), clinical symptoms and signs at A&E admission (current or previous smok ers, existence of left or righ t facial weakness and limb weakness), clinical structural data and testing scores (systolic blo od pressure, GCS, corresp ond- ing V erbal, Motor and Eye test scores) and historical clinical records (previous diagnosed cardio em b olism, atrial fibrillation and atherosclerosis). Compared with non-L V O strok e, L V O patients are more lik ely female than male (67.4% vs 31.6%, p=0.003) and older age (mean of 80.5 y ears vs 71.4 years, p < 0.001). Glasgo w coma scale is also an important indicator for L V O that the lo wer indcates higher the c hance to b e L VO of an terior circulation (mean of 10.7 vs 13.68, p < 0.001); b esides, its corresp onding eye, v erbal and motor testing scores all play critical roles, ha ving small p-v alues. The prev alence rates in L V O patien ts are higher than non-L V O for those patients suffering limb w eakness (99.2% vs 74.1%, p < 0.001), facial weakness (31.5% vs 25.9%, p < 0.001) and previously atrial fibrillation (36.9% vs 18.8%, p < 0.001). There is no difference in the prev alence of sp eech deficits, diab etes mellitus, hypertension, prior stroke and isc hemic heart disease betw een the L VO and non-L V O groups. Due to the limitation of pap er space, please refer another published pap er [11] for detailed tables and summary statistics. Fig. 3. R OC Curve The testing p erformance for R OC curv e is shown in Fig. 3 and the detailed measuremen ts are in T able 1 . Based on the results, the Level-3 mo del ac hieves the b est performance with all demographical, clinical and imaging features in- v olved in training. Co ordinating with the R OC curv e in Fig. 3 , T able 1 witnesses a great impro vemen t from Lev el-1 mo del to Lev el-2 model regarding all the ev al- uation measurements. This indicates the clinical symptoms and testing scores at A&E made great con tribution in L VO prediction. In volv ed with extra imaging features, Level-3 mo del does not significan t improv e based on Level-2 mo del. This is mainly due to the extra features within Level-3 models do not provide enough contributions. According to the Level-2 results, the sensitivity is 93.0%, while the specificity is 64.9%, indicating most failed predictions sub jects are false Automated Computer Ev aluation of Acute Ischemic Strok e 7 p ositiv e cases. W e aimed to reduce the specificity b y adding extra information of MCA dot signs; how ever, by chec king all those false positive sub jects are not diagnosed ha ving MCA dot signs. Hence, the features extracted through deep learning mo del cannot hav e extraordinary contribution to amend those wrongly predicted cases. T able 1. T esting Performance of all Three-Level Models Hierarc hical Sensitivity Sp ecificit y Y ouden Accuracy A UC Mo delling (%) (%) Index (%) (%) Lev el-1 69.8 57.9 27.7 63.0 0.647 Lev el-2 93.0 64.9 57.9 77.0 0.809 Lev el-3 93.0 68.4 61.4 79.0 0.850 Among the 100 testing sub jects, 24 patients were diagnosed to contain MCA dot signs, and the deep learning mo del is able to correctly predict 22 cases with only 2 sub jects missed. Those the sensitivity is fairly go od, the sp ecificit y is not quite satisfied, whic h is largely due to some false p ositiv e segmented dots. These inaccurate segmentations are largely due to proximit y of b one and the similarity to normal age related v ascular calcification. 4 Discussion It is a common problem that patients’ demographic and structural clinical data con tain missing v alues. W e initially hav e applied K-nearest neigh b ors metho d to impute the missing v alues, and implemented other mac hine learning techniques. Ho wev er, missing v alue imputation do es not make sense in real case application and prediction. The model need to contain all the training data in case to the patien t has an y incomplete attributes. Regarding this situation, only decision tree and XGBoost metho ds can handle missing v alue and tak e in to consideration for the mo del’s complexity , w e finally chose XGBoost as the classifier. It is known that obtain adequate amoun t of medical data is not easy , esp e- cially given the demand for w ell-diagnosed lab els pro vided b y exp ert clinicians. The deep learning is a data-h ungry algorithm that requires huge quantitates of images to keep up dating the weigh ts. In this study , among total 300 sub jects only 74 sub jects con tain MCA dot signs, and among these cases, merely 89 slices of scans enclose ground truth. The further study may in v olve more data to enhance our mo dels robustness. 5 Conclusion In this pap er, we presen t three hierarc hical machine learning mo dels that capable to fully automatically predict large vessel o cclusion for susp ected acute ischemic strok e patients. Combining all features from patients demographical, clinical information and brain CT imaging, our final model can give a promising result compare previous methods. Besides, the data inv olv ed in this study represen ts of territory-wide emergency and in-patient healthcare services for a p opulation of 7.3 million. Th us, it can work as an effectiv e tool in assistan t to do ctors in making quic k and accurate diagnosis at the early stage of patients admission. 8 J. YOU et al. References 1. P ow ers, W.J., Derdeyn, C.P ., Biller, J., Coffey , C.S., Hoh, B.L., Jauch, E.C., et al.: 2015 American Heart Association/American Stroke Asso ciation fo cused up date of the 2013 guidelines for the early management of patients with acute ischemic strok e regarding endov ascular treatment. Stroke 46 (10), 3020-3035 (2015) 2. Prabhak aran, S., W ard, E., John, S., Lop es, D.K., Chen, M., T emes, R.E., et al.: T ransfer dela y is a ma jor factor limiting the use of in tra-arterial treatment in acute isc hemic stroke. Stroke 42 (6), 1626-1630. (2011) 3. Singer, O. C., Dvorak, F., du Mesnil de Rochemon t, R., Lanfermann, H., Sitzer, M., Neumann-Haefelin, T.: A simple 3-item strok e scale: comparison with the National Institutes of Health Strok e Scale and prediction of middle cerebral artery occlusion. Strok e 36 (4), 773-776 (2005) 4. Nazliel, B., Starkman, S., Lieb eskind, D.S., Ovbiagele, B., Kim, D., Sanossian, N., et al.: A brief prehospital stroke severit y scale iden tifies isc hemic stroke patients harb oring persisting large arterial o cclusions. Strok e 39 (8), 2264-2267 (2008) 5. Prez de la Ossa, N., Carrera, D., Gorc hs, M., Querol, M., Milln, M., Gomis, M., et al.: Design and v alidation of a prehospital stroke scale to predict large arterial o cclusion: the rapid arterial o cclusion ev aluation scale. Stroke 45 (1), 87-91 (2014) 6. Katz, B. S., McMullan, J. T., Suc harew, H., Adeo ye, O., Broderick, J. P .: Design and v alidation of a prehospital scale to predict stroke severit y: Cincinnati Prehospital Strok e Severit y Scale. Strok e STROKEAHA-115 (2015). 7. Lima, F.O., Silv a, G.S., F urie, K.L., F rankel, M.R., Lev, M.H., Camargo, .C., et al.: Field assessment stroke triage for emergency destination: a simple and accurate prehospital scale to detect large v essel o cclusion strok es. Stroke 47 (8), 1997-2002 (2016) 8. Hastrup, S., Damgaard, D., Johnsen, S. P ., Andersen, G.: Prehospital acute stroke sev erity scale to predict large artery occlusion: design and comparison with other scales. Stroke STR OKEAHA-115 (2016) 9. F riedman, J., Hastie, T.,Tibshirani, R.: Additive logistic regression: a statistical view of b oosting. The annals of statistics 28 (2), 337-407 (2000) 10. F riedman, J. H.: Sto c hastic gradient bo osting. Computational Statistics & Data Analysis 38 (4), 367-378 (2002) 11. Tsang, A.C.O., Y ou, J., Li, L.F., Tsang, F.C.P ., W o o, P .P .S., Tsui, E.L.H., et al.: Burden of large vessel occlusion strok e and the service gap of thrombectomy: a p opulation-based study using a territory-wide public hospital system registry . T o app ear in In ternational Journal of Stroke (2018) 12. Jenkinson, M., Beckmann, C. F., Behrens, T. E., W oolrich, M. W., Smith, S. M. (2012). Fsl. Neuroimage 62 (2), 782-790. 13. Lim, J., Magarik, J. A., F roehler, M. T.: The CTDefined Hyperdense Arterial Sign as a Mark er for Acute Intracerebral Large V essel Occlusion. Journal of Neuroimaging 28 (2), 212-216 (2018) 14. Chen, T., Guestrin, C.: Xgb oost: A scalable tree b o osting system. In: Conference on Knowledge Disco very and Data Mining, pp. 785-794 (2016) 15. Long, J., Shelhamer, E., Darrell, T.: F ully conv olutional netw orks for seman tic segmen tation. In: IEEE Conference on CVPR. pp. 3431-3440 (2015) 16. Ronneb erger, O., Fisc her, P ., Brox, T.: U-Net: conv olutional netw orks for biomed- ical image segmentation. In: MICCAI. pp. 234241 (2015) 17. Y ouden, W. J.: Index for rating diagnostic tests. Cancer 3 (1), 32-35 (1950) 18. Bekk ar, M., Djemaa, H. K., Alitouc he, T. A.: Ev aluation measures for models assessmen t ov er imbalanced datasets. Journal Of Information Engineering and Ap- plications, 3 (10) (2013)
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment