Chagas Disease Vectors Identification using Data Mining and Deep Learning Techniques

UDM HOME RE:SEARCH HOME KNOWLEDGE HOME TITAN CONNECT

DSpace Home
→
University of Detroit Mercy Dissertation, Thesis, and Student Project Collections
→
Engineering & Science Thesis Collection
→
View Item

dc.contributor.author	Ghasemi, Zeinab
dc.date.accessioned	2021-04-23T18:17:50Z
dc.date.available	2021-04-23T18:17:50Z
dc.date.issued	2021-04-23
dc.identifier.uri	http://hdl.handle.net/10429/2146
dc.description	Department of Electrical and Computer Engineering and Computer Science Thesis	en_US
dc.description.abstract	Chagas Disease (CD) is a vector–borne infectious disease transmitted from animals to humans and reversely. It is caused by the parasite Trypanosoma cruzi (abbv. as T. cruzi). It is forcing an enormous social burden on public health and counts as one of the most major threats to human health. Based on WHO statistical analysis in 2019, CD affects about 7 million people and is responsible for nearly 50,000 annual mortalities around the world. Also an average of 80 million people are living in risky areas for infection in different parts of the world. The disease has two phases of acute and chronic. Diagnosing of CD can be performed at both acute and chronic phases. It invloves analyzing clinical, epidemi ological, and laboratory data. Since controlling and treating CD is easier in the early stages, detecting it in the acute phase plays an essential role in overcoming and controlling it. There are many clinical trials dedicated to this problem, but progress in compu tational research (automatic identification) has been limited. Therefore, this work presents four automated CD vector identification approaches that classify several different vectors of kissing bugs with acceptable accuracy rates. Classification of different CD vectors is important because carriers of CD belong to different species classes unevenly scattered in different parts of the world. Therefore, differentiating all species of CD vectors plays an important role in designing a robust global system for automatic identification. Three of our proposed methods are composed of preprocessing, feature extraction, feature selection, data balancing, and classification phases. The preprocessing steps are background removal, gray–scaling, and down–sizing. The Principal component analysis (PCA) algorithm is utilized for feature extraction. A correlation–based subset selection is used for feature selection. The classes are balanced by oversampled the minority classes. Finally, the employed classification techniques include Decision Tree (DT), Random Forrest (RF), and Support Vector Machine (SVM). These three methods are named “PCA+DT”,“PCA+RF”, and “PCA+SVM”. In the fourth approach, we applied two deep convolutional neural networks (CNN) on our preprocessed datasetiii and omitted the feature extraction and feature selection steps. Our two convolutional neural networks VGG16 and 7–layer CNN are trained using the same oversampled image dataset. The average accuracy using 150–features dataset for Brazilian vectors is 100% for PCA+DT and PCA+RF methods; 98.20% for PCA+SVM; 88.60% for VGG16; and 97.57% for 7–layer CNN. Brazilian vectors belong to 39 species of kissing bugs with 1620 images in the utilized dataset. The average accuracy using 150–features dataset for Mexican vectors is 100% for PCA+DT and PCA+RF; 98.40% for PCA+SVM; 89.20% for VGG16; and 96.48% for 7–layer CNN. Mexican vectors belong to 12 species of kissing bugs with 410 images in the utilized dataset. Our results are promising and outperform previously developed systems. Given that we have a small dataset, the results of tree–based algorithms (DT and RF) are better than SVM and convolutional neural networks (CNN). Upon availability of larger datasets of kissing bugs, the results of SVM and CNN are most likely to improve.	en_US
dc.language.iso	en_US	en_US
dc.subject	Engineering, Computer, Chagas, Disease, Data Mining, Deep Learning	en_US
dc.title	Chagas Disease Vectors Identification using Data Mining and Deep Learning Techniques	en_US
dc.type	Image	en_US
dc.type	Map	en_US
dc.type	Thesis	en_US