Prediction of Bone Marrow Transplantation Survivability
Using Machine Learning Techniques
Major Project Phase II (CS692)
SUBMITTED TO
Indian Institute of Information Technology Bhagalpur
Submitted in Partial Fulfilment of the Requirements for the Award of the Degree of
Master of Technology
IN
COMPUTER SCIENCE AND ENGINEERING
(ARTIFICIAL INTELLIGENCE AND DATA SCIENCE)
by
Mr. Rishabh Hanselia (2102020)
Under the guidance of
Dr. Dilip Kumar Choubey
Assistant Professor
Department of Computer Science & Engineering
IIIT BHAGALPUR, BIHAR-813210, INDIA
July-June, 2022
INDIAN INSTITUTE OF INFORMATION TECHNOLOGY BHAGALPUR
Bhagalpur, Bihar 83210, INDIA
Department of Computer Science and Engineering
APPROVAL OF THE GUIDE
Recommended that the work reported in this Major Project Phase II (CS692) on the topic
“An Efficient Detection of Lung and Colon Cancer using Deep Learning Techniques”
prepared by Ms. DEEPALI SHARMA, Roll No. 2202004 under my supervision and
guidance be accepted as fulfilling this part of the requirements for the degree of Master of
Technology.
To the best of my knowledge, the contents of this thesis did not form a basis for the award of
any previous degree to anybody else.
Date: Dr. Dilip Kumar Choubey
Assistant Professor
Department of Computer Science and Engineering
Place: Indian Institute of Information Technology
Bhagalpur
Bhagalpur, Bihar
INDIAN INSTITUTE OF INFORMATION TECHNOLOGY BHAGALPUR
Bhagalpur, Bihar 83210, INDIA
Department of Computer Science and Engineering
DECLARATION
I hereby declare that the work reported in this Major Project Phase II (CS692) on the topic
“An Efficient Detection of Lung and Colon Cancer using Deep Learning Techniques” is
original and has been carried out by me independently in the Department of Computer
science and Engineering, Indian Institute of Information Technology Bhagalpur, Bihar,
India under the supervision of Dr. Dilip Kumar Choubey, Assistant Professor, Department
of Computer Science and Engineering, Indian Institute of Information Technology
Bhagalpur, Bihar, India. I also declare that this work has not formed the basis for the award
of any other Degree, Diploma, or similar title of any university or institution.
Date: Rishabh Hanselia
2102020
Place:
INDIAN INSTITUTE OF INFORMATION TECHNOLOGY BHAGALPUR
Bhagalpur, Bihar 83210, INDIA
Department of Computer Science and Engineering
CERTIFICATE
This is to certify that the Major Project Phase II (CS692) entitled “An Efficient Detection
of Lung and Colon Cancer using Deep Learning Techniques” presented by
Ms. Deepali Sharma, Roll No. 2202004
M. Tech. student of IIIT Bhagalpur under my supervision and guidance. This project has
been submitted in partial fulfillment for the award of “Master of Technology in Computer
Science Engineering (Artificial Intelligence and Data Science)” degree at Indian Institute
of Information Technology Bhagalpur, Bihar, India.
No part of this project has been submitted for the award of any previous degree to the best of
my knowledge.
Dr. Dilip Kumar Choubey Dr. Dilip Kumar Choubey
(Supervisor) (Supervisor)
Assistant Professor Assistant Professor
Computer Science and Engineering Computer Science and Engineering
INDIAN INSTITUTE OF INFORMATION TECHNOLOGY
Bhagalpur, Bihar-83210, INDIA
Department of Computer Science and Engineering
ACKNOWLEDGEMENT
During the course of this major project report preparation, I have received lot of support,
encouragement, advice and assistance from many people and to this end I am deeply
grateful to them all.
It is with great pleasure that I express my cordial thanks and indebtedness to my
admirable guide, Dr. Dilip Kumar Choubey, Assistant Professor, Department of
Computer Science and Engineering. His vast knowledge, expert supervision and
enthusiasm continuously challenged and motivated me to achieve my goal. I will be
eternally grateful to him for allowing me the opportunity to work on this project.
I have great pleasure in expressing my sincere gratitude to Prof. Pradip Kumar
Jain, Director, Indian Institute of Information Technology Bhagalpur and all the
faculty members of the Department of Computer Science and Engineering, IIIT Bhagalpur.
I express my sincere gratitude to Dr. Pradeep Kumar Biswal, Assistant Professor and
Head of Department, Computer Science and Engineering for her valuable help and
suggestions and providing me all relevant facilities that have made the work completed in
time. The present work certainly would not have been possible without the help of my
friends, and also the blessings of my parents.
IIIT Bhagalpur Deepali Sharma
June 2024 2202004
ABSTRACT
Sample Abstract
List of Figures
Figure No. Name Page No.
Figure 4.2.1.1 Accuracy 16
Figure 4.2.2.1 Precision 17
Figure 4.2.3.1 Recall 17
Figure 4.2.4.1 F1 Score 17
Figure 4.3.1.1 Comparison of Accuracy 19
Figure 4.3.1.2 Comparison of Precision, Recall & F1 Score 19
List of Tables
Table No. Table Name Page No.
Table 2.2.1 Comparison of existing research work 6
Table 4.3.1.1 Experimental results 18
List of Acronyms
Acronyms Definition
ABi-LSTM Adaptive Bi-Long Short-Term Memory
MBi-LSTM Modified Bi-Long Short-Term Memory
Bi-LSTM Bi-Long short-term memory
INN Inverse Difference Normalized
ARO Adaptive Rain Optimization
RFE Recursive Feature Elimination
OANN Optimized Artificial Neural Network
OSVM Optimized Support Vector Machine
ANN Artificial Neural Network
ARFA Adaptive Red Fox Algorithm
AROA Adaptive Rain Optimization Algorithms
SVM-RFE Support Vector Machine-Recursive Feature Elimination
ASFO Adaptive Sunflower Optimization algorithm
RGB Red, Green, Blue
GLCM Gray level co-occurrence matrix
GA Genetic algorithm
PSO Particle Swarm Optimization
CS Cuckoo Search
ASFOA Adaptive sunflower optimization algorithm
SFO Sunflower Optimization
LF Levy Flight
KNN K-nearest neighbour
CNN Convolutional Neural Networks
SVM Support Vector Machine
CNN-BS Convolutional Neural Networks- Binary Search
ROC Receiver Operating Characteristic
RFE Recursive Feature Elimination
IOT Internet of Things
CSA Crow Search Algorithm
PCA Principal Component Analysis
PFO-DNN Patent Foramen Ovale - Deep Neural Networks
BFO-DNN Bitterling Fish Optimization-Deep Neural Network
DCNN Deep Convolutional Neural Networks
RF Random Forest
NB Naive Bayes
DT Decision Tree
HOG Histogram Oriented Gradients
LBP Local Binary Patterns
Table of Contents
Approval of the Guide i
Declaration ii
Certificate iii
Acknowledgement iv
Abstract v
List of Figures ix
List of Tables xi
List of Acronyms xii
Chapter 1 Introduction 1
1.1 Background 1
1.2 Objective 2
1.3 Motivation 2
1.4 Deep Learning 3
1.5 Lung Cancer and Colon Cancer 3
1.5.1 Lung Cancer 3
1.5.2 Colon Cancer 4
1.6 Problem Statement 5
1.7 Contribution to the thesis 5
1.8 Structure of the thesis 5
1.9 Summary 6
Chapter 2 Literature Review 7
2.1 Background 7
2.2 Literature Review 7-17
2.3 Summary 18
Chapter 3 Proposed Methodology 19
3.1 Background 19
3.2 Tools used and system specifications 19
3.2.1 Google Collaboratory 19
3.2.2 Jupyter Notebook 19
3.2.3 Python 19
3.2.4 Libraries 19
3.2.4.1 NumPy 19
3.2.4.2 Pandas 20
3.2.4.3 Matplotlib 20
3.2.4.4 TensorFlow 20
3.2.4.5 Keras 20
3.3 Proposed Methodology 21
3.3.1 LC25000 Histopathological Dataset Information 22
3.3.2 Data Pre-processing 23
3.3.3 Train, Test Split 24
3.3.4 Feature Extraction and Classification techniques 24
3.3.4.1 CNN with SVM 25
3.3.4.2 VGG16 26
3.3.4.3 VGG16 with Random Forest 27
3.3.4.4 VGG16 with Xgboost 29
3.3.4.5 ResNet-50 31
3.3.4.6 MobileNet_V2 33
3.3.4.7 GLCM with LightGBM 34
3.4 System Requirements 37
3.4.1 Software Requirements 37
3.4.2 Hardware Requirements 38
3.5 Summary 38
Chapter 4 Implementation Details and Experimental Results 39
4.1 Introduction 39
4.2 Evaluation Metrics used 39
4.2.1 Precision 39
4.2.2 Recall 39
4.2.3 Accuracy 39
4.2.4 F1-score 39
4.2.5 Confusion Matrix 40
4.3 Experimental Results 40
4.3.1 Results of CNN + SVM 40
4.3.2 Results of VGG16 42
4.3.3 Results of VGG16 with Random Forest 44
4.3.4 Results of VGG16 with Xgboost 46
4.3.5 Results of ResNet-50 48
4.3.6 Results of MobileNet_V2 50
4.3.7 Results of GLCM with LightGBM 52
4.4 Comparison report on the performance of all the models in this 54
study
4.5 Evaluation metrics of Different models 56
4.6 Application Built using Pycharm for Streamlit web application 60
4.7 Summary 61
Chapter 5 Discussion and Future Work 62
5.1 Conclusion 62
5.2 Future Direction 63
References 65
Appendices 69
Note: Sub levels in headings are allowed only till 4th i.e. 3.2.1.1 and headings like 4.3.2.1.1 is not
allowed