0% found this document useful (0 votes)
10 views32 pages

Introduction To Bioinformatics

SIO1003 is a Bioinformatics Concepts course for Semester 1 of the 2024/2025 session, focusing on the integration of biology, computer science, and information technology. The course covers various topics including DNA sequencing, bioinformatics tools, and applications in genomics, with a structure comprising lectures and practical assessments. Students are required to manage their attendance online and participate in a WhatsApp group for communication.

Uploaded by

Act Sri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views32 pages

Introduction To Bioinformatics

SIO1003 is a Bioinformatics Concepts course for Semester 1 of the 2024/2025 session, focusing on the integration of biology, computer science, and information technology. The course covers various topics including DNA sequencing, bioinformatics tools, and applications in genomics, with a structure comprising lectures and practical assessments. Students are required to manage their attendance online and participate in a WhatsApp group for communication.

Uploaded by

Act Sri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

SIO1003

BIOINFORMATICS CONCEPTS

Semester 1 Session 2024/2025

SIO1003 | Bioinformatics Concepts


ATTENDANCE → via SPECTRUM
MARK YOUR ATTENDANCE ONLINE
WE WON’T ENTERTAIN ANYONE ASKING US TO OPEN
THE ATTENDANCE IF YOU MISSED SIGNING IN.

SO, PLEASE BE RESPONSIBLE ON YOUR OWN


ATTENDANCE

DO IT NOW!

SIO1003 | Bioinformatics Concepts


ADD YOURSELF INTO THIS WHATSAPP GROUP NOW

TOTAL STUDENTS SO FAR IN WEEK 1 = 100 STUDENTS

SIO1003 | Bioinformatics Concepts


Lectures W1-W7 Lectures W8-W14
Occ 2 Practicals (Biotech, MGM) Occ 1 Practicals (Biotech)

Dr. Nikman Adli Nor Hashim Dr. Vijayan Manickam Achari


Practical: Online Practical: Online
[email protected] [email protected]

Lecture venue: Online (all) Occ 3 Practicals (SPAS)

Timetable: Dr. Farahaniza Supandi


Mon 12.00 pm – 12.50 pm (Lectures) Practical: Online
Tue 2.00 pm – 4.50 pm (Occ1 & Occ2)
Wed 2.00 pm – 4.50 pm (Occ3) [email protected]

SIO1003 | Bioinformatics Concepts


Course Learning Outcomes

Describe the basic concepts of bioinformatics

Manipulate suitable bioinformatics resources to


solve biological problems

Operate common bioinformatics software and


applications

SIO1003 | Bioinformatics Concepts


Course structure
Week 1 Course introduction. Central Dogma. Introduction to Bioinformatics
Week 2 DNA sequencing & The Human Genome Project
Week 3 Biological and Bioinformatics tools and databases. Part 1
Week 4 Biological and Bioinformatics tools and databases. Part 2 CA – Practical 1 (8%)
Week 5 Gene ontology CA – Practical 2 (8%)
Week 6 Molecular evolution
Week 7 Pairwise sequence alignment CA – Practical 3 (8%)
Mid-semester Break 25.11.2024 - 01.12.2024
Week 8 Database similarity search BLAST. Part 1 CA – Test 1 (10%)
Week 9 Database similarity search BLAST. Part 2 CA – Practical 4 (8%)
Week 10 Molecular phylogenetics
Week 11 Multiple sequence alignment CA – Practical 5 (8%)
Week 12 Introduction to structural biology and Computer-Aided Drug Design
AA – TBC (20%)
Week 13 Application and future in Bioinformatics
CA – Recorded group presentation (10%)
Week 14 Revision AA- TBC(20%)

SIO1003 | Bioinformatics Concepts


Course assessment
• 60% continuous assessment
• Week 4,5,7,9,11 – Practical reports (8% x 5 = 40%)
• Week 8 – Mid-Sem Test (10%)
• Week 13 – Recorded group presentation (10%)

• 40% Alternative assessment


• Week 13 – TBC (20%)
• Week 14 – TBC (20%)

SIO1003 | Bioinformatics Concepts


Introduction to Bioinformatics

Bioinformatics in the post-genomic era

SIO1003 | Bioinformatics Concepts


What is Bioinformatics?
• Bioinformatics is an
interdisciplinary field of science COMPUTER
SCIENCE
in which biology, computer
science, and information
ENGINEERING CHEMISTRY
technology merge to form a
single discipline

• … Bioinformatics is a hybrid BIOINFORMATICS


of biology and computer MATHEMATICS BIOCHEMISTRY
science
• … Bioinformatics is computer
aided biology!
STATISTICS BIOLOGY

SIO1003 | Bioinformatics Concepts


Definition
• Historically, the term bioinformatics did not mean what it means today.

• Paulien Hogeweg and Ben Hesper initially coined the term “bioinformatica” in 1970 to refer
to the study of information processes in biotic systems. This definition placed bioinformatics
as a field parallel to biochemistry (the study of chemical processes in biological systems).

SIO1003 | Bioinformatics Concepts


Definition
Today’s definition:

“Bioinformatics is the research, development, or application of computational tools and


approaches for expanding the use of biological, medical, behavioral or health data, including
those tools and approaches to acquire, store, organize, archive, analyze or visualize such data”

Computer based management and analysis of biological and biomedical data with useful
applications in many disciplines, particularly genomics, proteomics, metabolomics, etc…

SIO1003 | Bioinformatics Concepts


More definitions..
“Bioinformatics is conceptualizing biology in terms of macromolecules and then applying
"informatics" techniques (derived from disciplines such as applied maths, computer science, and
statistics) to understand and organize the information associated with these molecules, on a
large-scale.”
Luscombe NM, et al. Methods Inf Med. 2001;40:346.

“Bioinformatics is a subdiscipline of biology and computer science concerned with the


acquisition, storage, analysis, and dissemination of biological data, most often DNA and amino
acid sequences. Bioinformatics uses computer programs for a variety of applications, including
determining gene and protein functions, establishing evolutionary relationships, and predicting
the three-dimensional shapes of proteins.”
National Institutes of Health (NIH)

Key point: Bioinformatics is Computer Aided Biology

SIO1003 | Bioinformatics Concepts


Where did Bioinformatics come from?
• Bioinformatics arose as molecular biology began to be transformed by the emergence of
molecular sequence and structural data.

Computational alignment of experimentally 3D structure of


determined sequences of a class of related proteins hemoglobin

• Because bioinformatics depends on the collection and availability of biological data, the
question that emerges is why is there is so much interest in the storage, retrieval and
analysis of this data.

SIO1003 | Bioinformatics Concepts


Various types of Bioinformatics data

Protein
families, Protein
Genomes
motifs, interaction
domains
Gene Chemical
expressions entities
DNA and
Protein
RNA Systems
sequence
sequence
Protein
Pathways
structure
DNA and
RNA Ontologies Literatures
structure

SIO1003 | Bioinformatics Concepts


Recap: The key dogmas of molecular biology
• DNA sequence determines protein sequence.
• Protein sequence determines protein structure.
• Protein structure determines protein function.
• Regulatory mechanisms (e.g. gene expression) determine the amount of a particular
function in space and time.

• Bioinformatics is now essential for the archiving, organization and analysis of data related to
these processes.

SIO1003 | Bioinformatics Concepts


“The Central Dogma” Francis Crick, 1957
• Genetic Information Flow:

• The central dogma of molecular biology is an explanation of the flow of genetic


information within a biological system.
• It is often stated as "DNA makes RNA, and RNA makes protein"

SIO1003 | Bioinformatics Concepts


DNA mRNA Polypeptide
5’| |3’ |5’
“Basic” central dogma C---G C
Amino
terminus
G---C G Arg
T---A U
G---C G
DNA replication G---C G Gly
(DNA -> DNA) A---T A
DNA Polymerase T---A U
A---T A Tyr
Genome DNA C---G C
A---T A
Transcription C---G C Thr
(DNA -> RNA) T---A U

RNA Polymerase T---A U


T---A U Phe
T---A U
Transcriptome (+) Sense RNA G---C G
C---G C Ala
Translation C---G C
(RNA -> Protein) G---C G
T---A U Val
Ribosome T---A U Carboxy
Proteome 3’| |5’ |3’ terminus
Protein
Template strand

SIO1003 | Bioinformatics Concepts


“Unusual” central dogma

DNA replication
(DNA -> DNA)
DNA Polymerase
DNA

Reverse transcription Transcription


(RNA -> DNA) (DNA -> RNA)
Reverse Transcriptase RNA Polymerase

(+) Sense RNA (-) Sense RNA

Translation
RNA replication
(RNA -> Protein)
(RNA -> RNA)
Ribosome
DNA Dependent RNA Polymerase
Protein

SIO1003 | Bioinformatics Concepts


Genomes (genetics?)
• The genome of an organism – collection of DNA within that organism, including the set of
genes that encode RNA molecules and proteins
• 1st complete genome of a free-living organism
• 1995- bacterium Haemophilus influenzae
• Publicly available databanks now have more than 6 trillions of sequence data
• These have been collected from over 450,000 different species of organisms
• DNA sequencing technologies
• 1970 – Sanger sequencing
• 2005 – next generation sequencing
• Analyze, store, distribute, acquire of data

SIO1003 | Bioinformatics Concepts


SIO1003 | Bioinformatics Concepts
Post-genomic era
• Omics technologies have transformed molecular biology into a data-rich discipline by
enabling scientists to simultaneously measure large numbers of molecular components that
operates simultaneously through a network of interactions to generate cellular functions
and phenotypic states

• Extraction of this knowledge is not easy


• Incompleteness of data
• Variability between experimental platforms
• Multiple hypothesis testing with few replicates

• Functional genomics – look into gene functioning in the body


• Personalised medicine – develop tailored treatments based on unique genetic makeup

SIO1003 | Bioinformatics Concepts


Genome & Genomics
Genome
• Complete genetic information in an organism (incl. ALL organs, tissues, cells, genes, nc,
variants)
• Eukaryotes can have 2/3 genomes:
• Nuclear genome (usually referred as, if not specified)
• Mitochondrial genome
• Plastid genome
Genomics
• The study of genomes, incl. large chromosomal segments containing many genes
• Aims:
• to map and sequence the entirety of a genome (general)
• To deduce information about the functions of DNA sequences (functional genomics)

SIO1003 | Bioinformatics Concepts


-Omics
“Omics is a general term for a broad discipline of science and engineering for
analyzing the interactions of biological information objects in various ‘omes’.”

• The main focus is on:


1. Mapping information objects such as genes, proteins, and
ligands
2. Finding interaction relationships among the objects
3. Engineering the networks and objects to understand and
manipulate the regulatory mechanisms
4. Integrating various omes and omics subfields

–The Omics Wiki

SIO1003 | Bioinformatics Concepts


-Omics studies
• Genomics- The study of the structure, function and expression of all the genes in an
organism
• Transcriptomics- study of transcriptomes, their structure and functions
• Proteomics- The large-scale study of proteins, including their structure and function,
within a cell/system/organism.
• Metabolomics- The study of global metabolite profiles in a system (cell, tissue or
organism) under a given set of conditions
• Epigenomics, Interactomics, Phenomics, Lipidomics, Fluxomics, …

SIO1003 | Bioinformatics Concepts


Why do we need Bioinformatics?
• Bioinformatics is necessitated by the rapidly
expanding quantities and complexity of biomolecular
data

• Bioinformatics provides methods for the efficient:


• storage
• annotation
• search and retrieval
• data integration
• data mining and analysis

Bioinformatics is essential for the archiving, organization and analysis of data


from sequencing, structural genomics, microarrays, proteomics and new high
throughput assays.

SIO1003 | Bioinformatics Concepts


How do we do Bioinformatics?
A “bioinformatics approach” involves the application of computer algorithms, computer
models and computer databases with the broad goal of understanding the action of both
individual genes, transcripts, proteins and large collections of these entities

SIO1003 | Bioinformatics Concepts


How do we actually do Bioinformatics?
• Pre-packaged tools and databases
• Many online
• New tools and time-consuming methods frequently require downloading
• Most are free to use

• Tool development
• Mostly on a UNIX environment
• Knowledge of programing languages frequently required (Python, Perl, R, C Java,
Fortran)
• May require specialized or high-performance computing resources…

SIO1003 | Bioinformatics Concepts


Skepticism & Bioinformatics
• We have to approach computational results the same way we do wet-lab results:
• Do they make sense?
• Is it what we expected?

• Do we have adequate controls, and how did they come out?

• Modeling is modeling, but biology is different...


• What does this model actually contribute? (to the biological function)

• Avoid the miss-use of ‘black boxes’

• Replicability is a cornerstone of scientific research

SIO1003 | Bioinformatics Concepts


Challenges in Bioinformatics
• Explosion of information
• Need a faster, automated analysis to process large amount of data
• Need for integration between different types of information (sequence, literature,
annotations, proteins levels, RNA levels, etc)
• Need for a “smarter” software to identify interesting relationships in very large datasets

• Lack of Bioinformatician/Bioinformaticist
• Software needs to be easier to access, use and understand
• Biologist need to learn about the software, its limitations and how to interpret its results

SIO1003 | Bioinformatics Concepts


Challenges in Bioinformatics
• Confusing multitude of tools available
• Each with many options and settable parameters

• Most tools and databases are written by and for nerds


• Same is true of documentation - if any exists!

• Most are developed independently

Notable exceptions are found at the:


• EBI (European Bioinformatics Institute) and
• NCBI (National Center for Biotechnology Information)

SIO1003 | Bioinformatics Concepts


Bioinformatics research areas
• Include but are not limited to:
• Organization, classification, dissemination and analysis of biological and biomedical data
(particularly ‘-omics' data)
• Biological sequence analysis and phylogenetic
• Genome organization and evolution
• Regulation of gene expression and epigenetics
• Biological pathways and networks in healthy & disease states
• Protein structure prediction from sequence
• Modeling and prediction of the biophysical properties of biomolecules for binding
prediction and drug design
• Design of biomolecular structure and function

…With applications to Biology, Medicine, Agriculture and Industry

SIO1003 | Bioinformatics Concepts


SUMMARY
• Bioinformatics is computer aided biology.
• Bioinformatics deals with the collection, archiving, organization, and interpretation of a
wide range of biological data.

SIO1003 | Bioinformatics Concepts

You might also like