Posts

Creating Custom Database using Standalone NCBI BLAST+

Image
Basic Local Alignment Search Tool (BLAST) is a collection of programs developed using heuristic algorithm in C++ for comparing DNA, RNA, and protein sequences. The standalone command-line interface (CLI) of BLAST is named as BLAST+. The latest version of NCBI BLAST+ can be downloaded from the FTP server of NCBI ( ftp://ftp.ncbi.nih.gov/blast/executables/blast+/LATEST ). This is a simple tutorial for creating a custom database, accessing the database, and performing a sequence search using BLAST+. 1. Creating a Custom Database A nucleotide ( nucl ) or protein ( prot ) database can be created using -dbtype parameter in makeblastdb program. We can create two types of database using command-line below, Non-indexed Database: ./makeblastdb -in DBX.fasta -out DBX -dbtype prot Building a new DB, current time: 12/04/2020 10:10:06 New DB name: C:\NCBI\blast-2.6.0+\bin\DBX New DB title: DBX.fasta Sequence type: Protein Keep MBits: T Maximum file size: 1000000000B Adding sequences

Synthesis and Retrosynthesis of Molecule(s) for Drug Development using IBM RXN

Image
Synthesis is a step-by-step process of constructing complex molecules using a set of molecules and specific reagents. While retrosynthesis (retro- backwards) is the process of deconstructing a molecule into readily available simple molecules. IBM RXN is the first, FREE artificial intelligence (AI) web service for predicting reactions and retrosynthesis molecules. Moreover, the AI model is independent of atom-mapping and give back a confidence level of the prediction. In this tutorial, I have explained synthesis and retrosynthesis of a molecule using IBM RXN. List of resources used in this video tutorial are IBM RXN , and NCBI PubChem .

2D Plot and 3D Visualization of Protein-Ligand and Protein-Protein Interactions

Image
Intermolecular interactions are weak bonds formed due to electron sharing between two or more atoms to hold a molecule together. In molecular modelling or computer-aided drug designing (CADD), intermolecular interaction studies are performed to analyze the stability/energy of docking molecules. This video tutorial demonstrates two-dimensional (2D) plot and three-dimensional (3D) molecular visualization of protein-ligand and protein-protein inter-molecular interactions using LigPlot + /LigPlus tool. LigPlot + is a successor of original LigPlot program. The protein-ligand inter-molecular interactions are computed using LigPlot program and protein-protein or domain-domain inter-molecular interactions using DimPlot program. Moreover, the interactive 3D molecular visualization of the computed result can be viewed using RasMol or PyMOL. List of softwares used in this tutorial are LigPlot + , LigPlus , RasMol , PyMOL , and JDK .

Splitting PDB File into Chains and Ligands Without using Tools

Image
Protein Data Bank (PDB) file format is a representation of a three-dimensional (3D) structure of the protein and ligand data extracted by interpreting the experimental result of high-energy electromagnetic radiation (X-ray) or Nuclear Magnetic Resonance (NMR) through the computational method. The standard PDB file format was created by RCSB during the 1970s for parsing through software. The atomic coordinate entries of protein are represented by ATOM & TER fields, and ligand by HETATM field (given in the table below). Record Type Columns Description ATOM 1 - 4 "ATOM" 7 - 11 Atom serial number 13 - 16 Atom name 17 Alternate location indicator 18 - 20 Residue name 22 Chain identifier 23 - 26 Residue sequence number 27 Code for insertions of residues 31 - 38 X orthogonal Å coordinate 39 - 46 Y orthogonal Å coordinate 47 - 54 Z orthogonal Å coordinate 55 - 60 Occupancy 61 - 66 Temperature factor 73 - 76 Segment identifier 77 - 78 Element symbol 79 -

Prediction of 3D Structure (Folding) of a RNA Sequence

Image
Ribonucleic acid (RNA) is a linear single-stranded molecule that takes part in translation to protein. The intramolecular interactions ( a.k.a. folding) of RNA base pairs (A=U and C≡G) form a secondary structure. Nussinov (or) Zuker algorithm is a dynamic programming approach used for the prediction of the secondary structure of the RNA. The dot-bracket structure with the minimum free energy is a stable secondary structure. Prediction of the three-dimensional (3D) structure of RNA using mFold and RNAComposer tools have demonstrated in this tutorial. The mFold tool predicts the dot-bracket notation format RNA folding result from the RNA sequence, while the RNAComposer tool predicts the 3D structure from the dot-bracket notation. The resources used in this tutorial are NCBI GenBank , mFold , and RNAComposer . Note: The length of RNA sequence input in mFold is limited to 4000 bases and sequence/dot-bracket notation input in RNAComposer is limited to 500 bases, due to the comple

Constructing Entropy Plot from Multiple Sequence Alignment

Image
The entropy in sequence analysis refers to the measure of the variation of characters (column) in multiple sequences. Entropy plot through multiple sequence alignment can be predicted using different types of entropy formulas, namely Shannon's Entropy , Schneider's Entropy , Shenkin's Entropy , Gerstein's Entropy , and Gap normalized Entropy . Prediction of entropy plot consists of two phases: ( i ) performing multiple sequence alignment and consensus, and ( ii ) calculation of entropy number for each column through consensus of multiple sequence alignment. The entropy plot is generated by plotting vertical lines in the order of the consensus sequence on the x -axis, and the entropy number on the y -axis. This simple video tutorial demonstrates how to predict entropy plot through multiple sequence alignment. The tools used in this tutorial are ClustalW , and Entropy Plotter . Note: We can choose any multiple sequence alignment tool, but the alignment output must

Compound Name to 3D Structure Prediction

Image
The three-dimensional (3D) structure of a compound can be retrieved using standard chemical nomenclature from the most popular databases, namely PubChem , ChEBI , ChEMBL , ChemSpider , CSD , ZINC , DrugBank , etc. If the 3D structure or chemical data is not available in the database, this simple tutorial helps you to predict the most optimal structure. The tools used in this tutorial are OPSIN , CACTUS , and Chimera .