Splitting PDB File into Chains and Ligands Without using Tools

October 09, 2020

Protein Data Bank (PDB) file format is a representation of a three-dimensional (3D) structure of the protein and ligand data extracted by interpreting the experimental result of high-energy electromagnetic radiation (X-ray) or Nuclear Magnetic Resonance (NMR) through the computational method. The standard PDB file format was created by RCSB during the 1970s for parsing through software.

The atomic coordinate entries of protein are represented by ATOM & TER fields, and ligand by HETATM field (given in the table below).

Record Type	Columns	Description
ATOM	1 - 4	"ATOM"
	7 - 11	Atom serial number
	13 - 16	Atom name
	17	Alternate location indicator
	18 - 20	Residue name
	22	Chain identifier
	23 - 26	Residue sequence number
	27	Code for insertions of residues
	31 - 38	X orthogonal Å coordinate
	39 - 46	Y orthogonal Å coordinate
	47 - 54	Z orthogonal Å coordinate
	55 - 60	Occupancy
	61 - 66	Temperature factor
	73 - 76	Segment identifier
	77 - 78	Element symbol
	79 - 80	Charge
HETATM	1 - 6	"HETATM"
HETATM	7 - 80	same as ATOM records
TER	1 - 3	"TER"
	7 - 11	Serial number
	18 - 20	Residue name
	22	Chain identifier
	23 - 26	Residue sequence number
	27	Code for insertions of residues

Separation of PDB files to chains and ligands are most important in comparative/homology/molecular modelling and Computer-aided Drug Design. In this tutorial, a text editor is used to separate the PDB file into chains and ligands. The resource and software used in this video tutorial are RCSB PDB and Notepad++.

Note: The text editor must be an ASCII character only supported editor (not rich text editor). Moreover, the file name extension should end with .pdb.

Search This Blog

BioGem Blog

Splitting PDB File into Chains and Ligands Without using Tools

Comments

Post a Comment

Most Popular Posts

TNEB Bill Calculator

TNEB Bill Calculator (New)

Technical Questions

Get new posts by email: