View
227
Download
1
Category
Tags:
Preview:
DESCRIPTION
HELM, which was originally developed by Pfizer, provides a way to represent molecules that are too large to represent atomically or which contain non-natural chemical modifications that make it impractical to represent them as sequences. HELM's structure hierarchy consists of complex and simple polymers, monomers, and atoms. It describes monomers using atoms and bonds, single-type polymers are described as a sequence of monomers, and complex multi-type polymers are described as connected polymers. A detailed description of HELM is available in a paper that was published in the Journal of Chemical Information and Modeling.
Citation preview
http://pistoiaalliance.org @PistoiaAlliance
Pistoia Alliance HELM
Project - What About the Big
Guys?The emerging HELM standard for macromolecular representation
Domain Lead – Sergio RotsteinBusiness Technology, Pfizer
What is a “Biomolecule”?
2
Peptides
Therapeutic Proteins
ADCs
Antibodies
Vaccines
ASOs
siRNAs
For our purposes, anything that is not a small molecule is a biomolecule
Goal
• Eliminate biomolecule penalty
• Make these entities first-class citizens of the Informatics tool portfolio
GAP
So what’s the problem?
3
N
NH
O
O
O
N
NH
O
O
O
Small Molecules
Sequences
Biomolecules
Small Molecule Tools Sequence-Based Tools
“Fit-for-Purpose” Structure Representation
We need to enable the representation, manipulation and visualization of each molecule type in a way that is appropriate for its size and complexity
4
Fit for Purpose: “Monomer” Level• While you could draw out an oligonucleotide like
this:
• The representation is likely more intuitive / practical:
5
Fit for Purpose: Sequence Level
• But even the monomer level representation would not scale well to proteins with hundreds of amino acids. Larger molecules require a more sequence-oriented representation:
6
Fit for Purpose: Component Level
• For multi-component structures such as antibody drug conjugates, component level representations are required to enable each component to dealt with separately.
7
F
O
OO
O N
N
“Collapsed” Antibody
Expanded Drug
Ab
Hierarchical Editing Language for Macromolecules
– Hierarchical – Amenable to the various “levels”• Complex Polymer ⇒ Simple Polymer ⇒ Monomer ⇒
Atom– Extensible
• Allowing addition of new biopolymer types– (Reasonably) comprehensive
• e.g. Allowing representation of oligonucleotide hybridization
– Canonicalizable• Facilitating uniqueness checking
– (Somewhat) human-readable
8
HELM Example: Simple polymer
• HELM notation: A.R.G.[dF].C.K.[ahA].E.D.A
– Non-natural amino acid codes are enclosed in square brackets
• Natural equivalent: ARGFCKXEDA9
HELM Example: Complex Polymer
10
Monomer Database
• Each monomer used in the notation needs to be predefined in a monomer database
• The database includes the chemical structure of the monomer and a description of all acceptable attachment points
11
J. Chem. Inf. Model 2012, 52, 2796-2806
12
Recommended