Molecular similarity for machine learning in drug development : poster presentation
- Poster presentation In pharmaceutical research and drug development, machine learning methods play an important role in virtual screening and ADME/Tox prediction. For the application of such methods, a formal measure of similarity between molecules is essential. Such a measure, in turn, depends on the underlying molecular representation. Input samples have traditionally been modeled as vectors. Consequently, molecules are represented to machine learning algorithms in a vectorized form using molecular descriptors. While this approach is straightforward, it has its shortcomings. Amongst others, the interpretation of the learned model can be difficult, e.g. when using fingerprints or hashing. Structured representations of the input constitute an alternative to vector based representations, a trend in machine learning over the last years. For molecules, there is a rich choice of such representations. Popular examples include the molecular graph, molecular shape and the electrostatic field. We have developed a molecular similarity measure defined directly on the (annotated) molecular graph, a long-standing established topological model for molecules. It is based on the concepts of optimal atom assignments and iterative graph similarity. In the latter, two atoms are considered similar if their neighbors are similar. This recursive definition leads to a non-linear system of equations. We show how to iteratively solve these equations and give bounds on the computational complexity of the procedure. Advantages of our similarity measure include interpretability (atoms of two molecules are assigned to each other, each pair with a score expressing local similarity; this can be visualized to show similar regions of two molecules and the degree of their similarity) and the possibility to introduce knowledge about the target where available. We retrospectively tested our similarity measure using support vector machines for virtual screening on several pharmaceutical and toxicological datasets, with encouraging results. Prospective studies are under way.
Kernel learning for ligand-based virtual screening:discovery of a new PPARgamma agonist
- Poster presentation at 5th German Conference on Cheminformatics: 23. CIC-Workshop Goslar, Germany. 8-10 November 2009 We demonstrate the theoretical and practical application of modern kernel-based machine learning methods to ligand-based virtual screening by successful prospective screening for novel agonists of the peroxisome proliferator-activated receptor gamma (PPARgamma) . PPARgamma is a nuclear receptor involved in lipid and glucose metabolism, and related to type-2 diabetes and dyslipidemia. Applied methods included a graph kernel designed for molecular similarity analysis , kernel principle component analysis , multiple kernel learning , and, Gaussian process regression . In the machine learning approach to ligand-based virtual screening, one uses the similarity principle  to identify potentially active compounds based on their similarity to known reference ligands. Kernel-based machine learning  uses the "kernel trick", a systematic approach to the derivation of non-linear versions of linear algorithms like separating hyperplanes and regression. Prerequisites for kernel learning are similarity measures with the mathematical property of positive semidefiniteness (kernels). The iterative similarity optimal assignment graph kernel (ISOAK)  is defined directly on the annotated structure graph, and was designed specifically for the comparison of small molecules. In our virtual screening study, its use improved results, e.g., in principle component analysis-based visualization and Gaussian process regression. Following a thorough retrospective validation using a data set of 176 published PPARgamma agonists , we screened a vendor library for novel agonists. Subsequent testing of 15 compounds in a cell-based transactivation assay  yielded four active compounds. The most interesting hit, a natural product derivative with cyclobutane scaffold, is a full selective PPARgamma agonist (EC50 = 10 ± 0.2 microM, inactive on PPARalpha and PPARbeta/delta at 10 microM). We demonstrate how the interplay of several modern kernel-based machine learning approaches can successfully improve ligand-based virtual screening results.
Oxidative stress induces CHIP-mediated ubiquitination and roteasomal degradation of soluble guanylyl cyclase : oral presentation
Peter M. Schmidt
Harald HHW Schmidt
- Oxidative stress attenuates the NO-cGMP pathway, e.g. in the vascular system, through scavenging of free NO radicals by superoxide O2•-, by inactivation of soluble guanylyl cyclase (sGC) via oxidation of its central Fe2+ ion, and by down-regulation of sGC protein levels. While the former pathways are well established, the molecular mechanisms underlying the latter are still obscure. Using oxidative sGC inhibitor ODQ we demonstrate rapid down-regulation of sGC protein in mammalian cells. Co-incubation with proteasomal inhibitor MG132 results in accumulation of ubiquitinated sGC whereas sGC activator BAY 58–2667 prevents ubiquitination. ODQ-induced down-regulation of sGC is mediated through selective ubiquitination of its b subunit, and BAY 58–2667 abrogates this effect. Ubiquitination of sGC-b is dramatically enhanced by E3 ligase CHIP. Our data indicate that oxidative stress promotes ubiquitination of sGC b subunit through E3 ligase CHIP, and that sGC activator 58–2667 reverts this effect, most likely through stabilization of the heme-free b subunit. Thus the deleterious effects of oxidative stress can be counter-balanced by an activator of a key enzyme of vascular homeostasis.
Interplay of ‘induced fit’ and preorganization in the ligand induced folding of the aptamer domain of the guanine binding riboswitch
Hamid Reza Nasiri
- Riboswitches are highly structured elements in the 50-untranslated regions (50-UTRs) of messenger RNA that control gene expression by specifically binding to small metabolite molecules. They consist of an aptamer domain responsible for ligand binding and an expression platform. Ligand binding in the aptamer domain leads to conformational changes in the expression platform that result in transcription termination or abolish ribosome binding. The guanine riboswitch binds with high-specificity to guanine and hypoxanthine and is among the smallest riboswitches described so far. The X-ray-structure of its aptamer domain in complex with guanine/ hypoxanthine reveals an intricate RNA-fold consisting of a three-helix junction stabilized by longrange base pairing interactions. We analyzed the conformational transitions of the aptamer domain induced by binding of hypoxanthine using highresolution NMR-spectroscopy in solution. We found that the long-range base pairing interactions are already present in the free RNA and preorganize its global fold. The ligand binding core region is lacking hydrogen bonding interactions and therefore likely to be unstructured in the absence of ligand. Mg2+-ions are not essential for ligand binding and do not change the structure of the RNA-ligand complex but stabilize the structure at elevated temperatures. We identified a mutant RNA where the long-range base pairing interactions are disrupted in the free form of the RNA but form upon ligand binding in an Mg2+-dependent fashion. The tertiary interaction motif is stable outside the riboswitch context.
Base-specific spin-labeling of RNA for structure determination
Thomas F. Prisner
Joachim W. Engels
- To facilitate the measurement of intramolecular distances in solvated RNA systems, a combination of spin-labeling, electron paramagnetic resonance (EPR), and molecular dynamics (MD) simulation is presented. The fairly rigid spin label 2,2,5,5-tetramethyl-pyrrolin-1-yloxyl-3-acetylene (TPA) was base and site specifically introduced into RNA through a Sonogashira palladium catalyzed crosscoupling on column. For this purpose 5-iodouridine, 5-iodo-cytidine and 2-iodo-adenosine phosphoramidites were synthesized and incorporated into RNA-sequences. Application of the recently developed ACE (R) chemistry presented the main advantage to limit the reduction of the nitroxide to an amine during the oligonucleotide automated synthesis and thus to increase substantially the reliability of the synthesis and the yield of labeled oligonucleotides. 4-Pulse Electron Double Resonance (PELDOR) was then successfully used to measure the intramolecular spin–spin distances in six doubly labeled RNA-duplexes. Comparison of these results with our previous work on DNA showed that A- and B-Form can be differentiated. Using an all-atom force field with explicit solvent, MD simulations gave results in good agreement with the measured distances and indicated that the RNA A-Form was conserved despite a local destabilization effect of the nitroxide label. The applicability of the method to more complex biological systems is discussed.
David gegen Goliath : wie Viren das Immunsystem überlisten
- Infektionen mit Herpesviren sind bereits seit der Antike bekannt. So beschrieb zum Beispiel schon Hippokrates in seinem »Corpus Hippocraticum« die sich auf der Haut ausbreitenden Herpes Simplex Läsionen und gab der Krankheit ihren bis heute gültigen Namen. Verbürgt ist auch, dass der römische Kaiser Tiberius vor etwa 2000 Jahren während einer auftretenden Herpes labialis-Epidemie das Küssen bei öffentlichen Zeremonien per Dekret verbat. Shakespeare war ebenfalls bestens vertraut mit den periodisch auftretenden Herpes-Bläschen; in seinem Werk »Romeo & Julia« spricht Mercutio zu Romeo: »O’er ladies lips, who straight on kisses dream, which oft the angry Mab with blisters plagues, ….« Doch erst in den 1960er Jahren erkannte man die virale Herkunft der Erkrankung.
"Den umgekehrten Weg Freuds gehen" : Eric Kandel sucht die Erinnerung und plädiert für eine Biologie des Geistes ; [Rezension]
Predicting olfactory receptor neuron responses from odorant structure
Marien De Bruyne
- Background Olfactory receptors work at the interface between the chemical world of volatile molecules and the perception of scent in the brain. Their main purpose is to translate chemical space into information that can be processed by neural circuits. Assuming that these receptors have evolved to cope with this task, the analysis of their coding strategy promises to yield valuable insight in how to encode chemical information in an efficient way. Results We mimicked olfactory coding by modeling responses of primary olfactory neurons to small molecules using a large set of physicochemical molecular descriptors and artificial neural networks. We then tested these models by recording in vivo receptor neuron responses to a new set of odorants and successfully predicted the responses of five out of seven receptor neurons. Correlation coefficients ranged from 0.66 to 0.85, demonstrating the applicability of our approach for the analysis of olfactory receptor activation data. The molecular descriptors that are best-suited for response prediction vary for different receptor neurons, implying that each receptor neuron detects a different aspect of chemical space. Finally, we demonstrate that receptor responses themselves can be used as descriptors in a predictive model of neuron activation. Conclusions The chemical meaning of molecular descriptors helps understand structure-response relationships for olfactory receptors and their 'receptive fields'. Moreover, it is possible to predict receptor neuron activation from chemical structure using machine-learning techniques, although this is still complicated by a lack of training data.
Optimization and antiviral analysis of peptide ligands for the HIV-1 packaging signal PSI
- Oral presentations Background: We selected peptide ligands for the HIV-1 packaging signal PSI by screening phage displayed peptide libraries. Peptide ligands were optimized by screening spot synthesis peptide membranes. The aim of this study is the functional characterization of these peptide ligands with respect to inhibition of HIV-1 replication. Methods: Phage displayed peptide libraries were screened with PSI-RNA structures. The Trp-rich peptide motifs were optimized for specific binding on spot synthesis peptide membranes. The best binding peptide was expressed intracellularly in fusion with RFP or linked to a protein transduction domain (PTD) for intracellular delivery. The effects on virion production were analyzed using pseudotyped lentiviral particles. Results: After positive and negative selection rounds, phages binding specifically to PSI-RNA were identified by ELISA. Peptide inserts contained conserved motifs of aromatic amino acids known to be implicated in binding of PSI-RNA by the natural Gag ligand. The filter assay identified HKWPWW as the best binding ligand for PSI-RNA, which is delivered into several cell lines by addition of a PTD. Compared to a control peptide, the HKWPWW peptide inhibited HIV-1 replication as deduced from reduced titers of culture supernatants. As HKWPWW also binds to the TAR-RNA like the natural nucleocapsid PSI-RNA ligand, the effect on Tat-TAR inhibition will also be analyzed. Currently T-cell lines are established which stably express HKWPWW as well as a control peptide, which will be infected with HIV-1 to monitor the ability of HKWPWW to inhibit wild type HIV-1 replication. Conclusion: The selection of a peptide ligand for PSI-RNA able to inhibit HIV-1 replication proves the suitability of the phage display technology for the selection of peptides binding to RNA-structures. This enables the indentification of peptides serving as leads to interfere with additional targets in the HIV-1 replication cycle.
L11 domain rearrangement upon binding to RNA and thiostrepton studied by NMR spectroscopy
Hendrik R. A. Jonker
S. Kaspar Grimm
- Ribosomal proteins are assumed to stabilize specific RNA structures and promote compact folding of the large rRNA. The conformational dynamics of the protein between the bound and unbound state play an important role in the binding process. We have studied those dynamical changes in detail for the highly conserved complex between the ribosomal protein L11 and the GTPase region of 23S rRNA. The RNA domain is compactly folded into a well defined tertiary structure, which is further stabilized by the association with the C-terminal domain of the L11 protein (L11ctd). In addition, the N-terminal domain of L11 (L11ntd) is implicated in the binding of the natural thiazole antibiotic thiostrepton, which disrupts the elongation factor function. We have studied the conformation of the ribosomal protein and its dynamics by NMR in the unbound state, the RNA bound state and in the ternary complex with the RNA and thiostrepton. Our data reveal a rearrangement of the L11ntd, placing it closer to the RNA after binding of thiostrepton, which may prevent binding of elongation factors. We propose a model for the ternary L11–RNA–thiostrepton complex that is additionally based on interaction data and conformational information of the L11 protein. The model is consistent with earlier findings and provides an explanation for the role of L11ntd in elongation factor binding.