Title

Sequences and draft annotations of computationally predicted proteins from Balamuthia mandrillaris

Description

This file contains the sequences and draft annotations of computationally predicted proteins from Balamuthia mandrillaris. The sequences are reconstructed from RNA sequencing of logarithmic phase trophozoites, the infective form of the amoeba. Reads were quality filtered with Trimmomatic and assembled de-novo with Trinity v2.8 (k-mer=25) and Spades v3.13 (k-mer=29 and 33) after clipping of the adaptor sequences. Further, quality-filtered reads were aligned to the published B. mandrillaris genome LFUI01 with STAR v2.6 and assembled with Trinity. The three assemblies thus obtained were combined with EvidentialGene v19jan01 (EviGene) with BUSCO homology scores as input for the classifier. This data set consists of the EviGene ‘main’ proteins. FASTA headers are derived from the annotations predicted with blast2go or PANNZER2 if blast2go failed.

This dataset is a byproduct of the study described in: The transcriptome of Balamuthia mandrillaris trophozoites for structure-based drug design. https://doi.org/10.1038/s41598-021-99903-8

Balamuthia mandrillaris, a pathogenic free-living amoeba (FLA), causes cutaneous skin lesions as well as the brain-eating disease: Balamuthia granulomatous amoebic encephalitis (GAE). These diseases, and diseases caused by other pathogenic FLA, Naegleria fowleri or Acanthamoeba species, are minimally studied. Chemotherapies for CNS disease caused by B. mandrillaris require vast improvement. Current therapeutics are limited to a small number of drugs that were previously discovered in the last century through in vitro testing or identified after use in the small pool of surviving reports.Using our recently published methodology to identify potentially useful therapeutics, we screened a collection of 85 compounds that have previously been reported to have antiparasitic activity. We identified 59 compounds that impacted growth at concentrations below 220 µM. Since there is no fully annotated genome or proteome, we used RNA-Seq to determine the gene products of the specific genes potentially targeted by the compounds in B. mandrillaris trophozoites. We identified the sequence of 17 of these target genes and obtained expression clones for 15 that we validated by direct sequencing.

Publication Date

1-1-2021

Publisher

National Institutes of Health (NIH)

DOI

10.35092/yhjc.12478733.v2

Funder

National Institute of General Medical Sciences

Document Type

Data Set

Identifier

10.35092/yhjc.12478733.v2

Embargo Date

1-1-2021

Share

COinS