Gene Expression
We are interested in the regulation of gene expression in P. falciparum with respect to important biologic processes. Two of the hallmark features of this parasite are a complex life cycle that includes both insect and vertebrate hosts, and rapidly increasing resistance to many antimalarial compounds. We have been investigating these processes by analyzing the expression of important genes in these pathways. Serial analysis of gene expression (SAGE) was first used to define the transcriptome of the P. falciparum asexual, red blood cell stage, followed by more recent analyses using microarrays. Here, high density oligonucleotide arrays were used to assay perturbations in gene expression among drug treated cultures.
We are also interested in determining how regulation is achieved at the transcriptional and post-transcriptional levels in these parasites. Indeed, the sequence elements necessary for control of gene expression have been identified for only a few of the predicted 5300 genes of P. falciparum, and thus we are using bioinformatic, transient transfection and electrophoretic mobility shift assays (EMSA) to study regulatory elements in gene clusters sharing similar expression profiles. Our work suggests that post-transcriptional control plays a more important role than previously believed in this system. We have also discovered antisense RNAs in P. falciparum using SAGE, and are working toward dissecting the regulation of these novel RNAs. The long-term goal of this project is to understand the complex life cycle and drug resistance in malarial parasites.
Antisense transcription
Author: Kevin T Militello Ph.D., Anusha M. Gunasekera, Ph.D., Jennifer S. Sims
Antisense transcripts were first detected in the malaria system through the application of serial analysis of gene expression (SAGE) to asexual blood stage parasites. An inherent advantage of the technique lies in its ability to uncover novel ORFs that cannot be predicted from sequence information alone, as well as to provide directional data regarding tag(and hence transcript) orientation in relation to its corresponding locus. These features led to the important discovery of antisense transcripts in Plasmodium falciparum. In an initial analysis of 45 annotated ORFs, we noted that eight possessed antisense SAGE tags at relatively high counts; these included housekeeping genes such as calmodulin and ldh, as well as stage-specific ones, such as msp-3 and rap-1. Alternate methods such as strand-specific northern blots and strand-specific RT-PCR confirmed that highly abundant antisense SAGE tags actually do reflect transcription from the minus strand of these loci.
A growing number of eukaryotic genes are now known to be regulated, at least in part, by endogenous cis-encoded RNA transcribed from the non-coding strand of its ORF. By sequence-specifically targeting its complementary mRNA, an antisense RNA may effect its degradation or translational arrest. The biological role of antisense RNAs in P.falciparum, however, remains to be defined. We surveyed the global distribution of both sense and antisense transcripts across the asexual stage transcriptome for the first time, following the comprehensive annotation of a total of 17,245 SAGE tags, extending over a 350-fold expression range. Here, antisense RNAs were largely derived from nuclear-encoded loci, where approximately 18% of all transcripts expressed at detectable level, were found in the reverse orientation. Importantly this abundance represents both low levels of antisense transcription from numerous genes, as well as high levels of antisense RNAs from a few loci. The latter include genes specifically involved in translation and proteolysis functions. Interestingly, antisense RNAs were virtually absent from the mitochondrial genome. We also note that sense and antisense tags counts from single loci across the transcriptome are inversely related. Taken together, these results suggest that antisense transcription is initiated from certain groups of loci in P.falciparum and furthermore, may play a role in regulating expression in a gene-specific manner.
Elucidation of the mechanism of antisense RNA in P. falciparum synthesis is critical in order to demonstrate the origin and function of these transcripts. Therefore, a systematic analysis of antisense and sense RNA synthesis was performed using direct labeling experiments. Nuclear run-on experiments with single-stranded DNA probes demonstrated that antisense RNA is synthesized in the nucleus at several genomic loci. Antisense RNA synthesis is sensitive to the potent RNA polymerase II inhibitor alpha-amanitin. Antisense and sense transcription was also detected in nuclei isolated from synchronized parasites, suggesting concurrent synthesis. In summary, our experiments directly demonstrate that antisense RNA synthesis is a common transcriptional phenomenon in P. falciparum, and is catalyzed by RNA polymerase II.
BioinformaticsAuthors: Anusha M. Gunasekera, Ph.D., Kevin T. Militello Ph.D., Jennifer S. Sims
Bioinformatic approaches, such as the AlignACE and K-means clustering algorithms, were used to identify regulatory sequences in the genome of P. falciparum. Upstream regions of coregulated genes as well as gene families were examined and sequence motifs over represented in these regions were chosen for further functional study. Our lab was among the first to utilize bio-informatics to find a biological relevant motifs without a priori knowledge. Functional examination of motifs involved in potential regulation of heat shock and chloroquine-responsive genes have been carried out.
Transient transfection experiments demonstrated that both the 5' and 3' flanking regions of the heat shock 86 (hsp86) gene are required for reporter gene activity. The AlignACE algorithm was utilized to uncover over-represented sequence elements in the 5' flanking region of all eighteen P. falciparum heat shock genes, as promoter, enhancer, and untranslated elements are often found in this region in model eukaryotic organisms. The top-scoring motif identified by this analysis was a G rich sequence element named the G-box. The hsp86 gene has two palindromic copies of the G-box located 195 basepairs upstream from the transcription initiation site, implicating a role in transcription. A comparative genomic analysis also identified the two palindromic G-box elements upstream of the P. y. yoelii hsp90 gene. The G-box elements are required for maximal reporter gene expression in transient transfection experiments.
Microarrays were used to monitor the steady-state RNA levels of P.falciparum genes across various cell states and under various exposures to the antimalarial drug, chloroquine. The K-means clustering algorithm (GeneSpring 6.0) was then used define co-regulated gene clusters within this array data set. Clusters were examined for over-represented sequence motifs in the 5' upstream regions of genes. A single sequence element, 5'-GAGAGAA-3' was significantly overrepresented among a cluster of chloroquine responsive genes, while two additional 5' motifs, 5'-ACTATAAAGA-3' and 5'-TGCAC-3', were found among loci exhibiting shared transcript profiles across varying growth states. Following the identification of only three motifs in silico, we next determined the functional relevance of each in regulating gene expression. This was achieved by transient transfection assays utilizing reporter gene constructs and gel retardation assays. The 5'-GAGAGAA-3' and 5'-TGCAC-3' motifs were both active in the transient transfection assay, while the 5'-GAGAGAA-3' and 5'-ACTATAAAGA-3' elements were active in the latter. Together these studies demonstrate the utility of combining whole-genome and bioinformatic strategies, followed by standard molecular approaches aimed at functional verification, to dissect gene regulatory mechanisms in the malarial system.
Pfmdr1 Transcription
Author: Alissa Myrick, Ph.D.
The goal of this work was to commence a detailed examination of the regulation of the Plasmodium falciparum multidrug resistance gene (pfmdr1) at the transcriptional level by first determining the 5'end of its transcript and thus establishing the 3'limit of the promoter. Previous studies of pfmdr1 expression have shown that transcript levels are increased in drug-resistant isolates. However, a detailed examination of the transcriptional regulation of this gene has not been completed.
RT-PCR and 5'-RACE mapping showed that the 5' UTR has a length of 1.94 kb. A putative promoter has been identified via transient transfection. Northern analysis revealed a 2.1- to 2.7-fold increase in pfmdr1 expression in 3D7 parasites treated with 50 nM chloroquine for 6h, confirming results from Serial Analysis of Gene Expression. 3D7 parasites were subsequently treated with experimentally derived IC50 concentrations of mefloquine, quinine and pyrimethamine. Pfmdr1 transcript levels specifically increased 2.5-fold at 6 h in mefloquine-treated parasites and threefold in parasites treated with quinine for 30 min. There was no evidence of transcript induction in pyrimethamine-treated parasites. This is the first evidence of induction of pfmdr1 expression in sensitive cells; and suggests a novel method of transcriptional control for this gene.
Post-transcriptional regulation of gene expression
Author: Jennifer S. Sims
During its asexual life cycle, the Plasmodium falciparum develops into several distinct morphological forms, occupies various compartments in its human host, and often faces drug treatment. Microarray studies of the asexual stages show dramatic changes in the steady-state mRNA levels of many genes, suggesting that differential gene expression is important to development. However, evidence for gene-specific promoter-driven transcriptional control is limited, and may not be the dominant mode of gene regulation (see Transcriptional Regulation of Gene Expression). Post-transcriptional regulation may play a major role in the control of gene expression in P. falciparum.
Steady-state RNA transcript levels were measured by microarray in parasites in various cell states and under different chloroquine exposures, and clusters of co-regulated genes were identified. A short consensus motif was found in the 5'-untranslated regions of one cluster and (in repeated context) was functionally active in transient transfection assays. The motif was bound by protein as RNA, but not as DNA, in electrophoretic mobility shift assays (EMSA), suggesting that binding may play a regulatory role post-transcriptionally. Furthermore, binding to this RNA motif appears to be chloroquine-responsive, with decreased binding at increasing exposures to the drug. This project aims to identify the proteins responsible for binding this RNA motif, and to characterize the relationship between motif binding and regulation of the target transcripts in response to chloroquine exposure.
Transcription of Heat Shock Proteins
Author: Kevin T. Militello Ph.D.
The malaria parasite, Plasmodium falciparum, has a complex life cycle requiring dramatic changes in gene expression. Nonetheless, the sequence elements necessary for gene expression are poorly understood in P. falciparum. Thus, sequences required for expression of the P. falciparum heat shock genes are being elucidated as a model system. The heat shock genes are highly transcribed in asexual, blood stage parasites, which are amenable to transient transfection analysis. Heat shock proteins may be important for parasite adaptation to new temperatures endured throughout the life cycle. Another critical reason the heat shock gene family was chosen is this gene family may be coordinately regulated and thus contain common regulatory elements.
Transient transfection experiments demonstrated that both the 5' and 3' flanking regions of the heat shock 86 (hsp86) gene are required for reporter gene activity. A bioinformatic strategy identified a G rich sequence element named the G-box in the 5' flanking region of all eighteen P. falciparum heat shock genes (see Bioinformatics section). The hsp86 gene has two palindromic copies of the G-box. The G-box elements are required for maximal reporter gene expression in transient transfection experiments. The transient transfection analysis also revealed several regions of the hsp86 gene 5' flanking sequence that are required for robust gene expression including an upstream sequence element (USE), the region containing the transcription start site, and the 5' untranslated region. Remarkably, removal of the hsp86 5' untranslated region resulted in a loss of almost all reporter gene expression.
Our experiments demonstrate that expression of the P. falciparum hsp86 gene is dependent upon multiple sequence elements, the most notable being the G-boxes and the 5' UTR. The G-box is not homologous to known eukaryotic elements, and is one of only a few functional elements elucidated from Plasmodium species.
Transcriptional regulation of gene expression
Author: Jennifer S. Sims
Plasmodium falciparum transitions through several distinct morphological stages while replicating in human red blood cells (RBCs). Transcriptomic studies of these stages report dramatic changes in the steady-state mRNA levels of many genes, suggesting that differential gene expression is important for development. Yet, evidence of promoter-driven regulation is limited to a small number of P. falciparum genes. This project seeks to evaluate the relative influences of gene-specific and global regulation of transcription on differential gene expression during the asexual RBC life cycle.
By nuclear run-on of nuclei harvested from synchronized parasites, we observe a sharp increase in the total incorporation of radiolabeled nucleotide by late-trophozoite/early-schizont nuclei, indicating a peak in global transcriptional activity during these stages of development. A similar trend was seen among several genes when nuclear run-on-labeled RNA was hybridized to filters carrying gene-specific probes. Concurrent transcription from both the sense and antisense strands of genes was evident, in agreement with previous SAGE studies, which demonstrated transcription from both strands of many genes.
These findings support a model in which transcription during the RBC life cycle is globally regulated and occurs predominantly during a distinct period in the cycle. We propose that gene regulation in RBC-stage P. falciparum may consist of a bulk transcriptional event characteristic of the majority of genes, from which differential expression of a minority of genes is distilled by a combination of pre-transcriptional (i.e., chromatin structure), promoter-driven, and post-transcriptional mechanisms. The ongoing goals of this project include elucidating the mechanisms by which transcription is "turned on" in bulk and identifying determinants of genes whose transcriptome is regulated distinctly from this event.
Whole Genome Analysis - In vivo vs. in vitro
Author: Johanna P. Daily, M.D.
Fundamental questions regarding the molecular basis of virulence and immune evasion in the human malaria parasite, P. falciparum, have been only partially answered. Because of the parasite's intracellular location and complex life cycle, standard genetic approaches to the study of the pathogenesis of malaria have been limited. We used a novel approach to the identification of the biological processes involved in host-pathogen interactions, based on the analysis of in vivo P.falciparum transcripts using oligoarrays.
We demonstrate that a sufficient quantity of P.falciparum RNA transcripts can be derived from a small blood sample from infected patients for whole-genome microarray analysis. Overall, excellent correlation was observed between the transcriptomes derived from in vivo samples and in vitro samples with ring-stage P. falciparum 3D7 reference strain. However, gene families that encode surface proteins are overexpressed in vivo. Moreover, our analysis has identified a new family of hypothetical genes that may encode surface variant antigens.
We are currently characterizing the in vivo transcriptomes from several additional clinical samples, amongst which we have preliminarily observed significant diversity, and we are currently determining the biological significance of these differences. Comparative studies of the transcriptomes derived from in vivo samples and in vitro 3D7 samples may identify important strategies used by the pathogen for survival in the human host and highlight new candidate vaccine antigens that were not previously identified through the sole use of in vitro cultures.
Whole Genome Analysis: Oligonucleotide arrays
Author: Anusha M. Gunasekera, Ph.D., Alissa Myrick, Ph.D.
Oligonucleotide arrays are a powerful tool that can generate an expression profile of an entire genome. In collaboration with the Winzeler laboratory at the Scripps Institute in San Diego, our lab has made use of these arrays to study various aspects of the P.falciparum transcriptome. The whole-genome approach informs our studies of parasite gene expression by enabling us to map patterns of gene expression amongst multiple gene clusters. Also, we were able to identify putative regulatory regions among genes with similar expression profiles (see section on Bio-informatics); and then test these regions utilizing functional assays.
More specifically, we performed a high-density oligonucleotide array analysis of chloroquine (CQ) treated asynchronous P. falciparum cultures to examine the drug's effects on multiple cell states. Our group previously quantitatively profiled steady-state RNA levels in a single cell state using SAGE (serial analysis of gene expression) technology. We identified 100 CQ responders among highly expressed loci in the P.falciparum transcriptome, but found that overall changes in expression under drug treatment were weak compared to those observed in other eukaryotic systems. The current array study extends our analysis to the entire genome in multiple cell states, following numerous CQ treatments. Consistent with the SAGE data, the magnitude of transcriptional changes observed among drug treated samples in the current analysis was weak. These findings are important, given the striking contrast to similar studies in other organisms, which characteristically identify drug-responsive genes and networks with large fold-changes in expression.
In total, 600 genes were differentially regulated in response to CQ, over half of which are of unknown function. Overall, their fold change values were weak- only 3% of the responders exhibited FC values greater than 4. Differences in parasite staging appear to have a more pronounced effect on gene expression than chloroquine. Highlighting the dependence of gene expression on cell state, only 23 genes found to be differentially regulated by SAGE were also CQ responsive in this study. These data lead us to believe that there is no single signature response to CQ in Plasmodium, unlike the case of drugs such as isoniazid in Mycobacterium, for example. Collectively, the data may indicate that primary response to chloroquine does not occur at the level of transcriptional control or RNA stability in the malarial parasite. Alternatively, we postulate that these relatively small changes in the expression of groups of related genes may be biologically significant, acting synergistically to affect cell physiology. One way to delineate such networks would be the identification of common sequence motifs in noncoding regions of related clusters that might play a role in coordinating their expression (See Bio-informatics section).
Whole Genome Analysis: SAGE
Author: Anusha M. Gunasekera, Ph.D., Swati Patankar, Ph.D.
We have applied SAGE (serial analysis of gene expression) to P.falciparum asexual blood stages in order to characterize the parasite transcriptome. The SAGE methodology generates a comprehensive transcript profile by quantitatively and simultaneously analyzing thousands of cDNA tags from a given population. The technique is based on three experimentally confirmed principles. First the short cDNA tag (10bp), which is derived from a defined position within a transcript, contains sufficient information to uniquely identify that locus. Second, concatenation of several tags into a single molecule for sequencing allows parallel processing of a large number of transcripts. Finally amplification bias is readily recognized and eliminated during the PCR steps of SAGE. As such the relative abundance of SAGE tags accurately reflect that of its corresponding transcript in the population.
We demonstrated that P.falciparum is amenable to this technique, despite the remarkably high A-T content of its genome. SAGE tags as short as 10 nucleotides were sufficient to identify parasite transcripts from both nuclear and mitochondrial genomes. Moreover, the skewed A-T content of parasite sequence did not preclude the use of enzymes that are crucial for generating representative SAGE libraries. Finally, a few modifications to DNA extraction and cloning steps of the SAGE protocol proved useful for circumventing specific problems presented by A-T rich genomes.
A truly comprehensive assessment of the malarial transcriptome requires that tags be efficiently and routinely assigned to parasite ORFs. We constructed a relational database that integrates SAGE expression data with genome sequence information in PlasmoDB (www.PlasmoDB.org) for this purpose. Comprehensive annotation of transcripts extending over a ~350 fold expression range indicates that over 1500 of the 5000 predicted ORFs in P.falciparum are expressed in mixed erythrocytic stage parasites. Transcripts expressed as low as 0.015 percent abundance could be detected with this approach, hence validating that a substantial and comprehensive description of the 3D7 transcriptome was obtained. Major contributors to the sense transcriptional profile of all three libraries included ORFs encoding membrane-associated proteins, carbohydrate metabolism, mitochondrial metabolism and signal transduction factors. These results are likely of physiological relevance, given the unique contribution of each of these pathways to cellular life in the parasite.
In summary, we have established the necessary modifications required for the successful adaptation of SAGE in this system, and developed the bioinformatic tools essential for subsequent data compilation and analysis of P.falciparum 3D7 strain SAGE tags. The ability to quantify parallel gene expression within a population has allowed us to determine the relative contribution of different functional pathways to the parasite transcriptome. High-throughput profiling techniques in malaria can be used to address different aspects of parasite biology, which remain largely unexplored. In fact, the open platform profiling nature of SAGE has led to the important discovery of antisense transcription in this system and also allowed us to carry out a preliminary characterization of transcriptional changes influenced by drug pressure.
Here, the comprehensive annotation of SAGE libraries derived from an asexual stage population exposed to drug and its matched control was used to assess the modulation of gene expression by chloroquine. We observed a constellation of changes, with the differential regulation of over 100 transcripts, and have confirmed the data by alternate methods. A few responsive loci, including PfMDR1, have previously been implicated in the mechanism of chloroquine action/resistance. Several others, however, were derived from unexpected categories, including a large number of unknown open reading frames (ORFs), whose induction after drug exposure may provide first hints to their possible function.