In vitro propagation and genome sequencing of three ‘atypical’ Ehrlichia ruminantium isolates

Three isolates of Ehrlichia ruminantium (Kümm 2, Omatjenne and Riverside), the causative agent of heartwater in domestic ruminants, were isolated in Ixodes scapularis (IDE8) tick cell cultures using the leukocyte fraction of infected sheep blood. All stocks were successfully propagated in IDE8 cells, whereas initiation attempts using endothelial cell cultures were unsuccessful. Therefore, the new technique should be included in any attempt to isolate field strains of E. ruminantium to enhance the probability of getting E. ruminantium isolates which might not be initiated in endothelial cells. Draft genome sequences of all three isolates were generated and compared with published genomes. The data confirmed previous phylogenetic studies that these three isolates are genetically very close to each other, but distinct from previously characterised E. ruminantium isolates. Genome comparisons indicated that the gene content and genomic synteny were highly conserved, with the exception of the membrane protein families. These findings expand our understanding of the genetic diversity of E. ruminantium and confirm the distinct phenotypic and genetic characteristics shared by these three isolates.


Introduction
The intracellular rickettsial agent Ehrlichia ruminantium causes a disease commonly known as heartwater or cowdriosis. It is an infectious, non-contagious disease which affects mainly cattle, sheep, goats and some wild ruminants. It is transmitted by ticks of the genus Amblyomma and has been reported from almost all African countries south of the Sahara, from the adjacent islands of the Indian Ocean and Atlantic Ocean (Uilenberg 1983) and from some Caribbean islands (Birnie et al. 1984;Perreau et al. 1980).
The method of choice for in vitro isolation of E. ruminantium is infection of endothelial cells (Bezuidenhout, Paterson & Barnard 1985), the cell type in which organisms occur in infected animals (Cowdry 1926). However, several isolates Du Plessis & Kümm 1971;Steyn 2009) have failed to establish in endothelial cell cultures (Bezuidenhout & Brett 1992;Bezuidenhout et al. 1988). Besides endothelial cells, tick cell lines have been used to initiate E. ruminantium in in vitro cultures and have even allowed the establishment of infection directly from the leukocytes of sheep blood (Zweygarth, Josemans & Steyn 2008). Therefore, attempts were made to isolate 'atypical' E. ruminantium stocks in tick cells, atypical in the sense that they could not be initiated by using the classical ways of infecting endothelial cells (Bezuidenhout et al. 1985;Byrom et al. 1991).
Since the early reports of the E. ruminantium Omatjenne and Kümm stocks, it has been clear that these stocks share many phenotypical and genetic characteristics, but they differ from all other isolates (Allsopp et al. 1997(Allsopp et al. , 2001Du Plessis 1985, 1990. The Kümm stock was prepared from a goat, which was diagnosed with heartwater, from the Northern Province of South Africa, a heartwater endemic area (Du Plessis & Kümm 1971). Sheep injected with a lymph node suspension from goat developed heartwater symptoms. After more than 100 passages in mice it was still found to be pathogenic in mice, sheep and goats, but non-pathogenic to cattle (Du Plessis 1982). Many attempts were made to culture this organism in endothelial cells; however, it was only established in culture in 2002 using different monocyte cell lines (Zweygarth et al. 2002). It was observed that the Kümm stock comprised two 16S rRNA (16S ribosomal ribonucleic acid) genotypes, a 16S genotype typical of West African isolates (Kümm1) isolated in a canine macrophage-monocyte cell line (DH82) and a 16S genotype identical to E. ruminantium (Omatjenne) (Kümm2) isolated in a sheep blood mononuclear cell line (E2).
Three isolates of Ehrlichia ruminantium (Kümm 2, Omatjenne and Riverside), the causative agent of heartwater in domestic ruminants, were isolated in Ixodes scapularis (IDE8) tick cell cultures using the leukocyte fraction of infected sheep blood. All stocks were successfully propagated in IDE8 cells, whereas initiation attempts using endothelial cell cultures were unsuccessful. Therefore, the new technique should be included in any attempt to isolate field strains of E. ruminantium to enhance the probability of getting E. ruminantium isolates which might not be initiated in endothelial cells. Draft genome sequences of all three isolates were generated and compared with published genomes. The data confirmed previous phylogenetic studies that these three isolates are genetically very close to each other, but distinct from previously characterised E. ruminantium isolates. Genome comparisons indicated that the gene content and genomic synteny were highly conserved, with the exception of the membrane protein families. These findings expand our understanding of the genetic diversity of E. ruminantium and confirm the distinct phenotypic and genetic characteristics shared by these three isolates.
The Omatjenne genotype originated from the farm Omatjenne in the Otjiwarongo district of Namibia, a heartwater-and Amblyomma-free area (Du Plessis 1990). Healthy cattle on this farm reacted positively to E. ruminantium antigen using an indirect fluorescent antibody (IFA) test. Subsequently, ticks were collected from cattle on the farm and homogenates of individual ticks injected into mice. The serum of a single mouse, inoculated with homogenate prepared from a Hyalomma truncatum tick, tested positive in the IFA test. The original infective agent was non-pathogenic to mice, calves and sheep. Only after passaging through three generations of A. hebraeum, it became fatal to sheep and mice (Du Plessis 1990). The organisms observed in brain smears of the sheep closely resembled those of the Kümm stock; fewer colonies of small size compared with those typically observed in animals infected with other E. ruminantium isolates.
Both stocks were atypical in that they are highly pathogenic to mice, but apparently non-pathogenic to bovine and could not be cultured in endothelial cells. The Kümm stock was described as atypical in that it infected mouse peritoneal macrophages (Du Plessis 1982). Because of the differences in pathogenicity and anomalous behaviour in cell culture, it was questioned whether the Kümm stock belonged to the species E. ruminantium (Du Plessis 1982). Likewise Allsopp et al. (1997) suggested that E. ruminantium (Omatjenne) (then Cowdria ruminantium [Omatjenne]), not to be confused with Ehrlichia sp. (Omatjenne) later renamed Anaplasma sp. (Omatjenne), may belong to a different species because of its difference in vector specificity and virulence.
Phylogenetic studies revealed that all E. ruminantium stocks analysed routinely grouped into one of two major clades, a West African clade and a southern/eastern African clade, except for Kümm2 and/or Omatjenne that clustered either as a unique group or in one of the major clades Allsopp et al. 1997Allsopp et al. , 2001Steyn 2009;Van Heerden et al. 2004). Even isolates from several other geographical areas of Africa, the Indian Ocean islands and the Caribbean cluster with the southern and eastern African isolates can be included in a worldwide clade (Cangi et al. 2016). All these studies, however, were conducted with a limited number of genes, which do not necessarily allow the identification of recombination events. Furthermore, only small numbers of E. ruminantium isolates with different genotypes have been isolated in cell culture, which limits studies to link variation in DNA sequence to phenotypic characteristics. Therefore, there is a need to establish more isolates in cell culture that would enable us to conduct experiments to determine virulence, cross-protection between isolates, and to produce whole genome sequences.
This study reports on the isolation of 'atypical' E. ruminantium in tick cell cultures. In addition, we generated draft genome sequences of all three 'atypical' isolates and determined the differences between them and previously characterised E. ruminantium genomes.

Infective agents
Three stocks of E. ruminantium were used. The Kümm2 genotype (Zweygarth et al. 2002), which was derived from the Kümm stock (Du Plessis 1982), was originally isolated from a naturally infected goat in Rust de Winter (Limpopo Province, South Africa). The E. ruminantium Omatjenne isolate was isolated from a single H. truncatum tick collected from a heartwater-and Amblyomma-free area of Namibia (Du Plessis 1990). Its complete 16S rDNA (16S ribosomal RNA) sequence was determined and submitted to GenBank TM (accession number C. ruminantium [Omatjenne] U03776) (Allsopp et al. 1997). The original inoculums from which the Kümm and Omatjenne stocks were isolated are no longer available and the complete history of the blood stabilates used in this study is unknown. The third isolate was derived from the blood obtained from a sick angora goat from the farm Riverside (26.83°E, -33.45°S; Grahamstown, Eastern Cape Province, South Africa) (Steyn 2009).

Infection of Ixodes scapularis (IDE8) cell cultures
Each of the E. ruminantium stocks was used to infect Merino sheep by intravenous injection of 5 mL of an infectious blood stabilate. The sheep subsequently developed symptoms associated with E. ruminantium infection. The body temperature of each sheep was monitored daily and a blood sample was collected when it exceeded 41.0 °C. Blood collected at the peak febrile reaction was used to initiate cell cultures according to the method described by Zweygarth et al. (2008). Thereafter the sheep were treated with tetracycline.
Blood was collected by venipuncture into sterile Vac-u-test ® tubes containing heparin (lithium heparin, 14.3 United States Pharmacopoeia (USP) per mL blood) as anticoagulant and stored in ice. The cooled blood was centrifuged (800 × g; 10 min; 4 °C) and the buffy coat collected and washed with cold physiological phosphate-buffered saline (PBS). The buffy coat was re-collected, and the red blood cells were lysed for approximately 30 seconds in 18 mL sterile distilled water followed by the addition of 2 mL of 10-fold concentrated Hanks' balanced salt solution to restore physiological tonicity. The lysate was centrifuged for 5 min at 290 × g at 18 °C. The resulting cell pellet was re-suspended in 5 mL DF-12 and distributed into 25 cm² culture flasks containing IDE8 cells. The cultures were incubated at 32 °C.

DNA preparation and sequencing
The E. ruminantium elementary bodies were isolated from the cell culture material on discontinuous Percoll density gradients (Mahan et al. 1995) and the bacterial DNA was extracted with the DNeasy Blood & Tissue kit (Qiagen, GmbH, Hilden, Germany). The genomes were sequenced using Illumina Nextera paired-end libraries on the Illumina MiSeq and/or HiSeq platforms (Illumina, San Diego, CA, US).

De novo assembly
Sequencing reads were processed and assembled using CLC Genomics Workbench version 7.0 (https://www. qiagenbioinformatics.com/). Default parameters were used for quality trimming, and adapter sequences were removed. Trimmed reads < 50 bp were discarded. Several de novo assemblies using different combinations of parameters were performed for each data set. The following stringent parameters were used in all assemblies: mismatch cost, insertion cost and deletion cost of three, and length fraction and similarity fraction = 0.9. We varied the minimum contig length, also whether to use global alignment and whether to perform scaffolding or not. Contigs <500 bp and <10-fold coverage were discarded; the remaining contigs were blasted with the National Center for Biotechnology Information's (NCBI) nucleotide BLAST (blastn, https://blast.ncbi.nlm. nih.gov/Blast.cgi) to identify and remove contaminating data (phiX, tick, and Mycoplasma spp.).
The sequences of the contigs from different CLC assemblies for each data set were compiled and joined in GAP4 (Bonfield, Smith & Staden 1995). Discrepancies between different assemblies of the same data sets were checked and incorrect contig sequences were removed. The resulting contigs were ordered to the Welgevonden strain (CR767821) using progressive Mauve (Darling, Mau & Perna 2010) of Mauve version 2.4.0 (Darling et al. 2004).
The aligned concatenated sequences were analysed in CLC Genomics Workbench 7.0. A maximum likelihood phylogenetic tree (Unweighted Pair Group Method using Arithmetic averages [UPGMA], bootstrap analysis with 100 replicates) was created and pairwise comparisons were performed to illustrate relationship between the E. ruminantium isolates and the other Ehrlichia and Anaplasma species.

Whole genome comparisons
The ordered contig sequences of each genome were concatenated and submitted to NCBI. Protein-coding genes were annotated using NCBI Prokaryotic Genomes Annotation Pipeline (PGAP) (Tatusova et al. 2016) and the resulting GenBank files were used in whole genome comparisons with the Welgevonden and Gardel sequences. Whole-chromosome alignments were performed locally with Blastall (ftp://ftp. ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/); the tabular view option (-m = 8) allowed visualisation of the alignments in the Artemis Comparison Tool (ACT) programme (Carver et al. 2005). In addition, BLAST Ring Image Generator (BRIG) (Alikhan et al. 2011) was used to compare whole genomes and subsets of genes. We used the locus tags for coding sequence (CDS) from the original annotation and publication (Collins et al. 2005)

Ethical consideration
Experiments were performed in accordance with the stipulations of the animal ethics committee at ARC-Onderstepoort Veterinary Research and approved by the South African Department of Agriculture, Forestry and Fisheries under Section 20 of the Animal Disease Act of 1984.

Infection of IDE8 cell cultures
Leukocytes isolated from the blood of infected sheep were used as infective inoculum. All three E. ruminantium stocks were established successfully in IDE8 cell cultures by this method (Table 1). The Kümm2 stock was detected in Giemsastained smears 28 days after initiation. Infection with the Omatjenne stock was first demonstrated on day 47 postinfection. The Riverside isolate was detected in stained smears in IDE8 cell cultures after 25 days. It was not possible to initiate in vitro cultures of any of the three isolates using bovine endothelial cells as host cells (data not shown). http://www.ojvr.org Open Access

Genome assembly
The average coverage of all assemblies was greater than 100-fold, and the draft genome sequences of Kümm2, Omatjenne and Riverside isolates comprised 6, 7 and 9 contigs, respectively ( Table 2). All contigs of these three genomes were successfully mapped to the reference genome (Figure 1-A1).
The total length of the joined contig sequences ranged between 1448 megabases (Mb) and 1455 Mb. The remaining gaps were in repeat regions, including both tandem repeats and dispersed repeats with large repeat units.

In silico multi-locus sequence typing analysis
Multi-locus sequence typing was used to illustrate the relationship between different E. ruminantium isolates. The South African isolates (Welgevonden, Ball3, Mara87/7, Blaauwkrans and Kwanyanga), as well as the isolate from the Caribbean (Gardel), shared more than 99% identity across the eight genes selected for multi-locus sequence typing (MLST), whereas the West African isolates shared 98% identity with the South African isolates (Table 2-A1). The MLST sequences of the three 'atypical' isolates were identical, but had only 88% identity compared with other E. ruminantium isolates and were separated into a distinct clade in the phylogenetic tree ( Figure 1). The other three species of Ehrlichia shared 88% -89% identity (genetic distances 0.11-0.13), whereas different Anaplasma species are more diverse with 68.5% -92.5% identity (distance of 0.08-0.15). In fact, the genetic distance between A. marginale and A. centrale (0.08, 92.5% identity) is considerably closer than that between the 'atypical' and other E. ruminantium isolates (0.12, 88% identity).
In each genome, one set of rRNA genes, 36 transfer RNA (tRNA) genes, Transfer-messenger RNA (tmRNA) and two non-coding RNA (ncRNA) genes were identified.
The genomes generated in this study still have gaps because of dispersed repeats and large tandem repeats, and certain of the assembled tandem repeats are missing repeat units. These facts could contribute to the smaller genome size (1.45 Mb) of the draft genomes compared with 1.5 Mb reported for the Welgevonden and Gardel sequences (Collins et al. 2005;Frutos et al. 2006). It could also explain the slightly higher protein-coding capacity calculated for the assemblies because the repeat regions are mostly located in non-coding regions.
The gene content and genomic synteny were highly conserved between the three genomes as well as compared with previously sequenced genomes (    A high rate of DNA reorganisation in the terminus region is often observed between closely related bacteria, which may be associated with the mechanism of chromosome separation after replication (Hughes 2000).
Single nucleotide polymorphism (SNP) and insert or deletion

Comparison of membrane protein families
Although gene content and genomic synteny are highly conserved, variation was observed between the genomes generated in this study and those reported previously. The variation was mainly limited to genes predicted to encode hypothetical proteins or membrane proteins and specifically the members of the four membrane protein families described previously (Collins et al. 2005). We identified the orthologous families in the Kümm2, Omatjenne and Riverside genomes, and found that their arrangement and number were identical in all the three genomes sequenced in this study. There were, however, variations compared with other E. ruminantium isolates. The major antigenic protein 1 (map1) family has been described in various isolates. The nucleotide sequences of the members of map1 family in Kümm2, Omatjenne and Riverside show a high degree of similarity to Welgevonden (Figure 3a), except for map1-5. In the syntenic locus of map1-5, two smaller open reading frames (ORFs) were identified in the 'atypical' genomes ( Figure 3b).
Two other families described in Collins et al. (2005), here designated membrane family 1 (Erum2240-Erum2340; Erum2400; Erum2410) and membrane family 2 (Erum2750-Erum2800; Erum3600-Erum3630), were less conserved at nucleotide level ( Figure 4a) and both of these families have one member less compared with the Welgevonden annotation ( Figure 4b). Three of the genes in membrane family 2 are in opposite orientation in the 'atypical' strains as compared with Welgevonden.
Orthologs of four predicted membrane proteins (Erum7990, Erum8000, Erum8010 and Erum8020; membrane family 3)   were identified in the same relative location in the three newly sequenced genomes (Figure 4-A1). In all three genomes at this location, however, membrane family 3 expanded to seven members and five more members were identified at 82 Mb upstream (Figure 4-A1, Table 3-A1).

Erum2760
Erum2750 Erum2410  be contributed to other experimental factors. The experiments were conducted using uncharacterised inoculums passaged many times in different animals during the pre-PCR era when it was difficult to detect pathogen-free animals. Hence, the original Kümm stock comprised two organisms or one of them was introduced over the years during passaging. There is also a possibility that the initial Omatjenne agent is not the same as the organism we have cultured in this study. The pathogenicity and vector specificity of the cultured organisms need to be verified experimentally.
The sequences of the concatenated MLST loci of the three 'atypical' isolates were identical; however, they formed a distinct clade in tree topology. These results were confirmed by genome sequences. Although these three sequences only differed by a few SNPs and INDELs, as well as variation and small gaps in repeat regions that may be ascribed to assembly errors, they are markedly different from other E. ruminantium genomes. The 16S rRNA and map1 gene sequences identified the Omatjenne agent, and later Kümm2, as E. ruminantium in the southern African clade (Allsopp et al. 1997;Van Heerden et al. 2004). In contrast, tree topology and pairwise comparison of eight genes presented in this study may support an argument for the 'atypical' isolates to be classified as a separate species. The genetic distances and identity shared between the 'atypical' isolates and other E. ruminantium isolates are similar to the distances and percentage identity between the different species of Ehrlichia and Anaplasma. Whether these three isolates indeed represent a unique species needs to be validated.
All the E. ruminantium genomes sequenced thus far are syntenic (Frutos et al. 2006;Nakao Jongejan & Sugimoto 2016), and it is known that Anaplasma spp. and Ehrlichia spp. share conserved gene order (Dunning Hotopp et al. 2006;Pierlé et al. 2012). The synteny is also observed for the three isolates sequenced in this study with a few exceptions in the membrane protein families. Of the four families analysed, the map1 family was the most conserved one across all E. ruminantium genomes. The paralogs are maintained in the same order in all genomes, but in place of map1-5, two small ORFs were detected in the genomes presented here. The map1-5 gene is truncated compared with the other paralogs in all isolates analysed thus far. It was also identified as one of the paralogs that undergoes balancing selection, a type of selection that is reported to maintain genetic variation in genes that are involved in evasion of host immune response (Salim et al. 2019).
Most variation was detected in membrane family 1 and membrane family 2. The nucleotide sequences differed significantly between orthologous genes, and the number, order and, in some cases, orientation of the paralogs were different in new annotations. In membrane family 3, the nucleotide sequences between orthologs were conserved, but this family was expanded from four to 12 members in the 'atypical' genomes. At present, the function of these putative proteins is unknown, and it has not been shown that all members of these two protein families are expressed. Therefore, it is not known whether these variations have an effect on the expression or function of the predicted proteins encoded by these ORFs. It is known that Anaplasma and Ehrlichia spp. present a wide range of paralogous genes encoding various functions that ensure survival in diverse host and vector environments (Dunning Hotopp et al. 2006).
Several studies have reported a high level of genetic diversity among E. ruminantium isolates Cangi et al. 2016;Nakao et al. 2011). In contrast, here we found that the genome sequences of the three 'atypical' isolates from distant geographical areas and diverse habitats are almost identical. Excluding repeat regions, Kümm2, from the Limpopo Province of South Africa, and Omatjenne, from the much drier and heartwater-free Otjiwarongo district of Namibia, differ by four SNPs and four INDELs only. In fact, there were more substitutions and small deletions detected between the parental Welgevonden strain and its daughter strain after 11-13 passages in a different cell culture environment (Frutos et al. 2006).
Although the current results have not connected any genetic variation to the phenotypes that distinguish these isolates, variations in the membrane protein families may contribute to the ability of these organisms to infect different cells.
A comprehensive SNP analysis, including all genomes sequenced, may elucidate the determinants of diversity. The synteny conservation in E. ruminantium genomes suggests that at least some of the phenotypes are associated with small polymorphisms. In view of this, we are in the process of generating whole genome sequences of all the E. ruminantium isolates we have in cell culture. In addition, we need to establish more field isolates in culture to conduct phenotypic and genotypic analyses in the future work. To date, no standard cell line has been designated for isolation of E. ruminantium, and it is clear that the 'atypical' isolates cannot be easily isolated in bovine endothelial cells. We therefore recommend the use of both ruminant endothelial cells and tick cell cultures concurrently. The two methods complement each other and should be used when isolating field strains of Ehrlichia spp.
http://www.ojvr.org Open Access contributed to experimental laboratory work and reviewed the article. All authors read and approved the final article.

Funding information
Prof. Katherine M. Kocan, Oklahoma State University, Stillwater, OK, US, provided the IDE8 cells. This work was funded by the Gauteng Department of Agriculture and Rural Development, South Africa (project: Identification of Ehrlichia ruminantium proteins involved in host and vector cell invasion).

Data availability statement
The nucleotide sequences and annotation of the three genomes are available from NCBI (https://www.ncbi.nlm. nih.gov/genome/microbes/). The accession numbers are listed in Table 3.

Disclaimer
The views expressed in this article are those of the authors and do not necessarily reflect the official policy or position of any affiliated agency of the authors or funder.