Title

Overview

Ruminant Genome Database (RGD) is a comprehensive repository that integrated genomics, transcriptomics, epigenomics, and phenotypic data of Ruminantia. This suborder includes traditional livestock such as cattle, sheep and goat, endangered species such as milu, arctic and dessert lived species such as reindeer and gemsbok, and species of biomedical interest, such as antlered deer. Therefore, ruminants have great significance in agriculture, conservation, adaptability, and biomedicine. RGD currently hosts 78 published ruminant genomes, 1,936 RNA-seq datasets (goat: 300, sheep: 832, cattle: 461, zebu: 128, yak: 88, water buffalo: 69, roe deer: 20, and sika deer: 38), epigenomics signals predicted by 220 ruminant and 833 human epigenomic data, comparative genomic analysis results such as synteny blocks and orthologous gene clusters annotated with Gene Ontology and pathway, and the traits data from AnimalQTLdb and GWAS Atlas. Furthermore, a set of analysis (including BLAT, BLAST, and Table Browser) and visualization tools (the UCSC Genome Browser and geneHeatmap) and user-friendly query interfaces have been implemented in RGD to facilitate the usage of these large-scale data by the community.

01. Phylogenetic tree

Ruminant phylogenetic tree supports a sister-group relationship for Antilocapridae and Giraffidae, as well as for Moschidae and Bovidae. Using fossil calibrations, the emergence of crown Ruminantia is at 39.1 - 32.3 Mya (late Oligocene), and the origin of the Pecora is at 23.3 - 20.8 Mya (Neocene).

Figure 1. Phylogeny and trait evolution of ruminants (Chen et al).

02. Comparative Genomics and Epigenomic Signals

The UCSC Genome Browser is a feature-rich graphical viewing tool especially for displaying comparative genomic data. RGD releases "Genes and Gene Predictions", "Expression and Regulation", and "Comparative Genomics" tracks against cattle (ARS-UCD1.2_Btau5.0.1Y), sheep (Oar_rambouillet_v1_0_addY), and goat (ARS1) genome assembly. Here, we use the goat genome as a representative example to display the tracks, which include the improved goat gene annotation1, gene expression profile2, epigenomics signals3, five scales of conservative evaluation4, and a multiple sequence alignments of 110 species5 (78 ruminants and 32 mammalian outgroup species). Users can enter a genomic region, or a gene symbol, or a transcript name in the search box and click the "go" button to visualize these data. Especially, users can find the clade-specific or species-specific sequence differences that related to livestock production or ruminant morphological characteristics by zooming in the alignments to nucleotide or amino acid levels. Furthermore, users can click "PDF/PS" item under the "View" menu of navigation bar to generate a high quality image in PostScript or PDF formats.

Figure 2. Main Genome Browser display page on the goat assembly (ARS1), showing tracks of gene annotations, expreBar, epigenomics signals, five conservation scores, and 110-way multiple alignment.

Because more cattle epigenomic data can be accessed and collected, we used ChromHMM software to train a chromatin state prediction model incorporating six bovine epigenomic signals across nine tissues. Labels and colors assigned to each state are shown in Figure 3.

Figure 3. A, Definitions and abbreviations of 15 chromatin states, respectively. B, Chromatin states display in the Genome Browser.

03. Gene Expression Heatmap

The expression atlas database includes 1,936 RNA-seq datasets (goat: 300, sheep: 832, cattle: 461, zebu: 128, yak: 88, water buffalo: 69, roe deer: 20, and sika deer: 38). Click on the related species and enter a gene symbol in the search box to get gene expression in heatmap and Genome Browser.

04. Gene Ontology & Pathway Annotation

Users can enter a gene symbol to get results of three parts: Orthologous Genes, the corresponding Gene Ontology, and KEGG & Wiki Pathways.

Results:

  • Orthologous Genes includes:
    Gene symbol, Transcript name, Species, CDS Length, CDS sequence, and Protein sequence.
  • Gene Ontology includes:
    Molecular Function: GO ID, GO term, and Evidence
    Cellular Component: GO ID, GO term, and Evidence
    Biological Process: GO ID, GO term, and Evidence
    GO ID can be linked to AmiGO 2 database.
  • Pathway includes:
    KEGG pathway items and Pathway ID. Users can click the Pathway ID to get a detailed KEGG pathway figure.
    WikiPathway items. Users can click the "Pathway show" to get a detailed WikiPathway figure.

05. Quantitative Trait Locus & Genome Wide Association Study

5.1 QTL

We filtered QTL items that were not anchored to genome chromosome, and finally got 150,617 cattle QTLs and 2,350 sheep QTLs. We provide three ways to retrieve QTLdb: Search by Gene symbol, Find QTL by genome location, and Find associated gene by trait name or keyword. Users can get various traits with annotated genes to further confirm the gene function.

5.2 GWAS

We integrated GWAS data from GWAS Atlas, and finally got 1,630 cattle GWAS items, 539 sheep GWAS items and 38 goat GWAS items. Users can browse the GWAS data by any keywords and further confirm their interested genes.

06. Tools

6.1 Local UCSC Table Browser

UCSC Table Browser is a powerful tool for retrieving raw data and performing intersections and unions between data in different tracks. For the basic data queries, users can select clade, genome, assembly, group, track, table, regions of interest, output format and output file name to get query results in a tab-delimited text format or compressed format. While for the advanced queries, users can filter and refine queries, intersect query results from different tables and configure the resulting output. The UCSC Table Browser can retrieve and download all data from tracks of cattle, sheep, and goat genome for other analysis.

6.2 Blat

webBlat is a web-based version of BLAT developed by Jim Kent. User can type a DNA, mRNA, or protein sequence against the genome assembly of goat (ARS1), sheep (Oar_v4.0 and Oar_rambouillet_v1.0_addY), cattle (ARS-UCD1.2_Btau5.0.1Y), zebu (UOA_Brahman_1), yak (BosGru3.0), and water buffalo (UOA_WB_1) to return a list of links to all genome positions that share 95% or greater identity with the input sequence. Then the alignment regions can be displayed in the genome browser when users click the "Gbrowse" link.

6.3 Blast

ViroBLAST is also available in our database as an online tool. User can enter query sequences of DNA against goat (ARS1), sheep (Oar_v4.0 and Oar_rambouillet_v1.0_addY), cattle (ARS-UCD1.2_Btau5.0.1Y), zebu (UOA_Brahman_1), yak (BosGru3.0), and water buffalo (UOA_WB_1) genome database getting a group of high-scoring pairwise alignments.

6.4 LiftOver

RGD sets up a local LiftOver tool which was created by UCSC Genome Browser Group. This tool converts genome coordinates and genome annotation files between assemblies.

  • Cattle (ARS_UCD1.2_addY) --> Human (Hg38)
    Cattle (ARS_UCD1.2_addY) --> Goat (ARS1)
    Cattle (ARS_UCD1.2_addY) --> Cattle (Btau_5.0.1)
    Cattle (ARS_UCD1.2_addY) --> Cattle (UMD_3.1.1)
    Cattle (ARS_UCD1.2_addY) --> Cattle (UMD_3.1)
    Cattle (Btau_5.0.1) --> Cattle (ARS_UCD1.2_addY)
    Cattle (Btau_5.0.1) --> Cattle (UMD_3.1.1)
    Cattle (Btau_5.0.1) --> Cattle (UMD_3.1)
    Cattle (UMD_3.1.1) --> Cattle (ARS_UCD1.2_addY)
    Cattle (UMD_3.1.1) --> Cattle (Btau_5.0.1)
    Cattle (UMD_3.1.1) --> Cattle (UMD_3.1)
    Cattle (UMD_3.1) --> Cattle (ARS_UCD1.2_addY)
    Cattle (UMD_3.1) --> Cattle (Btau_5.0.1)
    Cattle (UMD_3.1) --> Cattle (UMD_3.1)
  • Sheep (Oar_rambouillet_v1.0) --> Human (Hg38)
    Sheep (Oar_rambouillet_v1.0) --> Sheep (Oar_v4.0)
    Sheep (Oar_rambouillet_v1.0) --> Sheep (Oar_v3.1)
    Sheep (Oar_v4.0) --> Sheep (Oar_rambouillet_v1.0)
    Sheep (Oar_v4.0) --> Sheep (Oar_v3.1)
    Sheep (Oar_v3.1) --> Sheep (Oar_rambouillet_v1.0)
    Sheep (Oar_v3.1) --> Sheep (Oar_v4.0)
  • Goat (ARS1) --> Human (Hg38)
    Goat (ARS1) --> Goat (CHIR_1.0)
    Goat (ARS1) --> Goat (CHIR_2.0)
    Goat (CHIR_2.0) --> Goat (ARS1)
    Goat (CHIR_2.0) --> Goat (CHIR_1.0)
    Goat (CHIR_1.0) --> Goat (ARS1)
    Goat (CHIR_1.0) --> Goat (CHIR_2.0)
  • Buffalo (UOA_WB_1) --> Human (Hg38)
  • Human (Hg38) --> Cattle (ARS_UCD1.2_addY)
    Human (Hg38) --> Sheep (Oar_rambouillet_v1.0)
    Human (Hg38) --> Goat (ARS1)
    Human (Hg38) --> Buffalo (UOA_WB_1)

6.5 Batch Download

RGD data can be downloaded easily by Table Download tool, or directly from Download Page.