logo
  • Mapping
  • Datasets
  • Search
  • Tutorial
  • Help
  • About
  • Contact
  • Manual
    • Spatial Mapping
    • Dataset Browser
    • Gene Search
  • Q&A
  • API
  • Compatibility
  1. Help/
  2. Manual/
  3. Gene Search/

Gene Search

This section would show you details of gene annotation and other features of genes of interest.

1. Gene annotation

We collected and integrated those public gene information data from Ensembl Database and NCBI Gene Database. The data from NCBI were downloaded in 2022-08-26. Human genes (based on GRCh38.p13) and mouse genes (based on GRCm39) were from Ensembl v107 database.

The Gene Datasets contains 68,324 Human gene records, 56,748 Mouse gene records and 463,409 transcripts records, which containing the following attributes:

Datasets Attribution and Source (click to hide or show this panel)
DatasetsAttributeSource
Gene InfoSymbolEnsembl Database
Ensembl IDEnsembl Database
DescriptionEnsembl Database
Gene TypeEnsembl Database
OrganismEnsembl Database
Chromosome, Start, End, StrandEnsembl Database
Gene SourceEnsembl Database
Gene VersionEnsembl Database
Entrez IDNCBI Gene Database
Aliases / Gene SynonymsNCBI Gene Database
Chromosomal LocationNCBI Gene Database
Other DesignationsNCBI Gene Database
Identifiers in Other DBNCBI Gene Database
TranscriptTranscript IDEnsembl Database
NameEnsembl Database
LengthEnsembl Database
TypeEnsembl Database
Transcription Start Sites (TSS)Ensembl Database
Refseq mRNA IDEnsembl Database
Refseq ncRNA IDEnsembl Database
VersionEnsembl Database
Start - EndEnsembl Database
CountEnsembl Database
Transcript Support Level (TSL)Ensembl Database

Explanation of gene attributes are as follows:

AttributeDescription
SymbolOfficial short-form abbreviation for a particular gene
Entrez IDIdentifier for a gene from the NCBI Entrez database
DescriptionA descriptive name for this gene, and those words inside the square brackets show the source of this attribution
Gene TypeA gene classification containing protein coding, lncRNA, processed pseudogene, unprocessed pseudogene, miRNA, TEC, snRNA, misc_RNA, snoRNA and so on, which integrated from Ensembl Database
OrganismOrganism where the gene came, containing only two species: Homo sapiens and Mus musculus
Gene SynonymsA comma-delimited set of unofficial symbols and descriptions that have been used for this gene integrated from NCBI Entrez Database
Other DesignationsSemicolon-delimited set of some alternate descriptions that have been assigned to a GeneID. '-' indicates none is being reported.
Identifiers in Other DBComma-delimited set of identifiers in other databases for this gene. The unit of the set is database:value. Note that HGNC and MGI include 'HGNC' and 'MGI', respectively, in the value part of their identifier. Consequently, this attribution for these databases will appear like: HGNC:HGNC:1100, this would be interpreted as database='HGNC', value='HGNC:1100'. Example for MGI: MGI:MGI:104537. This would be interpreted as database='MGI', value='MGI:104537'
LocationChromosome and coordinate where a gene locates, which is 0-based start
Chromosome LocationCytogenetic location
Gene VersionGene version integrated from Ensembl Database
Gene SourceThe annotation source for this gene integrated from Ensembl Database

Explanation of transcript attributes are as follows:

AttributeDescription
Transcript IDA stable identifier for this transcript from Ensembl
NameA name for this transcript from Ensembl
LengthLength of this transcript (bp)
TypeA transcript classification containing protein coding, lncRNA, processed pseudogene, unprocessed pseudogene, miRNA, TEC, snRNA, misc_RNA, snoRNA and so on, which is integrated from Ensembl Database
Transcription Start Sites (TSS)The transcription start sites of this transcript
Refseq mRNA IDA corresponding ID of this mRNA from NCBI's Reference Sequences (RefSeq) database
Refseq ncRNA IDA corresponding ID of this non-coding RNA from NCBI's Reference Sequences (RefSeq) database
VersionThe version of this trancript from Ensembl
Start - EndThe start and end coordinate of this trancript
CountThe expression count
Transcript Support Level (TSL)The Transcript Support Level (TSL) is a method to highlight the well-supported and poorly-supported transcript models for users, based on the type and quality of the alignments used to annotate the transcript.
  • TSL 1: A transcript where all splice junctions are supported by at least one non-suspect mRNA
  • TSL 2: A transcript where the best supporting mRNA is flagged as suspect or the support is from multiple ESTs
  • TSL 3: A transcript where the only support is from a single EST
  • TSL 4: A transcript where the best supporting EST is flagged as suspect
  • TSL 5: A transcript where no single transcript supports the model structure
  • TSL NA: A transcript that was not analysed for TSL

2. Search rules

The search allows users to choose the organism and id type of genes of interest.

Organisms:
  • Human: genes from Homo sapiens
  • Mouse: genes from Mus musculus
  • All: genes from human or mouse
ID type:
  • Symbol: short-form abbreviation for a particular gene
  • Ensembl ID: identifier for a gene from the Ensembl (European Bioinformatics Institute and the Wellcome Trust Sanger Institute) database
  • Entrez ID: identifier for a gene from the NCBI Entrez database

The search mode is case-insensitive, genes that are partially matched will be return with the perfect match comes first.

3. Spatially variable gene

Please see Identification of spatially variable gene for more details.

4. Expression Rank Score

The expression rank score is defined as the percentile of log-transformed CPM (natural logarithm) in each ST section.

logo-pku.png
logo-cft.png

Copyright © 2021-2024. College of Future Technology (CFT), Peking University. All Rights Reserved.

E-mail: . 苏ICP备2021011214号