点击下载
本文文档

当前位置：首页 - 正文

NCBI各符号代表意思

来源：动视网责编：小OO 时间：2025-10-02 10:55:20

NCBI各符号代表意思

Categorized|生物信息学Tags|NCBI,refseq,格式NCBIRefSeq命名格式的详细说明Postedon23四月2009by柳城，阅读4,194NCBIRefSeq(美国国立生物技术信息中心参考序列库)是目前世界上最具有权威性的序列数据库。NCBI的参考序列计划（RefSeq）将为中心法则中自然存在的分子，从染色体到mRNA到蛋白提供参考序列标准。RefSeq标准为人类基因组的功能注解提供一个基础。它们为突变分析，基因表达研究，和多态发现提供一个稳定的参考点。由于一些序列

推荐度：

点击下载本文 文档为doc格式

导读Categorized|生物信息学Tags|NCBI,refseq,格式NCBIRefSeq命名格式的详细说明Postedon23四月2009by柳城，阅读4,194NCBIRefSeq(美国国立生物技术信息中心参考序列库)是目前世界上最具有权威性的序列数据库。NCBI的参考序列计划（RefSeq）将为中心法则中自然存在的分子，从染色体到mRNA到蛋白提供参考序列标准。RefSeq标准为人类基因组的功能注解提供一个基础。它们为突变分析，基因表达研究，和多态发现提供一个稳定的参考点。由于一些序列

Categorized | 生物信息学

Tags | NCBI, refseq, 格式NCBI RefSeq命名格式的详细说明

Posted on 23 四月 2009 by 柳城，阅读 4,194

　　NCBI RefSeq (美国国立生物技术信息中心参考序列库) 是目前世界上最具有权威性的序列数据库。NCBI的参考序列计划（RefSeq）将为中心法则中自然存在的分子，从染色体到mRNA到蛋白提供参考序列标准。RefSeq标准为人类基因组的功能注解提供一个基础。它们为突变分析，基因表达研究，和多态发现提供一个稳定的参考点。由于一些序列来自异常连接产生的转录物或由计算机推演产生的不正确内含子-外显子剪切，因此该数据库所收集的参考序列一直在不断地被修改中，尽管如此，NCBI RefSeq仍是目前最可信赖的人类基因mRNA序列数据库。

RefSeq一般的命名格式:前缀为两个字母，然后下横线（'_'）。区别于其它的GenBank的命名格式。

Accession Molecule Method @ Note 说明

AC_123456 Genomic Mixed Alternate complete genomic molecule. This prefix is used for records that are provided to reflect an alternate assembly or annotation. Primarily used for viral, prokaryotic records. 基因组序列，主要是病毒、原核生物。

AP_123456 Protein Mixed Protein products; alternate protein record. This prefix is used for records that are provided to reflect an alternate assembly or annotation. The AP_ prefix was originally designated for bacterial proteins but this usage was changed. 蛋白序列，AP_原本只用于细菌的蛋白。

NC_123456 Genomic Mixed Complete genomic molecules including genomes, chromosomes, organelles, plasmids. 全基因组序列，包括细胞器的、质粒等

NG_123456 Genomic Mixed Incomplete genomic region; supplied to support the NCBI genome annotation pipeline. Represents either non-transcribed pseudogenes, or larger regions representing a gene cluster that is difficult to annotate via automatic methods. 不完整的基因组序列，

NM_123456

NM_1234567 mRNA Mixed Transcript products; mature messenger RNA (mRNA) transcripts. 成熟的mRNA

NP_123456

NP_1234567 Protein Mixed Protein products; primarily full-length precursor products but may include some partial proteins and mature peptide products. 全长蛋白序列。但也有可能包括非全长的蛋白或成熟的多肽序列。

NR_123456 RNA Mixed Non-coding transcripts including structural RNAs, transcribed pseudogenes, and others. 不编码的RNA，假基因或其它

NT_123456 Genomic Automated Intermediate genomic assemblies of BAC and/or Whole Genome Shotgun sequence data. BAC法或鸟法得到的基因组序列

NW_123456

NW_1234567 Genomic Automated Intermediate genomic assemblies of BAC or Whole Genome Shotgun sequence data. BAC法或鸟法得到的基因组序列

NZ_ABCD12345678 Genomic Automated A collection of whole genome shotgun sequence data for a project. Accessions are not tracked between releases. The first four characters following the underscore (e.g. 'ABCD') identifies a genome project. 'ABCD'代表的是具体的基因组计划

XM_123456

XM_1234567 mRNA Automated Transcript products; model mRNA provided by a genome annotation process; sequence corresponds to the genomic contig. 转录序列

XP_123456

XP_1234567 Protein Automated Protein products; model proteins provided by a genome annotation process; sequence corresponds to the genomic contig. 蛋白序列

XR_123456 RNA Automated Transcript products; model non-coding transcripts provided by a genome annotation process; sequence corresponds to the genomic contig. 不编码的转录序列，

YP_123456

YP_1234567 Protein Mixed Protein products; no corresponding transcript record provided. Primarily used for bacterial, viral, and mitochondrial records. 蛋白序列，没有对应的转录序列。用于细菌、病毒和线粒体

ZP_12345678 Protein Automated Protein products; annotated on NZ_ accessions (often via computational methods). 蛋白序列。来自对应的NZ_开头的核酸序列。

NS_123456 Genomic Automated Genomic records that represent an assembly which does not reflect the structure of a real biological molecule. The assembly may represent an unordered assembly of unplaced scaffolds, or it may represent an assembly of DNA sequences generated from a biological sample that may not represent a single organism. 比较复杂

@ Method:

Mixed: indicates the process flow includes both automated processing and expert review for some of the records; curation analysis may be provided either by NCBI staff or collaborators.由专家手动检查过的

Automated: indicates records that are not individually reviewed; updates are released in bulk for a genome.自动注释

本文详细出处参考：http://liucheng.name/379/

RefS

eq accession numbers can be distinguished from GenBank accessions by their distinct prefix format of 2 characters followed by an underscore character ('_'). For example, a RefSeq protein accession is NP_015325.

Accession	Molecule	Method @	Note
AC_123456	Genomic	Mixed	Alternate complete genomic molecule. This prefix is used for records that are provided to reflect an alternate assembly or annotation. Primarily used for viral, prokaryotic records.
AP_123456	Protein	Mixed	Protein products; alternate protein record. This prefix is used for records that are provided to reflect an alternate assembly or annotation. The AP_ prefix was originally designated for bacterial proteins but this usage was changed.
NC_123456	Genomic	Mixed	Complete genomic molecules including genomes, chromosomes, organelles, plasmids.
NG_123456	Genomic	Mixed	Incomplete genomic region; supplied to support the NCBI genome annotation pipeline. Represents either non-transcribed pseudogenes, or larger regions representing a gene cluster that is difficult to annotate via automatic methods.
NM_123456 NM_1234567	mRNA	Mixed	Transcript products; mature messenger RNA (mRNA) transcripts.
NP_123456 NP_1234567	Protein	Mixed	Protein products; primarily full-length precursor products but may include some partial proteins and mature peptide products.
NR_123456	RNA	Mixed	Non-coding transcripts including structural RNAs, transcribed pseudogenes, and others.
NT_123456	Genomic	Automated	Intermediate genomic assemblies of BAC and/or Whole Genome Shotgun sequence data.
NW_123456 NW_1234567	Genomic	Automated	Intermediate genomic assemblies of BAC or Whole Genome Shotgun sequence data.
NZ_ABCD12345678	Genomic	Automated	A collection of whole genome shotgun sequence data for a project. Accessions are not tracked between releases. The first four characters following the underscore (e.g. 'ABCD') identifies a genome project.
XM_123456 XM_1234567	mRNA	Automated	Transcript products; model mRNA provided by a genome annotation process; sequence corresponds to the genomic contig.
XP_123456 XP_1234567	Protein	Automated	Protein products; model proteins provided by a genome annotation process; sequence corresponds to the genomic contig.
XR_123456	RNA	Automated	Transcript products; model non-coding transcripts provided by a genome annotation process; sequence corresponds to the genomic contig.
YP_123456 YP_1234567	Protein	Mixed	Protein products; no corresponding transcript record provided. Primarily used for bacterial, viral, and mitochondrial records.
ZP_12345678	Protein	Automated	Protein products; annotated on NZ_ accessions (often via computational methods).
NS_123456	Genomic	Automated	Genomic records that represent an assembly which does not reflect the structure of a real biological molecule. The assembly may represent an unordered assembly of unplaced scaffolds, or it may represent an assembly of DNA sequences generated from a biological sample that may not represent a single organism.

@ Method:

Mixed: indicates the process flow includes both automated processing and expert review for some of the records; curation analysis may be provided either by NCBI staff or collaborators.

Automated: indicates records that are not individually reviewed; updates are released in bulk for a genome.

NCBI各符号代表意思

Categorized|生物信息学Tags|NCBI,refseq,格式NCBIRefSeq命名格式的详细说明Postedon23四月2009by柳城，阅读4,194NCBIRefSeq(美国国立生物技术信息中心参考序列库)是目前世界上最具有权威性的序列数据库。NCBI的参考序列计划（RefSeq）将为中心法则中自然存在的分子，从染色体到mRNA到蛋白提供参考序列标准。RefSeq标准为人类基因组的功能注解提供一个基础。它们为突变分析，基因表达研究，和多态发现提供一个稳定的参考点。由于一些序列

推荐度：

点击下载本文 文档为doc格式

热门焦点

NCBI各符号代表意思

NCBI各符号代表意思

NCBI各符号代表意思

最新推荐

猜你喜欢

热门推荐