HGVS的变异格式

转：https://www.cnblogs.com/timeisbiggestboss/p/7988377.html

符号：

HGVS的变异格式

1.HGVS的变异格式由两部分组成：

1.1 reference sequence file identifier (accession.version-number) ： actual description of a variant

比如：NG_012232.1(NM_004006.2):c.357+1G>A

NG_012232.1(NM_004006.2)是the reference sequence file identifier。

c.357+1G>A：the actual description of a variant

1.2 具体解释

reference sequence file identifiter:只接受NCBI,EBI的公共文件的内容。包括NC_# (e.g. NC_000023.10), LRG_# (e.g. LRG_199), NG_# (e.g. NG_012232.1), ENSG00000182533.6, NM_# (e.g. NM_004006.2), ENST00000343849.2, NR_# (e.g. NR_002196.1) and NP_# (e.g. NP_003997.1)。注意，点号后面的内容是版本号，除了LRG_外，其它的格式都需要版本号。

actual description of a variant:由两部分组成，一是参考序列的类型，二是具体的突变信息。

2.具体变异描述的内容（actual description of a variant）

2.1。参考序列的类型

HGVS的变异格式

2.2。具体的突变信息。

2.2.1.蛋白：

1.1替代：

格式：“prefix”“amino_acid”“position”“new_amino_acid”

LRG_199p1:p.Trp24Cys	missense	把一个氨基酸换成另一个氨基酸
LRG_199p1:p.Trp24Ter (p.Trp24*)	nonsense	把一个氨基酸换成终止密码子
NP_003997.1:p.Cys188=	slient	氨基酸没有变化

1.2缺失：

格式：“prefix”“amino_acid(s)+position(s)_deleted”“del”

p.Ala3del	第三个氨基酸Ala缺失
p.Ala3_Ser5del	第三个氨基酸到第五个氨基酸缺失

1.3重复：

格式：“prefix”“amino_acid(s)+position(s)_duplicated”“dup”

p.Ala3dup (one amino acid)a duplication of amino acid Ala3 in the sequence MetGlyAlaArgSerSerHis to MetGlyAlaAlaArgSerSerHis

1.4插入：

格式：“prefix”“amino_acids+positions_flanking”“ins”“inserted_sequence”

p.His4_Gln5insAla ：the insertion of amino acid Ala between amino acids His4 and Gln5 changing MetLysGlyHisGlnGlnCys to MetLysGlyHisAlaGlnGlnCys

1.5移码（frame shift）：移码是插入或缺失的一种特例。

格式： “prefix”“amino_acid”position”new_amino_acid”“fs”“Ter”“position_termination_site”

p.Arg97ProfsTer23 ：a variant with Arg97 as the first amino acid changed, shifting the reading frame, replacing it for a Pro and terminating at position Ter23

解释：第97位的Arg变为Pro，这次翻译终止为从该位点数起的第23个氨基酸。

2.2.2基因

在用coding DNA作为参考序列时，其有自己的坐标定义图：

HGVS的变异格式

2.1 替代 :

格式： “prefix”“position_substituted”“reference_nucleotide””>”new_nucleotide”

- NC_000023.10:g.33038255C>A
  
  a substitution of the C nucleotide at g.33038255 for an A
- NG_012232.1(NM_004006.1):c.93+1G>T
  
  a substitution of the G nucleotide at c.93+1 (coding DNA reference sequence) with a T

2.2缺失

格式：“prefix”“position(s)_deleted”“del”

NG_012232.1:g.19_21del (several nucleotides)

a deletion of nucleotides g.19 to g.21 in the sequence AGAATCACA to AGAA___CA

2.3重复

格式：“prefix”“position(s)_duplicated”“dup”

NM_004006.2:c.20_23dup (NC_000023.10:g.33229407_33229410dup)

a duplication from position c.20 to c.23 in the sequence AGAAGTAGAGG to AGAAGTAGATAGAGG

2.4插入：

格式：“prefix”“positions_flanking”“ins”“inserted_sequence”

NC_000023.10:g.32862923_32862924insCCT (LRG_199t1:c.240_241insAGG)

the insertion of nucleotides CCT between nucleotides g.32862923 and g.32862924

2.5转换：一段序列被参考基因组的另一段序列替换

格式：“prefix”“positions_converted”“con”“positions_replacing_sequence”

NC_000022.10:g.42522624_42522669con42536337_42536382

conversion in exon 9 of the CYP2D6 gene replacing exon 9 nucleotides g.42522624 to g.42522669 with those of the 3’ flanking CYP2D7P1 gene, nucleotides g.42536337 to g.42536382 from the same genomic reference sequence (NC_000022.10)

2.6缺失插入

格式：“prefix”“position(s)_deleted”“delins”“inserted_sequence”

g.6775delinsGA

a deletion of nucleotide g.6775 (a T, not described), replaced by nucleotides GA, changing ..AGGCTCATT.. to ..AGGCGACATT..

参考文章：

http://varnomen.hgvs.org/recommendations/general/

http://www.sohu.com/a/158915410_603295

HGVS的变异格式

HGVS的变异格式

相关推荐