Antibody Numbering Schemes1¶
IMGT¶
IMGT23 has 128 possible positions for all antigen receptor types (IG and TR). These are supposed to be structurally equivalent. In theory these are supposed to account for all possible positions, with an insertion point only between positions 111 and 112 in the CDR3 for lengths exceeding 13 amino acids, e.g. 111-ABCD DCBA-112
.
As the length of the CDR regions increases, the IMGT numbering of the added residues does not increment in a naive way. E.g., for HCDR1 of length 5 the numbering is "27 28 29 37 38", then for each added residue, the numbering follows the pattern "36 30 35 31 34 32 33" rather than "30 31 32 33 34 35 36". Thus, for HCDR1 of length 6, the numbering is "27 28 29 36 37 38 39".
When filling the gap within the minimal CDR numberings (see table below), larger numbers takes precedence over smaller numbers (e.g. 36 before 30, 112A before 111A).
Region | Length range | Minimal numbering | New numberings as length increases |
---|---|---|---|
CDR1 | 5-12 | 27 28 29 -- 37 38 | 36 30 35 31 34 32 33 |
CDR2 | 0-10 | -- | 56 65 57 64 58 63 59 62 60 61 |
CDR3 | 5-91 | 105 106 107 -- 116 117 | 115 108 114 109 113 110 112 111 112A 111A 112B 111B ... |
Kabat¶
Kabat4 is defined for heavy and light antibody chains only. Positions in the two chain types are not equivalent. Maximum number is 113 for heavy and 109 for light. Insertions occur at specific positions and can occur in both the framework and the CDRs. They are annotated from A->Z. e.g 100ABCDEFGHIJK 101
.
Region | Numbering | Possible insertion codes |
---|---|---|
[H]CDR1 | 35 | ABCD |
[H]CDR2 | 52 | ABC |
[H]FR3 | 82 | ABC |
[H]CDR3 | 100 | ABCDEFGHIJK |
[L]CDR1 | 27 | ABCDEF |
[L]CDR3 | 95 | ABCDEF |
[L]FR4 | 106 | A |
Despite its global popularity, the Kabat scheme is now known to have limitations:
-
Undefined for very long HCDR3s: In the potentially very long HCDR3, insertions are numbered between residue H100 and H101 with letters up to K (i.e. 100ABCDEFGHIJK 101). We now know it is possible to have more residues than that, but there is no defined way of numbering them.
-
Structurally uninformed: The Kabat numbering scheme was developed from a limited sequence dataset, without the knowledge of structure. We now know that the position at which insertions occur in LCDR1 and HCDR1 does not actually match the structural insertion position. Thus topologically equivalent residues in these loops do not get the same number.
Chothia¶
Chothia567 is defined for heavy and light antibody chains only. Numbering in the two chain types are not equivalent. Maximum number is 113 for heavy and 109 for light (same as Kabat). Insertions occur at specific positions and can occur in both the framework and the CDRs. They are annotated from A->Z. e.g 100ABCDEFGH 101
.
The Chothia numbering scheme is based on Kabat, but places the insertions in CDR-L1 and CDR-H1 at the structurally correct positions. This means that topologically equivalent residues in these loops do get the same label (unlike the Kabat scheme).
Region | Numbering | Possible insertion codes |
---|---|---|
[H]CDR1 | 31 | AB |
[H]CDR2 | 52 | ABC |
[H]FR3 | 82 | ABC |
[H]CDR3 | 100 | ABCDEFGHIJK |
[L]CDR1 | 30 | ABCDEF |
[L]CDR3 | 95 | ABCDEF |
[L]FR4 | 106 | A |
AHo¶
AHo has 149 possible for all antigen receptor types (IG and TR). These are supposed to be structurally equivalent. The AHo scheme's large number of positions is supposed to account for all possible positions without the need for specifying insertion positions.
Similar to IMGT, insertions and deletions in the AHo numbering are placed symmetrically around key positions, rather than growing from one direction. For example, the gap in immunoglobulin CDR2s is centered around position 63, resulting in an 8-residue gap in the VL domains (L59-L66, placed between L50 and L51 according to Kabat) and a 1-to-4-residue gap in VH domains (between H52 and H53 according to Kabat). For more details, please refer to the AHo official website.
AHo 1-33 is aligned to IMGT 1-33, AHo 36-119 is aligned to IMGT 34-117, AHo 139-149 to IMGT 118-128.
CDR definition comparison¶
Note that these numbers include residues with insertion codes, for example, Kabat HCDR1 (31-35
) include residues 31-34 35ABCD
.
Scheme | HCDR1 | HCDR2 | HCDR3 | LCDR1 | LCDR2 | LCDR3 |
---|---|---|---|---|---|---|
IMGT | 27-38 | 56-65 | 105-117 | 27-38 | 56-65 | 105-117 |
IMGT (Chothia-numbered) | 26-35 | 51-57 | 93-102 | 27-32 | 50-52 | 89-97 |
Kabat | 31-35 | 50-65 | 95-102 | 24-34 | 50-56 | 89-97 |
Chothia | 26-32 | 52-56 | 96-101 | 26-32 | 50-52 | 91-96 |
North (Chothia-numbered) | 23-35 | 50-58 | 93-102 | 24-34 | 49-56 | 89-97 |
You may find more detailed comparison between the CDR numbering of IMGT and Kabat/Chothia scheme of VH, VK, VL genes on the IMGT website. Here is an example with some real-world sequences.
Conserved sequences around CDRs¶
The following set of rules will allow you to locate the CDR region in an antibody sequence through the neighboring conserved residues.
Region | IMGT | Kabat/Chothia | Amino acids |
---|---|---|---|
Before HCDR1 | 23 | 22 | C |
After HCDR1 | 41-42 | 36-37 | WV, WI, WA |
Before HCDR2 | 50-54 | 45-49 | LEWIG |
After HCDR2 | 75-77 | 66-68 | K/R - L/I/V/F/T/A - T/S/I/A |
Before HCDR3 | 104-106 | 92-94 | CAR, CXX |
After HCDR3 | 118-121 | 103-106 | WGXG |
Before LCDR1 | 23 | 23 | C |
After LCDR1 | 41-43 | 35-37 | WYQ, WLQ, WFQ, WYL |
Before LCDR2 | 54-55 | 48-49 | IY, VY, IK, IF |
After LCDR2 | |||
Before LCDR3 | 104 | 88 | C |
After LCDR3 | 118-121 | 98-101 | FGXG |
The bold residues are conserved under most cases, while the italic residues are less frequent.
Vernier Zone¶
Vernier zone residues9 are located in the framework regions and underlie the complementary determining regions (CDRs). These residues potentially affect the conformation of CDR loop structures, thus are frequently back-mutated during CDR grafting to prevent affinity loss.
Region | Kabat/Chothia | IMGT |
---|---|---|
Before HCDR1 | H2,27-30 | H2,28-31 |
Before HCDR2 | H47-49 | H52-54 |
After HCDR2 | H67,69,71,73 | H76,78,80,82 |
Before HCDR3 | H93-94 | H105-106 |
After HCDR3 | H103-104 | H118-119 |
Before LCDR1 | L2,L4 | L2,L4 |
After LCDR1 | L35-36 | L41-42 |
Before LCDR2 | L46-49 | L52-55 |
After LCDR2 | L64,66,67,69,71 | L78,80,84,85,87 |
After LCDR3 | L98 | L118 |
-
This document is adapted from the ANARCI documentation and documentation by Prof. Andrew Martin's Group ↩
-
IMGT unique numbering for V-REGION: Lefranc, M.-P. et al., Dev. Comp. Immunol., 27, 55-77 (2003) PubMed pdf ↩
-
IMGT unique numbering for V-DOMAIN and V-LIKE-DOMAIN: Lefranc, M.-P. et al., Dev. Comp. Immunol., 27, 55-77 (2003) PubMed pdf ↩
-
Kabat, E.A. et al., In: Sequences of Proteins of Immunological Interest, NIH Publication, 91-3242 (1991). ↩
-
Chothia, C. and Lesk, A.M., J. Mol. Biol., 196, 901-917 (1987). ↩
-
Chothia, C. et al., Nature, 342, 877-883 (1989). ↩
-
Al-Lazikani, B. et al., J. Mol. Biol., 273, 927-948 (1997). ↩
-
A. Honegger & A. Plückthun. J. Mol. Biol, 309 (2001)657-670. PubMed pdf ↩
-
Jefferson Foote; Greg Winter, J. Mol. Biol. 224, 487-499 (1992). ↩