Antibody Optimization¶
Harness the power of advanced AI models to evaluate, screen, and optimize your lead antibodies efficiently.
If you have a single lead antibody, you can humanize it in one click and perform virtual screening on the generated sequences to select the best candidates for experimental validation.
If you have multiple lead antibodies, our task allows comprehensive scoring based on humanness, surface charge, hydrophobicity, isoelectric point, expression levels, structural confidence, and PTM risks. You can rank and screen molecules based on these scores, with intuitive color-coded visualizations at the amino acid level displayed on both sequence and structure.
If you need to modify existing or AI-generated antibodies, the Antibody Mutation Advisor provides intelligent mutation recommendations to guide your antibody engineering efforts.
Inputs¶
To submit an Antibody Optimization task, open the Project Editor and select "Antibody Optimization" from the "Antibody Design" dropdown menu.
- Antibodies: Input one or more antibodies for optimization, evaluation, or screening. Enter sequences directly or upload a FASTA file (click the "" button).
- Antibody Name: Defaults to "Antibody". Hover over the name and click "" to change it.
- Sequence: Sequences of the VH/VL regions of each antibody.
- : Upload a .FASTA file with two chains labeled with the same prefix ending in "_H" and "_L". A valid example is shown above.
- Job Name: The name of the job. Note that the job name must be unique within the project.
Parameters¶
You can click the "Show Parameters" button below the job submission form to expand the parameter list and edit job parameters. The parameters for this task include:
- Scheme: The protocol to number each antibody's amino acids, aligning equivalent amino acids across different antibodies. Options include "Kabat", "IMGT", "Chothia", or "AHo".
- CDR def.: The definition of CDR regions, affecting CDR grafting and one-click humanization results. Options include "Kabat", "IMGT", "Chothia", or "North". Here we compare different CDR definitions. We recommend using "North" for one-click humanization.
- Mutator: The protocol to mutate the starting sequence. Only applicable when the input sequence has only one sequence. Options include:
- None: No mutation. The result table will show the input antibodies as the initial outputs.
- GeoHumAb: The proprietary humanization algorithm developed by BioGeometry, which transforms the antibody into a more human-like molecule with fewer mutations. See here for details. The result table will show one humanized sequence apart from the input sequence.
- CDR Grafting: The traditional humanization technique that grafts the CDR regions of the input antibody onto human germline sequences. See here for details. The result table will show one humanized sequence apart from the input sequence.
- One-click humanization: The proprietary one-click humanization algorithm developed by BioGeometry, which performs VH-VL pairing, CDR grafting, selective backmutation, and obtains multiple humanized sequences for subsequent calculations. The result table will show multiple humanized sequences apart from the input sequence.
Result Panel¶
In the File & Job Manager panel, click the name of the task you previously created to view the task results.
Result Table¶
In the Antibody Properties table in the result panel (above), the ID, Comment, and properties of all antibody molecules in the task are listed. These antibodies may come from task inputs, or may be generated by mutation algorithms, or may be results from editing on the Edit Page. The properties listed include:
Antibody Properties
- Humanness (percentile): The humanness score of the antibody and its percentile in our antibody library. Generally, antibodies with a percentile below 10% are considered to have a high immunogenicity risk.
- Germline content: The percentage of amino acids in the antibody that are identical to the germline sequences. Generally, it should be greater than 80%.
- FR Germline content: The percentage of amino acids in the FR region of the antibody that are identical to the germline sequences. Generally, it should be greater than 85%.
- Germline allele: The germline genes of the antibody. The heavy and light chains are separated by "|", such as "IGHV1-2*02 | IGKV4-1*01".
- Optimal pH: The pH value at which the antibody structure is most stable.
- pI: The isoelectric point of the antibody calculated based on the structure.
- Positive patch (percentile): The number of positive charge patches on the antibody surface and its percentile. A percentile greater than 90% indicates a high risk of aggregation.
- Negative patch (percentile): The number of negative charge patches on the antibody surface and its percentile. A percentile greater than 90% indicates a high risk of aggregation.
- Charge symmetry (percentile): The charge symmetry of the VH and VL regions (the product of the net surface charge of amino acids). A percentile less than 10% indicates a high risk of aggregation.
- Hydrophobic patch (percentile): The number of hydrophobic patches on the antibody surface. A percentile less than 10% or greater than 90% indicates a high risk of aggregation.
- Expression (percentile): The expression level in CHO cells of the antibody and its percentile in our antibody library. A percentile less than 10% indicates a high risk of low expression.
- Purity (percentile): The purity of the antibody after one step of A-column purification (detected by SEC-HPLC). A percentile less than 10% indicates a high risk of low purity.
- #liabilities: The number of translation post-translational modifications (PTMs) and non-specific binding risk sites on the antibody.
- Energy (Amber14SB): The energy of the antibody structure calculated using the Amber14SB force field.
- plDDT: The local distance difference test (lDDT) of the entire antibody, ranging from 0 to 100. The higher the value, the more confident the antibody structure prediction. It has a positive correlation with antibody structural stability.
- CDR plDDT: The lDDT of the CDR region of the antibody, ranging from 0 to 100. The higher the value, the more confident the antibody CDR structure prediction.
- wpTm: The weighted TM score of the antibody VH-VL complex, ranging from 0 to 1. The higher the value, the more confident the antibody VH-VL complex structure prediction. It has a positive correlation with the VH-VL pairing propensity.
- RMSD to {{ref}}: The root mean square deviation (RMSD) of the antibody structure compared to the target structure.
- CDR RMSD to {{ref}}: The RMSD of the CDR region of the antibody compared to the target CDR region.
The above two columns will not appear by default. You can click "" in the table header and specify the target structure and the placeholder content of "{{ref}}" in the column name to display these columns.
Each property column contains two operation buttons:
- Sort: Click the up/down triangles to sort the current column in ascending/descending order, respectively.
- Filter: Click to filter the numerical values in the current column.
The table header has four operation buttons:
- (Custom Columns): Since there are many properties in the result table, you may want to view only certain ones. Click "Columns" to customize which properties to display and their order.
- (Batch Operations): Click to perform batch operations on multiple antibodies in the table. Supported operations include:
- Copy as FASTA: Copy the sequences of the selected antibodies to the clipboard in FASTA format.
- Export to CSV: Export the sequences and properties of the selected antibodies to a CSV file on your local machine.
- (Align Structures): Click to align the structures of all antibody molecules (and future added molecules) to the specified antibody structure. After confirmation, two columns related to structure similarity will be added to the table: RMSD to {{ref}} and CDR RMSD to {{ref}}.
- (Compare Sequences): Align and compare the sequences of the selected antibodies. Click to go to the Compare Page, which shows the molecule-level and amino acid-level scores of each antibody.
Compare Page¶
This page (titled “Antibody Property Comparison”) will align the selected sequences and compare their molecule-level and residue-level scores.
Click the job name in the breadcrumb at the top of the page or the left arrow "←" (enclosed in red in the figure above) to return to the Result Table. Click the button at the top right of the page to change the antibodies being compared.
This page consists of two sections: "Antibody Properties" which displays molecule-level scoring and "Per-residue scores" which displays residue-level scoring.
Antibody properties¶
In the "Antibody Properties" section, the properties of the selected antibodies are listed as a table. As in the Result Table, you can click the button to customize which properties to display and their order, or click the button to perform batch operations on the selected antibodies, such as copying the FASTA sequences or exporting the CSV table.
Per-residue scores¶
In the "Per-residue scores" section, the selected antibody sequences are aligned and the multi-dimensional scores of each amino acid are displayed as background colors. The models available for scoring can be switched in the tabs.
- The CDR regions are underlined with dark gray, while the Vernier zones and VH-VL interface regions are highlighted with light gray.
- The background color of each residue represents a certain model's score for that residue.
The top right corner of the region has a button that can switch between displaying the heavy chain or light chain. Hovering over an amino acid residue will display detailed scoring information for that position, such as "H113 (IMGT) | CDR3 | GeoHumanness: 0.889".
Amino Acid Scoring Color Scheme
Shows the predicted risk of inducing immune responses in humans. No risk is white; the deeper the red, the higher the risk.
More specifically, a 9-peptide segment is viewed as risky if it occurs in less than 10% of human beings in our antibody sequence repertoire. The more 9-peptide segments an amino acid belongs to, the deeper the red color.
Shows the predicted pKa of the side chain of each amino acid. Blue indicates a higher pKa (weaker acidic), while red indicates a lower pKa (stronger acidic).
Shows the predicted net surface charge of each amino acid. Positive charges are blue, while negative charges are red.
Shows the size of the charged patches on the antibody surface. The larger the patch, the higher the risk of charge-induced aggregation.
Shows the hydrophobicity of each amino acid. Yellow indicates higher hydrophobicity, while blue indicates lower hydrophobicity.
Shows the size of the hydrophobic patches on the antibody surface. The larger the patch, the higher the risk of hydrophobicity-induced aggregation.
Shows the predicted risk of translation post-translational modifications (PTMs) and non-specific binding risk sites on the antibody.
Edit Page¶
This page (titled “Editing {antibody_name}”) allows you to edit a specific antibody sequence in the Result Table with AI assistance, generating molecules with multiple properties improved and saving them to the Result Table.
Click the job name in the breadcrumb at the top of the page or the left arrow "←" (enclosed in red in the figure above) to return to the Result Table.
Antibody Properties¶
In the "Antibody Properties" section, the properties of the selected antibodies are listed as a table. As in the Result Table, you can click the button to customize which properties to display and their order.
Per-residue scores¶
In the "Per-residue scores" section, the selected antibody sequences are aligned and the multi-dimensional scores of each amino acid are displayed as background colors. The models available for scoring can be switched in the tabs. You may click the button in the section title to view the scoring model's color scheme. For detailed explanations of all color schemes, see here.
- The CDR regions are underlined with dark gray, while the Vernier zones and VH-VL interface regions are highlighted with light gray. These residues are likely to be affected by mutation and should be modified with caution.
- If you have enabled the "Show PTM Risk" option, residues with translation post-translational modifications (PTMs) or non-specific binding risk will be underlined with orange curves. When you hover over the risk site, the risk name and frequency will be displayed, e.g. "Asn deamidation (3.2%)". The higher the frequency, the more likely the risk occurs at this site.
- If you have enabled the "Show solvent accessibility" option, the solvent accessibility of each amino acid is indicated by the text color. Black indicates that the amino acid is fully exposed to the solvent, while light gray indicates that it is completely buried.
The top right corner of the region has a button that can switch between displaying the heavy chain or light chain. The button can open the settings window (as shown below), where you can modify the following configuration items:
- Show PTM Risk (show/hide sequence risk)
- Show solvent accessibility (show/hide solvent accessibility)
- Mutation advisor (configure your mutation advisor, described below)
Mutation advisor¶
The Mutation advisor provides intelligent mutation recommendations to guide your antibody engineering efforts. You can click the button to open the settings window and configure the following "Mutation advisor" options:
Mutation advisor settings
An AI assistant will provide up to 5 mutation suggestions for each site based on the current antibody sequence. The current residue will also participate in ranking. For each site, the sum of the scores of all residue types is 1. The higher the score of a residue type, the more likely it is to occur at this site according to the model (see figure below). To make the suggestions clearer and more readable, the current residues in the antibody sequence are represented by dots ("·"); furthermore, residues with confidence below 1% are omitted.
The available assistant models (Copilot model) include:
- ESM1v-650M: A language model specifically designed to predict the impact of mutations.
- ESM2-3B: An updated and more powerful general protein language model.
- GeoHumAb: Our proprietary algorithm for humanizing antibodies.
Displays up to five germline sequences most similar to the current antibody sequence. The residues that agree with the current sequence are highlighted in light blue. To reduce the immunogenicity of the current sequence, you might want to mutate the current residues to the corresponding residues in the germline sequence.
You can customize the source species of the germline sequence in the settings (see below). The available options include: Human (default), Cat, Mouse, Rat, Rabbit, Alpaca, Pig, Rhesus.
When you hover over a mutation suggestion, the detailed information of the mutation suggestion will be displayed in the sequence viewer at the bottom right. It includes:
- Position: The number of the amino acid in the antibody sequence, formatted as "{chain type}{residue number}{insertion code}", e.g., "H52A".
- Region: The antibody segment to which the residue belongs, with possible values of FR1-4 or CDR1-3. The Vernier zones and VH-VL interface regions are also marked.
- Advisor confidence: The top 10 mutation suggestions and their predicted likelihoods, with the hovered residue highlighted in blue. The higher the predicted likelihood, the more likely the mutation is to improve the antibody properties.
Ways to edit the current sequence
If you want to adopt a mutation suggestion, simply click on it, and then click the "Confirm" button in the pop-up window. For example, in the figure below, we mutate H64 from E to N, based on the most similar germline sequence.
If you want to customize a single-point mutation, you can click the amino acid you want to mutate in the "Current (mt)" sequence, and then select a mutation in the pop-up window.
If you want to customize multiple-point mutations, you can hover over the "Current (mt)" sequence and select "Change multiple residues" to enter the sequence editing mode. At this point, an editable sequence will appear below the "Current (mt)" sequence, where you can:
- Click the position you want to mutate, and enter the amino acid you want to mutate. Press the Tab or Shift+Tab keys to move left or right between amino acids.
- Click the leftmost position, and press Ctrl+V to paste the mutated sequence.
- Click any amino acid in the per-residue scores or mutation suggestion sections, and the current residue will be mutated to that amino acid.
When you are done editing, click the OK button on the right to update the current sequence.
If you want to humanize the current sequence, you can hover over the "Current (mt)" sequence and select "Humanize sequence" to open the humanization popup. Then, you may set the humanization parameters and click "Confirm" to trigger the humanization calculation. After the calculation is complete, the humanized sequence will be displayed as an editable sequence below the "Current (mt)" sequence, where you can further edit the sequence. When you are done editing, click the OK button on the right to update the current sequence.
You can view the mutations you have applied in the "Mutations" area. Click the "" button to select the mutations you want to revert.
Structure viewer¶
The structure viewer displays the structure of the current antibody. When this job's result panel is open, the structure object panel will be automatically synchronized with the structure in this panel.
The amino acid residues in the structure are colored according to the same scheme as the per-residue scores.
Acknowledgements¶
This task is supported by the following models:
- Our proprietary antibody structure prediction model,
- Our proprietary humanization and humanness prediction models,
- Our proprietary expression level/aggregation prediction model,
- ESM series models from Meta AI, used for modeling and optimizing protein sequences,
- The therapeutic antibody profiler (TAP) model proposed by Oxford University and improved by us, for predicting antibody side chain pKa, surface charge, hydrophobicity, CDR length, etc., and comparing them with clinical-stage antibodies.
- Our proprietary sequence risk analysis model, for predicting post-translational modifications (PTMs) and non-specific binding risk sites.