Last updated: 2025-12-10
In this practical session, you will learn how to use ChimeraX in order to visualise AlphaFold outputs interactively.
Why ChimeraX?
ChimeraX is developed by the Resource for Biocomputing, Visualization, and Informatics (RBVI) at UCSF. It is released under a non-commercial license for academic, government, nonprofit, and personal use, and it’s source code is available on GitHub. It is particularly suited for AlphaFold output analysis thanks to its many functionalities adapted for interactive score visualisation.
Installing ChimeraX
ChimeraX is easily downloadable from their website for any OS (Windows, Linux, Mac): https://www.cgl.ucsf.edu/chimerax/download.html
Additionnal information
ChimeraX help pages are very complete: https://www.rbvi.ucsf.edu/chimerax/docs/user/index.html
The ChimeraX window is composed of the Menu Bar & Tool Bar across the top, the command line area at the bottom and several detachable and moveable pannels in between, of which:
Log Pannel: Any action is recorded here and clickable to open its respective help page
Models Pannel: All objects (e.g. molecules) are listed here and are given an ID
Working Pannel: Here’s where you’ll visualise your objects
We will be using the pre-computed data from the previous practical “Application de MassiveFold à une protéine monomère inconnue”.
Download the pre-computed outputs (zip folder called A0A2U7UDN4.zip) from NextCloud
Unzip the folder
You should have the following folder architecture:
A0A2U7UDN4/
├── af3_default/ # AlphaFold3 outputs
│ └──...
├── afm_default/ # AFMassif outputs
│ └──...
├── cf_default/ # ColabFold outputs
│ └──...
├── msas/ # MSAs used in predictions
│ ├── bfd_uniref_hits.a3m
│ ├── mgnify_hits.sto
│ ├── pdb_hits.hhr
│ └── uniref90_hits.sto
├── msas_alphafold3/ # AF3-formatted MSAs -> json format
│ └── msas_alphafold3_data.json
└── msas_colabfold/ # colabfold-formatted MSAs
└── 0.a3mThe files that are important for this practical are:
.pdb or .cif extension (depending on what predictor was used)light_pkl sub-folders⚠️ WARNING: make sure to match rank, seed and model numbers between the .cif/.pdb and .pkl files when importing them into ChimeraX!
Open the most confident prediction of AFMassive into ChimeraX
Hint:
Look for a .pdb file within the afm_default folder with “ranked_0” in its name.
Change the lighting of the protein and the background colour
Click to see answer
These 3 icons are within the “Graphics tab”. This tab groups quick-access tools to change background colour and lighting effects of your window.
Don’t hesitate to play around a little with the different effects.
Play around with the following commands to get familiar with them & check they work for you
You don’t have to use the command line in ChimeraX for basic functions but it offers more options and is quite easy to master, especially thanks to the messages in the log pannel.
Whenever you do some changes in ChimeraX through the buttons and menus, you will see the corresponding command line show up in the log pannel. Each command in the log pannel is a clickable link that brings you straight to the corresponding help page and detailed usage. All help pages can also be found here.
The command syntax is very simple, it always starts with the command name and you use spaces as separators. Each command has its own options that might or might not take values.
command name + option not needing a value e.g. color bychain
colorbychaincommand name + option needing a value e.g. set bgColor white
setbgColorwhiteMost commands can be applied to a specific selection only. For this, you can use the following syntax:
Colour the model by pLDDT (pLDDT scores are often saved in the bfactor field)
Click to see answer
You will find this icon within the “Molecular display tab”. This tab groups quick-access tools to change structure granulosity (atomic, secondary structure and surface representation) as well as colouring (by heteroatom, chain, by position in the sequence, electrostatics & hydrophobicity…).
Don’t hesitate to play around a little with the different colours & representations.
In order to change the default colour palette used for bfactor colouring, you can, instead, use the command line to colour your structure and specify the alphafold colour palette:
color bfactor palette alphafold
Identify the least and most confidently-predicted portions of the structure according to the pLDDT
Click to see answer
As a reminder, pLDDT scores are a per-residue score between 0 and 100. The higher the score, the higher the confidence in the local environment of a residue.
As you can see, loops and terminal regions have a low confidence whereas well-structured regions have a high confidence i.e. AlphaFold is more confident in its prediction of the well-structured regions, and is not sure about the structure of the loop and terminal regions.
This is not at all surprising: Helices and sheets have well-defined, repeating backbone geometries and stabilising interactions. They are also often more evolutionarily conserved. All this information makes these regions easier to predict, thus AlphaFold is more confident in what it outputs.
Import the scores associated with the model you have in ChimeraX
The current model is ranked_0_unrelaxed_model_3_ptm_pred_2.pdb, it’s associated score file can be found within afm_defaults/light_pkl/. Pick the file with the same model and pred numbers.
You can import the scores into ChimeraX through the Menu bar in Tools > Structure Prediction > AlphaFold Error Plot.
Click to see help
The correct pickle file to import is result_model_3_ptm_pred_2.pkl within the afm_defaults/light_pkl/ folder.
After importing the scores, you should have an extra pannel appear that you can add to your ChimeraX main window if you wish.
Note the buttons at the bottom of this new pannel:
Highlight the lighter regions in the PAE matrix
Click to see answer
As a reminder, the PAE plot is a square matrix with the same length & height as the number of residues in your structure. It does not show scores but estimated errors on the distance between 2 residues of the structure. It’s a value between 0 and 32 Ångström. You can see it as a “±” value that you can add to the actual distance you see in the predicted structure. Thus, the lower it is, the better (i.e. AlphaFold is more confident in the distance that it has predicted between 2 residues).
The default colour scale of the PAE matrix is explained when you click on the Help button. The colour scale tries to reflect the pLDDT colours: blue for low error values, yellow-orange for medium values and grey-white regions for high error values.
When you highlight a light area in the PAE outside of the diagonal, it will show the corresponding regions in the structure as pink (y-axis) and green (x-axis) areas.
Unsurprisingly, these light areas are seen between residues of well-structured regions and predicted loop regions with very poor pLDDT scores. Indeed, loops have generally fewer structural constraints (they are often linkers or turns), so their exact position relative to other parts of the protein is harder to predict (i.e. more prone to error).
Colour the structure according to domains in the PAE
Click to see answer
Well-packed domains typically appear as blocks of low error along the diagonal of the plot. These blocks represent regions where residues whithin the domain have low predicted positional error relative to each other, indicating a stable well-folded structure.
ChimeraX has many inbuilt features, of which, the identification of predicted well-packed domains from a PAE plot (first button at the bottom of the PAE plot). You can then “right click” on the PAE plot to colour the PAE as the structure and keep the PAE values as a background grey scale. This is a useful tool to more easily find correspondances between the structure and the PAE plot and to help you read the PAE plot.
As you can see with the colours, ChimeraX found 3 different domains from the plot: 2 loops and the rest of the protein as a third well-packed domain. This is quite coherent with the structure.
PAE plots become particularly interesting with multi-domain and/or multi-chain proteins as they will help identify if 2 domains are likely to interact (lower PAE values) or not (higher PAE values). Unless the interaction is very strong (e.g. obligate oligomer), you should not expect the PAE values to match the intra-domain ones (the signal will be more diluted).
Colour the PAE plot according to the pLDDT score
Click to see answer
To help you answer this question, you can use the same trick as before: first colour the structure by pLDDT, then “right click” on the PAE to update the colouring as in the structure.
The resulting plot isn’t the most easy to read but if you look closely, it depicts the pLDDT score along the diagonal, together with the PAE error values in grey scale.
PAE values and pLDDT scores represent 2 different types of confidence levels, but as you can see, in this prediction, they are quite in agreement in the predicted domains.
Open the most confident ColabFold and AlphaFold3 predictions
af3_default/ranked_0_af3_seed_409255_sample_2_pred_17.cifcf_default/ranked_0_unrelaxed_model_3_ptm_pred_0.pdbNB: you can colour all structures by model by typing color bymodel in the command line at the bottom of the Window
Align and compare them to the AFmassive model we have analysed previously
You can use the “Matchmaker” tool for this through the Menu bar in Tools > Structure Analisys > Matchmaker.
Click to see help
NB: you can also restrict to a given selection by selecting the regions before running matchmaker, then by ticking the “Also restrict to selection” boxes.
As you can see, the structures do not superimpose that cleanly:
To get a split view, you can use the tile function in the command line:
tile columns 3 spacing_factor 1
In the above command, we specify that we want 3 columns and we reduce the default spacing between models to a factor 1. To remove tiling, you can just type: tile off
Many proteins share the same fold despite having <20% sequence identity and since structure is very linked to function, it can be beneficial to find structurally-similar proteins in order to better characterise a given protein of interest. In this case, our protein (uniprot id: A0A2U7UDN4_9VIRU) in Pandoravirus neocaledonia is not very well characterised (see Uniprot page).
Go back to your Foldseek output page and have a look at the top hits and their scores
Fetch one of the top-scoring PDB structures identified through Foldseek into ChimeraX
Open > Fetch by ID > PDB and type the PDB id.4FVJ it is chain B that interests us only.You can split the 4FVJ model into its individual chains using the split command:
split #4 chains
This will create 8 sub-models in the model pannel. You can hide all chains with hide #4 atoms and then just show chain B with show #4.2 cartoon or just tick the corresponding boxes in the models pannel.
Align the PDB structure to the best prediction with Matchmaker
Compare the two structures
Click to see answer
Tiling with the tile command can be useful in this case in order to see more clearly. If you set both structures side-by-side and colour the model by pLDDT, you can see that confidence score is not too bad for the region that overlaps with 4FVJ.