# Tutorial on protein–small molecule and lipid–protein interactions
In this tutorial, we will set up a simulation system containing a membrane protein, the bound small molecule, the host lipid membrane, and the solvent environment. This will be performed using the [CHARMM-GUI](https://www.charmm-gui.org/) portal. The process of setting up simulations for lipid systems, soluble proteins, or nanomaterils is very similar. Then, we will skip the simulation stage (~2 weeks on Puhti per system), download pre-computed trajectories for this system, and analyze them. Towards the end of the tutorial, we will also analyze an additional systems, listed below.
## The workflow in a nutshell
* Generate a simulation setup with the help of CHARMM-GUI
* Set up simulations for (membrane) proteins and ligands in CHARMM-GUI
* Perform basic analyses using the built-in tools of GROMACS
* Perform more advanced analyses using the MDAnalysis library
## The science behind
This tutorial is based on [the recent publication](https://doi.org/10.1101/2022.07.03.498529) describing the mechanism of inhibition of the Sec61 translocon by a cotransin derivative. During the biogenesis of membrane proteins and secreted proteins, the translating ribosome sits attached to the transmembrane Sec61 protein complex and pushes the nascent polypeptide chain (unfolded protein) through the Sec61 tunnel either into the ER lumen (secreted and thus soluble proteins exit Sec61 from the end of the tunnel) or to the ER membrane (membrane proteins escape via the lateral gate between the two halves of a clam-like Sec61 structure facing the membrane). As many proteins processed by Sec61 are of therapeutic interest (viral proteins, inflammatory cytokines, growth factor receptors related to cancer to name a few), the substrate-selective inhibition of Sec61 is desireable.
Cotransin is a cyclic depsipeptide that has been demonstrated to inhibit the translocation of proteins from the ribosome into the ER during co-translational translocation. The cotransin derivative, KZR8445 used in this work, was found to be substrate-selective using biochemical assays. Moreover, it was efficacious towards rheumatoid arthritis in mice, and inhibited the production of COVID-19 capsid protein (CHECK!). Thus, to understand its mechanism of function could help further refine the selectivity of cotransins towards new and specific protein targets.
To explain the substrate-selective inhibition, cryo-EM of KZR8445-bound Sec61 was performed, and additional density was found at the Sec61 lateral gate, i.e. at the site where the transmembrane proteins exit the channel. KZR8445 was modelled into that density. Here, we will use this structure of the Sec61 bound by KZR8445. We will study the effects of the inhibitor on the protein structure, notably its dynamics and the conformation of the lateral gate. Then, we will find the key hydrogen bonding partners for KZR8445, and confirm their importance using computational mutagenesis. We will also study the penetration of water into the Sec61 tunnel.
(Rehan et al., bioRxiv (2022), DOI: https://doi.org/10.1101/2022.07.03.498529)
%% Cell type:code id: tags:
``` python
importnumpyasnp
importMDAnalysisasmda
fromMDAnalysis.analysisimportdistances
importmatplotlib.pyplotasplt
importnglviewasnv
fromipywidgetsimportwidgets
fromIPython.displayimportdisplay
importwarnings
warnings.filterwarnings("ignore")
```
%% Cell type:markdown id: tags:
Let's visualize the protein & the inhibitor first!
%% Cell type:code id: tags:
``` python
u=mda.Universe('Sec61_KZR8445.pdb')
view=nv.show_mdanalysis(u)
view
```
%% Cell type:markdown id: tags:
## 1) Now let's learn how to set the system in CHARMM-GUI:
* Go to [CHARMM-GUI](https://www.charmm-gui.org/) and log in
* Note: along the process, you can always look at the generated structure in the browser!
* Go to input generator -> Membrane builder -> Bilayer builder
* Typically, we could use a PDB identifier here. We will use a structure coresponding to [PDB entry 7ZL3](https://www.rcsb.org/structure/7ZL3), but to have matching atom names for the ligand parametrization, we will upload the local PDB "Sec61_KZR8445.pdb" and **continue**
* Select all chains (Protein has 3 subunits, the "Hetero" is the bound small molecule (inhibitor) and **continue**
* Parametrize the ligand: upload the corresponding MOL2 file
* Now all sorts of protein modifications are possible, but we'll omit them and **continue**
* We use PPM2.0 to orient the protein to the membrane, and use the main subunit (Sec61α, "PROA") for this alignment and **continue**
* Let's hydrate the system with 80 waters per lipid, and place it in a membrane containing a total of 400 POPC molecules. Try to balance the lipid numbers in the leaflets so that the leaflet areas agree and we don't have much tension. Here, we can create as complex of a membrane as we like, but we are not going to do a literature review on the lipidomics of the ER membrane now :) When you reach balanced leaflets, **continue**
* Let's include 150 mM (0.15 M) of NaCl and neutralizing ions and **continue**
* Confirm that there is no penetration (e.g. lipid tails through ring structures) and **continue**
* Select a suitable for field combination (we will now use CHARMM36m) and simulation program (GROMACS) and **continue**
* Download the .tgz file containing the simulation system and the files required to run an MD simulation for it.
%% Cell type:markdown id: tags:
Let's first decompress the tgz file and see whats inside.
%% Cell type:code id: tags:
``` python
!mv~/Downloads/charmm-gui.tgz.
!tar-xvfcharmm-gui.tgz
!lscharmm*/gromacs/
```
%% Cell type:markdown id: tags:
CHARMM-GUI outputs handy equilibration scripts that we can now run to relax the protein and lipid structures, and get the proper density of the water. This information is included in the README file. Let's take a look at its contents.
%% Cell type:code id: tags:
``` python
!mvcharmm*/gromacs/*.
!rm-rfcharmm*
!ls
```
%% Cell type:markdown id: tags:
For a demonstration, we can run it for a while and then cancel it with CTRL+C
%% Cell type:code id: tags:
``` python
!catREADME
!chmodu+xREADME
!./README
```
%% Cell type:markdown id: tags:
Now we could run the entire README script, and it would eventually generate 100 ns of trajectory. But this would take 8 nodes for a day, so we will instead download 1000 ns-long trajectories from a public repository. The pre-computed trajectories are in [Zenodo](https://doi.org/10.5281/zenodo.7303653). Read the description carefully so you know what is in the files. For the next steps, we'll need a run input file (.tpr), a trajectory (.xtc), and an energy file (.edr).
We will now start with some basic calculations of protein structure. First, we calculate the root-mean-squared deviation (RMSD) of the protein backbone structure as a function of time with the reference taken from the TPR file (initial configuration). We will do it for the protein with the inhibitor bound, but the same calculation can be repeated for the system without the bound inhibitor.
It seems that the region around residue 200 is more mobile without the inhibitor. You can go back to the visualization and find out which part of the protein that is.
Next, we will check how much the presence of the inhibitor affects the protein conformation. The additional density found at the Sec61 lateral gate between helices 2 and 7 was used to fit the inhibitor. At this position, the inhibitor can affect the lateral gate conformation of Sec61. We will define the openness of the bate as the distance between the centers of mass of helices 2 and 7. The final conformations of these helices are visualized below for the system with the inhibitor (green, more open gate) and for the system without the inhibitor (yellow, more closed gate).
The inhibitor will clearly push the gate open and occypy that space.
We will then move on to study the interactions of the inhibitor with the protein. Experiments on various Sec61 inhibitors have identified N300 and Q127 as residues that maintain the lateral gate in a closed conformation. Since the inhibitor occupies this space, it could well interact with these two residues. We will first do a simple distance calculation between these two residues and the ligand.
%% Cell type:code id: tags:
``` python
# Define the atom selections for the ligand and residue 300 of the protein
ligand=u_inhi.select_atoms('resname KZC')
residue300=u_inhi.select_atoms('resnum 300')
residue127=u_inhi.select_atoms('resnum 127')
# Create an empty list to store the minimum distances
min_distances_127=[]
min_distances_300=[]
time=[]
# Loop over all frames in the trajectory
fortsinu_inhi.trajectory:
# Calculate the distance between residue 300 and the ligand using the MDAnalysis analysis.distances module
# Find the minimum distance and append it to the list of minimum distances
min_distance_127=np.min(distances_127)
min_distance_300=np.min(distances_300)
min_distances_127.append(min_distance_127)
min_distances_300.append(min_distance_300)
time.append(u_inhi.trajectory.time)
# Convert the minimum distances list to a NumPy array
min_distances_127=np.array(min_distances_127)
min_distances_300=np.array(min_distances_300)
time=np.array(time)
plt.plot(time/1e6,min_distances_127/10,'r-')
plt.plot(time/1e6,min_distances_300/10,'b-')
plt.legend(['Q127','N300'])
plt.xlabel('Time (microseconds)')
plt.ylabel('Distance (nm)')
plt.show()
```
%% Cell type:markdown id: tags:
N300 looks like a promising hydrogen bonding candidate, whereas Q127 seems to drift further as the lateral gate is pushed open. Let's do a hydrogen bonding analysis for N300. We'll simply generate GROMACS index groups for residues 127 and 300 and calculate the time evolution of hydrogen bonds between these residues and the inhbitor.
The N300A mutation clearly leads to the detachment of the inhibitor. This is also evidenced by functional assays in which the inhibitory effect of KZR8445 is significantly reduced upon the same mutation. Here, the inhibitor clearly inhibits protein translocation as evidenced by Gaussia luciferase reporter constructs. With the N300A mutation, the inhibition decreases, and based on the MD simulations, this is due to the inhibitor leaving its binding site.
(Rehan et al., bioRxiv (2022), DOI: https://doi.org/10.1101/2022.07.03.498529)
%% Cell type:markdown id: tags:
Now let's take a look at some lipid motions. The ER membrane has some cholesterol, which is known to flip-flop between the bilayer leaflets. We'll make an excellent guess here and look at the movement of cholesterol with a residue number of 490 and plot the z coordinates of two atoms as a function of time. These are one oxygen in its polar end and one carbon at the bottom of the ring structure, and they can be used to also characterize cholesterol orientation during a flip-flop. We'll also use the list comprehension here for a more compact code.