Basic tutorial 0: use VMD to do basic structural analysis of the trp-cage mini-protein

This tutorial will help you check your knwoledge of VMD (make sure to look at the info on the VMD web site first!). It will also help you to get used the the structure of the trp-cage mini-protein (variant tc5b), which will be helpful for the first tutorial that does MD simulations and analysis for trp-cage.

Read the experimental paper

Find the original research article from 2002 describing the design and structural characterization of the tc5b trp-cage mini-protein. The title of the article is Designing a 20-residue protein. Due to the small size and similar features to larger proteins, this has become one of the most important model systems for experimental and computational studies of protein structure, stability and folding.

When you read a structure paper, you'll want to learn to pick out the features that the authors think are most important. Here it is relatively easy since the protein is small and the paper is short. Look at this section of the paper, and refer back to this section when we are doing structure analysis below:

This highly conservative data treatment produced a well-converged structure (Fig. 3a; Table 2), with an α-helix from Leu 21 to Lys 27 and a short 310-helix (residues 30–33). The unusual features are a Gly30-NH hydrogen bond to the i-5 backbone carbonyl, an indole-NHɛ1 hydrogen bond to the i+10 backbone carbonyl and the placement of Pro rings on both faces of the indole ring, with the Tyr ring completing the Trp-cage hydrophobic cluster.

Please don't just read this excerpt, though. Read the entire experimental paper, because there are other important points that the authors make. This is just given to you as an example of the kind of language you should be on the lookout to read carefully and follow up with a visual analysis.

If you don't understand certain descriptions of structure (such as secondary structure types), then you should review the protein structure reading that was discussed in the new student checklist.

Read the Simmerling Lab's original folding paper for trp-cage

Next, read the short article from the Simmerling lab All-Atom Structure Prediction and Folding Simulations of a Stable Protein. This was also published in 2002, shortly after the first experimental paper listed above. What was done there? What measures of folding were used?What observations were made about the structure details?

Download the experimental NMR structure

Use the RCSB site (PDB) to find the coordinates from the experimental paper listed above. The experimental paper lists the PDB code at the very end, use that to access the coordinates. Click on "download files" in the upper right, and select PDB format.


One thing to note is that the numbering scheme for amino acids ("residues") differ between the paper and the PDB file. They are offset, so the residue number won't match the paper when you label an atom in VMD. You need to subtract 19 from the numbers in the paper to match the PDB file.

Load the coordinates into VMD

Open VMD on your computer and go to the main window. Click on File/New molecule. In the new window, click browse and choose the file you downloaded from the PDB. It should automatically recognize it as a PDB file. Click on "load".


You should see the structure in your VMD Display window. Go to the main menu and turn off perspective view (Display/Orthographic). It's still a little difficult to make out the features in a line drawing.


Adding a cartoon representation

Go to the main window and choose Graphics/Representations. A window will open that gives you extensive control over the graphical visualization.


We will add a few new representations. First, let's add a cartoon (like a ribbon, but a little nicer). Click on "create representation" and it will add a duplicate to the list of representations (since the existing one was line drawing, the duplicate will also be line and will have the same atom selection). Select the second line representation (the new one) and then choose newcartoon from the Drawing Method pull-down menu. Next, we will color the cartoon segments by secondary structure type (make sure to read about these in the protein structure book!). Click the Coloring Method pull-down and select secondary structure. Trp-cage has several different secondary structures that you should have read about in the experimental paper listed above. We will also make it a little narrower than standard. Look at the image below and change your newcartoon properties (Aspect and Thickness) to match these.


Your structure should now look like this. Look at the cartoon colors- do you remember what secondary structure elements these are? If not, go back and scan the experimental paper. Do these segments match what was stated in the paper?


Highlighting the Trp side chain

A key feature of the trp-cage fold is the tryptophan side chain buried in the hydrophobic core (in the "cage" of proline side chains). Let's highlight Trp and the packing of it's indole side chain by adding a van der Waals sphere representation for the side chain. Go back to the representations window, and add a new one. Let's make it VDW, and also we will restrict it to the Trp side chain. You can do that by typing something other than "all" in the selection window. This window is powerful and allows you to use Boolean logic to select groups of atoms. "and" means it much match both of the groups, while "or" means it can match either group. You can also use parentheses to make complex selections and logic.


Your structure should now look like this:


Highlighting the salt bridge

Another feature of the trp-cage fold is the pairing of a positively charged side chain and negatively charged side chain on the side of the protein that faces away from view on the images above. In tc5b, this pair is Asp-9 and Arg-16.

From the experimental paper: This is attributed to the ionization of Asp 28 and fold stabilization through formation of a salt bridge with Arg 35.

Let's use a licorice representation on these so we can see them more clearly, and determine if they interact in this structure. Let's do this with another new representation, this time using residue numbers. In VMD, a list of numbers (8 16 5) will pick each one in the list, while the word "to" will pick a range including the outer numbers ("residue 8 to 10" will pick 8, 9 and 10). Let's pick residues 9 and 16, and also restrict it to either the sidechain or the atom named CA (this way the licorice will be shown to connect to the backbone, which it wouldn't do if we left out the CA).


Here's what you should see after you rotate 180 degrees to see the other side. Keep in mind that a salt bridge is just a strong hydrogen bond between charged residues (sometimes it can be water bridged, but the strongest involve direct hydrogen bonding).See this article for discussion of hydrogen bond lengths if you need to review. Do these look like they interact closely enough to be a hydrogen bond? Probably not. This structure itself does not really demonstrate that a salt bridge is present (though other evidence might, for example experiments can detect interactions by studying the impact of mutations on stability). However, it's an NMR structure, and sometimes NMR structures are not high resolution even though atomic structure is shown. This is something that you can explore in Tutorial 1. It the next step we'll learn to measure distances rather than just visually decide if atoms are close.


Making distance measurements in VMD

Let's use the VMD distance measuring tool to do more quantitative analysis of the structure. You may have noticed that the N-terminal of the protein is positively charged (the NH3+ group), and the C-terminal is negatively charged (the COO- group). This is typical for proteins, though sometimes neutral caps are added in experiments. Also you may have noticed that the N- and C-terminal ends are near each other in space, meaning that the start and end of the protein chain aren't that far away from each other. Could these opposite charges interact, and stabilize the folded structure? Let's measure the distance and see. Rotate the structure so that you have a good view of the termini, and then go to the main VMD window and select Mouse/Label/Bonds. "Bonds" doesn't mean just bonds, it really means distance between 2 atoms. You can also get to this mode by pressing "2" on the keyboard (2 atoms gives a distance between the next 2 atoms you click, 1 gives info for a single atom, 3 measures an angle for the next 3 atoms, and 4 measures the dihedral angle for next 4 atoms). After choosing Bonds, you can click on the N in the N-terminal Asn, and then the backbone C in the C-terminal Ser. It should draw a line and look like the following image. To get back to mouse rotate mode, either go to the Mouse menu, or press "r" to go to rotate mode.


This distance is too long for a hydrogen bond, so again we conclude that either this interaction is not direct, or else the NMR structure is not high enough quality in this region to define the structure well. Perhaps this interaction is simply not present in trp-cage tc5b (did the experimental paper discuss it?).

Examining the Trp indole to Pro backbone interaction

One feature discussed in the experimental paper is the hydrogen bond between the NH group on the Trp-6 side chain indole ring system with the backbone carbonyl of the Arg-16. Let's look at the structure and see if it is observed. Go back to distance selection mode (press 2), and select the atoms corresponding to this interaction. It can be easier to pick the atoms if you turn off the VDW representation first. Do this by going to the representations window, and double clicking on the Trp VDW representation. That entry will turn gray and it will not be shown. Later you can double click it again to turn it back on.

The figure is shown below. Is this a good distance for a hydrogen bond?


Looking at other models in the PDB file

This PDB file actually includes more than 1 coordinate set. If you look at your main VMD window, under the "Frames" column you'll see that it has 38 frames. Each one is an independent model generated from the NMR data. This is common in NMR structure determination- the experiemnt gives a set of distance ranges for pairs of atoms, and they use simulations to generate a structure model consistent with the experiment. It's important to repeat that again - the NMR structures are CONSISTENT with the experiment. It's possible that some parts of the structure were poorly defined by the experiment (maybe the are very flexible, or the peaks in the NMR could not be assigned, and so on). That means that parts of this model may have been unconstrained by the experiment, and maybe they aren't a good model (or maybe they are, we don't really know just by looking at the NMR data). One way researchers reduce this uncertainty is to run the refinement simulation multiple times, in the hopes that parts of the structure that are not strongly constrained by the experimental data will end up in different places with each refinement run. This is why multiple models are included, and it can be very important to look at all of the models, not just 1. Places where the models differ are less reliable than places where they match. Let's see how to do that in VMD.

Using animation mode to look at structures

A simply way is to turn the 38 frames into a "movie", and watch to see which parts of the model seem to move. Those are the ones that the experiment has uncertainty about the structure. Go to the main VMD window, and in the bottom right corner you'll see a triangle pointing right. Click it and VMD will start cycling through the frames (NMR models in this case). If they go too fast, you can use the mouse to drag the "speed" slider to the left, it's located right next to the arrow that started the animation. Slow it down, and then look to see which of the features you explored above are staying the same for the models, and which are not. For example, do any of the models show a close interaction between Asp-9 and Arg-16? It's good that we made these side chains licorice mode, since they stand out more as the animation plays. While it is playing you still have full control to rotate and zoom your structure. It will also update measures shown on the screen such as your distances. Click the arrow again to stop the animation if you want to spend more time examining a particular model. You can also use the large horizontal scroll bar above the animation arrow to move back and forth between frames instead of having the animation play. Try to get a sense for which parts of the model are probably not reliable.

Showing multiple frames for individual representations

Let's say that you want to see multiple models overlaid at the same time, instead of animated. You can do this in the representations window. Go back to that window, and then click on the representation for which you want to overlay multiple frames (you can do this for more than one representation if you want, here we will just do 1 as an example). Let's select the Asp-Arg interaction. Click on "Trajectory" underneath the list of representations, then look down to see the "Draw multiple frames" box. Here is where you say which frames to draw. Let's draw every other frame (38 is a lot to see at once). Type in 0:2:37 (this means draw from 0 to 37 with a stride of 2, meaning every 2nd frame).


One last thing before we look at the structure - overlaying lots of snapshots can get messy, and for pairs of sidechains we won't know which Asp that is shown goes with which Arg. Let's change the coloring scheme. VMD lets us color each frame a different color. Go to the representations window and make sure you highlight the Asp-Arg pair representation, then under the atom selection box click back on Draw Style (you may have left it on Trajectory). Pull down the Coloring Method and select Trajectory/Timestep. Each frame will now be drawn using a different shade, so you know that side chains with the same shade correspond to the same frame (the same NMR model).


It does indeed look like some of the models might have a close pairing of Asp and Arg, though the Arg in particular is very poorly defined in the NMR models.

Conclusions and suggestions for additional analysis

Congratulations on learning some basic protein structure analysis! It can be much easier to understand structure when you can rotate it and zoom in as compared to looking at a static snapshot in a research article. These visualization skills (and the observations you made for trp-cage) will be useful in Tutorial 1.

Make sure to look at the rest of the experimental paper, and spend a little time looking at the structure to make sure you understand any other features discussed in the paper (for example, didn't it say something about a Tyr side chain ring stacking on the Trp indole? and an unusual hydrogen bond for a glycine? maybe you should look at those...).