MOLECULAR DYNAMICS SIMULATION OF WATER IN GROMACS

In this lab you will learn the basics of using the GROningen MAchine for Chemical Simulations (GROMACS) package. The first system to study is liquid water by using a simple SPC/E model. You will learn how to obtain thermodynamic observables, such as the potential energy, as well as structural and dynamic properties of the system. This is arguably the most important lab in the course. Much of the information needed to complete this lab is available in the GROMACS online manual. Refer to the manual first, before asking technical questions.

First, take some time to familiarize yourself with the GROMACS manual

You may also find useful information in some of the GROMACS tutorials

Scholar’s User Guide may also be convenient:

GROMACS unit system:

Length: nm = 10 -9 m = 10 Å
Mass: u = 1.6605402×10 -27 kg (1/12 the mass of a 12C atom)
Time ps = 10 -12 s = 1000 fs
Charge: e = 1.60217733×10 -19 C (electron charge)
Temperature: K
Energy: kJ mol -1
Force: kJ mol -1nm -1
Pressure: kJ mol -1nm -3 = 16.6054 bar
Velocity: nm ps -1 = 1000 m/s
Dipole Moment: e nm
Electric Potential: kJ mol -1e -1
Electric Field: kJ mol -1nm -1e -1

Lab Procedure:

The first part of this procedure will serve as an introduction to the use of GROMACS. On the second part you will carry out your first Molecular Dynamics simulation using GROMACS!

Step 1

From the /scratch/carter/g/gchopra/class/CHM579 directory on Scholar copy lab2b/ to your current directory located in your home directory.

To copy a folder and all its contents: cp -r /scratch/carter/g/gchopra/class/CHM579/lab4a $PWD

To print the current working directory: pwd

Step 2:

Before running any GROMACS commands, you need to load the GROMACS module on the node that you are using. By loading a module you will add paths to GROMACS into your environment settings. In other words, as GROMACS is not a conventional program that you use every day, you need to tell the supercomputer that you want to run GROMACS, so that it will prepare all the things it needs to do it.

To load the GROMACS module: module load gromacs

IMPORTANT: You can print on the terminal all the programs, or modules, that are available in the supercomputer by introducing the module avail command. You can verify if you did loaded the module by using the module list command.

Step 3:

Similarly to the Monte Carlo program of Assignment 3, GROMACS uses an input file (.tpr). However, the input file (.tpr) does not exist and has to be generated. In order to prepare GROMACS’ input file (.tpr) you will need to execute the grompp command, which needs three files to run: the topology file (.top), molecular structure file (.gro), and the parameters file (.mdp).

IMPORTANT: It is strongly recommended that you go over the details of each of these files on the GROMACS manual webpage.

For this lab, the topology file (.top) and molecular structure file (.gro) are provided so you do not need to worry about them for now. Nevertheless, you will need to modify the parameters file (.mdp) at various steps of this procedure. IMPORTANT: On the GROMACS manual webpage you will find a thorough explanation of the parameters file format and of its general use. Refer to: MDP

Take a look at these files. They are located at the ~/CHM579/lab2b/ directory: topol.top, conf.gro, and grompp.mdp. It is suggested that you use either a text editor, such as vi or nano, or just the more command. Find explanations of all the options you see in your parameters file (.mdp). Understanding these options is the key to realize what this lab is all about.

Step 4:

Now that we understand what the simulation is supposed to do, let us prepare the input file by executing the gmx grompp command.

To generate the input file: gmx_mpi_d grompp -f grompp.mdp -c conf.gro -p topol.top -o step04.tpr

After the command is executed several lines of information will come up on the terminal, a funny quote should be the last line. Take a look at the notes, they always provide interesting information. More importantly, if the execution was successful, two files will be produced: step04.tpr and mdout.mdp. The latter contains a summary of parameters of the simulation you prepared.

Do not try to open the input file with a standard visualizer or text editor, it is a binary file.

IMPORTANT: The structure of the previous command starts with gmx, followed by the name of the command, ( grompp in this case), followed by a series parameters of execution, or options, which are denoted by the hyphen symbol ( -). These options must be specified in case your files have non-default names. For example, the parameter -f means that an explicit parameters file (.mdp) will be used, -c indicates the name of the molecular structure file (.gro), -p is to use a specific topology file (.top), and -o is to give a name to the input file that is going to be generated. There are other options that may be useful for you in this lab, check the manual.

Step 5:

Let us take a look at input file (.tpr). It is a binary, which means that you will not be able to use neither your text editor nor the more command. Instead, you can check what is inside using the gmx_mpi_d dump command:

To check the input file: gmx_mpi_d dump -s step04.tpr | less

IMPORTANT: The pipe “|” character is usually typed using Shift + ∖ . You can quit the less command by hitting 'q'.

Step 6:

In order to run the simulation you just prepared in the input file (.tpr) we will use the mdrun command.

This will submit a GROMACS Molecular Dynamics simulation as a front-end application, meaning that we are going to carry out the simulation on the same front-end node that is in charge of handling all the instructions we input and projecting the terminal on our screen. This may seem trivial, but think of how many people is doing similar things on the same front-end node at the same time. That is why, we will not be allowed to run any program or script as a front-end application for more than 30 minutes. We will learn how to run longer GROMACS jobs in a while, for now:

To run a Molecular Dynamics simulation: gmx_mpi_d mdrun -s step04.tpr (Be patient!)

After the completion of the command, you will find several new files.

confout.gro New molecular structure file: A text file with the final structural configuration of the system.
ener.edr Energy file: It is a binary. Used by the gmx energy command.
md.log Log file: A log file that includes information about the simulation.
state.cpt Checkpoint file: It is a binary. to restart the simulation from the point it was stopped.
traj_comp.xtc Trajectory file: It is a binary. Used by many gmx commands, e.g., gmx msd.

When the simulation finishes, the new configuration information –basically coordinates and velocities– is saved to the confout.gro file. The default name confout.gro will be used unless you specified another name via the -c option when executing gmx_mpi_d mdrun.

Step 7:

Now we are going to use some of the GROMACS utilities for trajectory analysis. The gmx energy command analyzes the ener.edr file and can provide a lot of useful information.

To run the gmx energy command: gmx_mpi_d energy -f ener.edr

IMPORTANT: When the gmx_mpi_d energy command is executed a series of options will come up on the terminal, on the prompt type “4” (potential), and press Enter twice. The results will be printed and saved into an energy.xvg file.

Take a look at the new energy.xvg file using the text editor or the less command. This (.xvg) file has the proper format to be plotted with the plotting program Grace.

QtGrace is a native Grace for Windows, Linux, and Mac OS. This is a good option if you plan on using your personal computer.

To plot the energy.xvg file, download the file using BitVise or Filezilla and open it in QtGrace. You can use this program to export images of the plots. You maybe also interested in using RStudio to make the plots if you have a strong computer science background

Step 8:

In GROMACS, the diffusion coefficient can be obtained from mean-square displacements using the Einstein relationship. The gmx msd is the tool for that. When executed it takes information from both, the trajectory file and the input:

To run the gmx msd command: gmx_mpi_d msd -f traj_comp.xtc -s step04.tpr

IMPORTANT: Again, a series of options will come up on the terminal. Type either “0” system or “1” water on prompt. The results will be saved in the msd.xvg file. This has the same Grace format and you can plot it just like you plotted the energy.xvg file.

Step 9:

Using gmx_mpi_d rdf you can obtain pair correlations functions, or radial distribution functions (RDF), for the different types of atoms present in your system, i.e O···O, H···H, and O···H. In order to use gmx_mpi_d rdf, you need to create an index file with groups first:

To create a new index file with groups: gmx_mpi_d make_ndx -f conf.gro

IMPORTANT: When prompted type “a OW”, then “a HW*”, and then “q” to exit. The “*” character represents a wild card matching all characters, in this case 1 and 2 (there are two hydrogen atoms in water!). The information is saved in the index.ndx file.

After successful execution a new file index.ndx will be produced. Explore the information inside with a visualizer or a text editor. Sometimes it is convenient save previous index files. If you have already made an index file and wish to read in previous groups:

To use an existing index file with groups: gmx_mpi_d make_ndx -f conf.gro -n index.ndx

Now you can use this information to build radial distribution functions:

To run the gmx msd command with an existing index file(Whenever you see the ${·} notation in the instructions, it means that you have to replace it by inputting characters of your own!): gmx_mpi_d rdf -n index.ndx -o RDF_${atoms} -f traj_comp.xtc -s step04.tpr

IMPORTANT: Specify different RDF_${atoms} for O···O, O···H, and H···H RDF. On the prompt, pick the groups between which RDF will be calculated. For instance, “4” and “4” will prepare an H···H RDF, which should be named RDF_HH.xvg.

Plot the RDF for H···H, O···O, and O···H. Make sure you understand them.

Step 10:

It is possible to extend or continue a simulation that has completed, and even those that have crashed. This is a good technique for the handling of longer simulations reducing the time lost due to crashes of the computer being utilized. It is only possible to continue seamlessly from the point at which the last full coordinates and velocities are available for the system.

Now you will learn how to extend a simulation, with the exact same conditions and parameters, by restarting it from a previous point in the trajectory.

IMPORTANT: It is strongly recommended that you go over the details of this procedure on the GROMACS manual webpage.

WARNING: When following any of these directions, be careful to not overwrite files when executing mdrun for the second time. Make sure you have a consistent method to rename, save, and organize backup files.

Let us go over a bit of the theory first. There are several ways to prepare equilibrated configurations consistent with a given

temperature. One way is to run two subsequent Molecular Dynamics simulations at the same temperature and to use the first simulation for equilibration purposes only (no data analysis), and the second for data analysis (production run).

In order to determine if the system is equilibrated, check the average temperature (and pressure, if appropriate) using gmx_mpi_d energy as described above to make sure the system has the desired values.

A simulation that has completed can be extended using the commands convert-traj and mdrun, and checkpoint files (.cpt). First, the number of steps or time has to be changed in the (.tpr) file, then the simulation is continued from the last checkpoint with mdrun. This will produce a simulation that will be the same as if a continuous run was made.

To prepare an extended simulation: gmx_mpi_d convert-tpr -s step04.tpr -extend 200 -o step10.tpr

Here, step04.tpr should be the name of the (.tpr) file of the completed run. 200 should be the time of the new simulation in picoseconds (ps) and step10.tpr is the new file for the second Molecular Dynamics simulation.

To run the extended simulation: srun -I gmx_mpi_d mdrun -s step10.tpr -cpi state.cpt

IMPORTANT: The second simulation will start as a continuation of the first one. Be careful to specify these new files in analysis commands, using -f, -s, etc. options.

Step 11:

If you desire to make some changes to your run parameters, such as changing ensembles or temperatures, you will need to start by making a new parameters file (.mdp). Then you will need to rerun the grompp command.

IMPORTANT the gen_vel keyword in the new parameters file (.mdp) should be set to no in all consecutive calculations as yes resets velocities and destroys equilibrated trajectory.

WARNING: The following commands may overwrite your files; make sure you have a backup.

To create a new parameters file: gmx_mpi_d grompp -f ${new}.mdp -c ${old}.tpr -o ${new}.tpr -t ${old}.cpt

Files named ${old} correspond to those from the finished run and those named ${new} correspond to those named for the new run.

To execute mdrun with the new input file (.tpr) file: gmx_mpi_d mdrun -s ${new}.tpr

Now you have learned all the elements you need to run your own simulations.

Discussion

(A) Temperature-dependent dynamics

Perform Molecular Dynamics simulations, in the canonical ensemble (NVT), fixing the temperature at 300 K, 280 K, and 260 K. You need to be sure that the simulations start from a conformation consistent with the chosen temperature. The production run must be of at least 200 ps. Use the Nosé-Hoover thermostat to keep the temperature fixed. Calculate the average potential energy, diffusion coefficient, and pair correlation functions for the O···O, H···H, and O···H pairs.

  1. Present your results using tables and figures.
  2. At 260 K, the temperature is lower than the freezing point for water. Do you observe formation of ice? Explain what happens with the system as you lower the temperature.
  3. Compare the value you get for the diffusion coefficient at 300 K with the value reported by Mahoney and Jorgensen (J. Chem. Phys. 114, p 363, 2001).
  4. A more complete description of water molecules is availible here.
  5. Are the conditions of your simulation similar to those of Mahoney and Jorgensen?
(B) Diffusion experiments

Prepare an input file (grompp.mdp) to simulate water at 25 °C and 1 atm, in an isothermic-isobaric ensemble (NPT). For that you need to use pressure coupling.

You can do that by adding the following lines to your grompp.mdp file:

    pcoupl          =   Parrinello-Rahman
    pcoupltype      =   isotropic
    tau_p           =   1.0
    compressibility =   4.5e-5
    ref_p           =   1.0
    

Figure out the meaning of each of those lines. You can use the online manual.

Additionally, you may need to adjust the interactions cut-off of your system. Figure out how to do it.

IMPORTANT: The cut-offs of your simulation must be shorter than half the length of your simulation box! Why?

  1. Calculate the diffusion coefficient
  2. Compare with the published value.
  3. Why are you using Molecular Dynamics instead of Metropolis Monte Carlo?

For your lab report explain the methodology you followed to perform your simulations. Include all your tables and figures with their corresponding discussion. Answer the questions providing your reasoning. Do not forget your concluding remarks.