Assignment 6 : Use of docking methods for drug design

Sub part 1: Docking FDA approved compounds to a target


The object of this part of the assignment is to teach you how to submit CANDOCK jobs. These jobs find the interactions between a library of small of small molecules and a receptor (or target).


Step 01: Setup CANDOCK

Since CANDOCK's location is non-standard, you will need to tell the computer where to find the executables for CANDOCK. To do so, use the following command:

export PATH=/depot/gchopra-class/apps/v0.6.0_chm579/modules:$PATH

You can add this line to the ~/.bashrc file so that you do not have to run it every time you log into the computer.

Step 02: Obtain a receptor

For this assignment, we will use the kinase FLT3 as our target. We first need to download this structure and place it in a new folder located in your $HOME directory. Rename the file you downloaded as receptor.pdb. If you have completed the previous step, use the following command to do so for the PDB ID 4XUF. Note we will use as an example for future steps. 4xuf > 4xuf.pdb

4XUF already has a ligand in it (RCSB will tell you this), you must remove it. There will be a three alphanumeric code for this ligand. In our case it is P30, you can remove it using a text editor, or with this command: 4xuf | grep -v P30 | grep ' A ' > receptor.pdb

Note: CANDOCK is able to dock with crystal waters and cofactor (such as metal ions, NAD, etc). Thus, these do NOT need to be removed.

Also note: If you plan on using CANDOCK for your final project, pick a target that may be implicated in a disease pathway that you are interested in studying, or come from a previous assignment for this course. Once this target is selected, search for it on or and find a solved structure for this target (make sure you select a structure with the appropriate domains).

Step 03: Determine the binding site

In order to dock compounds to a target, one must give a binding site on the target in question. Fortunately, CANDOCK is able to predict binding sites de novo if no binding sites are given by the user. These binding sites are given as text files and are typically named receptor/site.cen (since we called our target receptor) and is placed in a directory named after the target.

If you wish for candock to predict a binding site for you, use the following command, but note that this is not needed as this step will be run automatically. find_centroids

This will create a SLURM job and submit it with sbatch automatically. If this step fails (no receptor/site.cen file is present once the job finishes), you must give a binding site yourself, Contact the instructor if this is the case.

If you are using CANDOCK for your final project, then the next section maybe useful to you. Otherwise, continue to Section 04.

If you know or are interested in, a binding site for your target create a text file named 'receptor/site.cen' with the following contents. Each line represents a sphere centered around (x,y,z) with radius r. The number in the lefthand column should always be 1 unless you are specifying more than one binding site. This should look like the following example:

1 x1 y1 z1 r1

1 x2 y2 z2 r2

1 x3 y3 z3 r3


If you do not know the binding site, do not create any new files. CANDOCK will attempt to determine the binding site automatically and save the result to receptor/site.cen .

Step 04: Prepare the ligands

Copy the following file to the directory that contains receptor.pdb:

cp /depot/gchopra-class/data/lab_7_compounds.mol2 ligands.mol2

When the command completes, you should have the following new files: prepared_ligands.pdb, seeds.txt, and seeds.pdb. You can open seeds.pdb in PyMOL to see how CANDOCK fragmented your compounds of interest. If you did this step, skip to step 05.

If you are using CANDOCK for your final project, the next section may be useful to you. Otherwise, skip to Step 05.

Note: submitting your own compounds

If you would like to use your own ligands (possibly ones that you find interesting), you can submit a version of your files in the Tripos mol2 format. An online tool exists to convert most formats to mol2 ( Place your ligands as the file ligands.mol2 in the same directory as receptor.pdb. Let's use P30 from 4XUF as an example. Goto and search for the ligand P30. You should find the page: . Use the button in the top right corner to download the Ideal SDF file. Submit this file to the webserver and download the result. Copy this file as ligands.mol2 in the directly with receptor.pdb in it. Ensure that you do NOT have a prepared_liands.pdb file or a seeds.txt file in this directory! Finally, run the following command (note this assumes you are docking a small number of ligands):

Step 05: Dock the ligand fragments

Now that you have a receptor ( receptor.pdb ), a binding site ( receptor/site.cen will only exist if you ran or have provided your own), and ligands to dock ( prepared_ligands.pdb ) you can now start docking! We're going to do this in stages so that you understand how the process works.

To dock the fragments, use the following command. Like the step for binding site determination, you do not need to do this explicitly as it will be done automatically in Step 06. dock_fragments

( These are notes for trouble shooting and do not need to be done! Feel free to move onto Step 06 )

You will get a new directory called receptor/top_seeds/ which contains the results of fragment docking performed on the seeds in seeds.pdb. You can open the files in this directory using PyMOL to see how CANDOCK placed these fragments in your binding pocket.

Once you have the fragments docked, you can complete the docking procedure by linking the fragments together. Before you begin this step, make sure that the directory top_seeds exists in the directory that you are doing docking in. Also, check the file slurm-###### for any errors

Technically speaking, this is the only command you need to submit a CANDOCK job as it will run any required previous steps if needed. This is the final step in running CANDOCK and will produce a directory call receptor/docked that will contain the docked conformations of the ligands. link_fragments

Note that for the sake of time, CANDOCK has been configured for this assignment in a way that it will generate fewer conformations than normal. If you are using CANDOCK for your final project, you may want to configure it to generate more. Ask an instructor if this is the case.

Step 07: Look at the results

After the linking step has completed, there will be a directory called receptor/docked in the directory that you submitted jobs in. In this directory, a file will be created for all of the compounds that you docked against. These files contain CANDOCK's predicted conformations and corresponding score. Since these files are standard PDB files, they can be opened in PyMOL directly. However, PyMOL has issues with opening these files directly. Thus, it is recommended to follow the following steps:

cd receptor/

You will get two new files in this directory, scores.csv which is an Excel file with CANDOCK's predicted scores, and all.pse which is a PyMOL sessions. The order of the compounds in this session are in decreasing binding affinity. If a compound is missing from either file, CANDOCK has predicted that this compound is a non-binder.

Sub part 2: Design of new compounds


This assignment will teach you how to use CANDOCK to make changes to existing binders ( lead.mol2 ) for a given target using a collection of chemical fragments (additional.mol2).

Lead optimization

Against a single target

A recent paper published by the pharmaceutical company Hoffman La-Roche (citation at the end of section) describes an effort to optimize a DPP-IV inhibitor. The goal was to optimize a lead by replacing an R group with various other fragments. Two files, lead.mol2 and additional.mol2 have been prepared with this lead and the additional fragments respectively. Download them and place them in an empty directory. Next, download the protein structure mentioned in the paper (3OC0) and place into the subdirectory targets/ (remember to clean it up first!). Then create the following script and run it.

cp /depot/gchopra-class/data/additional.mol2 .
cp /depot/gchopra-class/data/lead.mol2 .

mkdir targets
cp /depot/gchopra-class/data/target.pdb targets/

export CANDOCK_ligand=lead.mol2
export CANDOCK_fragment_mol=additional.mol2
export CANDOCK_target_dir=targets
export CANDOCK_lipinski_nhs=5
export CANDOCK_lipinski_ohs=5 design_ligands

You will get the files designed_1.mol2 and designed2.mol2. Do they match the compounds suggested by the authors?

A Real-World Perspective on Molecular Design Kuhn, B. et al. Journal of Medicinal Chemistry 2016 59 (9), 4087-4102 DOI: 10.1021/acs.jmedchem.5b01875

Against multiple targets

The procedure for profile based design is similar to the one for single targets, the only difference is that one places more targets in the targets/ directory. To give CANDOCK targets that you wish to decrease binding affinity to, you create an additional folder called atargets. Create a new directory from scratch (IE separate from any previous jobs) and run the following commands:

cp /depot/gchopra-class/data/additional.mol2 .
cp /depot/gchopra-class/data/lead.mol2 .

mkdir targets
cp /depot/gchopra-class/data/target.pdb targets/

mkdir atargets
cp /depot/gchopra-class/data/antitarget.pdb atargets/

export CANDOCK_ligand=lead.mol2
export CANDOCK_fragment_mol=additional.mol2

export CANDOCK_target_dir=targets
export CANDOCK_antitarget_dir=atargets
export CANDOCK_lipinski_nhs=5
export CANDOCK_lipinski_ohs=5
export CANDOCK_seeds_to_avoid=2 design_ligands

Design of new scaffolds

Note: you can ignore this section of assignment. It is presented to you only in hope that it may help you for your final project. It is not required for this lab.

CANDOCK has the ability to design new compounds from the seeds database generated in Step 05. Create a new directory (separate from the one you did docking in) and copy the files seeds.pdb and seeds.txt into this new directory. Create a subdirectory called targets in this location and and copy the receptor directory from your previous run into this subdirectory along with receptor.pdb. The directory structure should look like the following:

$ ls

seeds.txt seeds.pdb targets/

$ ls targets/

receptor.pdb receptor/

$ ls targets/receptor/

top_seeds/ site.cen gridpdb_hcp.pdb

export CANDOCK_target_dir=targets design_ligands

Congrats! Now you're creating new designs! When the job completes, you will get files named like designed#.pdb. These files contain the designs generated by the program. The docked version of these compounds is present in targets/receptor/docked/ . You can use the extract script to get the scores of these compounds.




Sub part 1

  1. For your writeup, you should include images of the docked poses and docking scores of the compound. Which drugs bind best to your protein of choice? Did you expect this? Why or why not? Note that the three letter codes ( P30, KIM, etc ) are all kinase inhibitors!
  2. Include images of the interactions between some of the high scoring molecules and the low scoring ones. What interactions do you believe helped make the higher scoring compounds bind better than the low scoring ones?
  3. Open all.pse along with the original 4XUF file. How well did CANDOCK predict the binding pose of P30? Using the arrows in PyMOL's window, checkout the multiple poses generated by CANDOCK. Report CANDOCK scores of the well predicted pose and the top pose of the P30 molecule
  4. Part of CANDOCK's linking procedure includes an energy minimization procedure. Did your target's conformation change while docking? If so, by much?

Sub part 2

  1. Examine the designs created by CANDOCK. Do they match what is listed in the paper?
  2. How does the addition of the antitarget change the designs created?
  3. Include images of some of the designed ligands and discuss their interactions at the binding site