What is the KoBaMIN server?

The KoBaMIN server is an internet service for very fast protein structure refinement. The refinement protocol is composed of two steps: (1) energy minimization using a knowledge-based potential of mean force and (2) stereochemistry correction. Users submit one or more protein structures and KoBaMIN likely brings them closer to the native-like conformation.

How does it work?

Step 1. The user provides initial protein structure(s) for refinement and a reference structure (optional). The reference structure is used in calculation of CαRMSD, GDT-HA and GDT-TS. If no reference structure is provided, the initial structure is used for these calculations.
All structures are checked for the correct file types. If successful, the job is queued, and the user receives a notification message if he provided an email address. Otherwise, the user is shown an error message detailing the error.

Step 2. Structural integrity checks are done using BioPython (Bio.PDB module) and incorrectly formatted PDB files are filtered out. We follow the Protein Data Bank convention for the PDB format. Insertion codes are dealt with in the following way: only the first instance of every residue is kept (e.g. 55, 55A; 61A, 61B). Insertion codes are then removed from the PDB file.

Step 3. Structures undergo refinement through energy minimization using a statistical knowledge-based potential of mean force (KB01). This is the major refinement process of the protocol.

Step 4. (optional) MESHI is used to give stereo-chemically correct models. This can be turned off by unchecking the box at the input page. It focuses on four criteria:

  • Number of clashes, indicating van der Waals violations
  • Number of angles and bonds outliers
  • Number of side-chain chi1 and chi2 outliers
  • Number of backbone angles that are outliers on the Ramachandran map

Figure 1. Schematic Representation of the KoBaMIN protocol.

Step 5.For each structure, CαRMS, GDT-HA, and GDT-TS are calculated either from the provided reference structure or, in its absence, from its initial structure. These are merely informational. Although it is highly likely that a structure moves towards the native state, sometimes the protocol fails to do so. Please refer to the Proteins paper below for more details.

Step 6. All data is organized, archived, and made available for download. An email is sent to the user notifying of job completion. The results are deleted from our server seven days after job completion.

What is the output of the KoBaMIN server?

Upon completion of the job, the server generates a results page displaying all the submitted structures and their respective analysis values. An example results page is available here.

If an email was provided on submission, the user is sent links to the abovementioned results page and to two ZIP archives (an example is shown in the links below):

  • results.zip - Contains only the final refined and stereochemically optimized (if requested) structures
  • request.zip - Contains all data pertaining to the job. The archive is named after the job name.

The full data archive (jobname.zip) is structured in several folders:

  • Initial_Structures - Contains the structures that the user submitted which passed the structural integrity tests (Step 2 of the workflow).
  • Reference_Structure - Contains the reference structure if the user submitted one. If none if provided, this directory is absent.
  • Final_Structures - Link to the folder containing the final refined structures.
  • Refined_Structures - Contains folders named after the refinement steps.
    • 1_KB_on_Initial - Contains structures successfully refined by KB01 minimization
    • 2_MESHI_on_KB_Minimized - Contains structures successfully optimized by MESHI stereochemistry correction step (if requested)
  • Refinement_Analysis - Contains the results of the calculation step (Step 5 of the workflow).
    • koba_energies.txt - ENCAD KB01 Potential Energy (kcal/mol) for each KB01 minimized structure
    • koba_gdt.txt - GDT-TS and GDT-HA with respect to the reference or initial structure for each initial and final structure
    • koba_rms.txt - CαRMS with respect to the reference and/or initial structure for each initial and final structure
    • koba_time.txt - Time spent during each refinement step (in seconds)
  • Log_Files - Contains individual log files of protein structure(s) for all steps in the refinement protocol and calculations.
    • koba.log - General log file for correct functioning of server workflow
    • calculations.log - General log file for structural analysis in Step 5 of the workflow
    • ENCAD_*.log - Individual log file for KB01 Refinement of each protein.
    • MESHI_*.log - Individual log file for MESHI stereochemistry optimization of each protein (if requested)
  • Problematic_Files - Contains all the structures that caused errors throughout the protocol.

How to cite the KoBaMIN server?

Further questions or suggestions?

Please email us at choprait [.at.] purdue [.dot.] edu