RAPTOR Frequently Asked Questions

1. Question: How much memory and storage space does RAPTOR need? How long does it take to install?

Answer: A large memory, preferably >= 1GB, may be needed for threading big proteins. A large cache size can significantly accelerate the computing speed. The installation needs 2.2 GB of disk space, and may take up to 30 minutes to finish, depending on your hardware setup.


2. Question: How does RAPTOR compare to MODELLER/MODELER?

Answer: They are complimentary products rather than replacements of each other. In short, they differ in the scale of their tasks. RAPTOR weeds out large amounts of unlikely proteins and indicates templates in the database that are closest in 3D shape to the query sequence. MODELLER/MODELER (or other comparative modelling tools) is specialized in modelling by "mapping" the query onto one template by satisfying spatial constraints. MODELLER/MODELER needs to know which template as the basis to build the model on, which is precisely what RAPTOR provides. It is keen at picking up similarities that might go undetected by other methods. MODELLER/MODELER is useful when the purpose of running RAPTOR is to get an accurate model based on templates from threading for possible structural analysis later. If, for example, only the query-template alignments are of interest, this step can be skipped. RAPTOR outputs in XML format. The module modellerProspect in the suite generates an alignment file from the XML output, extracts PDB atom coordinates from the template as well as preparing a script runnable by MODELLER/MODELER. All three of them (alignment, PDB atom file, script) are required inputs by MODELLER/MODELER.


3. Question: I noticed that the NR database that comes with Blast is smaller than the one I have originally installed on my machine. Which one should I use?

Answer: Due to the installer's size constraint, the version of NR database (required by PSI-BLAST) included in RAPTOR is not the most current one. You are encouraged to download the latest release from NCBI's ftp site:

ftp://ftp.ncbi.nih.gov/blast/db/

Please put your copy under $RAPTOR_HOME/data/nr/ and name it nr.db. Don't forget to format it if it is the unformatted version. Consult $RAPTOR_HOME/blast/README.formatdb.


4. Question: When we invoke SS-PRED in the shell, a psi-blast calculation is started and, after this, ss_pred itself, creating a file .ss in a desired location. Is the result of SS_PRED consequence of the psi-blast analysis or SS_PRED could be used alone without any reference to a previous psi-blast calculation or anything similar? Where can I find out more about the ss_pred procedure?

Answer: SS_PRED is a wrapper for ss_predictor (in $RAPTOR_HOME/bin) and forces a PSI-BLAST calculation because it's way more accurate and worthwhile than running ss_predictor with the query sequence as input. ss_predictor is an in-house built sec. structure predictor as in PROSPECT. It does accept both profiling data from PSI-BLAST or sequence as input. See usage here.

Again, running PSI-BLAST is because it generates better result. If you find it too slow (which I understand), let me know and we may add an option into IPThread.conf to allow turning off PSI-BLAST.

Or, alternatively, you can try PSIPRED in /bin, which is a wrapper for the rather established secondary structure predictor PSIPRED (we cannot include it in RAPTOR due to license restraint), which is free to academic users. Although, our ss_predictor is comparable in performance to PSIPRED.


5. Question: What are the differences on using different template types (-t argument)(fssp/xml)? Are there other possibilities?

Answer: Template types refer to either templates generated based on FSSP database or SCOP database. fssp/xml actually refer to the file suffices (.fssp or .xml). It's only a historical reason to have "xml" type as templates end with .xml in PROSPECT. Different "template types" have the same format, but represent different categorization scheme (i.e., the database type) based on which they were made. The templates we supply with RAPTOR are all FSSP-based (in /data/templates), which have the majority of SCOP-based ones covered. So you don't need SCOP templates.

We generate templates ourselves as it takes quite some efforts to make a large body of them. We provide new templates as periodic product updates.


6. Question: What format should the weight-file obey? (-w argument)

Answer: Weight files are in /WEIGHT and specify how much each component in the energy function should contribute in the final score. These files are set in /data/param/IPThread.conf (look for PWWeightFile and NPWeightFile). Note that weights have been trained and tested for optimal result, so for now using values in the default files is fine.


7. Question: What are the meanings of -l and -r arguments? Are they related to the following terms "templateName" and "sequenceName"? What do they exactly mean by templateName and sequenceName? Are they a list, a single file, or a name?

Answer: -l: This option is for weight factor training purposes only, so you don't need to change it. But for your information: if IP method is used, this option will only generate linear solution rather than integral solution. What's a linear solution? When solving Integer Programming, we always first relax it to a linear program. The linear program is solved and its solution is called linear solution. Then a branch-and-bound method is used to convert the linear solution to integral solution (the solution of integer program). if -l option is used, then the branch-and-bound process is not applied. So the solution is just linear.

-r: if the SARF result of this threading pair is available, this option will calculate the alignment accuracy of this pair generated by thread. Again, it's for training and assessment purposes only.

templateName is the name of the template file (no suffix) to run threading against. sequenceName is the input sequence file name (no suffix). e.g. I have a sequence file called 256ba.seq, and I want to thread it against template file 7rsa.xml (which is in /data/templates/) to see if they have similar fold. I type (ignore other options):

thread 7rsa 256ba