|
1. Question: How does PROSPECT Pro compare to MODELLER/MODELER? Answer: They are complimentary products rather than replacements of each other. In short, they differ in the scale of their tasks. PROSPECT Pro weeds out large amounts of unlikely proteins and indicates templates in the database that are closest in 3D shape to the query sequence. MODELLER/MODELER (or other comparative modelling tools) is specialized in modelling by "mapping" the query onto one template by satisfying spatial constraints. MODELLER/MODELER needs to know which template as the basis to build the model on, which is precisely what PROSPECT Pro provides. It is keen at picking up similarities that might go undetected by other methods. MODELLER/MODELER is useful when the purpose of running PROSPECT Pro is to get an accurate model based on templates from threading for possible structural analysis later. If, for example, only the query-template alignments are of interest, this step can be skipped. PROSPECT Pro outputs in XML format. The module modellerProspect in the suite generates an alignment file from the XML output, extracts PDB atom coordinates from the template as well as preparing a script runnable by MODELLER/MODELER. All three of them (alignment, PDB atom file, script) are required inputs by MODELLER/MODELER. 2. Question: What is PROSPECT PRO 3.0? How is it different from PROSPECT 2.0? Answer: It is the latest release of PROSPECT Pro. It has several additional features including:
3. Question: I have received the demo on CD-ROM. How can I get started? Answer: Please consult the installation page and the quick guide for a head start. 4. Question: Who developed PROSPECT and on what technique is it based? Answer: Developed by Ying Xu and Dong Xu of Oak Ridge (Tenn.) National Laboratory, PROSPECT is based on a technique called protein threading, but it uses an improved algorithm based on discoveries about the nature of protein folding. 5. Question: What platform does PROSPECT PRO run on? Answer: PROSPECT PRO has multi-platform compatibility. It can run on Linux, Windows2000/XP, Solaris and SGI environments. A Mac OS X version will be released soon. 6. Question: How does PROSPECT PRO differ from other 3D protein structure prediction programs? Answer: PROSPECT PRO software uses a very efficient energy function, a pre-programmed database and a smart threading algorithm to predict a 3-D structure of a protein with higher accuracy than other programs. The system guarantees to find the globally-optimal alignments for a given energy function with any combination of the following terms:
7. Question: What makes PROSPECT PRO run efficiently? Answer: The efficiency is achieved mainly by discovering and utilizing the "topological complexity" of a protein fold. 8. Question: Has PROSPECT received any international recognition? Answer: PROSPECT is award-winning software. It has won a prestigious 2001 R & D 100 award which recognizes the top 100 most promising innovations in science and technology for the year http://www.rdmag.com/features/0109100bio.asp. PROSPECT also took top place for 3-D protein prediction software in the threading category at the international 2000 CASP4 competition. 9. Question: What is the CASP4 competition? Answer: A bi-annual international competition, The Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction. See the webpage at http://predictioncenter.llnl.gov/casp4/Casp4.html. 10. Question: How does PROSPECT fair doing distant homology searches? Answer: PROSPECT did significantly better than PSI-BLAST in distant homology detection in CASP4. 11. Question: Does the PROSPECT PRO program take constraints? Answer: Yes. The system allows users to easily put biological knowledge and constraints into the threading process to find optimal alignment under the specified constraints: specified disulfide bonds specified active sites specified secondary structures specified gaps with no gap penalties 12. Question: Do I need a supercomputer to run PROSPECT PRO ? Answer: No. PROSPECT PRO can be run on any desktop computer for shorter sequences of less than 700 amino acids. A supercomputer would definitely increase the efficiency of the program and significantly decrease the running time on longer proteins. 13. Question: How much RAM do I need on my computer? Answer: PROSPECT PRO can run on any desktop that has 4G of virtual memory for sequences up to 700 amino acids. For longer sequences more memory would be necessary. 14. Question: How can I evaluate PROSPECT PRO ? Answer: A free demo version of PROSPECT PRO is available at our Internet site http://www.BioinformaticsSolutions.com/downloads/prospect-demo/. After filling out the form, the program can be downloaded. It is fully operational for up to 100 amino acid sequences. You may purchase the full version of PROSPECT PRO by calling (519) 885-8288, ext. 10. 15. Question: How long does it take to download PROSPECT PRO ? Answer: It could take up to 6 hours of uninterrupted time to download PROSPECT PRO via a dial-up connection because it contains a large protein structure database. 16. Question: How can I get a price quote for PROSPECT PRO? Answer: Send an email to Info@BioinformaticsSolutions.com. 17. Question: When I run PROSPECT with the terminal command "prospect", the machine told me "prospect: Command not found". What is the problem? Answer: The command "prospect" is an alias from the ".prospect" file. If you do not define the $System variable in Unix, you need to add setenv System `uname -a | cut -d' ' -f1`in your ".cshrc" file. Alternatively, you can specify the machine type manually in the ".prospect" file, e.g., setenv System SunOS. 18. Question: PROSPECT PRO complains that I do not have enough memory to run a threading. Can I still get some results? Answer: There are several solutions. An expensive solution is to get a new machine or add more memory. If you have some idea about the sequence domains, you can cut the protein into several domains and thread the subsequences one by one (e.g. by the tool ProDom). If you do not have any idea about the domain partition of the query sequence, you can use overlapping subsequences, such as 1-500, 400-900, 800-1300, 1200-1700, etc. You can also run the threading without pairwise interactions, which saves the memory and computing time substantially but may compromise the threading accuracy: prospect -np -ss my.seq > log 19. Question: Is there a size limit for the query protein? Answer: In principle there is none. However, threading with a very long sequence may not be practical in terms of computing time. We recommend you to use a sequence with less than 1000 amino acids when you do not use pairwise interactions, with less than 500 amino acids when you use pairwise interactions. For larger proteins, you can split them into subsequences (e.g. using a tool like ProDom. See the above question). 20. Question: PROSPECT Pro seemed to do a good job on one of the protein family I tested on, but did poorly on another. What caused the inconsistency? Answer: Firstly, no tool can be perfectly accurate in its prediction result. Moreover, we strongly recommend users to incorporate the secondary structure and profiling information as input into the threading process whenever possible, as it may drastically improve the quality of prediction. The built-in tools prospect_ssp and get_chk_file can do the jobs for you. Check on the usage page for more information. 21. Question: Running PROSPECT Pro with no z-scores is fairly fast. With z-scores it is terribly slow, as the documentation states. I tend to want z-scores, though. But that makes PROSPECT Pro too slow. Answer: It is generally recommended that all templates in the database be threaded against without z-score calculation first. Then turn on the z-score for the top 50 or 100 hits on the list sorted by either SVM (recommended) or raw score, since the remaining templates are very unlikely to have a good z-score. This will speed up things considerably. 22. Question: Why is SVM the preferred sorting criterion over raw score or z-score? Answer: It is a more comprehensive confidence assessment that has taken into account of a lot other information from the database and just general knowledge of proteins. It is better than the "local" scores in the case when the correct fold is not yet in the database, where the top raw scores or z-scores really don't mean they are likely folds. 23. Question: What threading method should I use? Answer: In general, -global performs better. So try that first, and then thread again with -global_local and inspect the discrepancy. 24. Question: Should I use the sequence file or the frequency/check point file to perform the secondary structure prediction? Answer: We recommend that the profiling information be used as input to prospect_ssp whenever possible -- it summarizes the evolutionary knowledge about that protein, and it contains the sequence information as well, since that is the basis the profile was generated in the first place. 25. Question: The protein seems to have two domains. Should I split it into two sequences and thread them separately or thread the whole sequence at the same time? Which way is better? Answer: If you know the protein has two domains, it is better to split it into two sequences and thread them separately. This will not only save computing time, but also have better chance to find the correct fold/alignment, since the search space is much smaller. In particular, sometimes only one domain has the native-like fold in the database. However, in case you are not sure about the partition of the two domains, it is better to run threading on both the whole sequence and the two subsequences. By comparing the two sets of the results, you may find some clues about the domain partition and the structure. 26. Question: If more than one template turns out to have good scores, should I only use the one with the best score for MODELLER. In homology modeling, I was told that more than one good template is better and MODELLER will weight them according to the sequence similarity. What would you suggest in the case of multiple templates from threading? Answer: The current interface between PROSPECT PRO and MODELLER does not support homology modeling with multiple templates. You may use the alignments generated by PROSPECT PRO to help the modeling with multiple templates. However, the focus of PROSPECT PRO is the initial structure model. Hence, we recommend you to pick up one template to build the model. If you like to refine the model, please check the instructions in homology modeling tools. 27. Question: I can't seem to obtain MODELLER to be installed to do the comparative modelling, but I have rasmol ready. Can I at least get some crude information on the prediction from PROSPECT Pro? Answer: Yes. Include -3d as an option when you do your threading. Then run rasmol on the resulting xml file. It gives you the rough fold of the backbone with some loop regions missing. 28. Question: Why do all the templates get a Z-score of -9.99? Should I rank the templates according to their raw scores? Answer: You need to use -reliab on command-line to turn on the z-score calculations. Otherwise, z-scores would be displayed as -9.99 as its default value. However, we found that Z-score is not a good indicator to rank templates. This conclusion was also reached by several other threading groups. To save computing time, we turn off the Z-score calculation in the default threading. If you want to get a Z-score, you can set the following ZscoreCycles 10000in common.conf file. You can rank the templates according to their raw scores. Sometimes you may get better ranking by using the reliability score based on a neural network (see Quick Guide). It is preferable to check both rankings when you are not certain about the threading result from one ranking. 29. Question: PROSPECT PRO complains about not finding something in ~/prospect_templates. I don't have that directory in my home directory. Answer: This means PROSPECT PRO cannot find a named template in the standard template libraries that come with it. The last place it looks into is ~/prospect_templates, which is meant to be storing user-defined templates. 30. Question: I'm running PROSPECT 1.0 / 1.1. How can I identify specific residues in our protein structures? Answer: After running threading against all the templates in 1.0/1.1, you will see two output files: seq_2_template.out seq_2_template.pdbwhere seq is the target sequence being queried upon, and template is the "closest" template in structure to seq that PROSPECT predicts. Clicking on the .pdb file, you will see a new window open with the predicted structure displayed. Now clicking anywhere within the figure, you can see in the bottom panel something like: THIS IS A HELIX (51-56)followed by the sequence with a specific region highlighted in red. That region is the secondary structure fragment you clicked on. So you can read off what residues are within that region. 31. Question: Is there a way for PROSPECT 1.0/1.1 to analyze and overlay 3-D structural similarities between proteins of similar types? Answer: Rasmol / RasTop can give more accurate and detailed information. However, it must be emphasized that PROSPECT is a structure *prediction* program, rather than a structure *analysis* tool. It bridges between the amino acid sequence and the possible structure, but it doesn't tell you too much about the structure since it's not the focus of the software. There is a tool called Swiss-Pdb Viewer (http://us.expasy.org/spdbv/mainpage.html) that does it.
There is another utility called DALI, which is an online server that does
structure analysis for a submitted pdb file. It's free to academics. You can
read more at
32. Question: PROSPECT uses templates to predict the protein structure, i.e., 1qhaa, 1erja, 1b4ka, etc, what do these terms mean? Answer: Those template names are the PDB ids for the specific proteins. Each structure in the PDB is represented by a 4 character alphanumeric identifier, assigned upon its deposition. For example, 4hhb and 9ins are identification codes for PDB entries for hemoglobin and insulin, respectively. Many of the PDB Web site pages, including the PDB home page, allow you to enter a PDB ID and retrieve information for the corresponding structure. PDB's page is at: http://www.rcsb.org/pdb/index.html The last letters of the longer IDs (as those mentioned have more than 4 characters) are the chain identifiers for proteins consisting of multiple polypeptide chains.33. Question: Do you have any references about PROSPECT? Answer: See list of references at our PROSPECT product information page and the reference section in the PROSPECT Pro Tutorial. 34. Question: Do you have any other documentation where I can read more on PROSPECT PRO? Answer: Yes. See the PROSPECT Pro Tutorial. |
|
|
|
|
|
|
|
|