CodonCode Corporation offers the programs Phred and Phrap for Windows and Mac OS X. Phred and Phrap were developed by Dr. Phil Green and co-workers at the University of Washington in Seattle. CodonCode Corporation has acquired the distribution rights for Phred and Phrap.This page gives a brief description of Phred and Phrap. The Phred-Phrap programs were developed for use by automated scripts, and therefore do not have a graphical user interface. For scientists who prefer to use Phred and Phrap from a graphical user interface on OS X or Windows, we offer the sequence assembly and editing software CodonCode Aligner. CodonCode Aligner makes basecalling with Phred and sequence assembly with Phrap easy, and also offers functions for contig editing and mutation detection.
Phred: Better Base Calling
Phred is a base-calling program for DNA sequence traces. The program was developed by Drs. Phil Green and Brent Ewing, and is copyrighted by the University of Washington. It is widely used by the largest academic and commercial sequencing laboratories. Two major reasons why Phred is used by leading sequencers are:
- High base calling accuracy. In an initial study, Phred achieved a 40-50% lower error rates than ABI software on large test data sets (Ewing, Hillier, Wendl & Green (1998), Genome Research 8: 175-185).
- Error probabilities for each base call. The highly accurate error probablilities Phred calculates for each base enable increase automation of the sequencing process, for example:
- More accurate consensus sequences.
- Automatic identification of areas that require "finishing" efforts.
- Drastically lower false positive error rates in mutation detection.
- Effective quality control immediately after sequence production.
- Quantitive benchmarking of different sequencing methods and protocol changes.
- Identification of repeat sequences in during assembly.
Phred was developed for the Human Genome Project, were large amounts of sequence data were processed by automated scripts; therefore, Phred's processing options are set by command line parameters. For Windows and OS X users who would like to use Phred through an easy-to-use graphical user interface, we have developed the sequence analysis software CodonCode Aligner. CodonCode Aligner greatly simplifies using Phred for base calling and Phrap for sequence assembly, and also offers a number of additional functions often needed in DNA sequencing projects, for example contig alignment and editing, reference sequence alignments, and mutation detection.
For corporate users who wish to use Phred-Phrap, purchasing Phred and / or Phrap licenses together with CodonCode Aligner, allows use of Phred and Phrap througha graphical user interface. Purchasing information for CodonCode Aligner can be found here.
Academic users who plan to use Phred from scripts or the command line can obtain source code for Phred-Phrap free of charge directly from the authors. For academic users who prefer a graphical user interface and purchase licenses for CodonCode Aligner, use of the workstation versions of Phred and Phrap that are included with CodonCode Aligner is free of charge.
To learn more about how Phred works or about Phred quality values, visit our PHRED page.
Phrap: Better Sequence Assemblies
Phrap is a leading program for DNA sequence assembly. Phrap is routinely used in some of the largest sequencing projects in the Human Genome Sequencing Project and in the biotech industry. Some of Phrap's feature include:
- Fast assemblies. Assemblies of cosmid- to BAC sized projects with several hundred to two thousand reads typically take only minutes to complete on high-powered workstations or personal computers.
- Accurate consensus sequences. Phrap uses Phred's quality scores to determine highly accurate consensus sequences. Phrap examines all individual sequences at a given position, and generally uses the highest quality sequence to build the consensus - similar to the way scientists would correct consensus sequences during "contig editing". Compared to simple majority rules use in older sequence assembly programs, Phrap's approach can give significantly more accurate consensus sequences, especially in regions of low coverage or regions of systematic errors like compressions.
- Consensus quality estimates. Phrap uses the quality information of individual sequences to estimate the quality of the consensus sequence. In addition, Phrap uses available information about sequencing chemistry (dye terminator or dye primer) and confirmation by "other strand" reads in estimating the consensus quality. This often allows scientists to ignore random errors, and to focus finishing efforts exclusively onto regions where the data quality is insufficient. Consensus quality estimates can also be very helpful in mutation detection by DNA sequencing (see Rieder, Taylor, Tobe & Nickerson (1998), Nucleic Acids Research 26: 967-973).
- Ability to assemble very large projects. Phrap has been used routinely to assembly bacterial genomes sequenced by the "shotgun" approach, where each project contained tens of thousands of reads. Smaller bacterial genomes (2 million bases or less) could often be assembled in less than three hours.
- Improved identification and handling of repeats. Phrap uses quality scores to estimate whether discrepancies between two overlapping sequences are more likely to arise from random errors, or from different copies of a repeated sequence. For repeats with 95 to 98% identity (like human Alu sequences) and high quality sequence data, this typically yields correct assemblies.