Prediction of Coreceptor Usage for HIV-1
Build your own classifiers with this training data or your own.
The classifiers available below all predict coreceptor usage with near 90% accuracy. According to our experiments, the SVM (support vector machine) is the most accurate, however, the rules generated by the other algorithms are much more comprehensible.
Instructions for obtaining predictions of coreceptor usage for your HIV-1 sequences:
- Extract the V3 region from your HIV sequences.
- Align the V3 sequences to the following consensus sequence built from the training data obtained from Los Alamos
- CTRPNNNT-RK*I I--GPG*AFY*-TG*I-IGDIRQAHC
- * indicates that it could be anything, - indicates a gap
- Make sure your sequence comes out the same length (40 bps)
- The better your alignment, the more reliable our prediction
- Pay particular attention to position 12 in the alignment
- Upload a fasta file with your sequences included in the following format
> class ID1
CTRPNNNT-RKRISL--GPGRVFYT-TGEI-IGDIRKAHC
- (To obtain this format, starting with a normal fasta file, replace all ">" with "> class " )
- (Here is a sample file to emulate)
For more information on these coreceptor classifiers: C4.5 ,C4.5 with p8-p12, PART , SVM , Charge Rule
Contact Benjamin Good at goodb@interchange.ubc.ca