MulPSSM
NAR Molecular Biology Database Collection entry number 844
Mohanty S., Swapna L.S., Gowri V.S., Agarwal G., Srinivasan N., and Krishnadev O.
Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560 012, India
Contact ns@mbu.iisc.ernet.in
Database Description
Representation of multiple sequence alignments of protein families in terms of Position Specific Scoring Matrices (PSSMs) is commonly used in the detection of remote homologues. A PSSM is generated with respect to one of the sequences involved in the multiple sequence alignment as a reference. We have shown recently that use of multiple PSSMs corresponding to an alignment, with several sequences in the family used as reference, improves the sensitivity of the remote homology detection dramatically (1,2). MulPSSM contains PSSMs for a large number of sequence and structural families of protein domains with multiple PSSMs for every family (3). The approach involves use of a clustering algorithm to identify most distinct sequences corresponding to a family. With each one of the distinct sequences as reference, multiple PSSMs have been generated.
Recent Developments
The current release of MulPSSM contains ~39000 and ~40000 PSSMs corresponding to 9318 Pfam sequence families and 2903 PALI (4) structural families. A RPS-BLAST interface allows sequence search against PSSMs of sequence or structural families or both. The presentation of data has been done using dynamic HTML. An analysis interface allows display and convenient navigation of alignments and domain hits. MulPSSM can be accessed at http://crick.mbu.iisc.ernet.in/~mulpssm.
Acknowledgements
OK, LSS and GA are supported by Council of Scientific and Industrial Research, New Delhi. This work is supported by Department of Biotechnology, New Delhi.
References
1. Anand B., Gowri V.S., Srinivasan N., (2005) Use of multiple profiles corresponding to a sequence alignment enables effective detection of remote homologues. Bioinformatics 21, 2821-2826.
2. Gowri, V.S., Tina, K.G., Krishnadev, O., Srinivasan, N. (2007) Strategies for the effective identification of remotely related sequences in multiple PSSM search approach. Proteins 67, 789-794.
3. Gowri V.S., Krishnadev O., Swamy C.S., Srinivasan N. (2006) MulPSSM: a database of multiple position-specific scoring matrices of protein domain families. Nucleic Acids Res. 34, D243-246.
4. Balaji, S., Sujatha, S., Kumar, S.S.C. and Srinivasan, N. (2001) PALI: A database of Phylogeny and ALIgnment of homologous protein structures. Nucleic Acids Res. 29, 61-65.
2. Gowri, V.S., Tina, K.G., Krishnadev, O., Srinivasan, N. (2007) Strategies for the effective identification of remotely related sequences in multiple PSSM search approach. Proteins 67, 789-794.
3. Gowri V.S., Krishnadev O., Swamy C.S., Srinivasan N. (2006) MulPSSM: a database of multiple position-specific scoring matrices of protein domain families. Nucleic Acids Res. 34, D243-246.
4. Balaji, S., Sujatha, S., Kumar, S.S.C. and Srinivasan, N. (2001) PALI: A database of Phylogeny and ALIgnment of homologous protein structures. Nucleic Acids Res. 29, 61-65.
Category: Protein sequence databases
Subcategory: Protein domain databases; protein classification
Go to the abstract in the NAR 2006 Database Issue.
Oxford University Press is not responsible for the content of external internet sites