PRDisData Contents

PRDisData User Guide

prodom

PRODOM

Dissimilarity dataset.

    S = PRODOM(TYPE)

Input
 TYPE String defining type of S, either 'sim' (similarity, original  data) or 'dis' (dissimilarity, default)
  ProDom is a comprehensive set of protein domain families [Corpet]. A ProDom subset of 2604 protein domain sequences from the ProDom set was  selected by [Roth]. These are chosen based on a high similarity to at  least one sequence contained in the first four folds of the SCOP database.  The pairwise structural alignments are computed [Roth]. Each SCOP sequence  belongs to a group, as labeled by the experts [Murzin]. The same four  classes are assigned here.

Note that S is a similarity matrix with positive and negative numbers.  Use D = DISSIMT(S,'sim2dis') for conversion to dissimilarities.

Reference(s)

V. Roth, J. Laub, J.M. Buhmann, and K.-R. Mueller, Going metric: Denoising pairwise data, Advances in Neural Information Processing Systems, 841-856, MIT Press, 2003.

F. Corpet, F. Servant, J. Gouzy and D. Kahn, ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons, Nucleid Acids Res., vol. 28, 267-269, 2000.

A .G. Murzin, S.E. Brenner, T. Hubbard and C. Chothia, SCOP: a structural classification of proteins database for the investigation of sequences and structures, Journal of Molecular Biology, vol. 247, 536-540, 1995.

See also

prtools, datasets, prdisdata, dissimt,

PRDisData Contents

PRDisData User Guide

This file has been automatically generated. If badly readable, use the help-command in Matlab.