gendat

GENDAT

Random sampling of datasets for training and testing

   [A,B,IA,IB] = GENDAT(X,N,SEED)
   [A,B,IA,IB] = X*GENDAT([],N,SEED)
   [A,B,IA,IB] = X*GENDAT(N,SEED)
   [A,B,IA,IB] = GENDAT(X,ALF,SEED)
   [A,B,IA,IB] = X*GENDAT([],ALF,SEED)
   [A,B,IA,IB] = X*GENDAT(ALF,SEED)

Input
X Dataset.
N,ALF Number/fraction of objects to be selected (def: bootstrapping). Alternatively a row vector of numbers of objects for each class.
SEED A state of the random number generation according to RANDRESET

Output
A,B Datasets
IA,IB Original indices from the dataset X

Description

Generation of N objects from dataset X. They are stored in dataset A, the remaining objects in dataset B. IA and IB are the indices of the objects selected from X for A and B. The random object generation follows the class prior probabilities. So if the prior probability of a class is PA, then in expectation PA*N objects are selected from that class. If N is large or if one of the classes has too few objects in A, the number of generated objects might be less than N.

If N is a vector of sizes, exactly N(i) objects are generated for class i. Classes are ordered as given by GETLABLIST(A).

If the function is called without specifying N, the data set X is bootstrapped and stored in A. Not selected samples are stored in B.

ALF should be a scalar < 1. For each class a fraction ALF of the objects is selected for A and the not selected objects are stored in B.

If X is a cell array of datasets the command is executed for each dataset separately. Results are stored in cell arrays. For each dataset the random seed is reset, resulting in aligned sets for the generated datasets if the sets in X were aligned.

Example(s)

prex_plotc,

Random sampling of datasets for training and testing

Description

Example(s)

See also