gendatk

GENDATK

K-Nearest neighbor data generation

    B = GENDATK(A,N,K,S)
    B = A*GENDATK([],N,K,S)
    B = A*GENDATK(N,K,S)

Input
A Dataset
N Number of points (optional; default: 50)
K Number of nearest neighbors (optional; default: 1)
S Standard deviation (optional; default: 1)

Output
B Generated dataset

Description

Generation of N points using the K-nearest neighbors of objects in the dataset A. First, N points of A are chosen in a random order. Next, to each of these points and for each direction (feature), a Gaussian-distributed offset is added with the zero mean and the standard deviation: S * the mean signed difference between the point of A under consideration and its K nearest neighbors in A.

The result of this procedure is that the generated points follow the local density properties of the point from which they originate.

If A is a multi-class dataset the above procedure is followed class by class, neglecting objects of other classes and possibly unlabeled objects.

If N is a vector of sizes, exactly N(I) objects are generated for class I. Default N is 100 objects per class.

K-Nearest neighbor data generation

Description

See also