Function Reference: @ClassificationKNN/ClassificationKNN

statistics: obj = ClassificationKNN (X, Y)
statistics: obj = ClassificationKNN (…, name, value)

Create a ClassificationKNN class object containing a k-Nearest Neighbor classification model.

obj = ClassificationKNN (X, Y) returns a ClassificationKNN object, with X as the predictor data and Y containing the class labels of observations in X.

  • X must be a N×P numeric matrix of input data where rows correspond to observations and columns correspond to features or variables. X will be used to train the kNN model.
  • Y is N×1 matrix or cell matrix containing the class labels of corresponding predictor data in X. Y can contain any type of categorical data. Y must have same numbers of Rows as X.

obj = ClassificationKNN (…, name, value) returns a ClassificationKNN object with parameters specified by Name-Value pair arguments. Type help fitcknn for more info.

A ClassificationKNN object, obj, stores the labelled training data and various parameters for the k-Nearest Neighbor classification model, which can be accessed in the following fields:

FieldDescription
obj.XUnstandardized predictor data, specified as a numeric matrix. Each column of X represents one predictor (variable), and each row represents one observation.
obj.YClass labels, specified as a logical or numeric vector, or cell array of character vectors. Each value in Y is the observed class label for the corresponding row in X.
obj.NumObservationsNumber of observations used in training the ClassificationKNN model, specified as a positive integer scalar. This number can be less than the number of rows in the training data because rows containing NaN values are not part of the fit.
obj.RowsUsedRows of the original training data used in fitting the ClassificationKNN model, specified as a numerical vector. If you want to use this vector for indexing the training data in X, you have to convert it to a logical vector, i.e X = obj.X(logical (obj.RowsUsed), :);
obj.StandardizeA boolean flag indicating whether the data in X have been standardized prior to training.
obj.SigmaPredictor standard deviations, specified as a numeric vector of the same length as the columns in X. If the predictor variables have not been standardized, then "obj.Sigma" is empty.
obj.MuPredictor means, specified as a numeric vector of the same length as the columns in X. If the predictor variables have not been standardized, then "obj.Mu" is empty.
obj.NumPredictorsThe number of predictors (variables) in X.
obj.PredictorNamesPredictor variable names, specified as a cell array of character vectors. The variable names are in the same order in which they appear in the training data X.
obj.ResponseNameResponse variable name, specified as a character vector.
obj.ClassNamesNames of the classes in the training data Y with duplicates removed, specified as a cell array of character vectors.
obj.BreakTiesTie-breaking algorithm used by predict when multiple classes have the same smallest cost, specified as one of the following character arrays: "smallest" (default), which favors the class with the smallest index among the tied groups, i.e. the one that appears first in the training labelled data. "nearest", which favors the class with the nearest neighbor among the tied groups, i.e. the class with the closest member point according to the distance metric used. "nearest", which randomly picks one class among the tied groups.
obj.PriorPrior probabilities for each class, specified as a numeric vector. The order of the elements in Prior corresponds to the order of the classes in ClassNames.
obj.CostCost of the misclassification of a point, specified as a square matrix. Cost(i,j) is the cost of classifying a point into class j if its true class is i (that is, the rows correspond to the true class and the columns correspond to the predicted class). The order of the rows and columns in Cost corresponds to the order of the classes in ClassNames. The number of rows and columns in Cost is the number of unique classes in the response. By default, Cost(i,j) = 1 if i != j, and Cost(i,j) = 0 if i = j. In other words, the cost is 0 for correct classification and 1 for incorrect classification.
obj.NumNeighborsNumber of nearest neighbors in X used to classify each point during prediction, specified as a positive integer value.
obj.DistanceDistance metric, specified as a character vector. The allowable distance metric names depend on the choice of the neighbor-searcher method. See the available distance metrics in knnseaarch for more info.
obj.DistanceWeightDistance weighting function, specified as a function handle, which accepts a matrix of nonnegative distances, and returns a matrix the same size containing nonnegative distance weights.
obj.DistParameterParameter for the distance metric, specified either as a positive definite covariance matrix (when the distance metric is "mahalanobis", or a positive scalar as the Minkowski distance exponent (when the distance metric is "minkowski", or a vector of positive scale values with length equal to the number of columns of X (when the distance metric is "seuclidean". For any other distance metric, the value of DistParameter is empty.
obj.NSMethodNearest neighbor search method, specified as either "kdtree", which creates and uses a Kd-tree to find nearest neighbors, or "exhaustive", which uses the exhaustive search algorithm by computing the distance values from all points in X to find nearest neighbors.
obj.IncludeTiesA boolean flag indicating whether prediction includes all the neighbors whose distance values are equal to the k^th smallest distance. If IncludeTies is true, prediction includes all of these neighbors. Otherwise, prediction uses exactly k neighbors.
obj.BucketSizeMaximum number of data points in the leaf node of the Kd-tree, specified as positive integer value. This argument is meaningful only when NSMethod is "kdtree".

See also: fitcknn, @ClassificationKNN/predict, knnsearch

Source Code: @ClassificationKNN/ClassificationKNN