COSIE.utils.nn_approx

nn_approx(ds1, ds2, knn=10, metric='euclidean', n_trees=10, include_distances=False)[source]

Efficiently find approximate K-nearest neighbors using the Annoy library.

Parameters

ds1np.ndarray

Query data of shape (n_query, dim), where neighbors will be searched for.

ds2np.ndarray

Reference data of shape (n_ref, dim), in which the neighbors are searched.

knnint, optional

Number of neighbors to retrieve per query. Default is 10.

metricstr, optional

Distance metric used in Annoy. Must be one of: {‘euclidean’, ‘manhattan’, ‘angular’, ‘hamming’, ‘dot’}. Default is ‘euclidean’.

n_treesint, optional

Number of trees used to build the Annoy index. Higher values increase accuracy at the cost of indexing time. Default is 10.

include_distancesbool, optional

Whether to also return distances to the nearest neighbors. Default is False.

Returns

indnp.ndarray

If include_distances is False, returns an array of shape (n_query, knn) with indices of nearest neighbors.

tuple of (ind, dist)(np.ndarray, np.ndarray)

If include_distances is True, returns a tuple:

  • ind : array of shape (n_query, knn) with indices of nearest neighbors.

  • dist : array of shape (n_query, knn) with corresponding distances.