FlatNav Index Module
This module provides interfaces to create and manipulate FlatNav index structures.
create
- flatnav.index.create(distance_type: str, dim: int, dataset_size: int, max_edges_per_node: int, index_data_type: flatnav._core.data_type.DataType = <DataType.float32: 9>, verbose: bool = False, collect_stats: bool = False) → object
Constructs a an in-memory index with the parameters. Args:
distance_type (str): The type of distance metric to use (‘l2’ for Euclidean, ‘angular’ for inner product). dim (int): The number of dimensions in the dataset. dataset_size (int): The number of vectors in the dataset. max_edges_per_node (int): The maximum number of edges per node in the graph. verbose (bool, optional): Enables verbose output. Defaults to False. collect_stats (bool, optional): Collects performance statistics. Defaults to False.
- Returns:
Union[IndexL2Float, IndexIPFloat]: The constructed index.
Index Classes
Classes for managing FlatNav indices.
IndexL2Float
- class flatnav.index.IndexL2Float
Bases:
pybind11_object
- add(self: flatnav._core.index.IndexL2Float, data: numpy.ndarray, ef_construction: int, num_initializations: int = 100, labels: object = None) → None
Add vectors(data) to the index with the given ef_construction parameter and optional labels. ef_construction determines how many vertices are visited while inserting every vector in the underlying graph structure. Args:
data (np.ndarray): The data to add to the index. ef_construction (int): The number of vertices to visit while inserting every vector in the graph. num_initializations (int, optional): The number of initializations to perform. Defaults to 100. labels (Optional[np.ndarray], optional): The labels for the data. Defaults to None.
- Returns:
None
- allocate_nodes(self: flatnav._core.index.IndexL2Float, data: numpy.ndarray[numpy.float32]) → flatnav._core.index.IndexL2Float
Allocate nodes in the underlying graph structure for the given data. Unlike the add method, this method does not construct the edge connectivity. It only allocates memory for each node in the graph. When using this method, you should invoke build_graph_links explicity.
`NOTE`
: In most cases you should not need to use this method. Args:data (np.ndarray): The data to add to the index.
- Returns:
None
- build_graph_links(self: flatnav._core.index.IndexL2Float, mtx_filename: str) → None
Construct the edge connectivity of the underlying graph. This method should be invoked after allocating nodes using the allocate_nodes method. Args:
mtx_filename (str): The filename of the matrix file.
- Returns:
None
- get_graph_outdegree_table(self: flatnav._core.index.IndexL2Float) → List[List[int]]
Returns the outdegree table (adjacency list) representation of the underlying graph. Returns:
List[List[int]]: The outdegree table.
- get_query_distance_computations(self: flatnav._core.index.IndexL2Float) → int
Returns the number of distance computations performed during the last search operation. This method also resets the distance computations counter. Returns:
int: The number of distance computations.
- static load_index(filename: str) → flatnav._core.index.IndexL2Float
Load a FlatNav index from a given file location. Args:
filename (str): The file location to load the index from.
- Returns:
Union[L2Inde, IndexIPFloat]: The loaded index.
- property max_edges_per_node
- property num_threads
Returns the number of threads used for constructing the graph and/or performing KNN search. Returns:
int: The number of threads.
- reorder(self: flatnav._core.index.IndexL2Float, strategies: List[str]) → None
Perform graph re-ordering based on the given sequence of re-ordering strategies. Supported re-ordering strategies include gorder and rcm. Reference:
Graph Reordering for Cache-Efficient Near Neighbor Search: https://arxiv.org/pdf/2104.03221
- Args:
strategies (List[str]): The sequence of re-ordering strategies.
- Returns:
None
- save(self: flatnav._core.index.IndexL2Float, filename: str) → None
Save a FlatNav index at the given file location. Args:
filename (str): The file location to save the index.
- Returns:
None
- search(self: flatnav._core.index.IndexL2Float, queries: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
This is a batched version of the search_single method. Return top K closest data points for every query in the provided queries. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for every query.
- Args:
queries (np.ndarray): The query vectors. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for every query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- search_single(self: flatnav._core.index.IndexL2Float, query: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
Return top K closest data points for the given query. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for the query.
- Args:
query (np.ndarray): The query vector. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for the query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- set_data_type(self: flatnav._core.index.IndexL2Float, data_type: flatnav._core.data_type.DataType) → None
- set_num_threads(self: flatnav._core.index.IndexL2Float, num_threads: int) → None
Set the number of threads to use for constructing the graph and/or performing KNN search. Args:
num_threads (int): The number of threads to use.
- Returns:
None
IndexL2Uint8
- class flatnav.index.IndexL2Uint8
Bases:
pybind11_object
- add(self: flatnav._core.index.IndexL2Uint8, data: numpy.ndarray, ef_construction: int, num_initializations: int = 100, labels: object = None) → None
Add vectors(data) to the index with the given ef_construction parameter and optional labels. ef_construction determines how many vertices are visited while inserting every vector in the underlying graph structure. Args:
data (np.ndarray): The data to add to the index. ef_construction (int): The number of vertices to visit while inserting every vector in the graph. num_initializations (int, optional): The number of initializations to perform. Defaults to 100. labels (Optional[np.ndarray], optional): The labels for the data. Defaults to None.
- Returns:
None
- allocate_nodes(self: flatnav._core.index.IndexL2Uint8, data: numpy.ndarray[numpy.float32]) → flatnav._core.index.IndexL2Uint8
Allocate nodes in the underlying graph structure for the given data. Unlike the add method, this method does not construct the edge connectivity. It only allocates memory for each node in the graph. When using this method, you should invoke build_graph_links explicity.
`NOTE`
: In most cases you should not need to use this method. Args:data (np.ndarray): The data to add to the index.
- Returns:
None
- build_graph_links(self: flatnav._core.index.IndexL2Uint8, mtx_filename: str) → None
Construct the edge connectivity of the underlying graph. This method should be invoked after allocating nodes using the allocate_nodes method. Args:
mtx_filename (str): The filename of the matrix file.
- Returns:
None
- get_graph_outdegree_table(self: flatnav._core.index.IndexL2Uint8) → List[List[int]]
Returns the outdegree table (adjacency list) representation of the underlying graph. Returns:
List[List[int]]: The outdegree table.
- get_query_distance_computations(self: flatnav._core.index.IndexL2Uint8) → int
Returns the number of distance computations performed during the last search operation. This method also resets the distance computations counter. Returns:
int: The number of distance computations.
- static load_index(filename: str) → flatnav._core.index.IndexL2Uint8
Load a FlatNav index from a given file location. Args:
filename (str): The file location to load the index from.
- Returns:
Union[L2Inde, IndexIPFloat]: The loaded index.
- property max_edges_per_node
- property num_threads
Returns the number of threads used for constructing the graph and/or performing KNN search. Returns:
int: The number of threads.
- reorder(self: flatnav._core.index.IndexL2Uint8, strategies: List[str]) → None
Perform graph re-ordering based on the given sequence of re-ordering strategies. Supported re-ordering strategies include gorder and rcm. Reference:
Graph Reordering for Cache-Efficient Near Neighbor Search: https://arxiv.org/pdf/2104.03221
- Args:
strategies (List[str]): The sequence of re-ordering strategies.
- Returns:
None
- save(self: flatnav._core.index.IndexL2Uint8, filename: str) → None
Save a FlatNav index at the given file location. Args:
filename (str): The file location to save the index.
- Returns:
None
- search(self: flatnav._core.index.IndexL2Uint8, queries: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
This is a batched version of the search_single method. Return top K closest data points for every query in the provided queries. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for every query.
- Args:
queries (np.ndarray): The query vectors. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for every query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- search_single(self: flatnav._core.index.IndexL2Uint8, query: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
Return top K closest data points for the given query. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for the query.
- Args:
query (np.ndarray): The query vector. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for the query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- set_data_type(self: flatnav._core.index.IndexL2Uint8, data_type: flatnav._core.data_type.DataType) → None
- set_num_threads(self: flatnav._core.index.IndexL2Uint8, num_threads: int) → None
Set the number of threads to use for constructing the graph and/or performing KNN search. Args:
num_threads (int): The number of threads to use.
- Returns:
None
IndexIPFloat
- class flatnav.index.IndexIPFloat
Bases:
pybind11_object
- add(self: flatnav._core.index.IndexIPFloat, data: numpy.ndarray, ef_construction: int, num_initializations: int = 100, labels: object = None) → None
Add vectors(data) to the index with the given ef_construction parameter and optional labels. ef_construction determines how many vertices are visited while inserting every vector in the underlying graph structure. Args:
data (np.ndarray): The data to add to the index. ef_construction (int): The number of vertices to visit while inserting every vector in the graph. num_initializations (int, optional): The number of initializations to perform. Defaults to 100. labels (Optional[np.ndarray], optional): The labels for the data. Defaults to None.
- Returns:
None
- allocate_nodes(self: flatnav._core.index.IndexIPFloat, data: numpy.ndarray[numpy.float32]) → flatnav._core.index.IndexIPFloat
Allocate nodes in the underlying graph structure for the given data. Unlike the add method, this method does not construct the edge connectivity. It only allocates memory for each node in the graph. When using this method, you should invoke build_graph_links explicity.
`NOTE`
: In most cases you should not need to use this method. Args:data (np.ndarray): The data to add to the index.
- Returns:
None
- build_graph_links(self: flatnav._core.index.IndexIPFloat, mtx_filename: str) → None
Construct the edge connectivity of the underlying graph. This method should be invoked after allocating nodes using the allocate_nodes method. Args:
mtx_filename (str): The filename of the matrix file.
- Returns:
None
- get_graph_outdegree_table(self: flatnav._core.index.IndexIPFloat) → List[List[int]]
Returns the outdegree table (adjacency list) representation of the underlying graph. Returns:
List[List[int]]: The outdegree table.
- get_query_distance_computations(self: flatnav._core.index.IndexIPFloat) → int
Returns the number of distance computations performed during the last search operation. This method also resets the distance computations counter. Returns:
int: The number of distance computations.
- static load_index(filename: str) → flatnav._core.index.IndexIPFloat
Load a FlatNav index from a given file location. Args:
filename (str): The file location to load the index from.
- Returns:
Union[L2Inde, IndexIPFloat]: The loaded index.
- property max_edges_per_node
- property num_threads
Returns the number of threads used for constructing the graph and/or performing KNN search. Returns:
int: The number of threads.
- reorder(self: flatnav._core.index.IndexIPFloat, strategies: List[str]) → None
Perform graph re-ordering based on the given sequence of re-ordering strategies. Supported re-ordering strategies include gorder and rcm. Reference:
Graph Reordering for Cache-Efficient Near Neighbor Search: https://arxiv.org/pdf/2104.03221
- Args:
strategies (List[str]): The sequence of re-ordering strategies.
- Returns:
None
- save(self: flatnav._core.index.IndexIPFloat, filename: str) → None
Save a FlatNav index at the given file location. Args:
filename (str): The file location to save the index.
- Returns:
None
- search(self: flatnav._core.index.IndexIPFloat, queries: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
This is a batched version of the search_single method. Return top K closest data points for every query in the provided queries. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for every query.
- Args:
queries (np.ndarray): The query vectors. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for every query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- search_single(self: flatnav._core.index.IndexIPFloat, query: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
Return top K closest data points for the given query. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for the query.
- Args:
query (np.ndarray): The query vector. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for the query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- set_data_type(self: flatnav._core.index.IndexIPFloat, data_type: flatnav._core.data_type.DataType) → None
- set_num_threads(self: flatnav._core.index.IndexIPFloat, num_threads: int) → None
Set the number of threads to use for constructing the graph and/or performing KNN search. Args:
num_threads (int): The number of threads to use.
- Returns:
None
IndexIPUint8
- class flatnav.index.IndexIPUint8
Bases:
pybind11_object
- add(self: flatnav._core.index.IndexIPUint8, data: numpy.ndarray, ef_construction: int, num_initializations: int = 100, labels: object = None) → None
Add vectors(data) to the index with the given ef_construction parameter and optional labels. ef_construction determines how many vertices are visited while inserting every vector in the underlying graph structure. Args:
data (np.ndarray): The data to add to the index. ef_construction (int): The number of vertices to visit while inserting every vector in the graph. num_initializations (int, optional): The number of initializations to perform. Defaults to 100. labels (Optional[np.ndarray], optional): The labels for the data. Defaults to None.
- Returns:
None
- allocate_nodes(self: flatnav._core.index.IndexIPUint8, data: numpy.ndarray[numpy.float32]) → flatnav._core.index.IndexIPUint8
Allocate nodes in the underlying graph structure for the given data. Unlike the add method, this method does not construct the edge connectivity. It only allocates memory for each node in the graph. When using this method, you should invoke build_graph_links explicity.
`NOTE`
: In most cases you should not need to use this method. Args:data (np.ndarray): The data to add to the index.
- Returns:
None
- build_graph_links(self: flatnav._core.index.IndexIPUint8, mtx_filename: str) → None
Construct the edge connectivity of the underlying graph. This method should be invoked after allocating nodes using the allocate_nodes method. Args:
mtx_filename (str): The filename of the matrix file.
- Returns:
None
- get_graph_outdegree_table(self: flatnav._core.index.IndexIPUint8) → List[List[int]]
Returns the outdegree table (adjacency list) representation of the underlying graph. Returns:
List[List[int]]: The outdegree table.
- get_query_distance_computations(self: flatnav._core.index.IndexIPUint8) → int
Returns the number of distance computations performed during the last search operation. This method also resets the distance computations counter. Returns:
int: The number of distance computations.
- static load_index(filename: str) → flatnav._core.index.IndexIPUint8
Load a FlatNav index from a given file location. Args:
filename (str): The file location to load the index from.
- Returns:
Union[L2Inde, IndexIPFloat]: The loaded index.
- property max_edges_per_node
- property num_threads
Returns the number of threads used for constructing the graph and/or performing KNN search. Returns:
int: The number of threads.
- reorder(self: flatnav._core.index.IndexIPUint8, strategies: List[str]) → None
Perform graph re-ordering based on the given sequence of re-ordering strategies. Supported re-ordering strategies include gorder and rcm. Reference:
Graph Reordering for Cache-Efficient Near Neighbor Search: https://arxiv.org/pdf/2104.03221
- Args:
strategies (List[str]): The sequence of re-ordering strategies.
- Returns:
None
- save(self: flatnav._core.index.IndexIPUint8, filename: str) → None
Save a FlatNav index at the given file location. Args:
filename (str): The file location to save the index.
- Returns:
None
- search(self: flatnav._core.index.IndexIPUint8, queries: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
This is a batched version of the search_single method. Return top K closest data points for every query in the provided queries. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for every query.
- Args:
queries (np.ndarray): The query vectors. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for every query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- search_single(self: flatnav._core.index.IndexIPUint8, query: numpy.ndarray, K: int, ef_search: int, num_initializations: int = 100) → Tuple[numpy.ndarray[numpy.float32], numpy.ndarray[numpy.int32]]
Return top K closest data points for the given query. The results are returned as a Tuple of distances and label ID’s. The ef_search parameter determines how many neighbors are visited while finding the closest neighbors for the query.
- Args:
query (np.ndarray): The query vector. K (int): The number of neighbors to return. ef_search (int): The number of neighbors to visit while finding the closest neighbors for the query. num_initializations (int, optional): The number of initializations to perform. Defaults to 100.
- Returns:
Tuple[np.ndarray, np.ndarray]: The distances and label ID’s of the closest neighbors.
- set_data_type(self: flatnav._core.index.IndexIPUint8, data_type: flatnav._core.data_type.DataType) → None
- set_num_threads(self: flatnav._core.index.IndexIPUint8, num_threads: int) → None
Set the number of threads to use for constructing the graph and/or performing KNN search. Args:
num_threads (int): The number of threads to use.
- Returns:
None