Vector Databases ➾¶
A vector database is specifically designed to operate on embedding vectors. As the popularity of LLMs and generative AI has grown recently, so has the use of embeddings to encode unstructured data. Vector databases have emerged as an effective solution for enterprises to deliver and scale these use cases.
What is Vector DB?
Vector databases are specialized databases that store data as high-dimensional vectors and their original content. They offer the capabilities of both vector indexes and traditional databases, such as optimized storage, scalability, flexibility, and query language support. They allow users to find and retrieve similar or relevant data based on their semantic or contextual meaning.
Vector databases can help RAG models quickly find the most similar documents or passages to a given query and use them as additional context for the LLM.
Vector Index¶
A vector index is a data structure in a vector database designed to enhance the efficiency of processing, and it is particularly suited for the high-dimensional vector dataencountered with LLMs. Its function is to streamline the search and retrieval processes within the database.
By implementing a vector index, the system is capable ofconducting quick similarity searches, identifying vectors that closely match or aremost similar to a given input vector.
How to convert embedding to vector index?
To create vector indexes for your embeddings, there are many options, such as exact or approximate nearest neighbor algorithms (e.g., HNSW or IVF), different distance metrics (e.g., cosine or Euclidean), or various compression techniques (e.g.,quantization or pruning).
Vector Search¶
A vector search is used to find the most relevant documents or passages to the query based on the similarity between the query vector and the document vectors in the index.
Similarity measures are mathematical methods that compare two vectors and compute a distance value between them. This distance value indicates how dissimilar or similar the two vectors are in terms of their semantic meaning