pgvector
by
pgvector

Description: Open-source vector similarity search for Postgres

View pgvector/pgvector on GitHub ↗

Summary Information

Updated 3 hours ago
Added to GitGenius on March 19th, 2026
Created on April 20th, 2021
Open Issues/Pull Requests: 14 (+0)
Number of forks: 1,143
Total Stargazers: 20,828 (+3)
Total Subscribers: 131 (+0)

Detailed Description

pgvector is an open-source extension designed to bring vector similarity search capabilities to the PostgreSQL database. Its primary function is to enable efficient storage and retrieval of vector embeddings, allowing users to perform nearest neighbor searches and other similarity-based queries directly within their Postgres databases. This eliminates the need for separate vector databases and allows users to leverage the robust features and ACID compliance of PostgreSQL for their vector search needs.

The core features of pgvector revolve around its ability to store and query vectors. It supports various vector types, including single-precision, half-precision, binary, and sparse vectors, catering to different data representation needs and performance requirements. Users can define vector columns within their tables, allowing them to integrate vector data seamlessly with other relational data. The extension supports a range of distance functions, including L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance, providing flexibility in measuring the similarity between vectors based on the specific application.

A key aspect of pgvector is its support for both exact and approximate nearest neighbor search. By default, pgvector performs exact searches, guaranteeing perfect recall. However, for large datasets, approximate search methods are often preferred for their speed advantages. pgvector offers two approximate index types: HNSW (Hierarchical Navigable Small World) and IVFFlat (Inverted File with Flat clustering). HNSW provides a good balance between speed and recall, while IVFFlat offers faster build times and lower memory usage, making it suitable for certain use cases. The choice of index type depends on the specific requirements of the application, such as the size of the dataset, the desired query performance, and the acceptable level of recall.

Installation of pgvector is straightforward, with instructions provided for various operating systems, including Linux, macOS, and Windows. The extension can be installed through several methods, including compiling from source, using package managers like Homebrew, APT, and Yum, or utilizing Docker. Once installed, the extension is enabled within a database using a simple `CREATE EXTENSION vector;` command. Users can then create tables with vector columns and insert vector data.

Querying with pgvector is intuitive. Users can use SQL to find the nearest neighbors to a given vector using the supported distance functions in the `ORDER BY` clause. The extension also supports filtering, allowing users to combine vector search with other database queries. Furthermore, pgvector provides features for indexing subvectors, enabling efficient search on portions of the vectors. It also supports hybrid search, allowing users to combine vector search with full-text search capabilities for more comprehensive results.

The purpose of pgvector is to provide a convenient and efficient way to perform vector similarity search within the familiar environment of PostgreSQL. By integrating vector search directly into the database, pgvector simplifies data management, reduces the complexity of application architecture, and allows users to leverage the reliability and scalability of PostgreSQL. This makes it an attractive solution for applications that require vector search, such as recommendation systems, semantic search, image and video similarity, and anomaly detection, without the overhead of managing a separate vector database. The extension also offers performance tuning tips, including the use of `EXPLAIN (ANALYZE, BUFFERS)` for debugging and the use of `COPY` for bulk loading, to optimize query performance and index creation.

pgvector
by
pgvectorpgvector/pgvector

Repository Details

Fetching additional details & charts...