chatgpt-retrieval-plugin
by
openai

Description: The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

View openai/chatgpt-retrieval-plugin on GitHub ↗

Summary Information

Updated 1 hour ago
Added to GitGenius on May 20th, 2023
Created on March 23rd, 2023
Open Issues/Pull Requests: 266 (+0)
Number of forks: 3,659
Total Stargazers: 21,218 (+0)
Total Subscribers: 315 (+0)
Detailed Description

The OpenAI repository `chatgpt-retrieval-plugin` introduces a novel approach to enhancing ChatGPT's responses by integrating a retrieval mechanism. Traditionally, ChatGPT relies solely on its vast pre-trained knowledge base. However, this plugin allows ChatGPT to access and incorporate information from external knowledge sources – specifically, a vector database – during the response generation process. This dramatically improves the accuracy, relevance, and freshness of its answers, particularly for questions requiring up-to-date information or specialized knowledge not present in the model’s core training data. The core idea is to augment ChatGPT’s reasoning with retrieved context, effectively turning it into a more informed and adaptable conversational agent.

The repository provides a modular plugin architecture designed to be easily integrated into ChatGPT’s existing infrastructure. It consists of several key components: a `RetrievalPlugin` interface, a `VectorStorePlugin` (which handles the interaction with the vector database), and a `QueryPlugin` that manages the retrieval process. The `RetrievalPlugin` acts as the central point of interaction, receiving the user’s query and orchestrating the retrieval process. The `VectorStorePlugin` is responsible for storing and retrieving embeddings – numerical representations of text – from the vector database. These embeddings are created using OpenAI’s text embedding models, allowing for semantic similarity searches rather than relying on exact keyword matches. This is crucial for finding relevant information even if the query doesn’t use the exact same words as the source document.

The repository demonstrates the plugin’s functionality through a simple example using a vector database populated with Wikipedia articles. The example showcases how the plugin can identify relevant passages from the database based on the user’s question and then seamlessly incorporate those passages into ChatGPT’s response. Crucially, the plugin doesn’t rewrite ChatGPT’s output; instead, it provides the model with additional context to inform its generation. This allows ChatGPT to synthesize information from both its internal knowledge and the retrieved external data.

Beyond the basic example, the repository includes detailed documentation, code examples, and instructions for setting up and running the plugin. It highlights the importance of choosing an appropriate vector database (e.g., Chroma, Pinecone, Weaviate) and selecting an appropriate embedding model. The documentation emphasizes the need to fine-tune the retrieval parameters – such as the number of retrieved passages and the similarity threshold – to optimize performance for specific use cases. The plugin’s modular design makes it adaptable to various knowledge sources and integration scenarios. The project is actively maintained and represents a significant step towards building more intelligent and reliable conversational AI systems by bridging the gap between large language models and external knowledge.

chatgpt-retrieval-plugin
by
openaiopenai/chatgpt-retrieval-plugin

Repository Details

Fetching additional details & charts...