pathwaycom/llm-app

Description: Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3,...

View on GitHub ↗Jump to charts ↓

Summary Information

Updated 10 minutes ago

Added to GitGenius on May 25th, 2026

Created on July 19th, 2023

Open Issues & Pull Requests: 10 (+0)

Number of forks: 1,434

Total Stargazers: 59,055 (+0)

Total Subscribers: 88 (+0)

Issue Activity (beta)

Open issues: 4

New in 7 days: 0

Closed in 7 days: 0

Avg open age: 472 days

Stale 30+ days: 4

Stale 90+ days: 3

Recent activity

Opened in 7 days: 0

Closed in 7 days: 0

Comments in 7 days: 0

Events in 7 days: 0

Top labels

bug (4)
documentation (1)
enhancement (1)
question (1)

Most active issues this week

No issue events were indexed in the last 7 days.

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: N/A

Mean response time: 125.4 days

90th percentile: 1001.8 days

Tracked items: 8

Most active contributors

dxtrous - 17 events, 3 issues
szymondudycz - 7 events, 3 issues
m0kr4n3 - 3 events, 1 issues
ucas010 - 3 events, 2 issues
KamilPiechowiak - 2 events, 1 issues

Related by overlapping contributors

Detailed Description

The llm-app repository provides production-ready cloud templates for building retrieval-augmented generation (RAG) systems, AI pipelines, and enterprise search applications. Written primarily in Jupyter Notebook format, the repository offers Docker-friendly, ready-to-deploy application templates that maintain live synchronization with multiple data sources including Sharepoint, Google Drive, S3, Kafka, PostgreSQL, and real-time data APIs. The core purpose is enabling developers to quickly build and deploy AI applications with up-to-date knowledge from their data sources without requiring separate infrastructure setup.

The repository contains multiple specialized application templates designed for different use cases and accuracy requirements. The Question-Answering RAG App provides a basic end-to-end pipeline for document-based question answering using GPT models. The Live Document Indexing template functions as a vector store service with real-time indexing capabilities that can integrate with Langchain or Llamaindex applications. A Multimodal RAG pipeline leverages GPT-4o for parsing PDFs and extracting information from charts and tables in financial documents. The Unstructured-to-SQL pipeline converts unstructured financial data into SQL format and enables natural language querying against PostgreSQL tables. An Adaptive RAG implementation reduces token costs up to 4x while maintaining accuracy. The Private RAG App offers a fully local version using Mistral and Ollama for privacy-conscious deployments. Additional templates include a Slides AI Search App for PowerPoint and PDF retrieval and a Video RAG pipeline using TwelveLabs for video content indexing.

These applications run as Docker containers and expose HTTP APIs for frontend integration, with optional Streamlit UIs included in some templates for quick testing and demonstrations. The underlying architecture relies on the Pathway Live Data Framework, a Python library with a Rust engine that handles data source synchronization and API request serving. Rather than requiring separate integrations of vector databases, caching layers, and API frameworks, Pathway consolidates these components into a unified application logic. The framework uses the usearch library for vector indexing and Tantivy for hybrid full-text search capabilities, with all functionality operating out of the box without external dependencies.

GitGenius activity tracking reveals that the repository maintains active issue management with bug reports being the most common issue type, followed by documentation and enhancement requests. The median issue and pull request response latency is 0.0 hours, indicating rapid community engagement. Key contributors dxtrous, szymondudycz, and m0kr4n3 drive development activity. The repository shares overlapping contributors with memorilabs/memori, deepspeedai/deepspeed, and automatic1111/stable-diffusion-webui, suggesting cross-pollination within the AI and machine learning development community.

The templates are designed to scale to millions of pages of documents and can be deployed on major cloud platforms including GCP, AWS, Azure, and Render, or on-premises infrastructure. Each template includes comprehensive README documentation with setup instructions. The repository emphasizes ease of modification, allowing developers to adjust pipeline steps such as adding new data sources or switching between vector and hybrid indexing with minimal code changes. The project actively encourages community contributions across documentation, features, bug fixes, and code reviews, with support provided through a dedicated Discord server for developers planning contributions.

pathwaycom/llm-app

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

llm-app
by
pathwaycompathwaycom/llm-app

Repository Details

pathwaycom/llm-app

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

llm-app by pathwaycompathwaycom/llm-app

Repository Details

llm-app
by
pathwaycompathwaycom/llm-app