pathway
by
pathwaycom

Description: Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.

View on GitHub ↗

Summary Information

Updated 45 minutes ago

Added to GitGenius on August 14th, 2025

Created on November 27th, 2022

Open Issues & Pull Requests: 36 (+0)

Number of forks: 1,668

Total Stargazers: 62,698 (+0)

Total Subscribers: 116 (+0)

Issue Activity (beta)

Open issues: 32

New in 7 days: 0

Closed in 7 days: 1

Avg open age: 259 days

Stale 30+ days: 30

Stale 90+ days: 19

Recent activity

Opened in 7 days: 0

Closed in 7 days: 1

Comments in 7 days: 0

Events in 7 days: 1

Top labels

enhancement (81)
question (40)
bug (38)
priority:low (4)
confirmed (2)
documentation (2)
good first issue (2)
priority:normal (2)

Most active issues this week

#255 Promote `TwelveLabsVideoParser` and `MarengoEmbedder` from the Video RAG template into `pathway.xpacks.llm` - 2 events / 0 comments

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 0.0 hours

Mean response time: 43.8 days

90th percentile: 36.6 days

Tracked items: 156

Most active contributors

zxqfd555 - 303 events, 103 issues
KamilPiechowiak - 35 events, 17 issues
dxtrous - 34 events, 22 issues
berkecanrizai - 31 events, 14 issues
voodoo11 - 23 events, 11 issues

Related by overlapping contributors

Detailed Description

Pathway is an open-source, modular framework developed by Pathway, Inc. designed to build and deploy high-performance data pipelines and machine learning applications, particularly focusing on streaming data. It aims to bridge the gap between the ease of use of Python and the performance of compiled languages like C++ and Rust, enabling developers to achieve speeds comparable to hand-optimized code without sacrificing Python’s flexibility. The core philosophy revolves around a "compile-first" approach, transforming Python code into optimized, statically-typed code before execution.

At its heart, Pathway utilizes a novel data type called a `Pathway`, which represents a stream of data. These `Pathway` objects are not simply Python lists or dataframes; they are designed for efficient, out-of-core processing, meaning they can handle datasets larger than available memory. The framework provides a rich set of operations (called "primitives") that can be applied to these `Pathway` objects, including filtering, mapping, aggregation, joining, and windowing. Crucially, these primitives are not executed immediately; instead, they are chained together to form a computational graph.

The key innovation lies in Pathway’s compiler. When a pipeline is defined, the compiler analyzes the graph of operations and translates it into highly optimized C++ code. This compilation step happens automatically and transparently to the user. The compiled code is then executed using a distributed execution engine, allowing for parallel processing across multiple cores and machines. This compilation-based approach allows Pathway to avoid the overhead associated with Python’s dynamic typing and interpretation, resulting in significant performance gains. Furthermore, the compiler performs static analysis to catch errors early in the development process, improving code reliability.

Pathway’s modularity is another significant feature. It’s designed to integrate seamlessly with existing Python data science tools like Pandas, NumPy, and PyTorch. Users can easily convert between `Pathway` objects and these familiar data structures. The framework also supports custom operations, allowing developers to extend its functionality with their own C++, Rust, or Python code. This extensibility makes Pathway adaptable to a wide range of data processing tasks. The repository includes examples demonstrating integration with various data sources like Kafka, files, and databases.

The repository itself contains comprehensive documentation, tutorials, and examples to help users get started. It’s structured with a focus on clarity and ease of use. The `pathway` Python package is the primary interface for interacting with the framework. The `pathway-compiler` component handles the code compilation process. The `pathway-core` contains the fundamental data structures and algorithms. The project is actively maintained and welcomes contributions from the community, with clear guidelines for contributing code, documentation, and bug reports. Pathway is particularly well-suited for applications requiring low latency and high throughput, such as real-time analytics, fraud detection, and sensor data processing.

pathway
by
pathwaycom

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

pathway
by
pathwaycompathwaycom/pathway

Repository Details

pathway by pathwaycom

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

pathway by pathwaycompathwaycom/pathway

Repository Details

pathway
by
pathwaycom

pathway
by
pathwaycompathwaycom/pathway