velox
by
IBM

Description: A composable and fully extensible C++ execution engine library for data management systems.

View IBM/velox on GitHub ↗

Summary Information

Updated 31 minutes ago

Added to GitGenius on March 8th, 2026

Created on May 1st, 2025

Open Issues & Pull Requests: 54 (+0)

Number of forks: 10

Total Stargazers: 16 (+0)

Total Subscribers: 1 (+0)

Issue Activity (beta)

Open issues: 10

New in 7 days: 34

Closed in 7 days: 34

Avg open age: 261 days

Stale 30+ days: 10

Stale 90+ days: 9

Recent activity

Opened in 7 days: 23

Closed in 7 days: 23

Comments in 7 days: 24

Events in 7 days: 72

Top labels

rebase-request (1,377)
enhancement (10)
iceberg (6)
P1 (3)
P3 (2)
OptimizedPartitioning (1)
P0 (1)
P2 (1)

Most active issues this week

#2082 Rebase branch staging-rebase (560b2ef) with staging-rebase-head (a5b4548) - 6 events / 2 comments
#2086 Rebase branch staging-rebase (6d755d0) with staging-rebase-head (a5b4548) - 6 events / 2 comments
#2095 Rebase branch staging-rebase (2e3a528) with staging-rebase-head (e28a54d) - 6 events / 2 comments
#2081 Rebase branch staging-rebase (560b2ef) with staging-rebase-head (d681478) - 3 events / 1 comments
#2083 Rebase branch staging-rebase (f3487aa) with staging-rebase-head (a5b4548) - 3 events / 1 comments

Explore full issue details

Detailed Description

Velox is a powerful, open-source C++ library designed to serve as a composable execution engine for data management systems. Developed initially by Meta and now supported by a collaborative community including IBM, Intel, and Microsoft, Velox provides a flexible and high-performance foundation for building data processing systems across various analytical workloads, such as batch processing, interactive queries, stream processing, and AI/ML.

The core purpose of Velox is to provide developers with reusable and extensible components for building custom data processing engines. It is not intended for direct end-user interaction, lacking a built-in SQL parser, dataframe layer, or query optimizer. Instead, Velox empowers developers to integrate and optimize their compute engines by offering a comprehensive set of building blocks.

The library's main features are centered around several key components. The **Type** system provides a generic typing system that supports scalar, complex, and nested data types, enabling the representation of diverse data structures. The **Vector** module offers an Arrow-compatible columnar memory layout, optimizing data storage and access with encodings like Flat, Dictionary, and Constant, along with lazy materialization and out-of-order write support. The **Expression Eval** engine is a fully vectorized expression evaluation engine that efficiently executes expressions on top of Vector/Arrow encoded data.

Furthermore, Velox includes a rich set of **Functions**, encompassing vectorized scalar, aggregate, and window functions, adhering to Presto and Spark semantics. **Operators** implement relational operations such as scans, writes, projections, filtering, grouping, ordering, and joins, providing the building blocks for complex query execution. The **I/O** component offers a connector interface for diverse data sources and sinks, supporting various file formats (ORC/DWRF, Parquet, Nimble) and storage adapters (S3, HDFS, GCS, ABFS, local files). **Network Serializers** enable efficient data transfer through different wire protocols, supporting PrestoPage and Spark's UnsafeRow. Finally, **Resource Management** provides primitives for handling computational resources, including memory arenas, buffer management, task and driver management, thread pools, and mechanisms for spilling and caching.

A significant advantage of Velox is its extensibility. Developers can customize the engine by defining their own specializations, including custom types, simple and vectorized functions, aggregate functions, window functions, operators, file formats, storage adapters, and network serializers. This modular design allows developers to tailor Velox to specific needs and optimize performance for particular workloads.

The repository provides comprehensive documentation, including developer guides and examples, to facilitate integration and customization. The project is actively supported by a community, with communication channels including Slack, GitHub Issues, and Discussions. The project is licensed under the Apache 2.0 License. The repository also includes detailed instructions for getting started, including setting up dependencies and building the library on various operating systems (Linux and macOS), along with instructions for building with Docker Compose. The documentation also provides information on supported compilers and build metrics.

velox
by
IBM

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

velox
by
IBMIBM/velox

Repository Details

velox by IBM

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

velox by IBMIBM/velox

Repository Details

velox
by
IBM

velox
by
IBMIBM/velox