Skip to content

Commit 1082e77

Browse files
committed
Add initial GPU RFC
1 parent 4791aec commit 1082e77

File tree

1 file changed

+68
-0
lines changed

1 file changed

+68
-0
lines changed

RFC-0021-gpu-support-cudf.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# **RFC-0021 GPU support using cuDF in C++ workers**
2+
3+
### Proposers
4+
5+
* Deepak Majeti, et. al. (IBM)
6+
* Zoltan Arnold Nagy, et. al. (IBM Research Europe)
7+
* Karthikeyan Natarajan, et. al. (NVIDIA)
8+
9+
## Related Issues
10+
11+
* [PR #25094](https://github.com/prestodb/presto/pull/25094): Enable Velox cuDF
12+
* [PR #26156](https://github.com/prestodb/presto/pull/26156): Add support for Velox cuDF options and CudfHiveConnector
13+
14+
## Summary
15+
16+
Enable C++ workers to execute queries on GPUs.
17+
18+
## Background
19+
20+
There is now a proliferation of GPU hardware primarily due to the demands from AI/ML usecases.
21+
GPU hardware over the years has evolved with advanced I/O capabilities.
22+
New AI adjacent data processing workflows are also being developed.
23+
24+
GPUs provide high compute and memory bandwidth, which can benefit operations such as
25+
joins, aggregations, string processing, etc.
26+
27+
28+
### Goals
29+
* Allow Presto queries to run on a single GPU or multiple GPUs.
30+
* A query will run either on the CPU or a GPU. No hybrid execution.
31+
* Use CPU if a GPU lacks a certain functionality.
32+
* Execution should maximize utilization of available hardware such as NVLink.
33+
34+
## Proposed Implementation
35+
36+
Some of this work has been implemented in [Velox](https://github.com/facebookincubator/velox/tree/main/velox/experimental/cudf).
37+
The current implementation translates the CPU operators to the GPU operators via a DriverAdapter in Velox.
38+
39+
Nvidia's [blog](https://developer.nvidia.com/blog/accelerating-large-scale-data-analytics-with-gpu-native-velox-and-nvidia-cudf/)
40+
has more details on the design and some early results.
41+
42+
The [Extending Velox - GPU Acceleration with cuDF](https://velox-lib.io/blog/extending-velox-with-cudf) blog also covers the current implementation.
43+
44+
On the Presto C++ side, the following registrations and configs have been added.
45+
46+
* CMake build option `PRESTO_ENABLE_CUDF` must be set. https://github.com/prestodb/presto/tree/master/presto-native-execution#nvidia-cudf-gpu-support
47+
* Parquet file-format is supported. cudfHiveConnector is registered.
48+
* S3 and local/linux filesystems are supported.
49+
* cuDF [configs](https://facebookincubator.github.io/velox/configs.html#cudf-specific-configuration-experimental) can be
50+
specified inside `config.properties` and catalog `.properties` file.
51+
52+
The current work so far shows that GPUs can provide good price-performance. However, to make this support user-friendly and get better price-performance, the following improvements are in progress.
53+
54+
## Work in Progress
55+
* Add GPU plan nodes.
56+
* Driver adapter runs after the drivers/pipelines are built. Limits the adaptation.
57+
* Allow efficient fallback to CPU.
58+
* GPU-GPU exchange using UCX (https://github.com/prestodb/presto/tree/ibm-research-preview).
59+
* Topology and hardware detection.
60+
* Metadata queries on CPU only.
61+
* Session parameter to filter workers.
62+
* Optimizer cost model to support GPUs.
63+
64+
## Releases
65+
Presto C++ workers will be released with GPU support.
66+
67+
## Test Plan
68+
Velox CI has a gpu runner sponsored by Meta. We need a similar runner for Presto.

0 commit comments

Comments
 (0)