Interoperability: CUDA Streams

CUDA streams the same way as FFT plans and BLAS handles and MPI communicators are a shared resource and we need to be able to expose them (easy) but also be able to be constrained to pre-initialized handles, if we need them to be. We can already do this pretty well now for FFTs, MPI and some BLAS stuff.

Functions like `ParallelFor` will call `Gpu::gpuStream` to get a stream for its kernel. But the `launch` function allows one to pass a stream.

We could expose a function that let's the user the set the stream before `ParallelFor`.

Currently, AMReX builds 4 streams by default. `MFIter` defines the strategy (round robin) and sets the values of `Gpu::gpuStream`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interoperability: CUDA Streams #547

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Interoperability: CUDA Streams #547

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions