Skip to content

Interoperability: CUDA Streams #547

@ax3l

Description

@ax3l

CUDA streams the same way as FFT plans and BLAS handles and MPI communicators are a shared resource and we need to be able to expose them (easy) but also be able to be constrained to pre-initialized handles, if we need them to be. We can already do this pretty well now for FFTs, MPI and some BLAS stuff.

Functions like ParallelFor will call Gpu::gpuStream to get a stream for its kernel. But the launch function allows one to pass a stream.

We could expose a function that let's the user the set the stream before ParallelFor.

Currently, AMReX builds 4 streams by default. MFIter defines the strategy (round robin) and sets the values of Gpu::gpuStream.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions