Interruptible#
#include <raft/core/interruptible.hpp>
namespace raft::core
- struct interrupted_exception : public raft::exception#
#include <interruptible.hpp>Exception thrown during
interruptible::synchronize
call when it detects a request to cancel the work performed in this CPU thread.
- class interruptible#
#include <interruptible.hpp>Cooperative-style interruptible execution.
This class provides facilities for interrupting execution of a C++ thread at designated points in code from outside of the thread. In particular, it provides an interruptible version of the blocking CUDA synchronization function, that allows dropping a long-running GPU work.
Important: Although CUDA synchronize calls serve as cancellation points, the interruptible machinery has nothing to do with CUDA streams or events. In other words, when you call
cancel
, it’s the CPU waiting function what is interrupted, not the GPU stream work. This means, when theinterrupted_exception
is raised, any unfinished GPU stream work continues to run. It’s the responsibility of the developer then to make sure the unfinished stream work does not affect the program in an undesirable way.What can happen to CUDA stream when the
synchronize
is cancelled? If you catch theinterrupted_exception
immediately, you can safely wait on the stream again. Otherwise, some of the allocated resources may be released before the active kernel finishes using them, which will result in writing into deallocated or reallocated memory and undefined behavior in general. A dead-locked kernel may never finish (or may crash if you’re lucky). In practice, the outcome is usually acceptable for the use case of emergency program interruption (e.g., CTRL+C), but extra effort on the use side is required to allow safe interrupting and resuming of the GPU stream work.Public Functions
- inline void cancel() noexcept#
Cancel any current or next call to
interruptible::synchronize
performed on the CPU thread given by thisinterruptible
token.Note, this function does not involve thread synchronization/locks and does not throw any exceptions, so it’s safe to call from a signal handler.
Public Static Functions
- static inline void synchronize(rmm::cuda_stream_view stream)#
Synchronize the CUDA stream, subject to being interrupted by
interruptible::cancel
called on this CPU thread.
- Parameters:
stream – [in] a CUDA stream.
- Throws:
raft::interrupted_exception – if interruptible::cancel() was called on the current CPU thread before the currently captured work has been finished.
raft::cuda_error – if another CUDA error happens.
- static inline void synchronize(cudaEvent_t event)#
Synchronize the CUDA event, subject to being interrupted by
interruptible::cancel
called on this CPU thread.
- Parameters:
event – [in] a CUDA event.
- Throws:
raft::interrupted_exception – if interruptible::cancel() was called on the current CPU thread before the currently captured work has been finished.
raft::cuda_error – if another CUDA error happens.
- static inline void yield()#
Check the thread state, whether the thread can continue execution or is interrupted by
interruptible::cancel
.This is a cancellation point for an interruptible thread. It’s called in the internals of
interruptible::synchronize
in a loop. If two synchronize calls are far apart, it’s recommended to callinterruptible::yield()
in between to make sure the thread does not become unresponsive for too long.Both
yield
andyield_no_throw
reset the state to non-cancelled after execution.
- Throws:
raft::interrupted_exception – if interruptible::cancel() was called on the current CPU thread.
- static inline auto yield_no_throw() -> bool#
Check the thread state, whether the thread can continue execution or is interrupted by
interruptible::cancel
.Same as
interruptible::yield
, but does not throw an exception if the thread is cancelled.Both
yield
andyield_no_throw
reset the state to non-cancelled after execution.
- Returns:
whether the thread can continue, i.e.
true
means continue,false
means cancelled.
- static inline auto get_token() -> std::shared_ptr<interruptible>#
Get a cancellation token for this CPU thread.
- Returns:
an object that can be used to cancel the GPU work waited on this CPU thread.
- static inline auto get_token(std::thread::id thread_id) -> std::shared_ptr<interruptible>#
Get a cancellation token for a CPU thread given by its id.
The returned token may live longer than the associated thread. In that case, using its
cancel
method has no effect.
- Parameters:
thread_id – [in] an id of a C++ CPU thread.
- Returns:
an object that can be used to cancel the GPU work waited on the given CPU thread.
- static inline void cancel(std::thread::id thread_id)#
Cancel any current or next call to
interruptible::synchronize
performed on the CPU thread given by thethread_id
Note, this function uses a mutex to safely get a cancellation token that may be shared among multiple threads. If you plan to use it from a signal handler, consider the non-static
cancel()
instead.
- Parameters:
thread_id – [in] a CPU thread, in which the work should be interrupted.