RMM  23.12
RAPIDS Memory Manager
Public Types | Public Member Functions | List of all members
rmm::device_uvector< T > Class Template Reference

An uninitialized vector of elements in device memory. More...

#include <device_uvector.hpp>

Public Types

using value_type = T
 T; stored value type.
 
using size_type = std::size_t
 The type used for the size of the vector.
 
using reference = value_type &
 value_type&; reference type returned by operator[](size_type)
 
using const_reference = value_type const &
 
using pointer = value_type *
 The type of the pointer returned by data()
 
using const_pointer = value_type const *
 The type of the pointer returned by data() const.
 
using iterator = pointer
 The type of the iterator returned by begin()
 
using const_iterator = const_pointer
 The type of the const iterator returned by cbegin()
 

Public Member Functions

RMM_EXEC_CHECK_DISABLE device_uvector (device_uvector &&) noexcept=default
 Default move constructor.
 
device_uvectoroperator= (device_uvector &&) noexcept=default
 Default move assignment operator. More...
 
 device_uvector (device_uvector const &)=delete
 Copy ctor is deleted as it doesn't allow a stream argument.
 
device_uvectoroperator= (device_uvector const &)=delete
 Copy assignment is deleted as it doesn't allow a stream argument.
 
 device_uvector ()=delete
 Default constructor is deleted as it doesn't allow a stream argument.
 
 device_uvector (std::size_t size, cuda_stream_view stream, async_resource_ref mr=rmm::mr::get_current_device_resource())
 Construct a new device_uvector with sufficient uninitialized storage for size elements. More...
 
 device_uvector (device_uvector const &other, cuda_stream_view stream, async_resource_ref mr=rmm::mr::get_current_device_resource())
 Construct a new device_uvector by deep copying the contents of another device_uvector. More...
 
pointer element_ptr (std::size_t element_index) noexcept
 Returns pointer to the specified element. More...
 
const_pointer element_ptr (std::size_t element_index) const noexcept
 Returns pointer to the specified element. More...
 
void set_element_async (std::size_t element_index, value_type const &value, cuda_stream_view stream)
 Performs an asynchronous copy of v to the specified element in device memory. More...
 
void set_element_async (std::size_t, value_type const &&, cuda_stream_view)=delete
 
void set_element_to_zero_async (std::size_t element_index, cuda_stream_view stream)
 Asynchronously sets the specified element to zero in device memory. More...
 
void set_element (std::size_t element_index, T const &value, cuda_stream_view stream)
 Performs a synchronous copy of v to the specified element in device memory. More...
 
value_type element (std::size_t element_index, cuda_stream_view stream) const
 Returns the specified element from device memory. More...
 
value_type front_element (cuda_stream_view stream) const
 Returns the first element. More...
 
value_type back_element (cuda_stream_view stream) const
 Returns the last element. More...
 
void reserve (std::size_t new_capacity, cuda_stream_view stream)
 Increases the capacity of the vector to new_capacity elements. More...
 
void resize (std::size_t new_size, cuda_stream_view stream)
 Resizes the vector to contain new_size elements. More...
 
void shrink_to_fit (cuda_stream_view stream)
 Forces deallocation of unused device memory. More...
 
device_buffer release () noexcept
 Release ownership of device memory storage. More...
 
std::size_t capacity () const noexcept
 Returns the number of elements that can be held in currently allocated storage. More...
 
pointer data () noexcept
 Returns pointer to underlying device storage. More...
 
const_pointer data () const noexcept
 Returns const pointer to underlying device storage. More...
 
iterator begin () noexcept
 Returns an iterator to the first element. More...
 
const_iterator cbegin () const noexcept
 Returns a const_iterator to the first element. More...
 
const_iterator begin () const noexcept
 Returns a const_iterator to the first element. More...
 
iterator end () noexcept
 Returns an iterator to the element following the last element of the vector. More...
 
const_iterator cend () const noexcept
 Returns a const_iterator to the element following the last element of the vector. More...
 
const_iterator end () const noexcept
 Returns an iterator to the element following the last element of the vector. More...
 
std::size_t size () const noexcept
 The number of elements in the vector. More...
 
std::int64_t ssize () const noexcept
 The signed number of elements in the vector. More...
 
bool is_empty () const noexcept
 true if the vector contains no elements, i.e. size() == 0 More...
 
async_resource_ref memory_resource () const noexcept
 The async_resource_ref used to allocate and deallocate the device storage. More...
 
cuda_stream_view stream () const noexcept
 Stream most recently specified for allocation/deallocation. More...
 
void set_stream (cuda_stream_view stream) noexcept
 Sets the stream to be used for deallocation. More...
 

Detailed Description

template<typename T>
class rmm::device_uvector< T >

An uninitialized vector of elements in device memory.

Similar to a thrust::device_vector, device_uvector is a random access container of elements stored contiguously in device memory. However, unlike thrust::device_vector, device_uvector does not default initialize the vector elements.

If initialization is desired, this must be done explicitly by the caller, e.g., with thrust::uninitialized_fill.

Example:

rmm::mr::device_memory_resource * mr = new my_custom_resource();
// Allocates *uninitialized* device memory on stream `s` sufficient for 100 ints using the
// supplied resource `mr`
rmm::device_uvector<int> uv(100, s, mr);
// Initializes all elements to 0 on stream `s`
thrust::uninitialized_fill(thrust::cuda::par.on(s), uv.begin(), uv.end(), 0);
Strongly-typed non-owning wrapper for CUDA streams with default constructor.
Definition: cuda_stream_view.hpp:41
An uninitialized vector of elements in device memory.
Definition: device_uvector.hpp:76
Base class for all libcudf device memory allocation.
Definition: device_memory_resource.hpp:89

Avoiding default initialization improves performance by eliminating the kernel launch required to default initialize the elements. This initialization is often unnecessary, e.g., when the vector is created to hold some output from some operation.

However, this restricts the element type T to only trivially copyable types. In short, trivially copyable types can be safely copied with memcpy. For more information, see https://en.cppreference.com/w/cpp/types/is_trivially_copyable.

Another key difference over thrust::device_vector is that all operations that invoke allocation, kernels, or memcpys take a CUDA stream parameter to indicate on which stream the operation will be performed.

Template Parameters
TTrivially copyable element type

Member Typedef Documentation

◆ const_reference

template<typename T >
using rmm::device_uvector< T >::const_reference = value_type const&

value_type const&; constant reference type returned by operator[](size_type) const

Constructor & Destructor Documentation

◆ device_uvector() [1/2]

template<typename T >
rmm::device_uvector< T >::device_uvector ( std::size_t  size,
cuda_stream_view  stream,
async_resource_ref  mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new device_uvector with sufficient uninitialized storage for size elements.

Elements are uninitialized. Reading an element before it is initialized results in undefined behavior.

Parameters
sizeThe number of elements to allocate storage for
streamThe stream on which to perform the allocation
mrThe resource used to allocate the device storage

◆ device_uvector() [2/2]

template<typename T >
rmm::device_uvector< T >::device_uvector ( device_uvector< T > const &  other,
cuda_stream_view  stream,
async_resource_ref  mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new device_uvector by deep copying the contents of another device_uvector.

Elements are copied as if by memcpy, i.e., T's copy constructor is not invoked.

Parameters
otherThe vector to copy from
streamThe stream on which to perform the copy
mrThe resource used to allocate device memory for the new vector

Member Function Documentation

◆ back_element()

template<typename T >
value_type rmm::device_uvector< T >::back_element ( cuda_stream_view  stream) const
inline

Returns the last element.

Note
This function incurs a device-to-host memcpy and should be used sparingly.
This function synchronizes stream.
Exceptions
rmm::out_of_rangeexception if the vector is empty.
Parameters
streamThe stream on which to perform the copy
Returns
The value of the last element

◆ begin() [1/2]

template<typename T >
const_iterator rmm::device_uvector< T >::begin ( ) const
inlinenoexcept

Returns a const_iterator to the first element.

If the vector is empty, then begin() == end().

Returns
Immutable iterator to the first element.

◆ begin() [2/2]

template<typename T >
iterator rmm::device_uvector< T >::begin ( )
inlinenoexcept

Returns an iterator to the first element.

If the vector is empty, then begin() == end().

Returns
Iterator to the first element.

◆ capacity()

template<typename T >
std::size_t rmm::device_uvector< T >::capacity ( ) const
inlinenoexcept

Returns the number of elements that can be held in currently allocated storage.

Returns
std::size_t The number of elements that can be stored without requiring a new allocation.

◆ cbegin()

template<typename T >
const_iterator rmm::device_uvector< T >::cbegin ( ) const
inlinenoexcept

Returns a const_iterator to the first element.

If the vector is empty, then cbegin() == cend().

Returns
Immutable iterator to the first element.

◆ cend()

template<typename T >
const_iterator rmm::device_uvector< T >::cend ( ) const
inlinenoexcept

Returns a const_iterator to the element following the last element of the vector.

The element referenced by end() is a placeholder and dereferencing it results in undefined behavior.

Returns
Immutable iterator to one past the last element.

◆ data() [1/2]

template<typename T >
const_pointer rmm::device_uvector< T >::data ( ) const
inlinenoexcept

Returns const pointer to underlying device storage.

Note
If size() == 0 it is undefined behavior to deference the returned pointer. Furthermore, the returned pointer may or may not be equal to nullptr.
Returns
const_pointer Raw const pointer to element storage in device memory.

◆ data() [2/2]

template<typename T >
pointer rmm::device_uvector< T >::data ( )
inlinenoexcept

Returns pointer to underlying device storage.

Note
If size() == 0 it is undefined behavior to deference the returned pointer. Furthermore, the returned pointer may or may not be equal to nullptr.
Returns
Raw pointer to element storage in device memory.

◆ element()

template<typename T >
value_type rmm::device_uvector< T >::element ( std::size_t  element_index,
cuda_stream_view  stream 
) const
inline

Returns the specified element from device memory.

Note
This function incurs a device to host memcpy and should be used sparingly.
This function synchronizes stream.
Exceptions
rmm::out_of_rangeexception if element_index >= size()
Parameters
element_indexIndex of the desired element
streamThe stream on which to perform the copy
Returns
The value of the specified element

◆ element_ptr() [1/2]

template<typename T >
const_pointer rmm::device_uvector< T >::element_ptr ( std::size_t  element_index) const
inlinenoexcept

Returns pointer to the specified element.

Behavior is undefined if element_index >= size().

Parameters
element_indexIndex of the specified element.
Returns
T* Pointer to the desired element

◆ element_ptr() [2/2]

template<typename T >
pointer rmm::device_uvector< T >::element_ptr ( std::size_t  element_index)
inlinenoexcept

Returns pointer to the specified element.

Behavior is undefined if element_index >= size().

Parameters
element_indexIndex of the specified element.
Returns
T* Pointer to the desired element

◆ end() [1/2]

template<typename T >
const_iterator rmm::device_uvector< T >::end ( ) const
inlinenoexcept

Returns an iterator to the element following the last element of the vector.

The element referenced by end() is a placeholder and dereferencing it results in undefined behavior.

Returns
Immutable iterator to one past the last element.

◆ end() [2/2]

template<typename T >
iterator rmm::device_uvector< T >::end ( )
inlinenoexcept

Returns an iterator to the element following the last element of the vector.

The element referenced by end() is a placeholder and dereferencing it results in undefined behavior.

Returns
Iterator to one past the last element.

◆ front_element()

template<typename T >
value_type rmm::device_uvector< T >::front_element ( cuda_stream_view  stream) const
inline

Returns the first element.

Note
This function incurs a device-to-host memcpy and should be used sparingly.
This function synchronizes stream.
Exceptions
rmm::out_of_rangeexception if the vector is empty.
Parameters
streamThe stream on which to perform the copy
Returns
The value of the first element

◆ is_empty()

template<typename T >
bool rmm::device_uvector< T >::is_empty ( ) const
inlinenoexcept

true if the vector contains no elements, i.e. size() == 0

Returns
true if the vector contains no elements, i.e. size() == 0

◆ memory_resource()

template<typename T >
async_resource_ref rmm::device_uvector< T >::memory_resource ( ) const
inlinenoexcept

The async_resource_ref used to allocate and deallocate the device storage.

Returns
The async_resource_ref used to allocate and deallocate the device storage

◆ operator=()

template<typename T >
device_uvector& rmm::device_uvector< T >::operator= ( device_uvector< T > &&  )
defaultnoexcept

Default move assignment operator.

Returns
device_uvector& Reference to the assigned object

◆ release()

template<typename T >
device_buffer rmm::device_uvector< T >::release ( )
inlinenoexcept

Release ownership of device memory storage.

Returns
The device_buffer used to store the vector elements

◆ reserve()

template<typename T >
void rmm::device_uvector< T >::reserve ( std::size_t  new_capacity,
cuda_stream_view  stream 
)
inline

Increases the capacity of the vector to new_capacity elements.

If new_capacity <= capacity(), no action is taken.

If new_capacity > capacity(), a new allocation of size new_capacity is created, and the first size() elements from the current allocation are copied there as if by memcpy. Finally, the old allocation is freed and replaced by the new allocation.

Parameters
new_capacityThe desired capacity (number of elements)
streamThe stream on which to perform the allocation/copy (if any)

◆ resize()

template<typename T >
void rmm::device_uvector< T >::resize ( std::size_t  new_size,
cuda_stream_view  stream 
)
inline

Resizes the vector to contain new_size elements.

If new_size > size(), the additional elements are uninitialized.

If new_size < capacity(), no action is taken other than updating the value of size(). No memory is allocated nor copied. shrink_to_fit() may be used to force deallocation of unused memory.

If new_size > capacity(), elements are copied as if by memcpy to a new allocation.

The invariant size() <= capacity() holds.

Parameters
new_sizeThe desired number of elements
streamThe stream on which to perform the allocation/copy (if any)

◆ set_element()

template<typename T >
void rmm::device_uvector< T >::set_element ( std::size_t  element_index,
T const &  value,
cuda_stream_view  stream 
)
inline

Performs a synchronous copy of v to the specified element in device memory.

Because this function synchronizes the stream s, it is safe to destroy or modify the object referenced by v after this function has returned.

Note
This function incurs a host to device memcpy and should be used sparingly.
This function synchronizes stream.

Example:

int v{42};
// Copies 42 to element 0 on `stream` and synchronizes the stream
vec.set_element(0, v, stream);
// It is safe to destroy or modify `v`
v = 13;
cuda_stream_view stream() const noexcept
Stream most recently specified for allocation/deallocation.
Definition: device_uvector.hpp:538
Exceptions
rmm::out_of_rangeexception if element_index >= size()
Parameters
element_indexIndex of the target element
valueThe value to copy to the specified element
streamThe stream on which to perform the copy

◆ set_element_async()

template<typename T >
void rmm::device_uvector< T >::set_element_async ( std::size_t  element_index,
value_type const &  value,
cuda_stream_view  stream 
)
inline

Performs an asynchronous copy of v to the specified element in device memory.

This specialization for fundamental types is optimized to use cudaMemsetAsync when host_value is zero.

This function does not synchronize stream s before returning. Therefore, the object referenced by v should not be destroyed or modified until stream has been synchronized. Otherwise, behavior is undefined.

Note
This function incurs a host to device memcpy and should be used sparingly.
Calling this function with a literal or other r-value reference for v is disallowed to prevent the implementation from asynchronously copying from a literal or other implicit temporary after it is deleted or goes out of scope.

Example:

int v{42};
// Copies 42 to element 0 on `stream`. Does _not_ synchronize
vec.set_element_async(0, v, stream);
...
cudaStreamSynchronize(stream);
// Synchronization is required before `v` can be modified
v = 13;
Exceptions
rmm::out_of_rangeexception if element_index >= size()
Parameters
element_indexIndex of the target element
valueThe value to copy to the specified element
streamThe stream on which to perform the copy

◆ set_element_to_zero_async()

template<typename T >
void rmm::device_uvector< T >::set_element_to_zero_async ( std::size_t  element_index,
cuda_stream_view  stream 
)
inline

Asynchronously sets the specified element to zero in device memory.

This function does not synchronize stream s before returning

Note
This function incurs a device memset and should be used sparingly.

Example:

int v{42};
// Sets element at index 42 to 0 on `stream`. Does _not_ synchronize
vec.set_element_to_zero_async(42, stream);
Exceptions
rmm::out_of_rangeexception if element_index >= size()
Parameters
element_indexIndex of the target element
streamThe stream on which to perform the copy

◆ set_stream()

template<typename T >
void rmm::device_uvector< T >::set_stream ( cuda_stream_view  stream)
inlinenoexcept

Sets the stream to be used for deallocation.

If no other rmm::device_uvector method that allocates memory is called after this call with a different stream argument, then stream will be used for deallocation in the rmm::device_uvector destructor. However, if either ofresize()orshrink_to_fit()` is called after this, the later stream parameter will be stored and used in the destructor.

Parameters
streamThe stream to use for deallocation

◆ shrink_to_fit()

template<typename T >
void rmm::device_uvector< T >::shrink_to_fit ( cuda_stream_view  stream)
inline

Forces deallocation of unused device memory.

If capacity() > size(), reallocates and copies vector contents to eliminate unused memory.

Parameters
streamStream on which to perform allocation and copy

◆ size()

template<typename T >
std::size_t rmm::device_uvector< T >::size ( ) const
inlinenoexcept

The number of elements in the vector.

Returns
The number of elements in the vector

◆ ssize()

template<typename T >
std::int64_t rmm::device_uvector< T >::ssize ( ) const
inlinenoexcept

The signed number of elements in the vector.

Returns
The signed number of elements in the vector

◆ stream()

template<typename T >
cuda_stream_view rmm::device_uvector< T >::stream ( ) const
inlinenoexcept

Stream most recently specified for allocation/deallocation.

Returns
Stream most recently specified for allocation/deallocation

The documentation for this class was generated from the following file: