libcudf  24.02.00
Files | Functions
Finding

Files

file  find.hpp
 
file  find_multiple.hpp
 

Functions

std::unique_ptr< columncudf::strings::find (strings_column_view const &input, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of character position values where the target string is first found in each string of the provided column. More...
 
std::unique_ptr< columncudf::strings::rfind (strings_column_view const &input, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of character position values where the target string is first found searching from the end of each string. More...
 
std::unique_ptr< columncudf::strings::find (strings_column_view const &input, strings_column_view const &target, size_type start=0, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of character position values where the target string is first found in the corresponding string of the provided column. More...
 
std::unique_ptr< columncudf::strings::contains (strings_column_view const &input, string_scalar const &target, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found within that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::contains (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the corresponding target string was found within that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::starts_with (strings_column_view const &input, string_scalar const &target, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found at the beginning of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::starts_with (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the beginning of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::ends_with (strings_column_view const &input, string_scalar const &target, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found at the end of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::ends_with (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the end of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::find_multiple (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a lists column with character position values where each of the target strings are found in each string. More...
 

Detailed Description

Function Documentation

◆ contains() [1/2]

std::unique_ptr<column> cudf::strings::contains ( strings_column_view const &  input,
string_scalar const &  target,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the target string was found within that string in the provided column.

If the target is not found for a string, false is returned for that entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

Parameters
inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New BOOL8 column

◆ contains() [2/2]

std::unique_ptr<column> cudf::strings::contains ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the corresponding target string was found within that string in the provided column.

The 'output[i] = trueif stringtargets[i]is found insideinput[i]otherwise output[i] = false. Iftarget[i]is an empty string, true is returned foroutput[i]. Iftarget[i]is null, false is returned foroutput[i]`.

Any null string entries return corresponding null entries in the output columns.

Exceptions
cudf::logic_errorif strings.size() != targets.size().
Parameters
inputStrings instance for this operation
targetsStrings column of targets to check row-wise in strings
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New BOOL8 column

◆ ends_with() [1/2]

std::unique_ptr<column> cudf::strings::ends_with ( strings_column_view const &  input,
string_scalar const &  target,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the target string was found at the end of that string in the provided column.

If target is not found at the end of a string, false is set for that row entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

Parameters
inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New BOOL8 column

◆ ends_with() [2/2]

std::unique_ptr<column> cudf::strings::ends_with ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the end of that string in the provided column.

If targets[i] is not found at the end of a string in strings[i], false is set for that row entry in the output column. If targets[i] is an empty string, true is returned for the corresponding entry in the output column.

Any null string entries in targets return corresponding null entries in the output columns.

Exceptions
cudf::logic_errorif strings.size() != targets.size().
Parameters
inputStrings instance for this operation
targetsStrings instance for this operation
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New BOOL8 column

◆ find() [1/2]

std::unique_ptr<column> cudf::strings::find ( strings_column_view const &  input,
string_scalar const &  target,
size_type  start = 0,
size_type  stop = -1,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of character position values where the target string is first found in each string of the provided column.

If target is not found, -1 is returned for that row entry in the output column.

The target string is searched within each string in the character position range [start,stop). If the stop parameter is -1, then the end of each string becomes the final position to include in the search.

Any null string entries return corresponding null output column entries.

Exceptions
cudf::logic_errorif start position is greater than stop position.
Parameters
inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
startFirst character position to include in the search
stopLast position (exclusive) to include in the search. Default of -1 will search to the end of the string.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New integer column with character position values

◆ find() [2/2]

std::unique_ptr<column> cudf::strings::find ( strings_column_view const &  input,
strings_column_view const &  target,
size_type  start = 0,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of character position values where the target string is first found in the corresponding string of the provided column.

The output of row i is the character position of the target string for row i within input string of row i starting at the character position start. If the target is not found within the input string, -1 is returned for that row entry in the output column.

Any null input or target entries return corresponding null output column entries.

Exceptions
cudf::logic_errorif input.size() != target.size()
Parameters
inputStrings to search against
targetStrings to search for in input
startFirst character position to include in the search
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New integer column with character position values

◆ find_multiple()

std::unique_ptr<column> cudf::strings::find_multiple ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a lists column with character position values where each of the target strings are found in each string.

The size of the output column is input.size(). Each row of the output column is of size targets.size().

output[i,j] contains the position of targets[j] in input[i]

Example:
s = ["abc", "def"]
t = ["a", "c", "e"]
r = find_multiple(s, t)
r is now {[ 0, 2,-1], // for "abc": "a" at pos 0, "c" at pos 2, "e" not found
[-1,-1, 1 ]} // for "def": "a" and "b" not found, "e" at pos 1
Exceptions
cudf::logic_errorif targets is empty or contains nulls
Parameters
inputStrings instance for this operation
targetsStrings to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
Lists column with character position values

◆ rfind()

std::unique_ptr<column> cudf::strings::rfind ( strings_column_view const &  input,
string_scalar const &  target,
size_type  start = 0,
size_type  stop = -1,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of character position values where the target string is first found searching from the end of each string.

If target is not found, -1 is returned for that entry.

The target string is searched within each string in the character position range [start,stop). If the stop parameter is -1, then the end of each string becomes the final position to include in the search.

Any null string entries return corresponding null output column entries.

Exceptions
cudf::logic_errorif start position is greater than stop position.
Parameters
inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
startFirst position to include in the search
stopLast position (exclusive) to include in the search. Default of -1 will search starting at the end of the string.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New integer column with character position values

◆ starts_with() [1/2]

std::unique_ptr<column> cudf::strings::starts_with ( strings_column_view const &  input,
string_scalar const &  target,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the target string was found at the beginning of that string in the provided column.

If target is not found at the beginning of a string, false is set for that row entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

Parameters
inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New type_id::BOOL8 column.

◆ starts_with() [2/2]

std::unique_ptr<column> cudf::strings::starts_with ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the beginning of that string in the provided column.

If targets[i] is not found at the beginning of a string in strings[i], false is set for that row entry in the output column. If targets[i] is an empty string, true is returned for corresponding entry in the output column.

Any null string entries in targets return corresponding null entries in the output columns.

Exceptions
cudf::logic_errorif strings.size() != targets.size().
Parameters
inputStrings instance for this operation
targetsStrings instance for this operation
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New BOOL8 column