std::unique_ptr< cudf::column > normalize_characters(cudf::strings_column_view const &input, bool do_lower_case, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
Normalizes strings characters for tokenizing.
std::unique_ptr< cudf::column > normalize_spaces(cudf::strings_column_view const &input, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
Returns a new strings column by normalizing the whitespace in each string in the input column.