cudf.core.column.string.StringMethods.token_count#

StringMethods.token_count(delimiter: str = ' ') → SeriesOrIndex#

Each string is split into tokens using the provided delimiter. The returned integer sequence is the number of tokens in each string.

Parameters

delimiterstr or list of strs, Default is whitespace.: The characters or strings used to locate the split points of each string.

Returns

Examples

>>> import cudf
>>> ser = cudf.Series(["hello world","goodbye",""])
>>> ser.str.token_count()
0    2
1    1
2    0
dtype: int32