cudf.crosstab#

cudf.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name='All', dropna=None, normalize=False)#

Compute a simple cross tabulation of two (or more) factors. By default computes a frequency table of the factors unless an array of values and an aggregation function are passed.

Parameters
indexarray-like, Series, or list of arrays/Series

Values to group by in the rows.

columnsarray-like, Series, or list of arrays/Series

Values to group by in the columns.

valuesarray-like, optional

Array of values to aggregate according to the factors. Requires aggfunc be specified.

rownameslist of str, default None

If passed, must match number of row arrays passed.

colnameslist of str, default None

If passed, must match number of column arrays passed.

aggfuncfunction, optional

If specified, requires values be specified as well.

marginsNot supported
margins_nameNot supported
dropnaNot supported
normalizeNot supported
Returns
DataFrame

Cross tabulation of the data.

Examples

>>> a = cudf.Series(["foo", "foo", "foo", "foo", "bar", "bar",
...               "bar", "bar", "foo", "foo", "foo"], dtype=object)
>>> b = cudf.Series(["one", "one", "one", "two", "one", "one",
...               "one", "two", "two", "two", "one"], dtype=object)
>>> c = cudf.Series(["dull", "dull", "shiny", "dull", "dull", "shiny",
...               "shiny", "dull", "shiny", "shiny", "shiny"],
...              dtype=object)
>>> cudf.crosstab(a, [b, c], rownames=['a'], colnames=['b', 'c'])
b   one        two
c   dull shiny dull shiny
a
bar    1     2    1     0
foo    2     2    1     2