libcudf  24.04.00
Public Member Functions | Static Public Member Functions | List of all members
cudf::io::json_reader_options Class Reference

Input arguments to the read_json interface. More...

#include <json.hpp>

Public Member Functions

 json_reader_options ()=default
 Default constructor. More...
 
source_info const & get_source () const
 Returns source info. More...
 
std::variant< std::vector< data_type >, std::map< std::string, data_type >, std::map< std::string, schema_element > > const & get_dtypes () const
 Returns data types of the columns. More...
 
compression_type get_compression () const
 Returns compression format of the source. More...
 
size_t get_byte_range_offset () const
 Returns number of bytes to skip from source start. More...
 
size_t get_byte_range_size () const
 Returns number of bytes to read. More...
 
size_t get_byte_range_size_with_padding () const
 Returns number of bytes to read with padding. More...
 
size_t get_byte_range_padding () const
 Returns number of bytes to pad when reading. More...
 
bool is_enabled_lines () const
 Whether to read the file as a json object per line. More...
 
bool is_enabled_mixed_types_as_string () const
 Whether to parse mixed types as a string column. More...
 
bool is_enabled_dayfirst () const
 Whether to parse dates as DD/MM versus MM/DD. More...
 
bool is_enabled_legacy () const
 Whether the legacy reader should be used. More...
 
bool is_enabled_keep_quotes () const
 Whether the reader should keep quotes of string values. More...
 
bool is_enabled_normalize_single_quotes () const
 Whether the reader should normalize single quotes around strings. More...
 
bool is_enabled_normalize_whitespace () const
 Whether the reader should normalize unquoted whitespace characters. More...
 
json_recovery_mode_t recovery_mode () const
 Queries the JSON reader's behavior on invalid JSON lines. More...
 
void set_dtypes (std::vector< data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, data_type > types)
 Set data types for columns to be read. More...
 
void set_dtypes (std::map< std::string, schema_element > types)
 Set data types for a potentially nested column hierarchy. More...
 
void set_compression (compression_type comp_type)
 Set the compression type. More...
 
void set_byte_range_offset (size_type offset)
 Set number of bytes to skip from source start. More...
 
void set_byte_range_size (size_type size)
 Set number of bytes to read. More...
 
void enable_lines (bool val)
 Set whether to read the file as a json object per line. More...
 
void enable_mixed_types_as_string (bool val)
 Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema. More...
 
void enable_dayfirst (bool val)
 Set whether to parse dates as DD/MM versus MM/DD. More...
 
void enable_legacy (bool val)
 Set whether to use the legacy reader. More...
 
void enable_keep_quotes (bool val)
 Set whether the reader should keep quotes of string values. More...
 
void enable_normalize_single_quotes (bool val)
 Set whether the reader should enable normalization of single quotes around strings. More...
 
void enable_normalize_whitespace (bool val)
 Set whether the reader should enable normalization of unquoted whitespace. More...
 
void set_recovery_mode (json_recovery_mode_t val)
 Specifies the JSON reader's behavior on invalid JSON lines. More...
 

Static Public Member Functions

static json_reader_options_builder builder (source_info src)
 create json_reader_options_builder which will build json_reader_options. More...
 

Detailed Description

Input arguments to the read_json interface.

Available parameters are closely patterned after PANDAS' read_json API. Not all parameters are supported. If the matching PANDAS' parameter has a default value of None, then a default value of -1 or 0 may be used as the equivalent.

Parameters in PANDAS that are unavailable or in cudf:

Name Description
orient currently fixed-format
typ data is always returned as a cudf::table
convert_axes use column functions for axes operations instead
convert_dates dates are detected automatically
keep_default_dates dates are detected automatically
numpy data is always returned as a cudf::table
precise_float there is only one converter
date_unit only millisecond units are supported
encoding only ASCII-encoded data is supported
chunksize use byte_range_xxx for chunking instead

Definition at line 88 of file io/json.hpp.

Constructor & Destructor Documentation

◆ json_reader_options()

cudf::io::json_reader_options::json_reader_options ( )
default

Default constructor.

This has been added since Cython requires a default constructor to create objects on stack.

Member Function Documentation

◆ builder()

static json_reader_options_builder cudf::io::json_reader_options::builder ( source_info  src)
static

create json_reader_options_builder which will build json_reader_options.

Parameters
srcsource information used to read json file
Returns
builder to build the options

◆ enable_dayfirst()

void cudf::io::json_reader_options::enable_dayfirst ( bool  val)
inline

Set whether to parse dates as DD/MM versus MM/DD.

Parameters
valBoolean value to enable/disable day first parsing format

Definition at line 347 of file io/json.hpp.

◆ enable_keep_quotes()

void cudf::io::json_reader_options::enable_keep_quotes ( bool  val)
inline

Set whether the reader should keep quotes of string values.

Parameters
valBoolean value to indicate whether the reader should keep quotes of string values

Definition at line 362 of file io/json.hpp.

◆ enable_legacy()

void cudf::io::json_reader_options::enable_legacy ( bool  val)
inline

Set whether to use the legacy reader.

Parameters
valBoolean value to enable/disable the legacy reader

Definition at line 354 of file io/json.hpp.

◆ enable_lines()

void cudf::io::json_reader_options::enable_lines ( bool  val)
inline

Set whether to read the file as a json object per line.

Parameters
valBoolean value to enable/disable the option to read each line as a json object

Definition at line 332 of file io/json.hpp.

◆ enable_mixed_types_as_string()

void cudf::io::json_reader_options::enable_mixed_types_as_string ( bool  val)
inline

Set whether to parse mixed types as a string column. Also enables forcing to read a struct as string column using schema.

Parameters
valBoolean value to enable/disable parsing mixed types as a string column

Definition at line 340 of file io/json.hpp.

◆ enable_normalize_single_quotes()

void cudf::io::json_reader_options::enable_normalize_single_quotes ( bool  val)
inline

Set whether the reader should enable normalization of single quotes around strings.

Parameters
valBoolean value to indicate whether the reader should normalize single quotes around strings

Definition at line 370 of file io/json.hpp.

◆ enable_normalize_whitespace()

void cudf::io::json_reader_options::enable_normalize_whitespace ( bool  val)
inline

Set whether the reader should enable normalization of unquoted whitespace.

Parameters
valBoolean value to indicate whether the reader should normalize unquoted whitespace characters i.e. tabs and spaces

Definition at line 378 of file io/json.hpp.

◆ get_byte_range_offset()

size_t cudf::io::json_reader_options::get_byte_range_offset ( ) const
inline

Returns number of bytes to skip from source start.

Returns
Number of bytes to skip from source start

Definition at line 184 of file io/json.hpp.

◆ get_byte_range_padding()

size_t cudf::io::json_reader_options::get_byte_range_padding ( ) const
inline

Returns number of bytes to pad when reading.

Returns
Number of bytes to pad

Definition at line 212 of file io/json.hpp.

◆ get_byte_range_size()

size_t cudf::io::json_reader_options::get_byte_range_size ( ) const
inline

Returns number of bytes to read.

Returns
Number of bytes to read

Definition at line 191 of file io/json.hpp.

◆ get_byte_range_size_with_padding()

size_t cudf::io::json_reader_options::get_byte_range_size_with_padding ( ) const
inline

Returns number of bytes to read with padding.

Returns
Number of bytes to read with padding

Definition at line 198 of file io/json.hpp.

◆ get_compression()

compression_type cudf::io::json_reader_options::get_compression ( ) const
inline

Returns compression format of the source.

Returns
Compression format of the source

Definition at line 177 of file io/json.hpp.

◆ get_dtypes()

std::variant<std::vector<data_type>, std::map<std::string, data_type>, std::map<std::string, schema_element> > const& cudf::io::json_reader_options::get_dtypes ( ) const
inline

Returns data types of the columns.

Returns
Data types of the columns

Definition at line 167 of file io/json.hpp.

◆ get_source()

source_info const& cudf::io::json_reader_options::get_source ( ) const
inline

Returns source info.

Returns
Source info

Definition at line 157 of file io/json.hpp.

◆ is_enabled_dayfirst()

bool cudf::io::json_reader_options::is_enabled_dayfirst ( ) const
inline

Whether to parse dates as DD/MM versus MM/DD.

Returns
true if dates are parsed as DD/MM, false if MM/DD

Definition at line 248 of file io/json.hpp.

◆ is_enabled_keep_quotes()

bool cudf::io::json_reader_options::is_enabled_keep_quotes ( ) const
inline

Whether the reader should keep quotes of string values.

Returns
true if the reader should keep quotes, false otherwise

Definition at line 262 of file io/json.hpp.

◆ is_enabled_legacy()

bool cudf::io::json_reader_options::is_enabled_legacy ( ) const
inline

Whether the legacy reader should be used.

Returns
true if the legacy reader will be used, false otherwise

Definition at line 255 of file io/json.hpp.

◆ is_enabled_lines()

bool cudf::io::json_reader_options::is_enabled_lines ( ) const
inline

Whether to read the file as a json object per line.

Returns
true if reading the file as a json object per line

Definition at line 234 of file io/json.hpp.

◆ is_enabled_mixed_types_as_string()

bool cudf::io::json_reader_options::is_enabled_mixed_types_as_string ( ) const
inline

Whether to parse mixed types as a string column.

Returns
true if mixed types are parsed as a string column

Definition at line 241 of file io/json.hpp.

◆ is_enabled_normalize_single_quotes()

bool cudf::io::json_reader_options::is_enabled_normalize_single_quotes ( ) const
inline

Whether the reader should normalize single quotes around strings.

Returns
true if the reader should normalize single quotes, false otherwise

Definition at line 269 of file io/json.hpp.

◆ is_enabled_normalize_whitespace()

bool cudf::io::json_reader_options::is_enabled_normalize_whitespace ( ) const
inline

Whether the reader should normalize unquoted whitespace characters.

Returns
true if the reader should normalize whitespace, false otherwise

Definition at line 276 of file io/json.hpp.

◆ recovery_mode()

json_recovery_mode_t cudf::io::json_reader_options::recovery_mode ( ) const
inline

Queries the JSON reader's behavior on invalid JSON lines.

Returns
An enum that specifies the JSON reader's behavior on invalid JSON lines.

Definition at line 283 of file io/json.hpp.

◆ set_byte_range_offset()

void cudf::io::json_reader_options::set_byte_range_offset ( size_type  offset)
inline

Set number of bytes to skip from source start.

Parameters
offsetNumber of bytes of offset

Definition at line 318 of file io/json.hpp.

◆ set_byte_range_size()

void cudf::io::json_reader_options::set_byte_range_size ( size_type  size)
inline

Set number of bytes to read.

Parameters
sizeNumber of bytes to read

Definition at line 325 of file io/json.hpp.

◆ set_compression()

void cudf::io::json_reader_options::set_compression ( compression_type  comp_type)
inline

Set the compression type.

Parameters
comp_typeThe compression type used

Definition at line 311 of file io/json.hpp.

◆ set_dtypes() [1/3]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector dtypes in string format

Definition at line 297 of file io/json.hpp.

◆ set_dtypes() [2/3]

void cudf::io::json_reader_options::set_dtypes ( std::map< std::string, schema_element types)
inline

Set data types for a potentially nested column hierarchy.

Parameters
typesMap of column names to schema_element to support arbitrary nesting of data types

Definition at line 304 of file io/json.hpp.

◆ set_dtypes() [3/3]

void cudf::io::json_reader_options::set_dtypes ( std::vector< data_type types)
inline

Set data types for columns to be read.

Parameters
typesVector of dtypes

Definition at line 290 of file io/json.hpp.

◆ set_recovery_mode()

void cudf::io::json_reader_options::set_recovery_mode ( json_recovery_mode_t  val)
inline

Specifies the JSON reader's behavior on invalid JSON lines.

Parameters
valAn enum value to indicate the JSON reader's behavior on invalid JSON lines.

Definition at line 385 of file io/json.hpp.


The documentation for this class was generated from the following file: