Skip to content

Get and reset polars options

Source code

Description

polars_options() returns a list of options for polars. Options can be set with options(). Note that options must be prefixed with "polars.", e.g to modify the option strictly_immutable you need to pass options(polars.strictly_immutable =). See below for a description of all options.

polars_options_reset() brings all polars options back to their default value.

Usage

polars_options()

polars_options_reset()

Details

The following options are available (in alphabetical order, with the default value in parenthesis):

  • debug_polars (FALSE): Print additional information to debug Polars.
  • do_not_repeat_call (FALSE): Do not print the call causing the error in error messages. The default is to show them.
  • int64_conversion (“double”): How should Int64 values be handled when converting a polars object to R?
    • “double” converts the integer values to double.
    • “bit64” uses bit64::as.integer64() to do the conversion (requires the package bit64 to be attached).
    • “string” converts Int64 values to character.
  • limit_max_threads (!polars_info()$features$disable_limit_max_threads): See ?pl_thread_pool_size for details. This option should be set before the package is loaded.
  • maintain_order (FALSE): Default for the maintain_order argument in \$group_by() and \$group_by().
  • no_messages (FALSE): Hide messages.
  • rpool_cap: The maximum number of R sessions that can be used to process R code in the background. See the section "About pool options" below.
  • strictly_immutable (TRUE): Keep polars strictly immutable. Polars/arrow is in general pro "immutable objects". Immutability is also classic in R. To mimic the Python-polars API, set this to FALSE.

Value

polars_options() returns a named list where the names are option names and values are option values.

polars_options_reset() doesn’t return anything.

About pool options

polars_options()$rpool_active indicates the number of R sessions already spawned in pool. polars_options()$rpool_cap indicates the maximum number of new R sessions that can be spawned. Anytime a polars thread worker needs a background R session specifically to run R code embedded in a query via $map_batches(…, in_background = TRUE) or $map_elements(…, in_background = TRUE), it will obtain any R session idling in rpool, or spawn a new R session (process) and add it to the rpool if rpool_cap is not already reached. If rpool_cap is already reached, the thread worker will sleep until an R session is idling.

Background R sessions communicate via polars arrow IPC (series/vectors) or R serialize + shared memory buffers via the rust crate ipc-channel. Multi-process communication has overhead because all data must be serialized/de-serialized and sent via buffers. Using multiple R sessions will likely only give a speed-up in a low io - high cpu scenario. Native polars query syntax runs in threads and have no overhead.

Examples

library(polars)

options(polars.maintain_order = TRUE, polars.strictly_immutable = FALSE)
polars_options()
#> Options:
#> ========                         
#> debug_polars        FALSE
#> df_knitr_print       auto
#> do_not_repeat_call  FALSE
#> int64_conversion   double
#> limit_max_threads   FALSE
#> maintain_order       TRUE
#> no_messages         FALSE
#> rpool_active            0
#> rpool_cap               4
#> strictly_immutable  FALSE
#> 
#> See `?polars_options` for the definition of all options.
# option checks are run when calling polars_options(), not when setting
# options
options(polars.maintain_order = 42, polars.int64_conversion = "foobar")
tryCatch(
  polars_options(),
  error = function(e) print(e)
)
#> <simpleError: Some polars options have an unexpected value:
#> - maintain_order: input must be TRUE or FALSE.
#> - int64_conversion: input must be one of "float", "string", "bit64".
#> 
#> More info at `?polars::polars_options`.>
# reset options to their default value
polars_options_reset()