Skip to content

Pivot data from long to wide

Source code

Description

Pivot data from long to wide

Usage

<DataFrame>$pivot(
  values,
  index,
  columns,
  ...,
  aggregate_function = NULL,
  maintain_order = TRUE,
  sort_columns = FALSE,
  separator = "_"
)

Arguments

values Column values to aggregate. Can be multiple columns if the columns arguments contains multiple columns as well.
index One or multiple keys to group by.
columns Name of the column(s) whose values will be used as the header of the output DataFrame.
Not used.
aggregate_function One of:
  • string indicating the expressions to aggregate with, such as ‘first’, ‘sum’, ‘max’, ‘min’, ‘mean’, ‘median’, ‘last’, ‘count’),
  • an Expr e.g. pl$element()$sum()
maintain_order Sort the grouped keys so that the output order is predictable.
sort_columns Sort the transposed columns by name. Default is by order of discovery.
separator Used as separator/delimiter in generated column names.

Value

DataFrame

Examples

library(polars)

df = pl$DataFrame(
  foo = c("one", "one", "one", "two", "two", "two"),
  bar = c("A", "B", "C", "A", "B", "C"),
  baz = c(1, 2, 3, 4, 5, 6)
)
df
#> shape: (6, 3)
#> ┌─────┬─────┬─────┐
#> │ foo ┆ bar ┆ baz │
#> │ --- ┆ --- ┆ --- │
#> │ str ┆ str ┆ f64 │
#> ╞═════╪═════╪═════╡
#> │ one ┆ A   ┆ 1.0 │
#> │ one ┆ B   ┆ 2.0 │
#> │ one ┆ C   ┆ 3.0 │
#> │ two ┆ A   ┆ 4.0 │
#> │ two ┆ B   ┆ 5.0 │
#> │ two ┆ C   ┆ 6.0 │
#> └─────┴─────┴─────┘
df$pivot(
  values = "baz", index = "foo", columns = "bar"
)
#> shape: (2, 4)
#> ┌─────┬─────┬─────┬─────┐
#> │ foo ┆ A   ┆ B   ┆ C   │
#> │ --- ┆ --- ┆ --- ┆ --- │
#> │ str ┆ f64 ┆ f64 ┆ f64 │
#> ╞═════╪═════╪═════╪═════╡
#> │ one ┆ 1.0 ┆ 2.0 ┆ 3.0 │
#> │ two ┆ 4.0 ┆ 5.0 ┆ 6.0 │
#> └─────┴─────┴─────┴─────┘
# Run an expression as aggregation function
df = pl$DataFrame(
  col1 = c("a", "a", "a", "b", "b", "b"),
  col2 = c("x", "x", "x", "x", "y", "y"),
  col3 = c(6, 7, 3, 2, 5, 7)
)
df
#> shape: (6, 3)
#> ┌──────┬──────┬──────┐
#> │ col1 ┆ col2 ┆ col3 │
#> │ ---  ┆ ---  ┆ ---  │
#> │ str  ┆ str  ┆ f64  │
#> ╞══════╪══════╪══════╡
#> │ a    ┆ x    ┆ 6.0  │
#> │ a    ┆ x    ┆ 7.0  │
#> │ a    ┆ x    ┆ 3.0  │
#> │ b    ┆ x    ┆ 2.0  │
#> │ b    ┆ y    ┆ 5.0  │
#> │ b    ┆ y    ┆ 7.0  │
#> └──────┴──────┴──────┘
df$pivot(
  index = "col1",
  columns = "col2",
  values = "col3",
  aggregate_function = pl$element()$tanh()$mean()
)
#> shape: (2, 3)
#> ┌──────┬──────────┬──────────┐
#> │ col1 ┆ x        ┆ y        │
#> │ ---  ┆ ---      ┆ ---      │
#> │ str  ┆ f64      ┆ f64      │
#> ╞══════╪══════════╪══════════╡
#> │ a    ┆ 0.998347 ┆ null     │
#> │ b    ┆ 0.964028 ┆ 0.999954 │
#> └──────┴──────────┴──────────┘