DuplicateError
The issue is that the output name of when/then comes from the first .then() branch.
In this case 1 is parsed as pl.lit(1) which has the default name of literal.
pl.when(cs.numeric() > 50).then(1)
You can think of it as there being an implicit alias() call with the name from the then branch.
pl.when(cs.numeric() > 50).then(pl.lit(1)).alias("literal")
Expression expansion then turns this into 3 separate calls, so you get a DuplicateError.
pl.when(pl.col.j > 50).then(pl.lit(1)).alias("literal"),
pl.when(pl.col.k > 50).then(pl.lit(1)).alias("literal"),
pl.when(pl.col.l > 50).then(pl.lit(1)).alias("literal")
Keep name with a 'literal' then
.name.keep() can be added to use the column name as the output name instead.
df.select(
pl.when(cs.numeric() < 50)
.then(1)
.otherwise(2)
.name.keep()
)
shape: (10, 3)
┌─────┬─────┬─────┐
│ j ┆ k ┆ l │
│ --- ┆ --- ┆ --- │
│ i32 ┆ i32 ┆ i32 │
╞═════╪═════╪═════╡
│ 2 ┆ 2 ┆ 2 │
│ 1 ┆ 1 ┆ 2 │
│ 1 ┆ 1 ┆ 2 │
│ 2 ┆ 2 ┆ 1 │
│ 2 ┆ 1 ┆ 2 │
│ 2 ┆ 1 ┆ 2 │
│ 2 ┆ 1 ┆ 2 │
│ 2 ┆ 2 ┆ 2 │
│ 2 ┆ 1 ┆ 1 │
│ 2 ┆ 2 ┆ 1 │
└─────┴─────┴─────┘
Keep name with a 'column' then
As per the comments, if you want to use a col() inside then() you will get an error.
df.select(
pl.when(cs.numeric() < 50).then(pl.col.j).otherwise(2).name.keep()
)
# DuplicateError: projections contained duplicate output name 'j'.
What you can do nest another when/then inside your then() branch.
df.select(
pl.when(cs.numeric() < 50)
.then(pl.when(False).then(cs.numeric() < 50).otherwise(pl.col.j))
.otherwise(2)
)
shape: (10, 3)
┌─────┬─────┬─────┐
│ j ┆ k ┆ l │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 2 ┆ 2 ┆ 2 │
│ 26 ┆ 26 ┆ 2 │
│ 12 ┆ 12 ┆ 2 │
│ 2 ┆ 2 ┆ 92 │
│ 2 ┆ 95 ┆ 2 │
│ 2 ┆ 75 ┆ 2 │
│ 2 ┆ 61 ┆ 2 │
│ 2 ┆ 2 ┆ 2 │
│ 2 ┆ 73 ┆ 73 │
│ 2 ┆ 2 ┆ 66 │
└─────┴─────┴─────┘
You need the "column selector" inside the then() branch in order to retain the name.
pl.when(False) can be used to "broadcast" column j values into the other columns while keeping their name.
df.select(pl.when(False).then(cs.numeric() < 50).otherwise(pl.col.j))
shape: (10, 3)
┌─────┬─────┬─────┐
│ j ┆ k ┆ l │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 71 ┆ 71 ┆ 71 │
│ 26 ┆ 26 ┆ 26 │
│ 12 ┆ 12 ┆ 12 │
│ 92 ┆ 92 ┆ 92 │
│ 95 ┆ 95 ┆ 95 │
│ 75 ┆ 75 ┆ 75 │
│ 61 ┆ 61 ┆ 61 │
│ 74 ┆ 74 ┆ 74 │
│ 73 ┆ 73 ┆ 73 │
│ 66 ┆ 66 ┆ 66 │
└─────┴─────┴─────┘
Technically you only .then(cs.numeric()) for the "inner" when, but I've just repeated the predicate.