Esc
Start typing to search...

DataFrame.Expr Module

Composable column expressions for DataFrame operations.

The DataFrame.Expr module provides a functional, pipe-friendly API for building column expressions that compile directly to Polars with SIMD optimization and parallel execution. Use expressions with DataFrame.applyExprs, DataFrame.filter, and DataFrame.agg.

Why Expressions?

  1. Performance: Expressions compile directly to Polars — always fast, no fallback
  2. Composability: Expressions are values that can be bound, passed, and composed
  3. Window functions: Impossible with closures, natural with expressions
  4. Aggregations: Sum, mean, count as composable operations

Common patterns

import DataFrame
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Add computed columns
df |> DataFrame.applyExprs
    [ col "price" |> Expr.mul (col "qty") |> Expr.named "total"
    , col "price" |> Expr.mul 1.1 |> Expr.named "with_tax"
    ]

-- Filter with expressions
df |> DataFrame.filter (col "status" |> Expr.eq "active")

-- Conditional logic
Expr.cond
    [ (col "age" |> Expr.lt 18, "minor")
    , (col "age" |> Expr.lt 65, "adult")
    ] "senior"

-- Window functions
col "sales" |> Expr.sum |> Expr.over ["region"] |> Expr.named "region_total"

Aggregation with groupBy

df
    |> DataFrame.groupBy ["department"]
    |> DataFrame.agg
        [ col "salary" |> Expr.mean |> Expr.named "avg_salary"
        , col "id" |> Expr.count |> Expr.named "employee_count"
        ]

Functions

Constructors

DataFrame.Expr.col

DataFrameColumn -> Expr

Reference a DataFrame column by name. Use the @name syntax to pass a column reference.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Reference a column (preferred: use @name directly)
@name

-- Explicit col wrapper (equivalent to @name)
col @name

-- Use in arithmetic
@price |> Expr.mul (@quantity)

-- Use in comparisons (scalar auto-coerced)
@age |> Expr.gte 18
Try it

Notes: Accepts a DataFrameColumn reference (@name). Column names are case-sensitive and must exactly match the DataFrame column names. In most cases you can use @name directly without wrapping in col.

See also: lit, named

DataFrame.Expr.lit

a -> Expr

Create a literal (constant) expression from a value. Supports Int, Float, String, Bool, and Unit (null).

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Integer literal
lit 42

-- Float literal
lit 3.14159

-- String literal
lit "active"

-- Boolean literal
lit True

-- Use in expressions (explicit lit is optional — scalars auto-coerce)
@price |> Expr.mul (lit 1.1)  -- explicit lit still works
@price |> Expr.mul 1.1        -- equivalent shorthand
Try it

Notes: Unit values become SQL NULL. Scalar values (Int, Float, String, Bool) auto-coerce in binary Expr operations, so explicit lit is optional in most cases.

See also: col

DataFrame.Expr.named

String -> Expr -> Expr

Assign a name (alias) to an expression's output column. Required in agg to define the result column name. Not needed for applyExprs — use the tuple key (@col, expr) instead.

Example:
import DataFrame.Expr as Expr

-- Name the output of an aggregation expression
@price |> Expr.mul @qty |> Expr.named "total"
Try it

Notes: Column name cannot be empty. The alias only affects the output column name, not the expression itself.

See also: col, DataFrame.agg

Arithmetic

DataFrame.Expr.add

a -> Expr -> Expr

Add two expressions element-wise. Works with numeric columns (Int, Float). The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Add two columns
@a |> Expr.add (@b)

-- Add a constant (scalar auto-coerced)
@price |> Expr.add 10

-- Chain operations
@a |> Expr.add (@b) |> Expr.add 1
Try it

Notes: Follows pipe convention: lhs |> add rhs = lhs + rhs. Type coercion follows Polars rules (Int + Float = Float). Scalars (Int, Float, String, Bool) are auto-coerced to lit.

See also: sub, mul, div

DataFrame.Expr.sub

a -> Expr -> Expr

Subtract two expressions element-wise (lhs - rhs). The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Column difference
@revenue |> Expr.sub (@cost)

-- Subtract a constant (scalar auto-coerced)
@score |> Expr.sub 5
Try it

Notes: Follows pipe convention: lhs |> sub rhs = lhs - rhs.

See also: add, mul, div

DataFrame.Expr.mul

a -> Expr -> Expr

Multiply two expressions element-wise. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Calculate total
@price |> Expr.mul (@quantity)

-- Apply percentage (scalar auto-coerced)
@salary |> Expr.mul 1.05  -- 5% raise

-- Named result
@hours |> Expr.mul (@rate) |> Expr.named "pay"
Try it

Notes: Follows pipe convention: lhs |> mul rhs = lhs * rhs.

See also: add, sub, div, pow

DataFrame.Expr.div

a -> Expr -> Expr

Divide two expressions element-wise (lhs / rhs). Returns Float for integer division.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Calculate ratio
@completed |> Expr.div (@total)

-- Per-unit value
@total_cost |> Expr.div (@quantity)

-- Normalize (0-1 range)
@value |> Expr.div (@max_value)
Try it

Notes: Division by zero returns null (not an error). Integer division produces Float result.

See also: mul, mod

DataFrame.Expr.mod

a -> Expr -> Expr

Modulo (remainder) of two expressions (lhs % rhs). The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Check if even (scalars auto-coerced)
@n |> Expr.mod 2 |> Expr.eq 0

-- Get last digit
@id |> Expr.mod 10
Try it

Notes: Result has the same sign as the dividend (lhs).

See also: div

DataFrame.Expr.pow

a -> Expr -> Expr

Raise base to exponent power (base ^ exp). The exponent auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Square a column (scalar auto-coerced)
@x |> Expr.pow 2

-- Cube root (exponent 1/3)
@volume |> Expr.pow 0.333333

-- Compound interest
@principal |> Expr.mul (lit 1.05 |> Expr.pow (@years))
Try it

Notes: Follows pipe convention: base |> pow exp = base ^ exp.

See also: sqrt, mul

Comparison

DataFrame.Expr.eq

a -> Expr -> Expr

Test equality (==). Returns a boolean expression. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Filter by status (scalar auto-coerced)
@status |> Expr.eq "active"

-- Compare columns
@actual |> Expr.eq (@expected)
Try it

Notes: Null values: null == null returns null, not True. Use isNull for null checks.

See also: neq, gt, lt, isNull

DataFrame.Expr.neq

a -> Expr -> Expr

Test inequality (!=). Returns a boolean expression. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Exclude a status (scalar auto-coerced)
@status |> Expr.neq "deleted"

-- Combine with boolean ops
@type |> Expr.neq "test" |> Expr.and (@active |> Expr.eq True)
Try it

Notes: Null values: null != value returns null, not True.

See also: eq, isNotNull

DataFrame.Expr.gt

a -> Expr -> Expr

Greater than comparison (>). Returns a boolean expression. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Age filter (scalar auto-coerced)
@age |> Expr.gt 18

-- Compare columns
@revenue |> Expr.gt (@cost)

-- Chain with boolean ops
@score |> Expr.gt 90 |> Expr.and (@passed |> Expr.eq True)
Try it

Notes: Follows pipe convention: lhs |> gt rhs = lhs > rhs.

See also: gte, lt, lte

DataFrame.Expr.gte

a -> Expr -> Expr

Greater than or equal comparison (>=). Returns a boolean expression. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Minimum threshold (scalar auto-coerced)
@quantity |> Expr.gte 10

-- Date comparison
@year |> Expr.gte 2020
Try it

Notes: Follows pipe convention: lhs |> gte rhs = lhs >= rhs.

See also: gt, lte

DataFrame.Expr.lt

a -> Expr -> Expr

Less than comparison (<). Returns a boolean expression. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Below threshold (scalar auto-coerced)
@temperature |> Expr.lt 0

-- Range check (combine with gt)
let inRange = @x |> Expr.gt 0 |> Expr.and (@x |> Expr.lt 100)
Try it

Notes: Follows pipe convention: lhs |> lt rhs = lhs < rhs.

See also: lte, gt, gte

DataFrame.Expr.lte

a -> Expr -> Expr

Less than or equal comparison (<=). Returns a boolean expression. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Maximum threshold (scalar auto-coerced)
@price |> Expr.lte 100

-- Cohort bucketing with cond (scalars auto-coerced)
Expr.cond
    [ (@age |> Expr.lte 17, "minor")
    , (@age |> Expr.lte 64, "adult")
    ]
    "senior"
Try it

Notes: Follows pipe convention: lhs |> lte rhs = lhs <= rhs.

See also: lt, gte

Boolean

DataFrame.Expr.and

a -> Expr -> Expr

Logical AND of two boolean expressions. Both must be true for result to be true. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Combine conditions (scalars auto-coerced)
let isActiveAdult =
    @age |> Expr.gte 18
    |> Expr.and (@status |> Expr.eq "active")

-- Multiple conditions
@a |> Expr.gt 0
    |> Expr.and (@b |> Expr.gt 0)
    |> Expr.and (@c |> Expr.gt 0)
Try it

Notes: Short-circuit evaluation is not guaranteed. Null AND True = Null, Null AND False = False.

See also: or, not

DataFrame.Expr.or

a -> Expr -> Expr

Logical OR of two boolean expressions. Either being true makes result true. The first argument auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Either condition (scalars auto-coerced)
let isSpecial =
    @status |> Expr.eq "vip"
    |> Expr.or (@status |> Expr.eq "admin")

-- Fallback check
@primary_email |> Expr.isNotNull
    |> Expr.or (@backup_email |> Expr.isNotNull)
Try it

Notes: Null OR True = True, Null OR False = Null.

See also: and, not

DataFrame.Expr.not

Expr -> Expr

Logical NOT (negation) of a boolean expression.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Negate a condition
Expr.not (@is_deleted)

-- Combine with and (scalars auto-coerced)
Expr.not (@status |> Expr.eq "spam")
    |> Expr.and (@score |> Expr.gt 0)
Try it

Notes: NOT Null = Null.

See also: and, or

Aggregation

DataFrame.Expr.sum

Expr -> Expr

Sum of all values in a column. Use with groupBy/agg for group-wise sums.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { category = "A", sales = 100 }
    , { category = "A", sales = 200 }
    , { category = "B", sales = 150 }
    ]
let totalExpr = @sales |> Expr.sum |> Expr.named "total_sales"

df |> DataFrame.groupBy [@category] |> DataFrame.agg [totalExpr]
Try it

Notes: Null values are ignored (not treated as 0). Returns null for empty groups.

See also: mean, count, over

DataFrame.Expr.mean

Expr -> Expr

Arithmetic mean (average) of values. Returns Float.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { department = "eng", salary = 90000 }
    , { department = "eng", salary = 110000 }
    , { department = "sales", salary = 75000 }
    ]
let avgExpr = @salary |> Expr.mean |> Expr.named "avg_salary"

df |> DataFrame.groupBy [@department] |> DataFrame.agg [avgExpr]
Try it

Notes: Null values are excluded from both numerator and count. Empty groups return null.

See also: sum, median, std

DataFrame.Expr.min

Expr -> Expr

Minimum value in a column. Works with numeric, string, and date types.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { product = "apple", price = 1 }
    , { product = "apple", price = 2 }
    , { product = "banana", price = 3 }
    ]
let minExpr = @price |> Expr.min |> Expr.named "lowest_price"

df |> DataFrame.groupBy [@product] |> DataFrame.agg [minExpr]
Try it

Notes: Null values are ignored. Returns null for empty groups.

See also: max, first

DataFrame.Expr.max

Expr -> Expr

Maximum value in a column. Works with numeric, string, and date types.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { region = "north", score = 80 }
    , { region = "north", score = 95 }
    , { region = "south", score = 70 }
    ]
let maxExpr = @score |> Expr.max |> Expr.named "top_score"

df |> DataFrame.groupBy [@region] |> DataFrame.agg [maxExpr]
Try it

Notes: Null values are ignored. Returns null for empty groups.

See also: min, last

DataFrame.Expr.count

Expr -> Expr

Count of non-null values in a column.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { status = "active", id = 1 }
    , { status = "active", id = 2 }
    , { status = "inactive", id = 3 }
    ]
let cntExpr = @id |> Expr.count |> Expr.named "n"

df |> DataFrame.groupBy [@status] |> DataFrame.agg [cntExpr]
Try it

Notes: Counts non-null values only. For total rows including nulls, count a non-nullable column like id.

See also: sum, first, last

DataFrame.Expr.first

Expr -> Expr

First value in a group. Order depends on the DataFrame's current row order.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { customer = "alice", order_id = 10 }
    , { customer = "alice", order_id = 20 }
    , { customer = "bob", order_id = 30 }
    ]
let firstExpr = @order_id |> Expr.first |> Expr.named "first_order"

df |> DataFrame.groupBy [@customer] |> DataFrame.agg [firstExpr]
Try it

Notes: Returns first non-null value. Sort the DataFrame first if you need a specific ordering.

See also: last, min

DataFrame.Expr.last

Expr -> Expr

Last value in a group. Order depends on the DataFrame's current row order.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { user = "alice", action = "login" }
    , { user = "alice", action = "purchase" }
    , { user = "bob", action = "login" }
    ]
let lastExpr = @action |> Expr.last |> Expr.named "last_action"

df |> DataFrame.groupBy [@user] |> DataFrame.agg [lastExpr]
Try it

Notes: Returns last non-null value. Sort the DataFrame first if you need a specific ordering.

See also: first, max

DataFrame.Expr.std

Expr -> Expr

Sample standard deviation (with Bessel's correction, ddof=1).

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { treatment = "A", response = 10 }
    , { treatment = "A", response = 20 }
    , { treatment = "B", response = 15 }
    ]
let stdExpr = @response |> Expr.std |> Expr.named "response_std"

df |> DataFrame.groupBy [@treatment] |> DataFrame.agg [stdExpr]
Try it

Notes: Uses ddof=1 (sample standard deviation). Requires at least 2 values.

See also: var, mean

DataFrame.Expr.var

Expr -> Expr

Sample variance (with Bessel's correction, ddof=1).

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Calculate variance
@measurement |> Expr.var |> Expr.named "measurement_var"
Try it

Notes: Uses ddof=1 (sample variance). Variance = std^2.

See also: std, mean

DataFrame.Expr.median

Expr -> Expr

Median (50th percentile) of values. More robust to outliers than mean.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { region = "north", price = 10 }
    , { region = "north", price = 20 }
    , { region = "south", price = 15 }
    ]
let meanExpr = @price |> Expr.mean |> Expr.named "mean_price"
let medExpr = @price |> Expr.median |> Expr.named "median_price"

df |> DataFrame.groupBy [@region] |> DataFrame.agg [meanExpr, medExpr]
Try it

Notes: For even-length groups, returns average of the two middle values.

See also: mean, min, max

String

DataFrame.Expr.strLength

Expr -> Expr

Length of string in characters (not bytes). Returns Int.

Example:
import DataFrame
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { code = "AB123" }
    , { code = "XY" }
    , { code = "LMN45" }
    ]

df |> DataFrame.filter (@code |> Expr.strLength |> Expr.eq (lit 5))
Try it

Notes: Counts Unicode characters, not bytes. Null strings return null.

See also: strUpper, strLower, strTrim

DataFrame.Expr.strUpper

Expr -> Expr

Convert string to uppercase.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Normalize to uppercase
@country_code |> Expr.strUpper |> Expr.named "country_code_upper"

-- Case-insensitive comparison
@status |> Expr.strUpper |> Expr.eq (lit "ACTIVE")
Try it

Notes: Uses Unicode case mapping rules.

See also: strLower

DataFrame.Expr.strLower

Expr -> Expr

Convert string to lowercase.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Normalize to lowercase
@email |> Expr.strLower |> Expr.named "email_normalized"
Try it

Notes: Uses Unicode case mapping rules.

See also: strUpper

DataFrame.Expr.strTrim

Expr -> Expr

Remove leading and trailing whitespace from string.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Clean up user input
@user_input |> Expr.strTrim |> Expr.named "cleaned_input"
Try it

Notes: Removes spaces, tabs, newlines, and other Unicode whitespace.

See also: strReplace

DataFrame.Expr.strContains

String -> Expr -> Expr

Check if string contains the given pattern. Returns boolean.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { email = "alice@company.com" }
    , { email = "bob@gmail.com" }
    , { email = "carol@company.com" }
    ]

df |> DataFrame.filter (@email |> Expr.strContains "@company.com")
Try it

Notes: Literal string matching (not regex). Case-sensitive.

See also: strStartsWith, strEndsWith

DataFrame.Expr.strStartsWith

String -> Expr -> Expr

Check if string starts with the given prefix. Returns boolean.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { name = "Dr. Alice" }
    , { name = "Bob" }
    , { name = "Dr. Carol" }
    ]

df |> DataFrame.filter (@name |> Expr.strStartsWith "Dr.")
Try it

Notes: Case-sensitive comparison.

See also: strEndsWith, strContains

DataFrame.Expr.strEndsWith

String -> Expr -> Expr

Check if string ends with the given suffix. Returns boolean.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { url = "https://example.org" }
    , { url = "https://example.com" }
    , { url = "https://wiki.org" }
    ]

df |> DataFrame.filter (@url |> Expr.strEndsWith ".org")
Try it

Notes: Case-sensitive comparison.

See also: strStartsWith, strContains

DataFrame.Expr.strReplace

String -> String -> Expr -> Expr

Replace first occurrence of a pattern with replacement string.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Replace substring
@text |> Expr.strReplace "old" "new"

-- Remove prefix
@id |> Expr.strReplace "ID_" ""
Try it

Notes: Only replaces the first occurrence. Pattern is literal (not regex).

See also: strTrim

DataFrame.Expr.strSplit

String -> Expr -> Expr

Split string by delimiter into a list. Returns a List column.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Split on delimiter
@text |> Expr.strSplit "_" |> Expr.named "text_parts"

-- Split CSV field
@tags |> Expr.strSplit "," |> Expr.named "tag_list"

-- Split full name
@full_name |> Expr.strSplit " " |> Expr.named "name_parts"
Try it

Notes: Returns a List column where each element is a list of string parts.

See also: concatMany, strContains

DataFrame.Expr.concatMany

String -> [Expr] -> Expr

Concatenate multiple string expressions with a separator between them.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords [{ city = "New York", state = "NY" }, { city = "Austin", state = "TX" }]
let cityExpr = @city
let stateExpr = @state
let addrExpr = [cityExpr, stateExpr] |> Expr.concatMany ", "

df |> DataFrame.applyExprs [(@location, addrExpr)]
Try it

Notes: Requires at least one expression. Separator is placed between expressions, not at start/end.

See also: strSplit

DataFrame.Expr.strMatches

String -> Expr -> Expr

Test whether each string in a column matches a regex pattern. Returns a Bool column.

Example:
import DataFrame
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords [{ code = "abc123" }, { code = "xyz" }, { code = "007" }]

-- Filter rows where code contains digits
case (df |> DataFrame.filter (@code |> Expr.strMatches "[0-9]+")) of
    Ok filtered -> DataFrame.shape filtered
    Err _ -> (0, 0)
Try it

Notes: Uses regex matching (not literal). Non-matching rows return false. An invalid pattern fills the entire result column with null.

See also: strContains, strCapture

DataFrame.Expr.strCapture

String -> Int -> Expr -> Expr

Extract a captured group from the first regex match in each string. Group index 0 returns the entire match; 1+ return capture groups.

Example:
import DataFrame
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords [{ s = "price: 42" }, { s = "count: 100" }, { s = "none" }]

-- Extract first group (digits after colon)
case (DataFrame.applyExprs [(@n, @s |> Expr.strCapture "([0-9]+)" 1)] df) of
    Ok df2 -> (DataFrame.column @n df2)?
    Err _ -> []
Try it

Notes: Returns a nullable String column (Maybe Nothing when no match). Group index 0 is the entire match. An invalid pattern fills the entire column with null.

See also: strMatches, strCaptureAll

DataFrame.Expr.strCaptureAll

String -> Expr -> Expr

Extract all non-overlapping regex matches from each string. Returns a List column.

Example:
import DataFrame
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords [{ text = "a1b2c3" }, { text = "xyz" }]

-- Extract all digit sequences per row
case (DataFrame.applyExprs [(@numbers, @text |> Expr.strCaptureAll "[0-9]+")] df) of
    Ok df2 -> (DataFrame.column @numbers df2)?
    Err _ -> []
Try it

Notes: Returns a List[String] column. Rows with no match return an empty list, not null. An invalid pattern fills the entire column with null.

See also: strCapture, strSplit

DataFrame.Expr.strReplaceAll

String -> String -> Expr -> Expr

Replace all regex matches in each string with a replacement string.

Example:
import DataFrame
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords [{ text = "a1b2c3" }, { text = "hello" }]

-- Replace all digits with X
case (DataFrame.applyExprs [(@r, @text |> Expr.strReplaceAll "[0-9]" "X")] df) of
    Ok df2 -> (DataFrame.column @r df2)?
    Err _ -> []
Try it

Notes: Uses regex matching. An invalid pattern causes applyExprs to return Err. Unlike strReplace, replaces all occurrences, not just the first.

See also: strReplace, strMatches

DataFrame.Expr.strCount

String -> Expr -> Expr

Count non-overlapping regex matches in each string. Returns an Int column.

Example:
import DataFrame
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords [{ text = "a1b2c3" }, { text = "abc" }]

-- Count digit occurrences per row
case (DataFrame.applyExprs [(@n, @text |> Expr.strCount "[0-9]")] df) of
    Ok df2 -> (DataFrame.column @n df2)?
    Err _ -> []
Try it

Notes: Returns a UInt32 column mapped to Int. Non-matching rows return 0. An invalid pattern causes applyExprs to return Err.

See also: strMatches, strCaptureAll

Math

DataFrame.Expr.abs

Expr -> Expr

Absolute value. Works with Int and Float.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Absolute difference
@actual |> Expr.sub (@predicted) |> Expr.abs |> Expr.named "abs_error"
Try it

Notes: Returns same type as input.

See also: sqrt, round

DataFrame.Expr.sqrt

Expr -> Expr

Square root. Returns Float.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Calculate RMSE component
@squared_error |> Expr.sqrt

-- Distance calculation
@x |> Expr.pow (lit 2)
    |> Expr.add (@y |> Expr.pow (lit 2))
    |> Expr.sqrt
    |> Expr.named "distance"
Try it

Notes: Negative values return NaN, not an error.

See also: pow, abs

DataFrame.Expr.floor

Expr -> Expr

Round down to nearest integer (toward negative infinity).

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Truncate to integer
@price |> Expr.floor |> Expr.named "price_floor"

-- floor(2.7) = 2, floor(-2.3) = -3
Try it

Notes: Returns Float type, not Int. Use for rounding, not type conversion.

See also: ceil, round

DataFrame.Expr.ceil

Expr -> Expr

Round up to nearest integer (toward positive infinity).

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Round up
@quantity |> Expr.ceil |> Expr.named "quantity_ceil"

-- ceil(2.1) = 3, ceil(-2.7) = -2
Try it

Notes: Returns Float type, not Int.

See also: floor, round

DataFrame.Expr.round

Int -> Expr -> Expr

Round to specified number of decimal places.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Round to 2 decimal places
@price |> Expr.round 2 |> Expr.named "price_rounded"

-- Round to whole number
@average |> Expr.round 0
Try it

Notes: Uses banker's rounding (round half to even). Negative decimals round to tens, hundreds, etc.

See also: floor, ceil

Null

DataFrame.Expr.fillNull

a -> Expr -> Expr

Replace null values with a default value. The default auto-coerces scalars to lit.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Fill with constant (scalar auto-coerced)
@score |> Expr.fillNull 0

-- Fill with another column
@nickname |> Expr.fillNull (@name)

-- Chain to handle multiple fallbacks
@preferred_email
    |> Expr.fillNull (@work_email)
    |> Expr.fillNull (@personal_email)
Try it

Notes: The default expression is only evaluated for null values.

See also: isNull, isNotNull

DataFrame.Expr.isNull

Expr -> Expr

Check if value is null. Returns boolean.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Find missing values (Expr-only)
@email |> Expr.isNull

-- Count nulls
@score |> Expr.isNull |> Expr.sum |> Expr.named "missing_count"
Try it

Notes: Null represents missing data. Use isNull instead of eq(lit null).

See also: isNotNull, fillNull

DataFrame.Expr.isNotNull

Expr -> Expr

Check if value is not null. Returns boolean.

Example:
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

-- Find present values (Expr-only)
@email |> Expr.isNotNull

-- Combine with and
@verified_at |> Expr.isNotNull
    |> Expr.and (@active |> Expr.isNotNull)
Try it

Notes: Equivalent to isNull |> not, but more readable.

See also: isNull, fillNull

Conditional

DataFrame.Expr.cond

[(Expr, a)] -> a -> Expr

Multi-branch conditional expression (like SQL CASE WHEN). Takes a list of (condition, result) pairs and a default value. All branch values and the default must share the same type — the compiler rejects mismatches like String branches with an Int default. Scalars auto-coerce to lit in both tuple values and the default.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { name = "Alice", age = 15 }
    , { name = "Bob", age = 35 }
    , { name = "Carol", age = 70 }
    ]

-- Scalars auto-coerce in condition values and default
let ageGroup = Expr.cond [(@age < 18, "minor"), (@age < 65, "adult")] "senior"
df |> DataFrame.applyExprs [(@age_group, ageGroup)]
Try it

Notes: Conditions are evaluated in order; first match wins. The default is required and used when no conditions match. Branch values and the default must have the same type (e.g. all String or all Int) — a type mismatch is a compile error. Scalars auto-coerce to lit in both tuple values and the default.

See also: and, or, eq

Window

DataFrame.Expr.over

[String] -> Expr -> Expr

Apply an expression as a window function over partition columns. Enables row-level access to aggregated values.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Running total per group
@amount |> Expr.sum |> Expr.over ["customer_id"] |> Expr.named "customer_total"

-- Percentage of group (scalar auto-coerced)
@sales
    |> Expr.div (@sales |> Expr.sum |> Expr.over ["region"])
    |> Expr.mul 100
    |> Expr.named "pct_of_region"

-- Multiple partitions
@value |> Expr.mean |> Expr.over ["year", "category"] |> Expr.named "avg_by_year_cat"

-- Global window (no partitions)
@score |> Expr.mean |> Expr.over [] |> Expr.named "global_avg"
Try it

Notes: Empty partition list [] means global window (entire DataFrame). Results are broadcast back to each row.

See also: sum, mean, rowNumber, lag, lead

DataFrame.Expr.rowNumber

() -> Expr

Assign sequential row numbers within partitions (1-based). Use with over to partition.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { group = "A", value = 10 }
    , { group = "A", value = 20 }
    , { group = "B", value = 30 }
    ]
let rnExpr = Expr.rowNumber ()

df |> DataFrame.applyExprs [(@rn, rnExpr)]
Try it

Notes: Starts at 1. Order depends on current DataFrame sort order.

See also: rank, denseRank, over

DataFrame.Expr.rank

() -> Expr

Rank values with gaps for ties. Ties get the same rank; next rank skips accordingly.

Example:
import DataFrame.Expr as Expr

-- Rank with gaps: [1, 2, 2, 4] for values [10, 20, 20, 30]
Expr.rank () |> Expr.over ["department"] |> Expr.named "sales_rank"
Try it

Notes: Use with over for partitioned ranking. Sort the DataFrame first to control rank ordering.

See also: denseRank, rowNumber

DataFrame.Expr.denseRank

() -> Expr

Rank values without gaps for ties. Consecutive ranks even when ties exist.

Example:
import DataFrame.Expr as Expr

-- Dense rank: [1, 2, 2, 3] for values [10, 20, 20, 30]
Expr.denseRank () |> Expr.over ["category"] |> Expr.named "price_rank"
Try it

Notes: Unlike rank, dense_rank doesn't skip numbers after ties.

See also: rank, rowNumber

DataFrame.Expr.lag

Int -> Expr -> Expr

Get value from n rows before the current row. Useful for comparing to previous values.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Previous row's value
@price |> Expr.lag 1 |> Expr.over ["stock"] |> Expr.named "prev_price"

-- Calculate change from previous
@value |> Expr.sub (@value |> Expr.lag 1 |> Expr.over [])
    |> Expr.named "change"

-- Look back 7 periods
@sales |> Expr.lag 7 |> Expr.over [] |> Expr.named "sales_last_week"
Try it

Notes: Returns null for rows without enough history (first n rows). Sort first for meaningful order.

See also: lead, over

DataFrame.Expr.lead

Int -> Expr -> Expr

Get value from n rows after the current row. Useful for comparing to future values.

Example:
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr

-- Next row's value
@price |> Expr.lead 1 |> Expr.over ["stock"] |> Expr.named "next_price"

-- Days until next event
@event_date |> Expr.lead 1 |> Expr.over ["user"]
    |> Expr.sub (@event_date)
    |> Expr.named "days_to_next"
Try it

Notes: Returns null for rows without enough future values (last n rows). Sort first for meaningful order.

See also: lag, over

Other

DataFrame.Expr.in

[a] -> Expr -> Expr

Test membership — whether the column value is in the given list. Returns a boolean expression.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords [{ city = "Berlin" }, { city = "Munich" }, { city = "Hamburg" }]
let expr = @city |> Expr.in ["Berlin", "Munich"]
DataFrame.filter expr df
Try it

Notes: The list must contain values of a compatible type (Int, Float, String, Bool). Equivalent to SQL's IN operator.

See also: eq, neq, DataFrame.filter

DataFrame.Expr.quantile

Float -> Expr -> Expr

Compute quantile/percentile of values. Takes a quantile value between 0.0 and 1.0.

Example:
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr

let df = DataFrame.fromRecords
    [ { group = "all", value = 10 }
    , { group = "all", value = 20 }
    , { group = "all", value = 30 }
    , { group = "all", value = 40 }
    ]
let q25Expr = @value |> Expr.quantile 0.25 |> Expr.named "q25"
let q75Expr = @value |> Expr.quantile 0.75 |> Expr.named "q75"

df |> DataFrame.groupBy [@group] |> DataFrame.agg [q25Expr, q75Expr]
Try it

Notes: Quantile must be between 0.0 and 1.0. Uses linear interpolation method.

See also: median, min, max