DataFrame.Expr Module
Composable column expressions for DataFrame operations.
The DataFrame.Expr module provides a functional, pipe-friendly API for building column expressions that compile directly to Polars with SIMD optimization and parallel execution. Use expressions with DataFrame.applyExprs, DataFrame.filter, and DataFrame.agg.
Why Expressions?
- Performance: Expressions compile directly to Polars — always fast, no fallback
- Composability: Expressions are values that can be bound, passed, and composed
- Window functions: Impossible with closures, natural with expressions
- Aggregations: Sum, mean, count as composable operations
Common patterns
import DataFrame
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Add computed columns
df |> DataFrame.applyExprs
[ col "price" |> Expr.mul (col "qty") |> Expr.named "total"
, col "price" |> Expr.mul 1.1 |> Expr.named "with_tax"
]
-- Filter with expressions
df |> DataFrame.filter (col "status" |> Expr.eq "active")
-- Conditional logic
Expr.cond
[ (col "age" |> Expr.lt 18, "minor")
, (col "age" |> Expr.lt 65, "adult")
] "senior"
-- Window functions
col "sales" |> Expr.sum |> Expr.over ["region"] |> Expr.named "region_total"
Aggregation with groupBy
df
|> DataFrame.groupBy ["department"]
|> DataFrame.agg
[ col "salary" |> Expr.mean |> Expr.named "avg_salary"
, col "id" |> Expr.count |> Expr.named "employee_count"
]
Functions
Constructors
DataFrame.Expr.col
DataFrameColumn -> Expr
Reference a DataFrame column by name. Use the @name syntax to pass a column reference.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Reference a column (preferred: use @name directly)
@name
-- Explicit col wrapper (equivalent to @name)
col @name
-- Use in arithmetic
@price |> Expr.mul (@quantity)
-- Use in comparisons (scalar auto-coerced)
@age |> Expr.gte 18Try itNotes: Accepts a DataFrameColumn reference (@name). Column names are case-sensitive and must exactly match the DataFrame column names. In most cases you can use @name directly without wrapping in col.
See also: lit, named
DataFrame.Expr.lit
a -> Expr
Create a literal (constant) expression from a value. Supports Int, Float, String, Bool, and Unit (null).
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Integer literal
lit 42
-- Float literal
lit 3.14159
-- String literal
lit "active"
-- Boolean literal
lit True
-- Use in expressions (explicit lit is optional — scalars auto-coerce)
@price |> Expr.mul (lit 1.1) -- explicit lit still works
@price |> Expr.mul 1.1 -- equivalent shorthandTry itNotes: Unit values become SQL NULL. Scalar values (Int, Float, String, Bool) auto-coerce in binary Expr operations, so explicit lit is optional in most cases.
See also: col
DataFrame.Expr.named
String -> Expr -> Expr
Assign a name (alias) to an expression's output column. Required in agg to define the result column name. Not needed for applyExprs — use the tuple key (@col, expr) instead.
import DataFrame.Expr as Expr
-- Name the output of an aggregation expression
@price |> Expr.mul @qty |> Expr.named "total"Try itNotes: Column name cannot be empty. The alias only affects the output column name, not the expression itself.
See also: col, DataFrame.agg
Arithmetic
DataFrame.Expr.add
a -> Expr -> Expr
Add two expressions element-wise. Works with numeric columns (Int, Float). The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Add two columns
@a |> Expr.add (@b)
-- Add a constant (scalar auto-coerced)
@price |> Expr.add 10
-- Chain operations
@a |> Expr.add (@b) |> Expr.add 1Try itNotes: Follows pipe convention: lhs |> add rhs = lhs + rhs. Type coercion follows Polars rules (Int + Float = Float). Scalars (Int, Float, String, Bool) are auto-coerced to lit.
See also: sub, mul, div
DataFrame.Expr.sub
a -> Expr -> Expr
Subtract two expressions element-wise (lhs - rhs). The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Column difference
@revenue |> Expr.sub (@cost)
-- Subtract a constant (scalar auto-coerced)
@score |> Expr.sub 5Try itNotes: Follows pipe convention: lhs |> sub rhs = lhs - rhs.
See also: add, mul, div
DataFrame.Expr.mul
a -> Expr -> Expr
Multiply two expressions element-wise. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Calculate total
@price |> Expr.mul (@quantity)
-- Apply percentage (scalar auto-coerced)
@salary |> Expr.mul 1.05 -- 5% raise
-- Named result
@hours |> Expr.mul (@rate) |> Expr.named "pay"Try itNotes: Follows pipe convention: lhs |> mul rhs = lhs * rhs.
See also: add, sub, div, pow
DataFrame.Expr.div
a -> Expr -> Expr
Divide two expressions element-wise (lhs / rhs). Returns Float for integer division.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Calculate ratio
@completed |> Expr.div (@total)
-- Per-unit value
@total_cost |> Expr.div (@quantity)
-- Normalize (0-1 range)
@value |> Expr.div (@max_value)Try itNotes: Division by zero returns null (not an error). Integer division produces Float result.
See also: mul, mod
DataFrame.Expr.mod
a -> Expr -> Expr
Modulo (remainder) of two expressions (lhs % rhs). The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Check if even (scalars auto-coerced)
@n |> Expr.mod 2 |> Expr.eq 0
-- Get last digit
@id |> Expr.mod 10Try itNotes: Result has the same sign as the dividend (lhs).
See also: div
DataFrame.Expr.pow
a -> Expr -> Expr
Raise base to exponent power (base ^ exp). The exponent auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Square a column (scalar auto-coerced)
@x |> Expr.pow 2
-- Cube root (exponent 1/3)
@volume |> Expr.pow 0.333333
-- Compound interest
@principal |> Expr.mul (lit 1.05 |> Expr.pow (@years))Try itNotes: Follows pipe convention: base |> pow exp = base ^ exp.
See also: sqrt, mul
Comparison
DataFrame.Expr.eq
a -> Expr -> Expr
Test equality (==). Returns a boolean expression. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Filter by status (scalar auto-coerced)
@status |> Expr.eq "active"
-- Compare columns
@actual |> Expr.eq (@expected)Try itNotes: Null values: null == null returns null, not True. Use isNull for null checks.
See also: neq, gt, lt, isNull
DataFrame.Expr.neq
a -> Expr -> Expr
Test inequality (!=). Returns a boolean expression. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Exclude a status (scalar auto-coerced)
@status |> Expr.neq "deleted"
-- Combine with boolean ops
@type |> Expr.neq "test" |> Expr.and (@active |> Expr.eq True)Try itNotes: Null values: null != value returns null, not True.
See also: eq, isNotNull
DataFrame.Expr.gt
a -> Expr -> Expr
Greater than comparison (>). Returns a boolean expression. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Age filter (scalar auto-coerced)
@age |> Expr.gt 18
-- Compare columns
@revenue |> Expr.gt (@cost)
-- Chain with boolean ops
@score |> Expr.gt 90 |> Expr.and (@passed |> Expr.eq True)Try itNotes: Follows pipe convention: lhs |> gt rhs = lhs > rhs.
See also: gte, lt, lte
DataFrame.Expr.gte
a -> Expr -> Expr
Greater than or equal comparison (>=). Returns a boolean expression. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Minimum threshold (scalar auto-coerced)
@quantity |> Expr.gte 10
-- Date comparison
@year |> Expr.gte 2020Try itNotes: Follows pipe convention: lhs |> gte rhs = lhs >= rhs.
See also: gt, lte
DataFrame.Expr.lt
a -> Expr -> Expr
Less than comparison (<). Returns a boolean expression. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Below threshold (scalar auto-coerced)
@temperature |> Expr.lt 0
-- Range check (combine with gt)
let inRange = @x |> Expr.gt 0 |> Expr.and (@x |> Expr.lt 100)Try itNotes: Follows pipe convention: lhs |> lt rhs = lhs < rhs.
See also: lte, gt, gte
DataFrame.Expr.lte
a -> Expr -> Expr
Less than or equal comparison (<=). Returns a boolean expression. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Maximum threshold (scalar auto-coerced)
@price |> Expr.lte 100
-- Cohort bucketing with cond (scalars auto-coerced)
Expr.cond
[ (@age |> Expr.lte 17, "minor")
, (@age |> Expr.lte 64, "adult")
]
"senior"Try itNotes: Follows pipe convention: lhs |> lte rhs = lhs <= rhs.
See also: lt, gte
Boolean
DataFrame.Expr.and
a -> Expr -> Expr
Logical AND of two boolean expressions. Both must be true for result to be true. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Combine conditions (scalars auto-coerced)
let isActiveAdult =
@age |> Expr.gte 18
|> Expr.and (@status |> Expr.eq "active")
-- Multiple conditions
@a |> Expr.gt 0
|> Expr.and (@b |> Expr.gt 0)
|> Expr.and (@c |> Expr.gt 0)Try itNotes: Short-circuit evaluation is not guaranteed. Null AND True = Null, Null AND False = False.
See also: or, not
DataFrame.Expr.or
a -> Expr -> Expr
Logical OR of two boolean expressions. Either being true makes result true. The first argument auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Either condition (scalars auto-coerced)
let isSpecial =
@status |> Expr.eq "vip"
|> Expr.or (@status |> Expr.eq "admin")
-- Fallback check
@primary_email |> Expr.isNotNull
|> Expr.or (@backup_email |> Expr.isNotNull)Try itNotes: Null OR True = True, Null OR False = Null.
See also: and, not
DataFrame.Expr.not
Expr -> Expr
Logical NOT (negation) of a boolean expression.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Negate a condition
Expr.not (@is_deleted)
-- Combine with and (scalars auto-coerced)
Expr.not (@status |> Expr.eq "spam")
|> Expr.and (@score |> Expr.gt 0)Try itNotes: NOT Null = Null.
See also: and, or
Aggregation
DataFrame.Expr.sum
Expr -> Expr
Sum of all values in a column. Use with groupBy/agg for group-wise sums.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { category = "A", sales = 100 }
, { category = "A", sales = 200 }
, { category = "B", sales = 150 }
]
let totalExpr = @sales |> Expr.sum |> Expr.named "total_sales"
df |> DataFrame.groupBy [@category] |> DataFrame.agg [totalExpr]Try itNotes: Null values are ignored (not treated as 0). Returns null for empty groups.
See also: mean, count, over
DataFrame.Expr.mean
Expr -> Expr
Arithmetic mean (average) of values. Returns Float.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { department = "eng", salary = 90000 }
, { department = "eng", salary = 110000 }
, { department = "sales", salary = 75000 }
]
let avgExpr = @salary |> Expr.mean |> Expr.named "avg_salary"
df |> DataFrame.groupBy [@department] |> DataFrame.agg [avgExpr]Try itNotes: Null values are excluded from both numerator and count. Empty groups return null.
See also: sum, median, std
DataFrame.Expr.min
Expr -> Expr
Minimum value in a column. Works with numeric, string, and date types.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { product = "apple", price = 1 }
, { product = "apple", price = 2 }
, { product = "banana", price = 3 }
]
let minExpr = @price |> Expr.min |> Expr.named "lowest_price"
df |> DataFrame.groupBy [@product] |> DataFrame.agg [minExpr]Try itNotes: Null values are ignored. Returns null for empty groups.
See also: max, first
DataFrame.Expr.max
Expr -> Expr
Maximum value in a column. Works with numeric, string, and date types.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { region = "north", score = 80 }
, { region = "north", score = 95 }
, { region = "south", score = 70 }
]
let maxExpr = @score |> Expr.max |> Expr.named "top_score"
df |> DataFrame.groupBy [@region] |> DataFrame.agg [maxExpr]Try itNotes: Null values are ignored. Returns null for empty groups.
See also: min, last
DataFrame.Expr.count
Expr -> Expr
Count of non-null values in a column.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { status = "active", id = 1 }
, { status = "active", id = 2 }
, { status = "inactive", id = 3 }
]
let cntExpr = @id |> Expr.count |> Expr.named "n"
df |> DataFrame.groupBy [@status] |> DataFrame.agg [cntExpr]Try itNotes: Counts non-null values only. For total rows including nulls, count a non-nullable column like id.
See also: sum, first, last
DataFrame.Expr.first
Expr -> Expr
First value in a group. Order depends on the DataFrame's current row order.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { customer = "alice", order_id = 10 }
, { customer = "alice", order_id = 20 }
, { customer = "bob", order_id = 30 }
]
let firstExpr = @order_id |> Expr.first |> Expr.named "first_order"
df |> DataFrame.groupBy [@customer] |> DataFrame.agg [firstExpr]Try itNotes: Returns first non-null value. Sort the DataFrame first if you need a specific ordering.
See also: last, min
DataFrame.Expr.last
Expr -> Expr
Last value in a group. Order depends on the DataFrame's current row order.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { user = "alice", action = "login" }
, { user = "alice", action = "purchase" }
, { user = "bob", action = "login" }
]
let lastExpr = @action |> Expr.last |> Expr.named "last_action"
df |> DataFrame.groupBy [@user] |> DataFrame.agg [lastExpr]Try itNotes: Returns last non-null value. Sort the DataFrame first if you need a specific ordering.
See also: first, max
DataFrame.Expr.std
Expr -> Expr
Sample standard deviation (with Bessel's correction, ddof=1).
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { treatment = "A", response = 10 }
, { treatment = "A", response = 20 }
, { treatment = "B", response = 15 }
]
let stdExpr = @response |> Expr.std |> Expr.named "response_std"
df |> DataFrame.groupBy [@treatment] |> DataFrame.agg [stdExpr]Try itNotes: Uses ddof=1 (sample standard deviation). Requires at least 2 values.
See also: var, mean
DataFrame.Expr.var
Expr -> Expr
Sample variance (with Bessel's correction, ddof=1).
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Calculate variance
@measurement |> Expr.var |> Expr.named "measurement_var"Try itNotes: Uses ddof=1 (sample variance). Variance = std^2.
See also: std, mean
DataFrame.Expr.median
Expr -> Expr
Median (50th percentile) of values. More robust to outliers than mean.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { region = "north", price = 10 }
, { region = "north", price = 20 }
, { region = "south", price = 15 }
]
let meanExpr = @price |> Expr.mean |> Expr.named "mean_price"
let medExpr = @price |> Expr.median |> Expr.named "median_price"
df |> DataFrame.groupBy [@region] |> DataFrame.agg [meanExpr, medExpr]Try itNotes: For even-length groups, returns average of the two middle values.
See also: mean, min, max
String
DataFrame.Expr.strLength
Expr -> Expr
Length of string in characters (not bytes). Returns Int.
import DataFrame
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { code = "AB123" }
, { code = "XY" }
, { code = "LMN45" }
]
df |> DataFrame.filter (@code |> Expr.strLength |> Expr.eq (lit 5))Try itNotes: Counts Unicode characters, not bytes. Null strings return null.
See also: strUpper, strLower, strTrim
DataFrame.Expr.strUpper
Expr -> Expr
Convert string to uppercase.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Normalize to uppercase
@country_code |> Expr.strUpper |> Expr.named "country_code_upper"
-- Case-insensitive comparison
@status |> Expr.strUpper |> Expr.eq (lit "ACTIVE")Try itNotes: Uses Unicode case mapping rules.
See also: strLower
DataFrame.Expr.strLower
Expr -> Expr
Convert string to lowercase.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Normalize to lowercase
@email |> Expr.strLower |> Expr.named "email_normalized"Try itNotes: Uses Unicode case mapping rules.
See also: strUpper
DataFrame.Expr.strTrim
Expr -> Expr
Remove leading and trailing whitespace from string.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Clean up user input
@user_input |> Expr.strTrim |> Expr.named "cleaned_input"Try itNotes: Removes spaces, tabs, newlines, and other Unicode whitespace.
See also: strReplace
DataFrame.Expr.strContains
String -> Expr -> Expr
Check if string contains the given pattern. Returns boolean.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { email = "alice@company.com" }
, { email = "bob@gmail.com" }
, { email = "carol@company.com" }
]
df |> DataFrame.filter (@email |> Expr.strContains "@company.com")Try itNotes: Literal string matching (not regex). Case-sensitive.
See also: strStartsWith, strEndsWith
DataFrame.Expr.strStartsWith
String -> Expr -> Expr
Check if string starts with the given prefix. Returns boolean.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { name = "Dr. Alice" }
, { name = "Bob" }
, { name = "Dr. Carol" }
]
df |> DataFrame.filter (@name |> Expr.strStartsWith "Dr.")Try itNotes: Case-sensitive comparison.
See also: strEndsWith, strContains
DataFrame.Expr.strEndsWith
String -> Expr -> Expr
Check if string ends with the given suffix. Returns boolean.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { url = "https://example.org" }
, { url = "https://example.com" }
, { url = "https://wiki.org" }
]
df |> DataFrame.filter (@url |> Expr.strEndsWith ".org")Try itNotes: Case-sensitive comparison.
See also: strStartsWith, strContains
DataFrame.Expr.strReplace
String -> String -> Expr -> Expr
Replace first occurrence of a pattern with replacement string.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Replace substring
@text |> Expr.strReplace "old" "new"
-- Remove prefix
@id |> Expr.strReplace "ID_" ""Try itNotes: Only replaces the first occurrence. Pattern is literal (not regex).
See also: strTrim
DataFrame.Expr.strSplit
String -> Expr -> Expr
Split string by delimiter into a list. Returns a List column.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Split on delimiter
@text |> Expr.strSplit "_" |> Expr.named "text_parts"
-- Split CSV field
@tags |> Expr.strSplit "," |> Expr.named "tag_list"
-- Split full name
@full_name |> Expr.strSplit " " |> Expr.named "name_parts"Try itNotes: Returns a List column where each element is a list of string parts.
See also: concatMany, strContains
DataFrame.Expr.concatMany
String -> [Expr] -> Expr
Concatenate multiple string expressions with a separator between them.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords [{ city = "New York", state = "NY" }, { city = "Austin", state = "TX" }]
let cityExpr = @city
let stateExpr = @state
let addrExpr = [cityExpr, stateExpr] |> Expr.concatMany ", "
df |> DataFrame.applyExprs [(@location, addrExpr)]Try itNotes: Requires at least one expression. Separator is placed between expressions, not at start/end.
See also: strSplit
DataFrame.Expr.strMatches
String -> Expr -> Expr
Test whether each string in a column matches a regex pattern. Returns a Bool column.
import DataFrame
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords [{ code = "abc123" }, { code = "xyz" }, { code = "007" }]
-- Filter rows where code contains digits
case (df |> DataFrame.filter (@code |> Expr.strMatches "[0-9]+")) of
Ok filtered -> DataFrame.shape filtered
Err _ -> (0, 0)Try itNotes: Uses regex matching (not literal). Non-matching rows return false. An invalid pattern fills the entire result column with null.
See also: strContains, strCapture
DataFrame.Expr.strCapture
String -> Int -> Expr -> Expr
Extract a captured group from the first regex match in each string. Group index 0 returns the entire match; 1+ return capture groups.
import DataFrame
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords [{ s = "price: 42" }, { s = "count: 100" }, { s = "none" }]
-- Extract first group (digits after colon)
case (DataFrame.applyExprs [(@n, @s |> Expr.strCapture "([0-9]+)" 1)] df) of
Ok df2 -> (DataFrame.column @n df2)?
Err _ -> []Try itNotes: Returns a nullable String column (Maybe Nothing when no match). Group index 0 is the entire match. An invalid pattern fills the entire column with null.
See also: strMatches, strCaptureAll
DataFrame.Expr.strCaptureAll
String -> Expr -> Expr
Extract all non-overlapping regex matches from each string. Returns a List column.
import DataFrame
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords [{ text = "a1b2c3" }, { text = "xyz" }]
-- Extract all digit sequences per row
case (DataFrame.applyExprs [(@numbers, @text |> Expr.strCaptureAll "[0-9]+")] df) of
Ok df2 -> (DataFrame.column @numbers df2)?
Err _ -> []Try itNotes: Returns a List[String] column. Rows with no match return an empty list, not null. An invalid pattern fills the entire column with null.
See also: strCapture, strSplit
DataFrame.Expr.strReplaceAll
String -> String -> Expr -> Expr
Replace all regex matches in each string with a replacement string.
import DataFrame
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords [{ text = "a1b2c3" }, { text = "hello" }]
-- Replace all digits with X
case (DataFrame.applyExprs [(@r, @text |> Expr.strReplaceAll "[0-9]" "X")] df) of
Ok df2 -> (DataFrame.column @r df2)?
Err _ -> []Try itNotes: Uses regex matching. An invalid pattern causes applyExprs to return Err. Unlike strReplace, replaces all occurrences, not just the first.
See also: strReplace, strMatches
DataFrame.Expr.strCount
String -> Expr -> Expr
Count non-overlapping regex matches in each string. Returns an Int column.
import DataFrame
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords [{ text = "a1b2c3" }, { text = "abc" }]
-- Count digit occurrences per row
case (DataFrame.applyExprs [(@n, @text |> Expr.strCount "[0-9]")] df) of
Ok df2 -> (DataFrame.column @n df2)?
Err _ -> []Try itNotes: Returns a UInt32 column mapped to Int. Non-matching rows return 0. An invalid pattern causes applyExprs to return Err.
See also: strMatches, strCaptureAll
Math
DataFrame.Expr.abs
Expr -> Expr
Absolute value. Works with Int and Float.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Absolute difference
@actual |> Expr.sub (@predicted) |> Expr.abs |> Expr.named "abs_error"Try itNotes: Returns same type as input.
See also: sqrt, round
DataFrame.Expr.sqrt
Expr -> Expr
Square root. Returns Float.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Calculate RMSE component
@squared_error |> Expr.sqrt
-- Distance calculation
@x |> Expr.pow (lit 2)
|> Expr.add (@y |> Expr.pow (lit 2))
|> Expr.sqrt
|> Expr.named "distance"Try itNotes: Negative values return NaN, not an error.
See also: pow, abs
DataFrame.Expr.floor
Expr -> Expr
Round down to nearest integer (toward negative infinity).
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Truncate to integer
@price |> Expr.floor |> Expr.named "price_floor"
-- floor(2.7) = 2, floor(-2.3) = -3Try itNotes: Returns Float type, not Int. Use for rounding, not type conversion.
See also: ceil, round
DataFrame.Expr.ceil
Expr -> Expr
Round up to nearest integer (toward positive infinity).
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Round up
@quantity |> Expr.ceil |> Expr.named "quantity_ceil"
-- ceil(2.1) = 3, ceil(-2.7) = -2Try itNotes: Returns Float type, not Int.
See also: floor, round
DataFrame.Expr.round
Int -> Expr -> Expr
Round to specified number of decimal places.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Round to 2 decimal places
@price |> Expr.round 2 |> Expr.named "price_rounded"
-- Round to whole number
@average |> Expr.round 0Try itNotes: Uses banker's rounding (round half to even). Negative decimals round to tens, hundreds, etc.
See also: floor, ceil
Null
DataFrame.Expr.fillNull
a -> Expr -> Expr
Replace null values with a default value. The default auto-coerces scalars to lit.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Fill with constant (scalar auto-coerced)
@score |> Expr.fillNull 0
-- Fill with another column
@nickname |> Expr.fillNull (@name)
-- Chain to handle multiple fallbacks
@preferred_email
|> Expr.fillNull (@work_email)
|> Expr.fillNull (@personal_email)Try itNotes: The default expression is only evaluated for null values.
See also: isNull, isNotNull
DataFrame.Expr.isNull
Expr -> Expr
Check if value is null. Returns boolean.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Find missing values (Expr-only)
@email |> Expr.isNull
-- Count nulls
@score |> Expr.isNull |> Expr.sum |> Expr.named "missing_count"Try itNotes: Null represents missing data. Use isNull instead of eq(lit null).
See also: isNotNull, fillNull
DataFrame.Expr.isNotNull
Expr -> Expr
Check if value is not null. Returns boolean.
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
-- Find present values (Expr-only)
@email |> Expr.isNotNull
-- Combine with and
@verified_at |> Expr.isNotNull
|> Expr.and (@active |> Expr.isNotNull)Try itNotes: Equivalent to isNull |> not, but more readable.
See also: isNull, fillNull
Conditional
DataFrame.Expr.cond
[(Expr, a)] -> a -> Expr
Multi-branch conditional expression (like SQL CASE WHEN). Takes a list of (condition, result) pairs and a default value. All branch values and the default must share the same type — the compiler rejects mismatches like String branches with an Int default. Scalars auto-coerce to lit in both tuple values and the default.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { name = "Alice", age = 15 }
, { name = "Bob", age = 35 }
, { name = "Carol", age = 70 }
]
-- Scalars auto-coerce in condition values and default
let ageGroup = Expr.cond [(@age < 18, "minor"), (@age < 65, "adult")] "senior"
df |> DataFrame.applyExprs [(@age_group, ageGroup)]Try itNotes: Conditions are evaluated in order; first match wins. The default is required and used when no conditions match. Branch values and the default must have the same type (e.g. all String or all Int) — a type mismatch is a compile error. Scalars auto-coerce to lit in both tuple values and the default.
See also: and, or, eq
Window
DataFrame.Expr.over
[String] -> Expr -> Expr
Apply an expression as a window function over partition columns. Enables row-level access to aggregated values.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Running total per group
@amount |> Expr.sum |> Expr.over ["customer_id"] |> Expr.named "customer_total"
-- Percentage of group (scalar auto-coerced)
@sales
|> Expr.div (@sales |> Expr.sum |> Expr.over ["region"])
|> Expr.mul 100
|> Expr.named "pct_of_region"
-- Multiple partitions
@value |> Expr.mean |> Expr.over ["year", "category"] |> Expr.named "avg_by_year_cat"
-- Global window (no partitions)
@score |> Expr.mean |> Expr.over [] |> Expr.named "global_avg"Try itNotes: Empty partition list [] means global window (entire DataFrame). Results are broadcast back to each row.
See also: sum, mean, rowNumber, lag, lead
DataFrame.Expr.rowNumber
() -> Expr
Assign sequential row numbers within partitions (1-based). Use with over to partition.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { group = "A", value = 10 }
, { group = "A", value = 20 }
, { group = "B", value = 30 }
]
let rnExpr = Expr.rowNumber ()
df |> DataFrame.applyExprs [(@rn, rnExpr)]Try itNotes: Starts at 1. Order depends on current DataFrame sort order.
See also: rank, denseRank, over
DataFrame.Expr.rank
() -> Expr
Rank values with gaps for ties. Ties get the same rank; next rank skips accordingly.
import DataFrame.Expr as Expr
-- Rank with gaps: [1, 2, 2, 4] for values [10, 20, 20, 30]
Expr.rank () |> Expr.over ["department"] |> Expr.named "sales_rank"Try itNotes: Use with over for partitioned ranking. Sort the DataFrame first to control rank ordering.
See also: denseRank, rowNumber
DataFrame.Expr.denseRank
() -> Expr
Rank values without gaps for ties. Consecutive ranks even when ties exist.
import DataFrame.Expr as Expr
-- Dense rank: [1, 2, 2, 3] for values [10, 20, 20, 30]
Expr.denseRank () |> Expr.over ["category"] |> Expr.named "price_rank"Try itNotes: Unlike rank, dense_rank doesn't skip numbers after ties.
See also: rank, rowNumber
DataFrame.Expr.lag
Int -> Expr -> Expr
Get value from n rows before the current row. Useful for comparing to previous values.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Previous row's value
@price |> Expr.lag 1 |> Expr.over ["stock"] |> Expr.named "prev_price"
-- Calculate change from previous
@value |> Expr.sub (@value |> Expr.lag 1 |> Expr.over [])
|> Expr.named "change"
-- Look back 7 periods
@sales |> Expr.lag 7 |> Expr.over [] |> Expr.named "sales_last_week"Try itNotes: Returns null for rows without enough history (first n rows). Sort first for meaningful order.
See also: lead, over
DataFrame.Expr.lead
Int -> Expr -> Expr
Get value from n rows after the current row. Useful for comparing to future values.
import DataFrame.Expr exposing col, lit
import DataFrame.Expr as Expr
-- Next row's value
@price |> Expr.lead 1 |> Expr.over ["stock"] |> Expr.named "next_price"
-- Days until next event
@event_date |> Expr.lead 1 |> Expr.over ["user"]
|> Expr.sub (@event_date)
|> Expr.named "days_to_next"Try itNotes: Returns null for rows without enough future values (last n rows). Sort first for meaningful order.
See also: lag, over
Other
DataFrame.Expr.in
[a] -> Expr -> Expr
Test membership — whether the column value is in the given list. Returns a boolean expression.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords [{ city = "Berlin" }, { city = "Munich" }, { city = "Hamburg" }]
let expr = @city |> Expr.in ["Berlin", "Munich"]
DataFrame.filter expr dfTry itNotes: The list must contain values of a compatible type (Int, Float, String, Bool). Equivalent to SQL's IN operator.
See also: eq, neq, DataFrame.filter
DataFrame.Expr.quantile
Float -> Expr -> Expr
Compute quantile/percentile of values. Takes a quantile value between 0.0 and 1.0.
import DataFrame
import DataFrame.Expr exposing col
import DataFrame.Expr as Expr
let df = DataFrame.fromRecords
[ { group = "all", value = 10 }
, { group = "all", value = 20 }
, { group = "all", value = 30 }
, { group = "all", value = 40 }
]
let q25Expr = @value |> Expr.quantile 0.25 |> Expr.named "q25"
let q75Expr = @value |> Expr.quantile 0.75 |> Expr.named "q75"
df |> DataFrame.groupBy [@group] |> DataFrame.agg [q25Expr, q75Expr]Try itNotes: Quantile must be between 0.0 and 1.0. Uses linear interpolation method.
See also: median, min, max