DataFrame Reshaping

Keel provides DataFrame.melt to reshape wide-format data into long format. This is useful when you have a dataset where repeated measurements are stored as separate columns and you want to stack them into rows.

Wide vs Long Format

Wide format keeps each variable or time point as its own column:

nr	var1_year1	var1_year2	var2_year1	var2_year2
1	10	20	5	8
2	30	40	7	9

Long format stacks those columns into rows, with a new index column for the suffix:

nr	year	var1	var2
1	1	10	5
1	2	20	8
2	1	30	7
2	2	40	9

`DataFrame.melt`

DataFrame.melt takes five arguments:

id columns — column names to keep as-is (e.g. ["nr"])
prefixes — one prefix per output value column (e.g. ["var1_", "var2_"])
separator — the character between the prefix stem and the suffix (e.g. "_")
index name — name for the new column that holds the parsed suffix (e.g. "year")

-- melt reshapes wide data into long form by parsing column name prefixes
import DataFrame

-- Wide table: each measurement variable has one column per time point
let wide = DataFrame.fromRecords
    [ { nr = 1, var1_year1 = 10, var1_year2 = 20, var2_year1 = 5, var2_year2 = 8 }
    , { nr = 2, var1_year1 = 30, var1_year2 = 40, var2_year1 = 7, var2_year2 = 9 }
    ]

-- melt id columns, prefix list, separator, and index name
case (wide |> DataFrame.melt [@nr] ["var1_", "var2_"] "_" "year") of
    Ok df -> DataFrame.columns df
    Err _ -> []

Try it

["nr"] — keep the nr column as an identifier
["var1_", "var2_"] — two prefixes produce two value columns (var1, var2)
"_" — separator used to split prefix from suffix
"year" — name of the new index column

The suffix after the separator is parsed as Int if all values across that prefix group are numeric (e.g. "year1" → suffix "1" → Int 1). Otherwise it stays as String.

Stem Names

The output column name for each prefix is the stem: the prefix with the trailing separator stripped. For "var1_" with separator "_", the stem is "var1". The resulting output column is named "var1".

Index Column Type

The index column type (year in the example) is inferred from the suffix values:

If every suffix across all prefixes parses as an integer → the index column is Int
Otherwise → the index column is String

This inference is consistent across all prefixes in a single melt call. If var1_ has suffixes "year1", "year2" and var2_ has suffixes "year1", "year2", all four suffixes are "year1" and "year2" — not integers — so the index column is String.

Next Steps

DataFrame Type Safety — schema annotations and column type checking
DataFrame Expressions — composable column expressions
DataFrame stdlib reference — complete function list including melt and pivot