Skip to contents

Returns a securer::securer_tool() that computes summary statistics for a data frame.

Usage

data_profile_tool(max_rows = 1e+05, max_calls = NULL)

Arguments

max_rows

Maximum rows to profile. Larger data frames are sampled. Default 100000.

max_calls

Maximum invocations allowed. NULL (default) means unlimited.

Value

A securer_tool object.

Details

Computes per-column statistics including type, missing count, and unique count. For numeric and integer columns, also computes min, max, mean, median, and standard deviation. For character and factor columns, returns the top 5 most frequent values with counts.

When the input data frame exceeds max_rows, a random sample of max_rows rows is profiled and the result indicates that sampling occurred.

The data argument is declared as type "list" in the tool schema because the IPC serialization layer converts data frames to lists. The tool automatically coerces list input back to a data frame.

Examples

# \donttest{
tool <- data_profile_tool(max_rows = 50000, max_calls = 10)
# }