Skip to main content

Overview

When parsing CSV/JSON files with custom column mapping, you can use these aggregation methods to compute metrics from raw data.

Available Methods

MethodDescription
avgAverage of all values (default)
sumSum of all values
minMinimum value
maxMaximum value
countCount of non-null values
p5050th percentile (median)
p9090th percentile
p9595th percentile
p9999th percentile
pass_ratePercentage of “success”/“pass”/true/1 values
fail_ratePercentage of “error”/“fail”/false/0 values
firstFirst value
lastLast value

Usage Examples

Average (Default)

sources:
  csv:
    metrics:
      - column: accuracy
        aggregate: avg

Percentiles

Use percentiles for latency and performance metrics:
sources:
  csv:
    metrics:
      - column: latency_ms
        aggregate: p95
        as: latency_p95
      - column: response_time
        aggregate: p99
        as: response_time_p99

Pass/Fail Rates

For boolean or status columns:
sources:
  csv:
    metrics:
      - column: status
        aggregate: pass_rate
        as: success_rate
      - column: error_flag
        aggregate: fail_rate
        as: error_rate

Min/Max

Find extreme values:
sources:
  csv:
    metrics:
      - column: cost_per_request
        aggregate: min
        as: min_cost
      - column: cost_per_request
        aggregate: max
        as: max_cost

Renaming Metrics

Use the as field to rename aggregated metrics:
sources:
  csv:
    metrics:
      - column: latency
        aggregate: p95
        as: latency_p95_ms  # Renamed metric
      - column: accuracy_score
        aggregate: avg
        as: mean_accuracy   # Renamed metric

Common Patterns

Performance Metrics

metrics:
  - column: latency_ms
    aggregate: p95
    as: latency_p95
  - column: throughput
    aggregate: avg
  - column: error_count
    aggregate: sum

Quality Metrics

metrics:
  - column: accuracy
    aggregate: avg
  - column: quality_score
    aggregate: min  # Ensure minimum quality
  - column: passed
    aggregate: pass_rate
    as: pass_rate

Cost Metrics

metrics:
  - column: cost_per_request
    aggregate: avg
  - column: total_cost
    aggregate: sum
  - column: cost_per_request
    aggregate: max
    as: max_cost_per_request