Skip to main content

Overview

GitHub Actions is the most popular CI/CD platform. Geval integrates seamlessly to enforce quality gates on pull requests.

Basic Setup

name: Eval Check

on: [pull_request]

jobs:
  eval-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '18'
      
      - name: Run evals
        run: npm run evals  # Your eval command
      
      - name: Install Geval
        run: npm install -g @geval-labs/cli
      
      - name: Check eval results
        run: |
          geval check \
            --contract contract.yaml \
            --eval eval-results.csv

Advanced Workflow

With Baseline Comparison

name: Eval Check

on:
  pull_request:
    paths:
      - 'prompts/**'
      - 'agents/**'

jobs:
  eval-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install Geval
        run: npm install -g @geval-labs/cli

      - name: Run Evals
        run: |
          # Your eval command here (promptfoo, langsmith, etc.)
          npm run evals -- --output eval-results.json

      - name: Validate Contract
        run: geval validate contract.yaml

      - name: Check Eval Results
        run: |
          geval check \
            --contract contract.yaml \
            --eval eval-results.json \
            --json > decision.json

      - name: Comment on PR
        if: always()
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const decision = JSON.parse(fs.readFileSync('decision.json', 'utf8'));
            
            const icon = decision.status === 'PASS' ? '✅' : '❌';
            const body = `## ${icon} Geval: ${decision.status}\n\n${decision.summary}`;
            
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: body
            });

Required Status Check

After setting up the workflow:
  1. Go to your repository SettingsBranches
  2. Click Add rule or edit existing rule
  3. Under Require status checks to pass before merging:
    • Check Require branches to be up to date before merging
    • Select eval-check from the list
  4. Click Save
Now PRs cannot be merged until the eval check passes!

Multiple Eval Files

- name: Check Multiple Evals
  run: |
    geval check \
      --contract contract.yaml \
      --eval results1.json \
      --eval results2.json \
      --eval results3.csv

With Baseline from Previous Run

- name: Download Baseline
  uses: actions/download-artifact@v3
  with:
    name: baseline-results
    path: ./baseline

- name: Check with Baseline
  run: |
    geval check \
      --contract contract.yaml \
      --eval current-results.json \
      --baseline baseline/results.json

Caching Dependencies

- name: Cache node modules
  uses: actions/cache@v3
  with:
    path: ~/.npm
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      ${{ runner.os }}-node-

Matrix Strategy

Run checks across multiple Node versions:
strategy:
  matrix:
    node-version: [18, 20, 22]
steps:
  - uses: actions/setup-node@v4
    with:
      node-version: ${{ matrix.node-version }}

Best Practices

  1. Run evals first - Generate eval results before checking
  2. Validate contracts - Use geval validate to catch errors early
  3. Use JSON output - For programmatic handling in workflows
  4. Comment on PRs - Provide feedback directly in pull requests
  5. Cache dependencies - Speed up workflow runs