feat: add CI plugin validation workflow (#3)

Uses claude-code-action with plugin-dev plugin to validate plugin structure, agent consistency, eval manifests, and skills on every PR. Includes @claude mention support for interactive fixes.
2026-03-27 05:44:44 -05:00 · 2026-03-27 05:44:44 -05:00 · 021d64a1fd
commit 021d64a1fd
parent 5a990f78a9
1 changed files with 202 additions and 0 deletions
--- a/.github/workflows/validate.yml
+++ b/.github/workflows/validate.yml
@ -0,0 +1,202 @@
+name: Plugin Validation
+
+on:
+  pull_request:
+    types: [opened, synchronize, ready_for_review, reopened]
+  issue_comment:
+    types: [created]
+  pull_request_review_comment:
+    types: [created]
+  pull_request_review:
+    types: [submitted]
+
+jobs:
+  validate:
+    concurrency:
+      group: validate-${{ github.head_ref || github.run_id }}
+      cancel-in-progress: true
+    if: |
+      (
+        github.event_name == 'pull_request' &&
+        github.event.sender.login != 'claude[bot]' &&
+        github.event.pull_request.head.repo.full_name == github.repository
+      )
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      pull-requests: write
+      issues: read
+      id-token: write
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          ref: ${{ github.event.pull_request.head.ref }}
+
+      - name: Configure AWS Credentials
+        uses: aws-actions/configure-aws-credentials@v4
+        with:
+          role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
+          aws-region: ${{ secrets.AWS_REGION }}
+
+      - name: Run Plugin Validation
+        uses: anthropics/claude-code-action@v1
+        with:
+          use_bedrock: "true"
+          use_sticky_comment: true
+          plugins: "plugin-dev@claude-code-plugins"
+          prompt: |
+            You are validating the codeflash-agent Claude Code plugin. This plugin has:
+            - 6 agents in `agents/` (router + setup + 4 domain agents)
+            - 2 skills in `skills/` (codeflash-optimize, memray-profiling)
+            - Eval templates in `evals/templates/`
+            - Plugin manifest at `.claude-plugin/plugin.json`
+            - No hooks directory
+
+            Execute each step in order. If a step finds no issues, state that and continue.
+
+            <step name="triage">
+            Assess what changed in this PR:
+            1. Run `gh pr diff ${{ github.event.pull_request.number }} --name-only` to get changed files.
+            2. Classify changes:
+               - AGENTS: files in `agents/`
+               - SKILLS: files in `skills/`
+               - EVALS: files in `evals/`
+               - PLUGIN_CONFIG: `.claude-plugin/plugin.json`, hooks, `.mcp.json`
+               - DOCS: `*.md` outside agents/skills, LICENSE
+               - OTHER: anything else
+            3. Record which categories have changes — later steps only run if relevant.
+            </step>
+
+            <step name="plugin_structure">
+            Use the Agent tool to launch the **plugin-dev:plugin-validator** agent with this prompt:
+            "Validate the plugin at the current directory. Check plugin.json, all agent frontmatter, all skill frontmatter, and hook configuration. Report any issues found."
+
+            This agent knows the full Claude Code plugin spec and will check:
+            - plugin.json schema and required fields
+            - Agent YAML frontmatter (name, description, model, tools, color)
+            - Skill YAML frontmatter (name, description, allowed-tools)
+            - File cross-references and directory structure
+            </step>
+
+            <step name="agent_consistency">
+            Only run if AGENTS changed.
+
+            The 4 domain agents (codeflash-cpu.md, codeflash-memory.md, codeflash-async.md, codeflash-structure.md)
+            must all have these steps in their experiment loops:
+            1. A "Review git history" step (step 1) with `git log --oneline -20` and `git diff HEAD~1`
+            2. A "Guard" step (if configured in conventions.md) with revert/rework/discard logic
+            3. A "Config audit" step (after KEEP) checking for dead/inconsistent config flags
+
+            Check each domain agent:
+            1. Read the experiment loop section of each file.
+            2. Verify all 3 steps are present.
+            3. Verify step numbering is sequential with no gaps.
+            4. Verify the Guard step includes "revert, rework (max 2 attempts), then discard".
+            5. Verify the Config audit step has domain-specific guidance (not generic).
+
+            Also check: router agent (codeflash.md) domain detection table matches the 4 domain agents that exist.
+            </step>
+
+            <step name="eval_manifests">
+            Only run if EVALS changed.
+
+            For each `evals/templates/*/manifest.json`:
+            1. Verify valid JSON.
+            2. Verify required fields: `name`, `eval_type`, `bugs` (array), `rubric` (object with `criteria`).
+            3. Verify each bug has: `id`, `file`, `description`, `domain`.
+            4. Verify `rubric.criteria` values are positive integers.
+            5. Verify `rubric.total` equals the sum of criteria values (if present).
+            6. Verify referenced files (`file` in bugs, `test_file`) actually exist in that template directory.
+            </step>
+
+            <step name="skill_review">
+            Only run if SKILLS changed.
+
+            Use the Agent tool to launch the **plugin-dev:skill-reviewer** agent with this prompt:
+            "Review all skills in the `skills/` directory. Check description quality, triggering accuracy,
+            allowed-tools restrictions, and whether the skill content follows best practices."
+            </step>
+
+            <step name="summary">
+            Post exactly one summary comment with all results:
+
+            ## Plugin Validation
+
+            ### Plugin Structure
+            (plugin-validator findings or "All checks passed")
+
+            ### Agent Consistency
+            (experiment loop check results or "Not applicable — no agent changes")
+
+            ### Eval Manifests
+            (manifest validation results or "Not applicable — no eval changes")
+
+            ### Skill Review
+            (skill-reviewer findings or "Not applicable — no skill changes")
+
+            ---
+            *Validated by plugin-dev + codeflash-agent checks*
+            </step>
+          claude_args: '--model us.anthropic.claude-sonnet-4-6 --allowedTools "Agent,Read,Glob,Grep,Bash(gh pr diff:*),Bash(gh pr view:*),Bash(git diff*),Bash(git log*),Bash(git status*)"'
+
+  claude-mention:
+    concurrency:
+      group: claude-mention-${{ github.event.issue.number || github.event.pull_request.number || github.run_id }}
+      cancel-in-progress: false
+    if: |
+      (
+        github.event_name == 'issue_comment' &&
+        contains(github.event.comment.body, '@claude') &&
+        (github.event.comment.author_association == 'OWNER' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'COLLABORATOR')
+      ) ||
+      (
+        github.event_name == 'pull_request_review_comment' &&
+        contains(github.event.comment.body, '@claude') &&
+        (github.event.comment.author_association == 'OWNER' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'COLLABORATOR') &&
+        github.event.pull_request.head.repo.full_name == github.repository
+      ) ||
+      (
+        github.event_name == 'pull_request_review' &&
+        contains(github.event.review.body, '@claude') &&
+        (github.event.review.author_association == 'OWNER' || github.event.review.author_association == 'MEMBER' || github.event.review.author_association == 'COLLABORATOR') &&
+        github.event.pull_request.head.repo.full_name == github.repository
+      )
+    runs-on: ubuntu-latest
+    permissions:
+      contents: write
+      pull-requests: write
+      issues: read
+      id-token: write
+    steps:
+      - name: Get PR head ref
+        id: pr-ref
+        env:
+          GH_TOKEN: ${{ github.token }}
+        run: |
+          if [ "${{ github.event_name }}" = "issue_comment" ]; then
+            PR_REF=$(gh api repos/${{ github.repository }}/pulls/${{ github.event.issue.number }} --jq '.head.ref')
+            echo "ref=$PR_REF" >> $GITHUB_OUTPUT
+          else
+            echo "ref=${{ github.event.pull_request.head.ref || github.head_ref }}" >> $GITHUB_OUTPUT
+          fi
+
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          ref: ${{ steps.pr-ref.outputs.ref }}
+
+      - name: Configure AWS Credentials
+        uses: aws-actions/configure-aws-credentials@v4
+        with:
+          role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
+          aws-region: ${{ secrets.AWS_REGION }}
+
+      - name: Run Claude Code
+        uses: anthropics/claude-code-action@v1
+        with:
+          use_bedrock: "true"
+          plugins: "plugin-dev@claude-code-plugins"
+          claude_args: '--model us.anthropic.claude-sonnet-4-6 --allowedTools "Agent,Read,Edit,Write,Glob,Grep,Bash(git status*),Bash(git diff*),Bash(git add *),Bash(git commit *),Bash(git push*),Bash(git log*),Bash(gh pr comment*),Bash(gh pr view*),Bash(gh pr diff*)"'