History

Shawn Anderson 58812dc1b3 Add variants loop.		2025-10-10 18:33:46 -07:00
..
.claude	Add variants loop.	2025-10-10 18:33:46 -07:00
specs	Add variants loop.	2025-10-10 18:33:46 -07:00
templates	Add variants loop.	2025-10-10 18:33:46 -07:00
test_output	Add variants loop.	2025-10-10 18:33:46 -07:00
utils	Add variants loop.	2025-10-10 18:33:46 -07:00
CLAUDE.md	Add variants loop.	2025-10-10 18:33:46 -07:00
README.md	Add variants loop.	2025-10-10 18:33:46 -07:00

README.md

Infinite Loop Variant 2: Rich Utility Commands Ecosystem

Variant Focus: Chain-of-Thought Reasoning in Utility Commands

This variant extends the base Infinite Agentic Loop pattern with a comprehensive ecosystem of utility commands that leverage chain-of-thought (CoT) prompting to make orchestration, validation, and quality assurance transparent, reliable, and actionable.

Key Innovation: Chain-of-Thought Utility Commands

Traditional utility tools often provide simple outputs without showing their reasoning. This variant applies chain-of-thought prompting principles to every utility command, making each tool:

Explicit in reasoning - Shows step-by-step thinking process
Transparent in methodology - Documents how conclusions are reached
Reproducible in analysis - Clear criteria anyone can verify
Actionable in guidance - Specific recommendations with rationale
Educational in nature - Teaches users the reasoning process

What is Chain-of-Thought Prompting?

Chain-of-thought (CoT) prompting is a technique that improves AI output quality by eliciting explicit step-by-step reasoning. Instead of jumping directly to conclusions, CoT prompts guide the model to:

Break down complex problems into intermediate reasoning steps
Show logical progression from input to output
Make decision criteria transparent so they can be verified
Enable debugging by exposing the reasoning chain
Improve accuracy through systematic thinking

Research Source: Prompting Guide - Chain-of-Thought

Key Techniques Applied:

Problem decomposition - Complex tasks broken into steps
Explicit thinking - Reasoning made visible through "Let's think through this step by step"
Intermediate steps - Each phase documented before moving to next
Reasoning validation - Evidence provided for conclusions

Utility Commands Ecosystem

1. `/analyze` - Iteration Analysis Utility

Purpose: Examine existing iterations for quality patterns, theme diversity, and improvement opportunities.

Chain-of-Thought Process:

Step 1: Define Analysis Scope - What are we analyzing and why?
Step 2: Data Collection - Systematically gather file and content data
Step 3: Pattern Recognition - Identify themes, variations, quality indicators
Step 4: Gap Identification - Determine what's missing or could improve
Step 5: Insight Generation - Synthesize findings into actionable insights
Step 6: Report Formatting - Present clearly with evidence

Example Usage:

# Analyze entire output directory
/analyze outputs/

# Focus on specific dimension
/analyze outputs/ themes
/analyze outputs/ quality
/analyze outputs/ gaps

Output: Comprehensive analysis report with quantitative metrics, pattern findings, gap identification, and specific recommendations.

CoT Benefit: Users see exactly how patterns were identified and why recommendations were made, enabling them to learn pattern recognition themselves.

2. `/validate-spec` - Specification Validation Utility

Purpose: Ensure specification files are complete, consistent, and executable before generation begins.

Chain-of-Thought Process:

Step 1: Preliminary Checks - File exists, readable, correct format?
Step 2: Structural Validation - All required sections present and complete?
Step 3: Content Quality Validation - Each section substantive and clear?
Step 4: Executability Validation - Can sub-agents work with this?
Step 5: Integration Validation - Compatible with utilities and orchestrator?
Step 6: Issue Categorization - Critical, warnings, or suggestions?
Step 7: Report Generation - Structured findings with remediation

Example Usage:

# Standard validation
/validate-spec specs/my_spec.md

# Strict mode (enforce all best practices)
/validate-spec specs/my_spec.md strict

# Lenient mode (only critical issues)
/validate-spec specs/my_spec.md lenient

Output: Validation report with pass/fail status, categorized issues, and specific remediation steps for each problem.

CoT Benefit: Spec authors understand not just WHAT is wrong, but WHY it matters and HOW to fix it through explicit validation reasoning.

3. `/test-output` - Output Testing Utility

Purpose: Validate generated outputs against specification requirements and quality standards.

Chain-of-Thought Process:

Step 1: Understand Testing Context - What, why, scope?
Step 2: Load Specification Requirements - Extract testable criteria
Step 3: Collect Output Files - Discover and organize systematically
Step 4: Execute Structural Tests - Naming, structure, accessibility
Step 5: Execute Content Tests - Sections, completeness, correctness
Step 6: Execute Quality Tests - Standards, uniqueness, integration
Step 7: Aggregate Results - Compile per-iteration and overall findings
Step 8: Generate Test Report - Structured results with recommendations

Example Usage:

# Test all outputs
/test-output outputs/ specs/example_spec.md

# Test specific dimension
/test-output outputs/ specs/example_spec.md structural
/test-output outputs/ specs/example_spec.md content
/test-output outputs/ specs/example_spec.md quality

Output: Detailed test report with per-iteration results, pass/fail status for each test type, quality scores, and remediation guidance.

CoT Benefit: Failed tests include reasoning chains showing exactly where outputs deviate from specs and why it matters, enabling targeted fixes.

4. `/debug` - Debugging Utility

Purpose: Diagnose and troubleshoot issues with orchestration, agent coordination, and generation processes.

Chain-of-Thought Process:

Step 1: Symptom Identification - What's wrong, when, expected vs actual?
Step 2: Context Gathering - Command details, environment state, history
Step 3: Hypothesis Formation - What could cause this? (5 categories)
Step 4: Evidence Collection - Gather data to test each hypothesis
Step 5: Root Cause Analysis - Determine underlying cause with evidence
Step 6: Solution Development - Immediate fix, verification, prevention
Step 7: Debug Report Generation - Document findings and solutions

Example Usage:

# Debug with issue description
/debug "generation producing empty files"

# Debug with context
/debug "quality issues in outputs" outputs/

# Debug orchestration problem
/debug "infinite loop not launching next wave"

Output: Debug report with problem summary, investigation process, root cause analysis with causation chain, solution with verification plan, and prevention measures.

CoT Benefit: Complete reasoning chain from symptom to root cause enables users to understand WHY problems occurred and HOW to prevent them, building debugging skills.

5. `/status` - Status Monitoring Utility

Purpose: Provide real-time visibility into generation progress, quality trends, and system health.

Chain-of-Thought Process:

Step 1: Determine Status Scope - Detail level, time frame, aspects
Step 2: Collect Current State - Progress, quality, system health
Step 3: Calculate Metrics - Completion %, quality scores, performance
Step 4: Analyze Trends - Progress, quality, performance trajectories
Step 5: Identify Issues - Critical, warnings, informational
Step 6: Predict Outcomes - Completion time, quality, resources
Step 7: Format Status Report - At-a-glance to detailed

Example Usage:

# Check current status
/status outputs/

# Quick summary
/status outputs/ summary

# Detailed with trends
/status outputs/ detailed

# Historical comparison
/status outputs/ historical

Output: Status report with progress overview, detailed metrics, performance analysis, system health indicators, trend analysis, predictions, and recommendations.

CoT Benefit: Transparent metric calculations and trend reasoning enable users to understand current state and make informed decisions about continuing or adjusting generation.

6. `/init` - Interactive Setup Wizard

Purpose: Guide new users through complete setup with step-by-step wizard.

Chain-of-Thought Process:

Step 1: Welcome and Context Gathering - Understand user situation
Step 2: Directory Structure Setup - Create necessary directories
Step 3: Specification Creation - Interview user, guide spec writing
Step 4: First Generation Test - Run small test, validate results
Step 5: Utility Introduction - Demonstrate each command
Step 6: Workflow Guidance - Design customized workflow
Step 7: Best Practices Education - Share success principles
Step 8: Summary and Next Steps - Recap and confirm readiness

Example Usage:

# Start interactive setup
/init

Output: Complete setup including directory structure, validated specification, test generation, utility demonstrations, customized workflow, and readiness confirmation.

CoT Benefit: Interactive reasoning guides users through decisions (Why this directory structure? Why these spec sections?) enabling them to understand the setup logic and customize effectively.

7. `/report` - Report Generation Utility

Purpose: Generate comprehensive quality and progress reports with analysis and recommendations.

Chain-of-Thought Process:

Step 1: Define Report Scope - Purpose, audience, time period
Step 2: Data Collection - Iterations, specs, tests, analysis
Step 3: Quantitative Analysis - Calculate all metrics systematically
Step 4: Qualitative Assessment - Evaluate content and patterns
Step 5: Comparative Analysis - Spec compliance, historical, benchmarks
Step 6: Issue Identification - Categorize problems by severity
Step 7: Insight Generation - Synthesize findings into insights
Step 8: Report Formatting - Structure for clarity and action

Example Usage:

# Standard report
/report outputs/ specs/example_spec.md

# Executive summary
/report outputs/ specs/example_spec.md executive

# Detailed technical report
/report outputs/ specs/example_spec.md technical

# Quick summary
/report outputs/ specs/example_spec.md summary

Output: Comprehensive report with executive summary, quantitative/qualitative analysis, comparative benchmarks, issues categorized by severity, insights, and prioritized recommendations.

CoT Benefit: Every conclusion includes reasoning chain showing how data led to insights, making findings verifiable and teaching users analytical thinking.

How Chain-of-Thought Improves Utility Reliability

1. Transparent Decision-Making

Without CoT:

"Quality is poor. Fix the outputs."

With CoT:

Let's assess quality step by step:
Step 1: Test Results - 12/20 iterations passed (60% pass rate)
Step 2: Pass rate target is 85% for acceptable quality
Step 3: Gap is 25 percentage points below target
Step 4: Failed iterations missing interactive controls (8 cases) and documentation (4 cases)
Step 5: Root cause: Spec section 4.3 doesn't explicitly require interactivity
Step 6: Recommendation: Add "Interactive Controls" to required sections in spec
Step 7: Expected impact: Pass rate should increase to 90%+ after spec fix

Conclusion: Quality is below acceptable standards (60% vs 85% target) due to vague spec guidance on required interactivity. Fix spec section 4.3 and regenerate failed iterations.

Benefit: Users see the complete reasoning chain and can verify each step, building trust and understanding.

2. Reproducible Analysis

Without CoT:

"Theme diversity is low."

With CoT:

Analyzing theme diversity:
Step 1: Count unique themes - Found 8 distinct themes across 20 iterations
Step 2: Calculate distribution - Theme frequencies: [4, 4, 3, 2, 2, 2, 2, 1]
Step 3: Calculate evenness - Shannon diversity index: 0.82 (scale 0-1)
Step 4: Compare to benchmark - Target diversity index: 0.90 for "high diversity"
Step 5: Identify pattern - Two themes (bar charts, line graphs) used 4x each, consuming 40% of iterations
Step 6: Calculate gap - 0.08 below target indicates moderate, not high, diversity

Conclusion: Theme diversity is moderate (0.82) due to overuse of bar charts (4x) and line graphs (4x). Recommend exploring underused techniques: scatter plots, heatmaps, network graphs.

Benefit: Anyone can reproduce the analysis by following the same steps and criteria, ensuring consistency.

3. Debuggable Reasoning

Without CoT:

"Generation failed."

With CoT:

Debugging generation failure:
Step 1: Symptom - No output files created in outputs/ directory
Step 2: Hypothesis A - Spec file unreadable → Test: Read spec file → Result: File reads successfully, hypothesis eliminated
Step 3: Hypothesis B - Output directory permissions → Test: Check write permissions → Result: Permission denied on outputs/
Step 4: Root cause confirmed - Directory created with incorrect permissions (read-only)
Step 5: Immediate fix - Run: chmod 755 outputs/
Step 6: Verification - Retry generation → Files created successfully
Step 7: Prevention - Update /init command to set correct permissions during setup

Conclusion: Generation failed due to read-only permissions on output directory. Fixed by chmod 755. Updated /init to prevent recurrence.

Benefit: Complete reasoning chain allows users to understand the diagnostic process and learn troubleshooting skills.

4. Actionable Recommendations

Without CoT:

"Improve iteration quality."

With CoT:

Quality improvement analysis:
Step 1: Identify weakness - Technical correctness scores average 72/100
Step 2: Analyze failure pattern - 80% of errors are HTML syntax errors (unclosed tags)
Step 3: Trace to root cause - Spec lacks HTML validation requirement
Step 4: Design intervention - Add HTML validation to spec quality standards
Step 5: Estimate impact - Should reduce syntax errors by ~70%, raising scores to 85/100
Step 6: Define action - Update spec section 5.2 with "Must pass HTML validator"
Step 7: Verification plan - Run /test-output after regeneration to confirm improvement

Recommendation: Add HTML validation requirement to spec section 5.2. This addresses the root cause (no validation requirement) of the most common error pattern (unclosed tags, 80% of issues). Expected improvement: technical correctness 72→85.

Benefit: Recommendations include reasoning chains showing WHY the action will work and HOW much improvement to expect, enabling confident decision-making.

Complete Workflow Examples

Small Batch Workflow (5 iterations)

# 1. Validate specification before starting
/validate-spec specs/my_spec.md

# Review validation report, fix any critical issues

# 2. Generate iterations
/project:infinite specs/my_spec.md outputs 5

# 3. Test outputs against spec
/test-output outputs/ specs/my_spec.md

# Review test results, note any failures

# 4. Analyze patterns and quality
/analyze outputs/

# Review analysis, understand themes used

# 5. Generate final report
/report outputs/ specs/my_spec.md summary

CoT Benefit: Each utility shows reasoning, so you understand not just what's wrong, but why and how to fix it.

Medium Batch Workflow (20 iterations)

# 1. Strict spec validation
/validate-spec specs/my_spec.md strict

# Fix all warnings and suggestions, not just critical issues

# 2. Generate first wave (5 iterations)
/project:infinite specs/my_spec.md outputs 5

# 3. Test and analyze first wave
/test-output outputs/ specs/my_spec.md
/analyze outputs/

# 4. Refine spec based on learnings
# Edit spec file if needed

# 5. Continue generation
/project:infinite specs/my_spec.md outputs 20

# 6. Monitor status periodically
/status outputs/ detailed

# 7. Final comprehensive report
/report outputs/ specs/my_spec.md detailed

CoT Benefit: Early wave testing with reasoning chains catches spec issues before generating full batch, saving time and improving quality.

Infinite Mode Workflow (continuous)

# 1. Validate thoroughly before starting
/validate-spec specs/my_spec.md strict

# 2. Start infinite generation
/project:infinite specs/my_spec.md outputs infinite

# 3. Monitor status during generation
/status outputs/ summary
# (Run periodically to check progress)

# 4. Analyze after each wave completes
/analyze outputs/
# (Check theme diversity isn't exhausted)

# 5. If issues detected, debug
/debug "quality declining in later waves" outputs/

# 6. Stop when satisfied or context limits reached
# (Manual stop)

# 7. Generate comprehensive final report
/report outputs/ specs/my_spec.md technical

CoT Benefit: Status and analyze commands show reasoning about trends, enabling early detection of quality degradation with clear explanations of WHY.

Directory Structure

infinite_variant_2/
├── .claude/
│   ├── commands/
│   │   ├── infinite.md          # Main orchestrator with CoT
│   │   ├── analyze.md           # Analysis utility with CoT
│   │   ├── validate-spec.md     # Validation utility with CoT
│   │   ├── test-output.md       # Testing utility with CoT
│   │   ├── debug.md             # Debugging utility with CoT
│   │   ├── status.md            # Status utility with CoT
│   │   ├── init.md              # Setup wizard with CoT
│   │   └── report.md            # Reporting utility with CoT
│   └── settings.json            # Tool permissions
├── specs/
│   └── example_spec.md          # Example showing utility integration
├── utils/
│   └── quality_metrics.json     # Quality metric definitions with CoT
├── templates/
│   └── report_template.md       # Report template with CoT sections
├── README.md                    # This file
└── CLAUDE.md                    # Project instructions for Claude

Key Benefits of This Variant

1. Transparency

Every utility command shows its reasoning process, making it clear HOW conclusions were reached and WHY recommendations are made.

2. Reliability

Chain-of-thought reasoning reduces errors by forcing systematic, step-by-step thinking instead of jumping to conclusions.

3. Debuggability

When something goes wrong, reasoning chains reveal exactly where in the process the issue occurred, enabling targeted fixes.

4. Educational

Users learn analytical and debugging skills by observing the reasoning process, building competency over time.

5. Reproducibility

Explicit criteria and methodologies enable anyone to reproduce analyses and verify conclusions independently.

6. Actionability

Recommendations include reasoning chains showing expected impact, enabling confident decision-making.

Getting Started

For New Users

# Start with the interactive setup wizard
/init

The wizard will:

Ask about your generation goals
Create directory structure
Guide you through writing a specification
Run a test generation
Demonstrate all utilities
Design a customized workflow
Teach best practices

For Experienced Users

# 1. Create your specification in specs/
# 2. Validate it
/validate-spec specs/your_spec.md

# 3. Generate iterations
/project:infinite specs/your_spec.md outputs [count]

# 4. Use utilities as needed
/test-output outputs/ specs/your_spec.md
/analyze outputs/
/report outputs/ specs/your_spec.md

Chain-of-Thought in Practice

Example: Why `/analyze` Uses CoT

Traditional analysis might say:

"You've used bar charts 5 times. Try something else."

With chain-of-thought reasoning:

Step 1: Count visualizations - Found 20 iterations using 8 unique types Step 2: Calculate distribution - Bar charts: 5, Line graphs: 4, Pie charts: 3, Others: 8 Step 3: Identify pattern - Bar charts (25%) and line graphs (20%) account for 45% of iterations Step 4: Compare to even distribution - Even distribution would be 12.5% per type (100%/8) Step 5: Calculate overuse - Bar charts 2x expected, line graphs 1.6x expected Step 6: Identify underused - Scatter plots (0), heatmaps (1), network graphs (1) Step 7: Recommend - Next 5 iterations should focus on underused types to balance distribution Step 8: Predict impact - Would reduce overuse from 2x to 1.5x, increase diversity index from 0.78 to 0.88

Result: User understands not just WHAT to do, but WHY it matters (distribution balance) and WHAT impact to expect (diversity improvement), enabling informed decisions.

Quality Metrics with CoT Reasoning

See utils/quality_metrics.json for complete metric definitions. Each metric includes:

Clear definition - What is being measured
Explicit calculation - How the score is computed
Transparent thresholds - What constitutes excellent/good/acceptable/poor
Reasoning application - How this metric fits into overall quality assessment

Example from metrics file:

{
  "completeness": {
    "description": "Measures whether all required components are present",
    "calculation": "present_components / required_components * 100",
    "thresholds": {
      "excellent": 100,
      "good": 90,
      "acceptable": 75
    },
    "reasoning": "Completeness is weighted at 25% because partial outputs have limited utility. A component missing critical sections fails to serve its purpose, regardless of other quality dimensions. This metric answers: 'Is everything required actually present?'"
  }
}

Contributing and Extending

Adding New Utility Commands

When creating new utilities, apply CoT principles:

Start with "Let's think through this step by step"
Break complex tasks into numbered steps
Make decision criteria explicit
Show intermediate reasoning
Provide evidence for conclusions
Make recommendations actionable

Template for New Utility

# New Utility - [Purpose]

## Chain-of-Thought Process

Let's think through [task] step by step:

### Step 1: [First Phase]
[Questions to answer]
[Reasoning approach]

### Step 2: [Second Phase]
[Questions to answer]
[Reasoning approach]

[Continue for all steps...]

## Execution Protocol

Now, execute the [task]:

1. [Step 1 action]
2. [Step 2 action]
...

Begin [task] with the provided arguments.

Research and Learning

Chain-of-Thought Resources

Primary Source: Prompting Guide - Chain-of-Thought Techniques
Key Paper: Wei et al. (2022) - "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models"
Application Guide: This README's workflow examples

Learning from the Utilities

Each utility command serves as both a functional tool AND a teaching resource:

Read the commands in .claude/commands/ to see CoT structure
Run utilities and observe the reasoning process
Compare outputs with traditional tools to see transparency benefits
Adapt patterns to your own prompt engineering

Troubleshooting

"I don't understand the reasoning chain"

Solution: Break down the chain step by step. Each step should:

State what question it's answering
Show what data it's using
Explain how it reaches its conclusion
Connect to the next step

If a step doesn't meet these criteria, run /debug to identify the gap.

"Too much detail, just give me the answer"

Solution: Use summary modes:

/analyze outputs/ summary
/status outputs/ summary
/report outputs/ specs/my_spec.md executive

Summary modes provide conclusions upfront, with reasoning available if needed.

"Reasoning seems wrong"

Solution: The beauty of CoT is debuggability. If you disagree with a conclusion:

Identify which step in the reasoning chain is wrong
Check the data or criteria used in that step
Run /debug with description of the issue
The debug utility will analyze its own reasoning process

License and Attribution

Created as: Infinite Loop Variant 2 - Part of the Infinite Agents project Technique Source: Chain-of-Thought prompting from Prompting Guide Generated: 2025-10-10 Generator: Claude Code (claude-sonnet-4-5)

Next Steps

Try the setup wizard: /init - Best for first-time users
Validate a spec: /validate-spec specs/example_spec.md - See CoT validation in action
Generate test batch: /project:infinite specs/example_spec.md test_outputs 3 - Quick test
Analyze results: /analyze test_outputs/ - Observe reasoning about patterns
Generate report: /report test_outputs/ specs/example_spec.md - See comprehensive CoT analysis

Remember: The goal isn't just to generate iterations, but to understand the process through transparent, step-by-step reasoning. Every utility command is both a tool and a teacher.

README.md

Infinite Loop Variant 2: Rich Utility Commands Ecosystem

Key Innovation: Chain-of-Thought Utility Commands

What is Chain-of-Thought Prompting?

Utility Commands Ecosystem

1. /analyze - Iteration Analysis Utility

2. /validate-spec - Specification Validation Utility

3. /test-output - Output Testing Utility

4. /debug - Debugging Utility

5. /status - Status Monitoring Utility

6. /init - Interactive Setup Wizard

7. /report - Report Generation Utility

How Chain-of-Thought Improves Utility Reliability

1. Transparent Decision-Making

2. Reproducible Analysis

3. Debuggable Reasoning

4. Actionable Recommendations

Complete Workflow Examples

Small Batch Workflow (5 iterations)

Medium Batch Workflow (20 iterations)

Infinite Mode Workflow (continuous)

Directory Structure

Key Benefits of This Variant

1. Transparency

2. Reliability

3. Debuggability

4. Educational

5. Reproducibility

6. Actionability

Getting Started

For New Users

For Experienced Users

Chain-of-Thought in Practice

Example: Why /analyze Uses CoT

Quality Metrics with CoT Reasoning

Contributing and Extending

Adding New Utility Commands

Template for New Utility

Research and Learning

Chain-of-Thought Resources

Learning from the Utilities

Troubleshooting

"I don't understand the reasoning chain"

"Too much detail, just give me the answer"

"Reasoning seems wrong"

License and Attribution

Next Steps

1. `/analyze` - Iteration Analysis Utility

2. `/validate-spec` - Specification Validation Utility

3. `/test-output` - Output Testing Utility

4. `/debug` - Debugging Utility

5. `/status` - Status Monitoring Utility

6. `/init` - Interactive Setup Wizard

7. `/report` - Report Generation Utility

Example: Why `/analyze` Uses CoT