infinite-agents-public/infinite_variants/infinite_variant_2/CLAUDE.md

701 lines
20 KiB
Markdown

# CLAUDE.md - Infinite Loop Variant 2: Rich Utility Commands Ecosystem
This file provides guidance to Claude Code when working with this variant of the Infinite Agentic Loop pattern.
## Project Overview
**Variant Name:** Infinite Loop Variant 2 - Rich Utility Commands Ecosystem
**Primary Innovation:** Chain-of-thought (CoT) prompting applied throughout a comprehensive ecosystem of utility commands that support the infinite loop orchestration pattern.
**Key Differentiator:** Every utility command uses explicit step-by-step reasoning, making orchestration, validation, testing, debugging, and reporting transparent, reproducible, and educational.
**Research Integration:** Implements chain-of-thought prompting techniques from [Prompting Guide - CoT](https://www.promptingguide.ai/techniques/cot), specifically:
- Problem decomposition into intermediate steps
- Explicit thinking through "Let's think step by step" pattern
- Transparent reasoning chains from inputs to conclusions
- Evidence-based decision making
## Architecture
### Command System (`.claude/commands/`)
**Core Orchestrator:**
- `infinite.md` - Main orchestration command with integrated CoT reasoning for agent deployment
**Utility Commands (7 utilities):**
1. **`analyze.md`** - Pattern and quality analysis with 6-step CoT process
2. **`validate-spec.md`** - Specification validation with 7-step CoT process
3. **`test-output.md`** - Output testing with 8-step CoT process
4. **`debug.md`** - Issue debugging with 7-step CoT process
5. **`status.md`** - Progress monitoring with 7-step CoT process
6. **`init.md`** - Setup wizard with 8-step CoT process
7. **`report.md`** - Report generation with 8-step CoT process
### Key Design Principles
**1. Explicit Reasoning Chains**
Every command includes a "Chain-of-Thought Process" section that:
- Lists numbered steps
- Defines what each step accomplishes
- Shows how steps connect logically
- Makes decision criteria transparent
**2. Systematic Execution**
Commands follow consistent pattern:
```
1. Understand context and scope
2. Collect relevant data systematically
3. Apply analysis or validation logic
4. Synthesize findings
5. Generate structured output
6. Provide actionable recommendations
```
**3. Evidence-Based Conclusions**
Every conclusion includes:
- The data it's based on
- The reasoning process
- Supporting evidence
- Expected impact of recommendations
**4. Reproducibility**
Anyone can verify conclusions by:
- Following the same steps
- Applying the same criteria
- Checking the same data sources
- Reproducing the calculation/analysis
## Command Usage Patterns
### Pre-Generation Phase
**Specification Creation and Validation:**
```bash
# For new users - interactive wizard
/init
# For spec validation before generation
/validate-spec specs/my_spec.md
# Strict validation (recommended for important generations)
/validate-spec specs/my_spec.md strict
```
**Why CoT Helps:** Validation shows exactly which spec requirements are vague, incomplete, or contradictory, with reasoning about WHY each matters for successful generation.
### Generation Phase
**Main Orchestration:**
```bash
# Single iteration
/project:infinite specs/my_spec.md outputs 1
# Small batch
/project:infinite specs/my_spec.md outputs 5
# Large batch
/project:infinite specs/my_spec.md outputs 20
# Infinite mode
/project:infinite specs/my_spec.md outputs infinite
```
**Why CoT Helps:** Orchestrator shows reasoning for agent assignments, wave planning, and creative direction distribution.
**Monitoring During Generation:**
```bash
# Check status during long runs
/status outputs/
# Detailed status with trends
/status outputs/ detailed
```
**Why CoT Helps:** Status shows reasoning behind progress predictions, quality trends, and recommendations to continue or adjust.
### Post-Generation Phase
**Testing and Validation:**
```bash
# Test all outputs
/test-output outputs/ specs/my_spec.md
# Test specific dimension
/test-output outputs/ specs/my_spec.md quality
```
**Why CoT Helps:** Test failures include reasoning chains showing exactly where outputs deviate from specs and why it impacts quality.
**Analysis and Reporting:**
```bash
# Analyze patterns and quality
/analyze outputs/
# Generate comprehensive report
/report outputs/ specs/my_spec.md detailed
# Executive summary only
/report outputs/ specs/my_spec.md executive
```
**Why CoT Helps:** Analysis and reports show complete reasoning from data to insights, making all conclusions verifiable.
### Troubleshooting Phase
**When Issues Occur:**
```bash
# Debug specific problem
/debug "generation produced empty files" outputs/
# Debug quality issues
/debug "low uniqueness scores" outputs/
```
**Why CoT Helps:** Debug utility traces from symptom → hypothesis → evidence → root cause → solution, teaching users debugging methodology.
## Utility Integration Points
### How Utilities Support Each Other
**1. Init → Validate-Spec → Infinite**
```
/init creates spec → /validate-spec checks it → /infinite uses it
```
CoT flow: Setup reasoning → Validation reasoning → Orchestration reasoning
**2. Infinite → Status → Analyze**
```
/infinite generates → /status monitors → /analyze evaluates
```
CoT flow: Deployment reasoning → Progress reasoning → Pattern reasoning
**3. Test-Output → Debug → Report**
```
/test-output finds issues → /debug diagnoses → /report summarizes
```
CoT flow: Testing reasoning → Diagnostic reasoning → Synthesis reasoning
### Chain-of-Thought Consistency
All utilities follow consistent CoT patterns:
**Step Structure:**
- Each command breaks work into 5-8 major steps
- Each step has a clear purpose (question it answers)
- Steps flow logically (each builds on previous)
- Final step synthesizes into actionable output
**Reasoning Template:**
```markdown
### Step N: [Step Name]
[What question does this step answer?]
[Reasoning approach:]
1. [Sub-task 1]
2. [Sub-task 2]
3. [Sub-task 3]
[How this connects to next step]
```
**Output Structure:**
- Executive summary (for decision-makers)
- Detailed findings (for verification)
- Reasoning chains (for understanding)
- Actionable recommendations (for next steps)
## File Organization
### Specifications (`specs/`)
**Example Specification:** `example_spec.md`
- Demonstrates complete spec structure
- Shows how to integrate utility commands
- Includes section explaining how utilities help
- Uses CoT principles in requirement definitions
**Spec Quality Standards:**
Validated specs should have:
1. Clear purpose and success criteria
2. Explicit output structure requirements
3. Unambiguous naming conventions
4. Measurable quality standards
5. Well-defined uniqueness constraints
### Utilities (`utils/`)
**Quality Metrics:** `quality_metrics.json`
- Defines all quality dimensions
- Provides explicit calculation methods
- Sets clear thresholds (excellent/good/acceptable)
- Explains reasoning for weights and criteria
- Includes CoT application examples
**Key Metrics:**
- Completeness (25% weight) - All components present
- Technical Correctness (25% weight) - No syntax/logic errors
- Spec Compliance (25% weight) - Meets requirements
- Uniqueness (15% weight) - Differs from other iterations
- Innovation (10% weight) - Creative/novel approach
### Templates (`templates/`)
**Report Template:** `report_template.md`
- Standard structure for all reports
- Reasoning templates for each section
- CoT principles explained and demonstrated
- Examples of good vs poor reasoning chains
## Chain-of-Thought Benefits in Practice
### 1. Transparency Benefits
**Traditional Approach:**
```
"Quality is poor."
```
**CoT Approach:**
```
Quality assessment reasoning:
Step 1: Test results show 60% pass rate (12/20 passed)
Step 2: Target pass rate is 85% for acceptable quality
Step 3: Gap is 25 percentage points
Step 4: Failure pattern: 8 iterations missing interactive controls
Step 5: Root cause: Spec doesn't require interactivity explicitly
Step 6: Fix: Add requirement to spec section 4.3
Step 7: Expected impact: Pass rate → 90%+
Conclusion: Quality below standard due to spec gap. Fix spec and regenerate.
```
**Benefit:** Users understand not just the conclusion but the complete reasoning path.
### 2. Reproducibility Benefits
**Traditional Approach:**
```
"Themes are repetitive."
```
**CoT Approach:**
```
Theme diversity analysis:
Step 1: Count unique themes = 8 themes across 20 iterations
Step 2: Calculate distribution = [4, 4, 3, 2, 2, 2, 2, 1]
Step 3: Compute evenness = Shannon index 0.82
Step 4: Compare to target = 0.90 for "high diversity"
Step 5: Identify overuse = Bar charts (4x) and line graphs (4x)
Step 6: Gap analysis = 0.08 below target
Step 7: Recommend = Prioritize scatter plots, heatmaps, network graphs
Conclusion: Moderate diversity (0.82) due to bar chart/line graph overuse.
```
**Benefit:** Anyone can reproduce analysis by following same steps and criteria.
### 3. Debuggability Benefits
**Traditional Approach:**
```
"Something went wrong."
```
**CoT Approach:**
```
Debugging process:
Step 1: Symptom = No output files created
Step 2: Hypothesis A = Spec unreadable → Test: Spec reads OK → Eliminated
Step 3: Hypothesis B = Permissions issue → Test: chmod 755 → Confirmed
Step 4: Root cause = Directory created read-only
Step 5: Immediate fix = chmod 755 outputs/
Step 6: Verify = Regeneration succeeds
Step 7: Prevent = Update /init to set permissions
Conclusion: Permission issue fixed. Prevention added to setup wizard.
```
**Benefit:** Users see diagnostic reasoning and learn troubleshooting methodology.
### 4. Educational Benefits
**Traditional Approach:**
```
"Use scatter plots next."
```
**CoT Approach:**
```
Next iteration recommendation reasoning:
Step 1: Review used techniques = 5 bar, 4 line, 3 pie, 3 area, 2 donut, 2 gauge, 1 heatmap
Step 2: Identify unused = Scatter (0), bubble (0), network (0), treemap (0), sankey (0)
Step 3: Consider data fit = Current data works well with scatter plots
Step 4: Check learning curve = Scatter plots good intermediate difficulty
Step 5: Verify uniqueness = Would be first scatter plot (100% unique)
Step 6: Estimate quality = High (common technique, good documentation)
Recommendation: Next iteration should use scatter plot because: (1) unused technique (uniqueness), (2) fits current data, (3) appropriate difficulty, (4) well-documented (quality). Expected quality score: 85-90/100.
```
**Benefit:** Users learn selection reasoning and can apply same logic independently.
## Best Practices for Using This Variant
### 1. Trust but Verify
**Do:**
- Follow the reasoning chains provided by utilities
- Verify conclusions by checking the evidence cited
- Reproduce calculations to confirm accuracy
- Challenge conclusions that don't seem right
**Why:** CoT makes verification possible. Use it.
### 2. Learn from the Reasoning
**Do:**
- Read the step-by-step processes in utility outputs
- Understand WHY each step is necessary
- Note what criteria are used for decisions
- Apply the same reasoning to similar problems
**Why:** Utilities teach methodology, not just provide answers.
### 3. Start with Validation
**Do:**
- Always run `/validate-spec` before generation
- Use strict mode for important generations
- Fix warnings, not just critical issues
- Validate again after spec changes
**Why:** CoT validation catches problems early when they're easy to fix.
### 4. Use Utilities Proactively
**Do:**
- Run `/status` during long generations
- Run `/analyze` after each wave in infinite mode
- Run `/test-output` immediately after generation
- Run `/report` at the end for documentation
**Why:** CoT reasoning helps you adjust course before problems compound.
### 5. Debug Systematically
**Do:**
- Run `/debug` when issues occur
- Follow the hypothesis-testing approach shown
- Document root causes and solutions
- Update specs to prevent recurrence
**Why:** CoT debugging teaches you to fish, not just gives you a fish.
## Quality Assurance
### Specification Quality
**Minimum Requirements:**
- All 5 required sections present and complete
- Naming pattern unambiguous with examples
- Quality standards measurable and specific
- Uniqueness constraints clearly defined
**Validation:**
```bash
/validate-spec specs/my_spec.md strict
```
**Pass Criteria:**
- No critical issues
- No warnings (in strict mode)
- All sections rated "Complete" or "Excellent"
- Executability assessment: "Can execute"
### Output Quality
**Minimum Requirements:**
- Pass rate ≥ 85% (17/20 for batch of 20)
- Average quality score ≥ 80/100
- Uniqueness score ≥ 70 per iteration
- No critical issues in any iteration
**Testing:**
```bash
/test-output outputs/ specs/my_spec.md
```
**Pass Criteria:**
- Structural tests: 100% pass
- Content tests: ≥ 90% pass
- Quality tests: ≥ 85% pass
- No critical failures
### Process Quality
**Indicators of Good Process:**
- Spec validated before generation
- First wave tested before continuing
- Status monitored during long runs
- Issues debugged and documented
- Final report generated and reviewed
**Red Flags:**
- Skipping validation step
- Generating full batch without testing
- Ignoring warnings or quality signals
- Not debugging failures
- No post-generation analysis
## Extending This Variant
### Adding New Utility Commands
**Process:**
1. Identify utility purpose (what problem does it solve?)
2. Design CoT process (5-8 major steps)
3. Define reasoning approach for each step
4. Create output structure with reasoning sections
5. Add usage examples showing benefits
6. Document integration with existing utilities
**Template:**
See "Contributing and Extending" section in README.md
**Quality Criteria:**
- Clear CoT process with 5-8 steps
- Each step has defined purpose and reasoning
- Output includes executive summary + detailed reasoning
- Examples demonstrate CoT benefits
- Integrates with existing utilities
### Customizing for Different Domains
**To adapt to different content types:**
1. Update `example_spec.md` with domain-specific requirements
2. Update `quality_metrics.json` with domain-specific metrics
3. Update `report_template.md` with domain-specific analysis sections
4. Keep CoT reasoning structure intact (transparency remains valuable)
**Example domains:**
- Code generation (components, functions, modules)
- Documentation (guides, tutorials, API docs)
- Data visualizations (charts, dashboards, infographics)
- UI components (React, Vue, web components)
- Scientific content (analyses, visualizations, reports)
## Common Workflows
### First-Time User Workflow
```bash
# 1. Interactive setup
/init
# Follow wizard prompts:
# - Answer questions about generation goals
# - Review generated spec
# - Observe test generation
# - Learn utility commands
# - Get customized workflow
# 2. Generate first real batch
/project:infinite specs/user_spec.md outputs 5
# 3. Review with utilities
/test-output outputs/ specs/user_spec.md
/analyze outputs/
# 4. Generate report for documentation
/report outputs/ specs/user_spec.md summary
```
### Experienced User Workflow
```bash
# 1. Create and validate spec
# (edit specs/my_spec.md)
/validate-spec specs/my_spec.md strict
# 2. Generate with monitoring
/project:infinite specs/my_spec.md outputs 20
/status outputs/ detailed # Check periodically
# 3. Test and analyze
/test-output outputs/ specs/my_spec.md
/analyze outputs/
# 4. Debug if needed
/debug "description of issue" outputs/
# 5. Generate final report
/report outputs/ specs/my_spec.md detailed
```
### Production Workflow
```bash
# 1. Strict validation
/validate-spec specs/production_spec.md strict
# Fix ALL issues, not just critical
# 2. Test run first
/project:infinite specs/production_spec.md test_outputs 5
/test-output test_outputs/ specs/production_spec.md
# Verify 100% pass rate
# 3. Full generation with checkpoints
/project:infinite specs/production_spec.md prod_outputs 20
/status prod_outputs/ detailed # After wave 1
/analyze prod_outputs/ # After wave 2
/test-output prod_outputs/ specs/production_spec.md # After wave 4
# 4. Comprehensive review
/report prod_outputs/ specs/production_spec.md technical
# Review technical report thoroughly
# 5. Archive and document
# Move to permanent location
# Keep report for documentation
```
## Troubleshooting Guide
### Issue: "Too much reasoning, hard to find the answer"
**Solution:** Use summary modes
```bash
/status outputs/ summary
/report outputs/ specs/my_spec.md executive
```
### Issue: "Reasoning chain seems wrong"
**Solution:** Debug the reasoning
```bash
/debug "validation said spec is complete but section 4 is missing" specs/my_spec.md
```
### Issue: "Can't reproduce the analysis results"
**Solution:** Check for data changes
```bash
# Re-run analysis to see if consistent
/analyze outputs/
# Check if files changed since last analysis
ls -lt outputs/
```
### Issue: "Utilities give conflicting recommendations"
**Solution:** Use debug to understand why
```bash
/debug "analyze recommends X but test-output recommends Y" outputs/
```
## Performance Considerations
### Large Batches (50+ iterations)
**Recommendations:**
- Use `/status` to monitor progress, not `/analyze` (lighter weight)
- Run `/analyze` only after each wave completes, not after each iteration
- Use `/test-output` on samples (first 10, last 10) rather than all iterations
- Generate `/report` once at end, not during generation
### Infinite Mode
**Recommendations:**
- Set up periodic `/status` checks (every 5-10 iterations)
- Run `/analyze` after each wave to detect theme exhaustion
- Monitor quality trends to detect degradation
- Plan stopping criteria in advance (iteration count, quality threshold, time limit)
### Resource Optimization
**Disk Space:**
- Monitor with `/status outputs/ detailed`
- Archive old iterations before starting new batches
- Use summary modes to reduce log file sizes
**Context Usage:**
- CoT increases token usage (more detailed outputs)
- Balance detail level with context limits
- Use summary modes for routine checks
- Use detailed modes for important decisions
## Key Differentiators from Other Variants
### vs. Base Infinite Loop Pattern
**Base:** Orchestration without utility ecosystem
**This Variant:** Rich utilities with CoT reasoning at every step
**Benefit:** Complete transparency and support throughout entire lifecycle
### vs. Web-Enhanced Variant
**Web-Enhanced:** Progressive learning from web resources
**This Variant:** Progressive learning from reasoning chains
**Benefit:** Self-contained knowledge that builds user competency
### vs. Future Variants
**This variant excels when:**
- Transparency and explainability are critical
- Users need to verify and trust conclusions
- Teaching/learning is an important goal
- Debugging and troubleshooting are frequent
- Reproducibility and auditability matter
**Other variants may excel when:**
- Raw generation speed is priority
- Output volume matters more than process understanding
- Users are experts who don't need reasoning shown
- Context limits require minimal token usage
## Success Metrics
### How to Know This Variant is Working Well
**Process Indicators:**
- Users running `/validate-spec` before generation (good practice adoption)
- Users citing reasoning chains when discussing results (understanding)
- Users reproducing analyses independently (learning transfer)
- Users debugging issues systematically (skill development)
**Quality Indicators:**
- Spec validation pass rate ≥ 90% (specs improving)
- First-wave test pass rate ≥ 85% (fewer iterations wasted)
- Issue resolution time decreasing (debugging skills improving)
- Repeat issues decreasing (prevention working)
**Outcome Indicators:**
- Generated iteration quality ≥ 85/100 average
- User satisfaction with utility transparency
- Reduced need for manual intervention
- Increased user competency over time
## Contact and Support
**For issues with this variant:**
- Check README.md for usage examples
- Run `/debug` with description of issue
- Review CoT reasoning chains to understand behavior
- Verify spec with `/validate-spec strict`
**For general infinite loop questions:**
- See parent project CLAUDE.md
- Review base pattern documentation
- Compare with other variants
---
**Variant Version:** 1.0
**Last Updated:** 2025-10-10
**Chain-of-Thought Research:** [Prompting Guide](https://www.promptingguide.ai/techniques/cot)
**Generated By:** Claude Code (claude-sonnet-4-5)