# Self Test - System Validation and Testing

**Purpose:** Validate system capabilities, test recent improvements, and ensure everything works as expected.

## Usage

```bash
/self-test [scope] [depth]
```

### Parameters
- `scope`: What to test - "commands", "generation", "improvement", "integration", "all" (default: "all")
- `depth`: Test thoroughness - "smoke", "standard", "comprehensive" (default: "standard")

### Examples

```bash
# Quick smoke test of all systems
/self-test all smoke

# Standard test of generation capabilities
/self-test generation standard

# Comprehensive test of improvement system
/self-test improvement comprehensive

# Test command integration
/self-test integration standard
```

## Command Implementation

You are the **Self-Testing System**. Your role is to validate that all components work correctly and improvements haven't broken functionality.

### Phase 1: Test Planning

1. **Determine Test Scope**

   Based on `{{scope}}` and `{{depth}}`:

   **Smoke Testing (quick validation):**
   - Basic functionality checks
   - Critical path verification
   - 5-10 minute execution
   - Pass/fail only

   **Standard Testing (thorough validation):**
   - Full functionality coverage
   - Quality metrics measurement
   - 15-30 minute execution
   - Detailed results

   **Comprehensive Testing (complete validation):**
   - All edge cases
   - Performance benchmarking
   - Integration testing
   - Regression detection
   - 45-90 minute execution
   - Full diagnostics

2. **Load Test Definitions**

   Read test specifications from `specs/test_definitions.md` (auto-generated if missing).

### Phase 2: Command Testing

3. **Test /infinite-meta Command**

   ```markdown
   ## Test: infinite-meta Basic Functionality

   **Setup:**
   - Use test spec: specs/example_spec.md
   - Generate 3 iterations
   - Output to: test_output/meta_test_{{timestamp}}/

   **Expected Behavior:**
   - Reads spec correctly
   - Generates 3 unique outputs
   - Creates improvement logs
   - No errors

   **Validation:**
   - [ ] All 3 files created
   - [ ] Files follow spec requirements
   - [ ] Improvement log exists
   - [ ] Files are unique (similarity < 70%)
   - [ ] Quality score ≥ 7.0

   **Result:** [PASS/FAIL]
   **Notes:** [observations]
   ```

4. **Test /improve-self Command**

   ```markdown
   ## Test: improve-self Analysis

   **Setup:**
   - Run: /improve-self all quick

   **Expected Behavior:**
   - Analyzes current system state
   - Generates improvement proposals (3-10)
   - Creates actionable recommendations
   - Applies meta-prompting principles

   **Validation:**
   - [ ] Analysis completes without errors
   - [ ] Proposals are concrete and actionable
   - [ ] Risk assessments included
   - [ ] Meta-prompting principles evident
   - [ ] Output format is correct

   **Result:** [PASS/FAIL]
   **Notes:** [observations]
   ```

5. **Test /generate-spec Command**

   ```markdown
   ## Test: generate-spec Creation

   **Setup:**
   - Run: /generate-spec patterns novel test_spec

   **Expected Behavior:**
   - Analyzes patterns successfully
   - Generates valid specification
   - Creates companion files
   - Spec is actually usable

   **Validation:**
   - [ ] Spec file created with correct structure
   - [ ] Guide file generated
   - [ ] Examples file generated
   - [ ] Spec follows meta-prompting principles
   - [ ] Validation test passes

   **Bonus Validation:**
   - [ ] Use generated spec with /infinite-meta
   - [ ] Verify it produces valid output

   **Result:** [PASS/FAIL]
   **Notes:** [observations]
   ```

6. **Test /evolve-strategy Command**

   ```markdown
   ## Test: evolve-strategy Evolution

   **Setup:**
   - Run: /evolve-strategy quality incremental

   **Expected Behavior:**
   - Analyzes current strategy
   - Generates evolution proposals
   - Creates implementation plan
   - Includes rollback procedures

   **Validation:**
   - [ ] Strategy analysis is thorough
   - [ ] Evolution proposals are concrete
   - [ ] Risk assessment included
   - [ ] Validation plan exists
   - [ ] Meta-level insights present

   **Result:** [PASS/FAIL]
   **Notes:** [observations]
   ```

7. **Test /self-test Command** (meta!)

   ```markdown
   ## Test: self-test Self-Testing

   **Setup:**
   - This test is self-referential!

   **Expected Behavior:**
   - Can test itself
   - Detects if testing framework is broken
   - Meta-awareness demonstrated

   **Validation:**
   - [ ] This test runs successfully
   - [ ] Can detect test failures
   - [ ] Generates valid test reports
   - [ ] Shows meta-level capability

   **Result:** [PASS/FAIL]
   **Notes:** [This is meta-testing - testing the tester]
   ```

8. **Test /self-document Command**

   ```markdown
   ## Test: self-document Documentation

   **Setup:**
   - Run: /self-document all standard

   **Expected Behavior:**
   - Analyzes system state
   - Generates/updates documentation
   - Reflects current capabilities
   - Clear and accurate

   **Validation:**
   - [ ] README.md updated/created
   - [ ] Documentation is accurate
   - [ ] Examples are current
   - [ ] Reflects latest improvements
   - [ ] Meta-capabilities documented

   **Result:** [PASS/FAIL]
   **Notes:** [observations]
   ```

### Phase 3: Generation Testing

9. **Test Content Generation Quality**

   ```markdown
   ## Test: Generation Quality Baseline

   **Setup:**
   - Generate 5 iterations with specs/example_spec.md
   - Measure quality metrics

   **Expected Metrics:**
   - Quality avg: ≥ 7.5
   - Uniqueness: ≥ 80%
   - Spec compliance: 100%
   - Meta-awareness: ≥ 6.0

   **Validation:**
   - [ ] All files generated successfully
   - [ ] Quality meets threshold
   - [ ] No duplicates
   - [ ] Spec requirements met
   - [ ] Meta-reflection present

   **Result:** [PASS/FAIL]
   **Metrics:** [actual values]
   ```

10. **Test Progressive Improvement**

    ```markdown
    ## Test: Wave-Based Improvement

    **Setup:**
    - Run 3 waves of 5 iterations each
    - Track quality across waves
    - Enable improvement_mode = "evolve"

    **Expected Behavior:**
    - Quality improves across waves
    - Diversity maintained
    - Strategy evolves
    - Meta-insights accumulated

    **Validation:**
    - [ ] Wave 2 quality > Wave 1 quality
    - [ ] Wave 3 quality > Wave 2 quality
    - [ ] Diversity maintained (≥ 80%)
    - [ ] Strategy evolution documented
    - [ ] Improvement logs created

    **Result:** [PASS/FAIL]
    **Metrics:** [quality progression]
    ```

### Phase 4: Improvement System Testing

11. **Test Self-Improvement Loop**

    ```markdown
    ## Test: Complete Self-Improvement Cycle

    **Setup:**
    1. Run baseline generation (5 iterations)
    2. Run /improve-self to analyze
    3. Run /evolve-strategy to evolve
    4. Run improved generation (5 iterations)
    5. Compare results

    **Expected Behavior:**
    - Improvements detected
    - Strategy evolved
    - Second generation better than first
    - Changes are traceable

    **Validation:**
    - [ ] Improvement analysis complete
    - [ ] Strategy evolution applied
    - [ ] Measurable improvement in metrics
    - [ ] All changes logged
    - [ ] Rollback possible

    **Result:** [PASS/FAIL]
    **Improvement:** [% increase in quality]
    ```

12. **Test Meta-Prompting Application**

    ```markdown
    ## Test: Meta-Prompting Principles

    **Setup:**
    - Review all generated content
    - Analyze for meta-prompting principles

    **Expected Behavior:**
    - Structure-oriented approaches evident
    - Abstract frameworks used
    - Minimal example dependency
    - Efficient reasoning patterns

    **Validation:**
    - [ ] Commands use structural patterns
    - [ ] Specs define frameworks, not examples
    - [ ] Reasoning is principle-based
    - [ ] Meta-awareness present throughout
    - [ ] Self-improvement capability demonstrated

    **Result:** [PASS/FAIL]
    **Notes:** [specific examples]
    ```

### Phase 5: Integration Testing

13. **Test Command Integration**

    ```markdown
    ## Test: Multi-Command Workflow

    **Setup:**
    1. /generate-spec patterns novel workflow_test
    2. /infinite-meta workflow_test.md output/ 5 evolve
    3. /improve-self all standard
    4. /evolve-strategy quality incremental
    5. /infinite-meta workflow_test.md output/ 5 evolve
    6. /self-document all standard

    **Expected Behavior:**
    - All commands work together
    - Data flows between commands
    - Improvements compound
    - System remains stable

    **Validation:**
    - [ ] All steps complete successfully
    - [ ] Generated spec is usable
    - [ ] Improvements detected and applied
    - [ ] Second generation better than first
    - [ ] Documentation reflects changes
    - [ ] No data corruption or errors

    **Result:** [PASS/FAIL]
    **Notes:** [integration observations]
    ```

14. **Test Regression Detection**

    ```markdown
    ## Test: No Regressions from Improvements

    **Setup:**
    - Load baseline metrics from improvement_log/baseline.json
    - Run current system with same test cases
    - Compare results

    **Expected Behavior:**
    - No metric should regress >10%
    - Most metrics should improve
    - Quality maintains or increases

    **Validation:**
    - [ ] Quality: No regression
    - [ ] Efficiency: No regression
    - [ ] Diversity: No regression
    - [ ] Meta-awareness: No regression
    - [ ] Overall: Net improvement

    **Result:** [PASS/FAIL]
    **Regressions:** [list any]
    **Improvements:** [list all]
    ```

### Phase 6: Reporting

15. **Generate Test Report**

    Create `improvement_log/test_report_{{timestamp}}.md`:

    ```markdown
    # Self-Test Report - {{timestamp}}

    **Scope:** {{scope}}
    **Depth:** {{depth}}
    **Duration:** {{duration}} minutes

    ## Executive Summary
    - Total Tests: {{total}}
    - Passed: {{passed}} ({{pass_rate}}%)
    - Failed: {{failed}}
    - Warnings: {{warnings}}

    ## Overall Status: [PASS/FAIL/WARNING]

    ## Test Results by Category

    ### Command Tests ({{n}} tests)
    | Command | Status | Notes |
    |---------|--------|-------|
    | /infinite-meta | PASS/FAIL | ... |
    | /improve-self | PASS/FAIL | ... |
    | /generate-spec | PASS/FAIL | ... |
    | /evolve-strategy | PASS/FAIL | ... |
    | /self-test | PASS/FAIL | ... |
    | /self-document | PASS/FAIL | ... |

    ### Generation Tests ({{n}} tests)
    | Test | Status | Metrics |
    |------|--------|---------|
    | Quality Baseline | PASS/FAIL | 7.8 avg |
    | Progressive Improvement | PASS/FAIL | +12% |

    ### Improvement Tests ({{n}} tests)
    | Test | Status | Impact |
    |------|--------|--------|
    | Self-Improvement Loop | PASS/FAIL | +15% quality |
    | Meta-Prompting | PASS/FAIL | Evident |

    ### Integration Tests ({{n}} tests)
    | Test | Status | Notes |
    |------|--------|-------|
    | Multi-Command Workflow | PASS/FAIL | ... |
    | Regression Detection | PASS/FAIL | ... |

    ## Failed Tests (if any)
    [Detailed analysis of failures]

    ## Performance Metrics
    - Avg generation time: {{time}}s
    - Context usage: {{context}}%
    - Quality score: {{quality}}
    - Improvement rate: {{rate}}%

    ## Recommendations
    [What should be fixed or improved based on test results]

    ## Meta-Insights
    [What testing reveals about system capabilities and improvement areas]

    ## Next Steps
    1. [Action item 1]
    2. [Action item 2]
    ```

16. **Update Health Dashboard**

    Create/update `improvement_log/system_health.json`:

    ```json
    {
      "last_test": "{{timestamp}}",
      "overall_status": "healthy|degraded|critical",
      "test_pass_rate": 0.95,
      "quality_trend": "improving|stable|declining",
      "metrics": {
        "quality_avg": 8.2,
        "efficiency": 0.82,
        "diversity": 0.88,
        "meta_awareness": 0.75
      },
      "issues": [
        "Minor: Edge case in /generate-spec with empty patterns"
      ],
      "recent_improvements": [
        "Quality +12% since last test",
        "Efficiency +8% since last test"
      ]
    }
    ```

## Meta-Testing Pattern

This command tests itself using meta-prompting:

```
STRUCTURE: Self-testing system
ABSTRACTION: Testing validates structure and patterns, not just functionality

REASONING:
1. Define test structure (framework)
2. Execute tests (pattern validation)
3. Analyze results (meta-evaluation)
4. Report findings (actionable insights)

SELF_REFLECTION:
- Do my tests validate structure or just surface functionality?
- Are test patterns generalizable to new capabilities?
- Can I detect meta-level issues?
- Does testing improve the system?

META_CAPABILITY:
Testing should improve testability.
Each test run should enhance test coverage.
Failures should generate better tests.
```

## Success Criteria

A successful self-test:
1. Validates all critical functionality
2. Detects regressions if present
3. Measures key metrics accurately
4. Provides actionable recommendations
5. Shows meta-awareness (can test itself)
6. Generates clear, useful reports
7. Enables continuous improvement

---

*This command validates system capabilities using meta-prompting principles. It tests structure and patterns, not just functionality, enabling generalizable quality assurance and continuous improvement.*