# Self Test - System Validation and Testing **Purpose:** Validate system capabilities, test recent improvements, and ensure everything works as expected. ## Usage ```bash /self-test [scope] [depth] ``` ### Parameters - `scope`: What to test - "commands", "generation", "improvement", "integration", "all" (default: "all") - `depth`: Test thoroughness - "smoke", "standard", "comprehensive" (default: "standard") ### Examples ```bash # Quick smoke test of all systems /self-test all smoke # Standard test of generation capabilities /self-test generation standard # Comprehensive test of improvement system /self-test improvement comprehensive # Test command integration /self-test integration standard ``` ## Command Implementation You are the **Self-Testing System**. Your role is to validate that all components work correctly and improvements haven't broken functionality. ### Phase 1: Test Planning 1. **Determine Test Scope** Based on `{{scope}}` and `{{depth}}`: **Smoke Testing (quick validation):** - Basic functionality checks - Critical path verification - 5-10 minute execution - Pass/fail only **Standard Testing (thorough validation):** - Full functionality coverage - Quality metrics measurement - 15-30 minute execution - Detailed results **Comprehensive Testing (complete validation):** - All edge cases - Performance benchmarking - Integration testing - Regression detection - 45-90 minute execution - Full diagnostics 2. **Load Test Definitions** Read test specifications from `specs/test_definitions.md` (auto-generated if missing). ### Phase 2: Command Testing 3. **Test /infinite-meta Command** ```markdown ## Test: infinite-meta Basic Functionality **Setup:** - Use test spec: specs/example_spec.md - Generate 3 iterations - Output to: test_output/meta_test_{{timestamp}}/ **Expected Behavior:** - Reads spec correctly - Generates 3 unique outputs - Creates improvement logs - No errors **Validation:** - [ ] All 3 files created - [ ] Files follow spec requirements - [ ] Improvement log exists - [ ] Files are unique (similarity < 70%) - [ ] Quality score ≥ 7.0 **Result:** [PASS/FAIL] **Notes:** [observations] ``` 4. **Test /improve-self Command** ```markdown ## Test: improve-self Analysis **Setup:** - Run: /improve-self all quick **Expected Behavior:** - Analyzes current system state - Generates improvement proposals (3-10) - Creates actionable recommendations - Applies meta-prompting principles **Validation:** - [ ] Analysis completes without errors - [ ] Proposals are concrete and actionable - [ ] Risk assessments included - [ ] Meta-prompting principles evident - [ ] Output format is correct **Result:** [PASS/FAIL] **Notes:** [observations] ``` 5. **Test /generate-spec Command** ```markdown ## Test: generate-spec Creation **Setup:** - Run: /generate-spec patterns novel test_spec **Expected Behavior:** - Analyzes patterns successfully - Generates valid specification - Creates companion files - Spec is actually usable **Validation:** - [ ] Spec file created with correct structure - [ ] Guide file generated - [ ] Examples file generated - [ ] Spec follows meta-prompting principles - [ ] Validation test passes **Bonus Validation:** - [ ] Use generated spec with /infinite-meta - [ ] Verify it produces valid output **Result:** [PASS/FAIL] **Notes:** [observations] ``` 6. **Test /evolve-strategy Command** ```markdown ## Test: evolve-strategy Evolution **Setup:** - Run: /evolve-strategy quality incremental **Expected Behavior:** - Analyzes current strategy - Generates evolution proposals - Creates implementation plan - Includes rollback procedures **Validation:** - [ ] Strategy analysis is thorough - [ ] Evolution proposals are concrete - [ ] Risk assessment included - [ ] Validation plan exists - [ ] Meta-level insights present **Result:** [PASS/FAIL] **Notes:** [observations] ``` 7. **Test /self-test Command** (meta!) ```markdown ## Test: self-test Self-Testing **Setup:** - This test is self-referential! **Expected Behavior:** - Can test itself - Detects if testing framework is broken - Meta-awareness demonstrated **Validation:** - [ ] This test runs successfully - [ ] Can detect test failures - [ ] Generates valid test reports - [ ] Shows meta-level capability **Result:** [PASS/FAIL] **Notes:** [This is meta-testing - testing the tester] ``` 8. **Test /self-document Command** ```markdown ## Test: self-document Documentation **Setup:** - Run: /self-document all standard **Expected Behavior:** - Analyzes system state - Generates/updates documentation - Reflects current capabilities - Clear and accurate **Validation:** - [ ] README.md updated/created - [ ] Documentation is accurate - [ ] Examples are current - [ ] Reflects latest improvements - [ ] Meta-capabilities documented **Result:** [PASS/FAIL] **Notes:** [observations] ``` ### Phase 3: Generation Testing 9. **Test Content Generation Quality** ```markdown ## Test: Generation Quality Baseline **Setup:** - Generate 5 iterations with specs/example_spec.md - Measure quality metrics **Expected Metrics:** - Quality avg: ≥ 7.5 - Uniqueness: ≥ 80% - Spec compliance: 100% - Meta-awareness: ≥ 6.0 **Validation:** - [ ] All files generated successfully - [ ] Quality meets threshold - [ ] No duplicates - [ ] Spec requirements met - [ ] Meta-reflection present **Result:** [PASS/FAIL] **Metrics:** [actual values] ``` 10. **Test Progressive Improvement** ```markdown ## Test: Wave-Based Improvement **Setup:** - Run 3 waves of 5 iterations each - Track quality across waves - Enable improvement_mode = "evolve" **Expected Behavior:** - Quality improves across waves - Diversity maintained - Strategy evolves - Meta-insights accumulated **Validation:** - [ ] Wave 2 quality > Wave 1 quality - [ ] Wave 3 quality > Wave 2 quality - [ ] Diversity maintained (≥ 80%) - [ ] Strategy evolution documented - [ ] Improvement logs created **Result:** [PASS/FAIL] **Metrics:** [quality progression] ``` ### Phase 4: Improvement System Testing 11. **Test Self-Improvement Loop** ```markdown ## Test: Complete Self-Improvement Cycle **Setup:** 1. Run baseline generation (5 iterations) 2. Run /improve-self to analyze 3. Run /evolve-strategy to evolve 4. Run improved generation (5 iterations) 5. Compare results **Expected Behavior:** - Improvements detected - Strategy evolved - Second generation better than first - Changes are traceable **Validation:** - [ ] Improvement analysis complete - [ ] Strategy evolution applied - [ ] Measurable improvement in metrics - [ ] All changes logged - [ ] Rollback possible **Result:** [PASS/FAIL] **Improvement:** [% increase in quality] ``` 12. **Test Meta-Prompting Application** ```markdown ## Test: Meta-Prompting Principles **Setup:** - Review all generated content - Analyze for meta-prompting principles **Expected Behavior:** - Structure-oriented approaches evident - Abstract frameworks used - Minimal example dependency - Efficient reasoning patterns **Validation:** - [ ] Commands use structural patterns - [ ] Specs define frameworks, not examples - [ ] Reasoning is principle-based - [ ] Meta-awareness present throughout - [ ] Self-improvement capability demonstrated **Result:** [PASS/FAIL] **Notes:** [specific examples] ``` ### Phase 5: Integration Testing 13. **Test Command Integration** ```markdown ## Test: Multi-Command Workflow **Setup:** 1. /generate-spec patterns novel workflow_test 2. /infinite-meta workflow_test.md output/ 5 evolve 3. /improve-self all standard 4. /evolve-strategy quality incremental 5. /infinite-meta workflow_test.md output/ 5 evolve 6. /self-document all standard **Expected Behavior:** - All commands work together - Data flows between commands - Improvements compound - System remains stable **Validation:** - [ ] All steps complete successfully - [ ] Generated spec is usable - [ ] Improvements detected and applied - [ ] Second generation better than first - [ ] Documentation reflects changes - [ ] No data corruption or errors **Result:** [PASS/FAIL] **Notes:** [integration observations] ``` 14. **Test Regression Detection** ```markdown ## Test: No Regressions from Improvements **Setup:** - Load baseline metrics from improvement_log/baseline.json - Run current system with same test cases - Compare results **Expected Behavior:** - No metric should regress >10% - Most metrics should improve - Quality maintains or increases **Validation:** - [ ] Quality: No regression - [ ] Efficiency: No regression - [ ] Diversity: No regression - [ ] Meta-awareness: No regression - [ ] Overall: Net improvement **Result:** [PASS/FAIL] **Regressions:** [list any] **Improvements:** [list all] ``` ### Phase 6: Reporting 15. **Generate Test Report** Create `improvement_log/test_report_{{timestamp}}.md`: ```markdown # Self-Test Report - {{timestamp}} **Scope:** {{scope}} **Depth:** {{depth}} **Duration:** {{duration}} minutes ## Executive Summary - Total Tests: {{total}} - Passed: {{passed}} ({{pass_rate}}%) - Failed: {{failed}} - Warnings: {{warnings}} ## Overall Status: [PASS/FAIL/WARNING] ## Test Results by Category ### Command Tests ({{n}} tests) | Command | Status | Notes | |---------|--------|-------| | /infinite-meta | PASS/FAIL | ... | | /improve-self | PASS/FAIL | ... | | /generate-spec | PASS/FAIL | ... | | /evolve-strategy | PASS/FAIL | ... | | /self-test | PASS/FAIL | ... | | /self-document | PASS/FAIL | ... | ### Generation Tests ({{n}} tests) | Test | Status | Metrics | |------|--------|---------| | Quality Baseline | PASS/FAIL | 7.8 avg | | Progressive Improvement | PASS/FAIL | +12% | ### Improvement Tests ({{n}} tests) | Test | Status | Impact | |------|--------|--------| | Self-Improvement Loop | PASS/FAIL | +15% quality | | Meta-Prompting | PASS/FAIL | Evident | ### Integration Tests ({{n}} tests) | Test | Status | Notes | |------|--------|-------| | Multi-Command Workflow | PASS/FAIL | ... | | Regression Detection | PASS/FAIL | ... | ## Failed Tests (if any) [Detailed analysis of failures] ## Performance Metrics - Avg generation time: {{time}}s - Context usage: {{context}}% - Quality score: {{quality}} - Improvement rate: {{rate}}% ## Recommendations [What should be fixed or improved based on test results] ## Meta-Insights [What testing reveals about system capabilities and improvement areas] ## Next Steps 1. [Action item 1] 2. [Action item 2] ``` 16. **Update Health Dashboard** Create/update `improvement_log/system_health.json`: ```json { "last_test": "{{timestamp}}", "overall_status": "healthy|degraded|critical", "test_pass_rate": 0.95, "quality_trend": "improving|stable|declining", "metrics": { "quality_avg": 8.2, "efficiency": 0.82, "diversity": 0.88, "meta_awareness": 0.75 }, "issues": [ "Minor: Edge case in /generate-spec with empty patterns" ], "recent_improvements": [ "Quality +12% since last test", "Efficiency +8% since last test" ] } ``` ## Meta-Testing Pattern This command tests itself using meta-prompting: ``` STRUCTURE: Self-testing system ABSTRACTION: Testing validates structure and patterns, not just functionality REASONING: 1. Define test structure (framework) 2. Execute tests (pattern validation) 3. Analyze results (meta-evaluation) 4. Report findings (actionable insights) SELF_REFLECTION: - Do my tests validate structure or just surface functionality? - Are test patterns generalizable to new capabilities? - Can I detect meta-level issues? - Does testing improve the system? META_CAPABILITY: Testing should improve testability. Each test run should enhance test coverage. Failures should generate better tests. ``` ## Success Criteria A successful self-test: 1. Validates all critical functionality 2. Detects regressions if present 3. Measures key metrics accurately 4. Provides actionable recommendations 5. Shows meta-awareness (can test itself) 6. Generates clear, useful reports 7. Enables continuous improvement --- *This command validates system capabilities using meta-prompting principles. It tests structure and patterns, not just functionality, enabling generalizable quality assurance and continuous improvement.*