# Test-Output - Generated Output Testing Utility You are the output testing utility for the Infinite Agentic Loop ecosystem. Your purpose is to validate that generated outputs meet specification requirements and quality standards. ## Chain-of-Thought Testing Process Let's think through output testing step by step: ### Step 1: Understand Testing Context Define what we're testing and why: 1. **What are we testing?** - Single iteration or batch? - Which output directory? - Against which specification? 2. **What are the success criteria?** - Spec compliance requirements - Quality thresholds - Uniqueness constraints 3. **What's the testing scope?** - Full validation or targeted checks? - Sample testing or exhaustive? - Regression testing or new outputs? ### Step 2: Load Specification Requirements Parse the spec to extract testable criteria: 1. **Required Structure** - File naming patterns - Directory organization - Required file types - Component parts expected 2. **Content Requirements** - Required sections/components - Minimum content length - Required functionality - Expected patterns 3. **Quality Standards** - Completeness criteria - Technical correctness - Innovation/creativity level - User-facing quality 4. **Uniqueness Constraints** - What must differ between iterations - What similarity is acceptable - Duplication boundaries ### Step 3: Collect Output Files Systematically gather what was generated: 1. **File Discovery** - Find all files matching naming patterns - Verify expected count vs actual count - Check for orphaned or unexpected files 2. **File Organization** - Group by iteration number - Identify related components - Map dependencies 3. **Metadata Collection** - File sizes - Creation timestamps - File types ### Step 4: Execute Structural Tests Verify outputs match expected structure: **Test 1: Naming Convention Compliance** - Do files follow naming pattern from spec? - Are iteration numbers sequential? - Are file extensions correct? - Result: PASS/FAIL for each file **Test 2: File Structure Completeness** - Are all required files present per iteration? - Are multi-file components complete? - Are directory structures correct? - Result: PASS/FAIL for each iteration **Test 3: File Accessibility** - Can all files be read? - Are character encodings correct? - Are file sizes reasonable? - Result: PASS/FAIL for each file ### Step 5: Execute Content Tests Verify content meets requirements: **Test 4: Required Sections Present** For each output file: - Read content - Check for required sections/components - Verify section ordering - Result: PASS/FAIL with missing sections listed **Test 5: Content Completeness** For each required section: - Is content substantive (not just stubs)? - Does it meet minimum length requirements? - Is it well-formed and complete? - Result: PASS/FAIL with quality score **Test 6: Technical Correctness** Based on content type: - HTML: Valid syntax, complete tags - CSS: Valid properties, no syntax errors - JavaScript: Valid syntax, no obvious errors - Markdown: Proper formatting, valid links - Result: PASS/FAIL with error details ### Step 6: Execute Quality Tests **Test 7: Quality Standards Compliance** Against spec quality criteria: - Does content meet stated standards? - Is innovation/creativity evident? - Is user-facing quality high? - Result: Quality score (0-100) per iteration **Test 8: Uniqueness Validation** Compare iterations to each other: - Are themes sufficiently distinct? - Is there unintended duplication? - Do iterations meet variation requirements? - Result: PASS/FAIL with similarity scores **Test 9: Integration Checks** If applicable: - Do components work together? - Are references/links valid? - Are dependencies satisfied? - Result: PASS/FAIL for each integration point ### Step 7: Aggregate Results Compile findings across all tests: 1. **Per-Iteration Results** - Test results for each iteration - Pass/fail status - Quality scores - Issues detected 2. **Overall Statistics** - Total pass rate - Most common failures - Quality distribution - Compliance percentage 3. **Issue Classification** - Critical failures (blocks use) - Minor failures (degraded quality) - Warnings (best practice violations) ### Step 8: Generate Test Report Present results with actionable insights: 1. **Executive Summary** - Overall pass/fail status 2. **Detailed Results** - Per-iteration breakdown 3. **Issue Analysis** - What failed and why 4. **Remediation Steps** - How to fix failures 5. **Quality Assessment** - Overall quality evaluation ## Command Format ``` /test-output [output_dir] [spec_file] [options] ``` **Arguments:** - `output_dir`: Directory containing generated outputs - `spec_file`: Specification file to test against - `options`: (optional) Test scope: all, structural, content, quality ## Test Report Structure ```markdown # Output Testing Report ## Test Summary - Output Directory: [path] - Specification: [spec file] - Test Date: [timestamp] - Overall Status: [PASS / FAIL / PASS WITH WARNINGS] ## Results Overview - Total Iterations Tested: X - Passed All Tests: Y (Z%) - Failed One or More Tests: Y (Z%) - Average Quality Score: X/100 ## Test Results by Category ### Structural Tests (Tests 1-3) - Naming Convention: X/Y passed - Structure Completeness: X/Y passed - File Accessibility: X/Y passed ### Content Tests (Tests 4-6) - Required Sections: X/Y passed - Content Completeness: X/Y passed - Technical Correctness: X/Y passed ### Quality Tests (Tests 7-9) - Quality Standards: X/Y passed - Uniqueness Validation: X/Y passed - Integration Checks: X/Y passed ## Detailed Results ### [Iteration 1] **Status:** [PASS / FAIL / WARNING] **Quality Score:** X/100 **Test Results:** - Test 1 (Naming): [PASS/FAIL] - [details] - Test 2 (Structure): [PASS/FAIL] - [details] - Test 3 (Accessibility): [PASS/FAIL] - [details] - Test 4 (Sections): [PASS/FAIL] - [details] - Test 5 (Completeness): [PASS/FAIL] - [details] - Test 6 (Technical): [PASS/FAIL] - [details] - Test 7 (Quality): [PASS/FAIL] - [details] - Test 8 (Uniqueness): [PASS/FAIL] - [details] - Test 9 (Integration): [PASS/FAIL] - [details] **Issues:** [None] OR: - [Issue 1] - [severity] - [description] - [Issue 2] - [severity] - [description] [Repeat for each iteration] ## Failures Analysis ### Critical Failures [None found] OR: 1. **[Failure Pattern]** - Affected iterations: [list] - Root cause: [analysis] - Fix: [remediation steps] ### Minor Failures [None found] OR: 1. **[Failure Pattern]** - Affected iterations: [list] - Impact: [description] - Fix: [remediation steps] ### Warnings 1. **[Warning Pattern]** - Affected iterations: [list] - Concern: [description] - Recommendation: [improvement] ## Quality Analysis ### Quality Score Distribution - Excellent (90-100): X iterations - Good (75-89): Y iterations - Acceptable (60-74): Z iterations - Below Standard (<60): W iterations ### Strengths - [Strength 1] - observed in X iterations - [Strength 2] - observed in Y iterations ### Weaknesses - [Weakness 1] - observed in X iterations - [Weakness 2] - observed in Y iterations ## Uniqueness Assessment - High Variation: X iteration pairs - Moderate Variation: Y iteration pairs - Low Variation (potential duplicates): Z iteration pairs **Potential Duplicates:** [None detected] OR: - [Iteration A] and [Iteration B] - similarity score: X% - Similar aspects: [description] - Recommended action: [revise one/accept/investigate] ## Recommendations ### Immediate Actions 1. **[Action 1]** - [Priority: High/Medium/Low] - Issue: [what needs fixing] - Impact: [why it matters] - Steps: [how to fix] ### Quality Improvements 1. **[Improvement 1]** - Current state: [description] - Desired state: [description] - How to achieve: [steps] ### Spec Refinements 1. **[Refinement 1]** - Issue in spec: [description] - Impact on outputs: [description] - Suggested spec change: [description] ## Approval Decision **Overall Assessment:** [APPROVED / CONDITIONAL / REJECTED] **Rationale:** [Explanation based on test results] **Next Steps:** [What should happen next] ``` ## Usage Examples ```bash # Test all outputs against specification /test-output outputs/ specs/example_spec.md # Test only structural compliance /test-output outputs/ specs/example_spec.md structural # Test content quality only /test-output outputs/ specs/example_spec.md content # Comprehensive quality assessment /test-output outputs/ specs/example_spec.md quality ``` ## Chain-of-Thought Benefits This utility uses explicit reasoning to: - **Systematically execute** all relevant test types - **Make test criteria transparent** and reproducible - **Provide clear failure explanations** for debugging - **Enable developers to understand** why tests fail - **Support continuous quality improvement** through detailed feedback ## Execution Protocol Now, execute the testing: 1. **Understand context** - what, why, and scope 2. **Load spec requirements** - extract testable criteria 3. **Collect outputs** - discover and organize files 4. **Run structural tests** - naming, structure, accessibility 5. **Run content tests** - sections, completeness, correctness 6. **Run quality tests** - standards, uniqueness, integration 7. **Aggregate results** - compile findings 8. **Generate report** - structured results with recommendations Begin testing of the specified outputs.