infinite-agents-public/infinite_variants/infinite_variant_4/templates/quality_report.md

12 KiB
Raw Blame History

Quality Evaluation Report Template

This template provides the structure for comprehensive quality reports generated by the /quality-report command.

Report Header

# Quality Evaluation Report - Wave {wave_number}

**Generated**: {timestamp}
**Directory**: {output_dir}
**Specification**: {spec_path}
**Total Iterations**: {iteration_count}
**Evaluation System**: ReAct Pattern (Reasoning + Acting + Observing)

---

Executive Summary Section

## Executive Summary

### Overall Quality Assessment

{1-3 paragraph narrative summary of quality state}

**Quality Level**: {Exceptional/Excellent/Good/Adequate/Needs Improvement}
**Trend**: {Improving/Stable/Declining} {if multiple waves}

### Top 3 Insights

1. **{Insight 1 Title}**: {Brief description}
2. **{Insight 2 Title}**: {Brief description}
3. **{Insight 3 Title}**: {Brief description}

### Priority Recommendation

**Action**: {Single most important action for next wave}
**Rationale**: {Why this matters most}
**Expected Impact**: {Quality improvement anticipated}

---

Metrics Overview Section

## Quality Metrics Overview

### Composite Scores

| Metric | Value | Target | Status |
|--------|-------|--------|--------|
| Mean Score | {mean}/100 | {target} | {✓/⚠/✗} |
| Median Score | {median}/100 | - | - |
| Std Deviation | {std} | - | - |
| Range | {min} - {max} | - | - |
| Top Score | {max}/100 | - | - |
| Bottom Score | {min}/100 | - | - |

### Distribution

{score_distribution_histogram}


### Dimensional Breakdown

| Dimension | Mean | Median | Std Dev | Min | Max | Top Iteration |
|-----------|------|--------|---------|-----|-----|---------------|
| Technical | {tech_mean} | {tech_median} | {tech_std} | {tech_min} | {tech_max} | iteration_{X} |
| Creativity | {creative_mean} | {creative_median} | {creative_std} | {creative_min} | {creative_max} | iteration_{Y} |
| Compliance | {compliance_mean} | {compliance_median} | {compliance_std} | {compliance_min} | {compliance_max} | iteration_{Z} |

### Quality Progression {if sequence available}

{score_timeline_chart}


---

Rankings Section

## Rankings & Performance Segments

### Top Performers (Top 20%)

**Exemplary Quality** - {count} iterations, average score: {avg}

1. **iteration_{X}** - Score: {score}/100
   - Technical: {tech} | Creativity: {creative} | Compliance: {compliance}
   - Profile: {quality_profile}
   - Strengths: {top_strengths}
   - Notable: {distinctive_characteristic}

2. **iteration_{Y}** - Score: {score}/100
   {repeat structure}

{continue for all top 20% iterations}

### Proficient Performers (30th-50th Percentile)

**Above Average Quality** - {count} iterations, average score: {avg}

{list with less detail}

### Adequate Performers (50th-80th Percentile)

**Meets Expectations** - {count} iterations, average score: {avg}

{list with minimal detail}

### Developing Iterations (Bottom 20%)

**Improvement Opportunities** - {count} iterations, average score: {avg}

{list with focus on growth areas}

---

Visual Analysis Section

## Visual Quality Analysis

### Score Distribution Histogram

Composite Score Distribution

90-100 ████████ ({count}) {percent}% 80-89 ████████████████ ({count}) {percent}% 70-79 ████████████ ({count}) {percent}% 60-69 ████████ ({count}) {percent}% 50-59 ████ ({count}) {percent}% <50 ({count}) {percent}%

Pattern: {description of distribution shape}


### Quality Quadrant Map

Technical vs Creativity Positioning

    High Creativity (>75)
          │
Q2: Innovators │ Q1: Triple Threats
{count} iters  │ {count} iters

─────────────────┼───────────────── Q3: Developing │ Q4: Engineers {count} iters │ {count} iters │ Low Creativity (<75) │ Low Tech │ High Tech (<75) │ (>75)

Insight: {quadrant_analysis}


### Dimensional Radar

    Technical ({mean})
         

Compliance ───── Creativity ({mean}) ({mean})

Pattern: {shape_interpretation} Balance: {balance_assessment}


---

Deep Analysis Section

## Deep Quality Analysis

### Pattern 1: {Pattern Name}

**Observation**: {What we see in the data}

**Affected Iterations**: {list}

**Analysis**: {Why this pattern exists}

**Impact on Quality**: {How it affects scores}

**Strategic Insight**: {What this means for future}

### Pattern 2: {Pattern Name}

{repeat structure}

{continue for all significant patterns}

---

## Quality Trade-offs

### Trade-off 1: {Dimension A} vs {Dimension B}

**Correlation**: {positive/negative/none} ({coefficient if calculated})

**Pattern**: {description of trade-off}

**Example Iterations**:
- High {A}, Low {B}: iteration_{X} ({A_score}/{B_score})
- High {B}, Low {A}: iteration_{Y} ({A_score}/{B_score})
- Balanced: iteration_{Z} ({A_score}/{B_score})

**Implication**: {what this means strategically}

**Recommendation**: {how to handle this trade-off}

### Trade-off 2: {Dimension A} vs {Dimension B}

{repeat structure}

---

## Success Factor Analysis

### What Makes Iterations Succeed

**Factor 1: {Success Factor}**
- Evidence: Iterations {list} all exhibit {characteristic}
- Impact: Average {dimension} score {X} points higher
- Recommendation: {how to amplify this factor}

**Factor 2: {Success Factor}**
{repeat}

{continue for all identified success factors}

### What Causes Lower Scores

**Factor 1: {Failure Factor}**
- Evidence: Iterations {list} all share {problem}
- Impact: Average {dimension} score {X} points lower
- Recommendation: {how to avoid this factor}

**Factor 2: {Failure Factor}**
{repeat}

{continue for all identified failure factors}

---

Strategic Insights Section

## Strategic Insights & Implications

### Insight 1: {Insight Title}

**Observation**: {Data-driven observation}

**Analysis**: {Reasoning about why this matters}

**Implication**: {What this means for strategy}

**Confidence**: {High/Medium/Low}

**Action Items**:
1. {Specific action}
2. {Specific action}
3. {Specific action}

### Insight 2: {Insight Title}

{repeat structure}

{continue for all major insights}

---

Recommendations Section

## Recommendations for Next Wave

### Priority 1: {Recommendation Title}

**Rationale**: {Why this is priority #1}

**Current State**: {What we see now}

**Desired State**: {What we want to achieve}

**Action Steps**:
1. {Specific step}
2. {Specific step}
3. {Specific step}

**Expected Impact**:
- {Dimension}: +{points} improvement
- {Dimension}: +{points} improvement
- Composite: +{points} improvement

**Difficulty**: {Low/Medium/High}
**Priority**: {High/Medium/Low}

### Priority 2: {Recommendation Title}

{repeat structure}

{continue for top 5 priorities}

---

## Creative Direction Recommendations

Based on analysis of successful iterations, explore these creative directions:

1. **{Direction 1}**: {Description}
   - Inspiration: iteration_{X} demonstrated {characteristic}
   - Target dimensions: {which quality dimensions benefit}
   - Risk level: {Low/Medium/High}

2. **{Direction 2}**: {Description}
   {repeat}

{continue for 5-10 recommended directions}

---

## Quality Targets for Next Wave

| Dimension | Current Mean | Target Mean | Stretch Goal |
|-----------|--------------|-------------|--------------|
| Technical | {current} | {target} | {stretch} |
| Creativity | {current} | {target} | {stretch} |
| Compliance | {current} | {target} | {stretch} |
| Composite | {current} | {target} | {stretch} |

**Rationale**: {why these targets}

**Strategy**: {how to achieve targets}

---

System Performance Section

## Quality System Performance Assessment

### Evaluation System Effectiveness

**Score Differentiation**: {High/Medium/Low}
- Explanation: {how well scores separate quality levels}
- Evidence: {standard deviation, range, distribution}

**Scoring Consistency**: {High/Medium/Low}
- Explanation: {how reliably criteria are applied}
- Evidence: {examples of consistent scoring}

**Criterion Fairness**: {High/Medium/Low}
- Explanation: {whether scoring feels balanced}
- Evidence: {analysis of dimension weights}

**Actionability of Results**: {High/Medium/Low}
- Explanation: {whether results guide improvement}
- Evidence: {specific actionable insights generated}

### System Recommendations

**Recommended Adjustments**:
1. {Adjustment to evaluation system}
2. {Adjustment to scoring weights}
3. {Adjustment to quality criteria}

**Rationale**: {why these adjustments}

---

Appendix Section

## Appendix: Detailed Data

### Complete Rankings Table

| Rank | Iteration | Composite | Technical | Creativity | Compliance | Profile |
|------|-----------|-----------|-----------|------------|------------|---------|
| 1 | iteration_{X} | {score} | {score} | {score} | {score} | {profile} |
| 2 | iteration_{Y} | {score} | {score} | {score} | {score} | {profile} |
{continue for all iterations}

### Individual Evaluation Summaries

**iteration_{X}** - Rank {rank}, Score {score}/100

Technical ({score}/100):
- Code Quality: {score}/25
- Architecture: {score}/25
- Performance: {score}/25
- Robustness: {score}/25

Creativity ({score}/100):
- Originality: {score}/25
- Innovation: {score}/25
- Uniqueness: {score}/25
- Aesthetic: {score}/25

Compliance ({score}/100):
- Requirements: {score}/40
- Naming: {score}/20
- Structure: {score}/20
- Standards: {score}/20

Key Strengths: {list}
Growth Areas: {list}

{repeat for all iterations or top/bottom performers}

---

Meta-Reflection Section

## Meta-Reflection: Quality of This Report

### Self-Assessment

**Actionability**: {High/Medium/Low}
- {reasoning about whether recommendations can be implemented}

**Comprehensiveness**: {High/Medium/Low}
- {reasoning about coverage of quality dimensions}

**Honesty**: {High/Medium/Low}
- {reasoning about acknowledging weaknesses}

**Usefulness**: {High/Medium/Low}
- {reasoning about value for improvement}

### Report Limitations

1. {Limitation 1}
2. {Limitation 2}
3. {Limitation 3}

### Confidence Assessment

**Overall Confidence in Findings**: {High/Medium/Low}

**Reasoning**: {why this confidence level}

**Caveats**: {what might invalidate findings}

---

## Conclusion

{1-2 paragraph summary of report}

**Next Steps**: {immediate actions to take}

**Success Metrics**: {how to measure improvement in next wave}

---

*This report generated using ReAct pattern: Reasoning → Action → Observation*

*All insights derived from evidence-based analysis of {iteration_count} iteration evaluations*

*Report Version: 1.0 | Generated: {timestamp}*

Usage Notes

This template should be populated with:

  • Actual data from evaluations
  • Calculated statistics
  • Identified patterns
  • Strategic insights
  • Specific recommendations

Sections can be:

  • Expanded with additional analysis
  • Condensed if less detail needed
  • Reordered based on priorities
  • Customized for specific contexts

The template emphasizes:

  • Evidence-based insights
  • Actionable recommendations
  • Clear visualizations (text-based)
  • Strategic thinking
  • Honest assessment
  • ReAct reasoning throughout

Remember: A quality report is only valuable if it drives improvement. Fill this template with meaningful insights, specific recommendations, and clear reasoning.