canvas-website/SANITIZATION_EXPLANATION.md

3.0 KiB

Sanitization Explanation

Why Sanitization Exists

Sanitization is necessary because TLDraw has strict schema requirements that must be met for shapes to render correctly. Without sanitization, we get validation errors and broken shapes.

Critical Fixes (MUST KEEP)

These fixes are required for TLDraw to work:

  1. Move w/h/geo from top-level to props for geo shapes

    • TLDraw schema requires w, h, and geo to be in props, not at the top level
    • Without this, TLDraw throws validation errors
  2. Remove w/h from group shapes

    • Group shapes don't have w/h properties
    • Having them causes validation errors
  3. Remove w/h from line shapes

    • Line shapes use points, not w/h
    • Having them causes validation errors
  4. Fix richText structure

    • TLDraw requires richText to be { content: [...], type: 'doc' }
    • Old data might have it as an array or missing structure
    • We preserve all content, just fix the structure
  5. Fix crop structure for image/video

    • TLDraw requires crop to be { topLeft: {x,y}, bottomRight: {x,y} } or null
    • Old data might have { x, y, w, h } format
    • We convert the format, preserving the crop area
  6. Remove h/geo from text shapes

    • Text shapes don't have h or geo properties
    • Having them causes validation errors
  7. Ensure required properties exist

    • Some shapes require certain properties (e.g., points for line shapes)
    • We only add defaults if truly missing

What We Preserve

We preserve all user data:

  • richText content (we only fix structure, never delete content)
  • text property on arrows
  • All metadata (meta object)
  • All valid shape properties
  • Custom shape properties

What We Remove (Only When Necessary)

We only remove properties that:

  1. Cause validation errors (e.g., w/h on groups/lines)
  2. Are invalid for the shape type (e.g., geo on text shapes)

We never remove:

  • User-created content (text, richText)
  • Valid metadata
  • Properties that don't cause errors

Current Sanitization Locations

  1. TLStoreToAutomerge.ts - When saving from TLDraw to Automerge

    • Minimal fixes only
    • Preserves all data
  2. AutomergeToTLStore.ts - When loading from Automerge to TLDraw

    • Minimal fixes only
    • Preserves all data
  3. useAutomergeStoreV2.ts - Initial load processing

    • More extensive (handles migration from old formats)
    • Still preserves all user data

Can We Simplify?

Yes, but carefully:

  1. We can remove property deletions that don't cause validation errors
  2. We can consolidate duplicate logic
  3. We cannot remove schema fixes (w/h/geo movement, richText structure)
  4. We cannot remove property deletions that cause validation errors

Recommendation

Keep sanitization but:

  1. Only delete properties that actually cause validation errors
  2. Preserve all user data (text, richText, metadata)
  3. Consolidate duplicate logic between files
  4. Add comments explaining why each fix is necessary