canvas-website/SANITIZATION_EXPLANATION.md

92 lines
3.0 KiB
Markdown

# Sanitization Explanation
## Why Sanitization Exists
Sanitization is **necessary** because TLDraw has strict schema requirements that must be met for shapes to render correctly. Without sanitization, we get validation errors and broken shapes.
## Critical Fixes (MUST KEEP)
These fixes are **required** for TLDraw to work:
1. **Move w/h/geo from top-level to props for geo shapes**
- TLDraw schema requires `w`, `h`, and `geo` to be in `props`, not at the top level
- Without this, TLDraw throws validation errors
2. **Remove w/h from group shapes**
- Group shapes don't have `w`/`h` properties
- Having them causes validation errors
3. **Remove w/h from line shapes**
- Line shapes use `points`, not `w`/`h`
- Having them causes validation errors
4. **Fix richText structure**
- TLDraw requires `richText` to be `{ content: [...], type: 'doc' }`
- Old data might have it as an array or missing structure
- We preserve all content, just fix the structure
5. **Fix crop structure for image/video**
- TLDraw requires `crop` to be `{ topLeft: {x,y}, bottomRight: {x,y} }` or `null`
- Old data might have `{ x, y, w, h }` format
- We convert the format, preserving the crop area
6. **Remove h/geo from text shapes**
- Text shapes don't have `h` or `geo` properties
- Having them causes validation errors
7. **Ensure required properties exist**
- Some shapes require certain properties (e.g., `points` for line shapes)
- We only add defaults if truly missing
## What We Preserve
We **preserve all user data**:
-`richText` content (we only fix structure, never delete content)
-`text` property on arrows
- ✅ All metadata (`meta` object)
- ✅ All valid shape properties
- ✅ Custom shape properties
## What We Remove (Only When Necessary)
We only remove properties that:
1. **Cause validation errors** (e.g., `w`/`h` on groups/lines)
2. **Are invalid for the shape type** (e.g., `geo` on text shapes)
We **never** remove:
- User-created content (text, richText)
- Valid metadata
- Properties that don't cause errors
## Current Sanitization Locations
1. **TLStoreToAutomerge.ts** - When saving from TLDraw to Automerge
- Minimal fixes only
- Preserves all data
2. **AutomergeToTLStore.ts** - When loading from Automerge to TLDraw
- Minimal fixes only
- Preserves all data
3. **useAutomergeStoreV2.ts** - Initial load processing
- More extensive (handles migration from old formats)
- Still preserves all user data
## Can We Simplify?
**Yes, but carefully:**
1. ✅ We can remove property deletions that don't cause validation errors
2. ✅ We can consolidate duplicate logic
3. ❌ We **cannot** remove schema fixes (w/h/geo movement, richText structure)
4. ❌ We **cannot** remove property deletions that cause validation errors
## Recommendation
Keep sanitization but:
1. Only delete properties that **actually cause validation errors**
2. Preserve all user data (text, richText, metadata)
3. Consolidate duplicate logic between files
4. Add comments explaining why each fix is necessary