canvas-website/DATA_CONVERSION_SUMMARY.md

142 lines
4.6 KiB
Markdown

# Data Conversion Summary
## Overview
This document summarizes the data conversion implementation from the old tldraw sync format to the new automerge sync format.
## Conversion Paths
The system handles three data formats automatically:
### 1. Automerge Array Format
- **Format**: `[{ state: { id: "...", ... } }, ...]`
- **Conversion**: `convertAutomergeToStore()`
- **Handles**: Raw Automerge document format
### 2. Store Format (Already Converted)
- **Format**: `{ store: { "recordId": {...}, ... }, schema: {...} }`
- **Conversion**: None needed - already in correct format
- **Handles**: Previously converted documents
### 3. Old Documents Format (Legacy)
- **Format**: `{ documents: [{ state: {...} }, ...] }`
- **Conversion**: `migrateDocumentsToStore()`
- **Handles**: Old tldraw sync format
## Validation & Error Handling
### Record Validation
- ✅ Validates `state` property exists
- ✅ Validates `state.id` exists and is a string
- ✅ Validates `state.typeName` exists (for documents format)
- ✅ Skips invalid records with detailed logging
- ✅ Preserves valid records
### Shape Migration
- ✅ Ensures required properties (x, y, rotation, opacity, isLocked, meta, index)
- ✅ Moves `w`/`h` from top-level to `props` for geo shapes
- ✅ Fixes richText structure
- ✅ Preserves custom shape properties (ObsNote, Holon, etc.)
- ✅ Tracks and verifies custom shapes
### Custom Records
- ✅ Preserves `obsidian_vault:` records
- ✅ Tracks custom record count
- ✅ Logs custom record IDs for verification
## Logging & Statistics
All conversion functions now provide comprehensive statistics:
### Conversion Statistics Include:
- Total records processed
- Successfully converted count
- Skipped records (with reasons)
- Errors encountered
- Custom records preserved
- Shape types distribution
- Custom shapes preserved
### Log Levels:
- **Info**: Conversion statistics, successful conversions
- **Warn**: Skipped records, warnings (first 10 shown)
- **Error**: Conversion errors with details
## Data Preservation Guarantees
### What is Preserved:
- ✅ All valid shape data
- ✅ All custom shape properties (ObsNote, Holon, etc.)
- ✅ All custom records (obsidian_vault)
- ✅ All metadata
- ✅ All text content
- ✅ All richText content (structure fixed, content preserved)
### What is Fixed:
- 🔧 Missing required properties (defaults added)
- 🔧 Invalid property locations (w/h moved to props)
- 🔧 Malformed richText structure
- 🔧 Missing typeName (inferred where possible)
### What is Skipped:
- ⚠️ Records with missing `state` property
- ⚠️ Records with missing `state.id`
- ⚠️ Records with invalid `state.id` type
- ⚠️ Records with missing `state.typeName` (for documents format)
## Testing
### Unit Tests
- `test-data-conversion.ts`: Tests edge cases with malformed data
- Covers: missing fields, null records, invalid types, custom records
### Integration Testing
- Test with real R2 data (see `test-r2-conversion.md`)
- Verify data integrity after conversion
- Check logs for warnings/errors
## Migration Safety
### Safety Features:
1. **Non-destructive**: Original R2 data is not modified until first save
2. **Error handling**: Invalid records are skipped, not lost
3. **Comprehensive logging**: All actions are logged for debugging
4. **Fallback**: Creates empty document if conversion fails completely
### Rollback:
- Original data remains in R2 until overwritten
- Can restore from backup if needed
- Conversion errors don't corrupt existing data
## Performance
- Conversion happens once per room (cached)
- Statistics logging is efficient (limited to first 10 errors)
- Shape migration only processes shapes (not all records)
- Custom record tracking is lightweight
## Next Steps
1. ✅ Conversion logic implemented and validated
2. ✅ Comprehensive logging added
3. ✅ Custom records/shapes preservation verified
4. ✅ Edge case handling implemented
5. ⏳ Test with real R2 data (manual process)
6. ⏳ Monitor production conversions
## Files Modified
- `worker/AutomergeDurableObject.ts`: Main conversion logic
- `getDocument()`: Format detection and routing
- `convertAutomergeToStore()`: Automerge array conversion
- `migrateDocumentsToStore()`: Old documents format conversion
- `migrateShapeProperties()`: Shape property migration
## Key Improvements
1. **Validation**: All records are validated before conversion
2. **Logging**: Comprehensive statistics for debugging
3. **Error Handling**: Graceful handling of malformed data
4. **Preservation**: Custom records and shapes are tracked and verified
5. **Safety**: Non-destructive conversion with fallbacks