canvas-website/backlog/tasks/task-027 - Implement-proper...

4.9 KiB

id title status assignee created_date updated_date labels dependencies priority
task-027 Implement proper Automerge CRDT sync for offline-first support In Progress
2025-12-04 21:06 2025-12-25 23:59
offline-sync
crdt
automerge
architecture
high

Description

Replace the current "last-write-wins" full document replacement with proper Automerge CRDT sync protocol. This ensures deletions are preserved across offline/reconnect scenarios and concurrent edits merge correctly.

Current problem: Server does currentDoc.store = { ...newDoc.store } which is full replacement, not merge. This causes "ghost resurrection" of deleted shapes when offline clients reconnect.

Solution: Use Automerge's native binary sync protocol with proper CRDT merge semantics.

Acceptance Criteria

  • #1 Server stores Automerge binary documents in R2 (not JSON)
  • #2 Client-server communication uses Automerge sync protocol (binary messages)
  • #3 Deletions persist correctly when offline client reconnects
  • #4 Concurrent edits merge deterministically without data loss
  • #5 Existing JSON rooms are migrated to Automerge format
  • #6 All existing functionality continues to work

Implementation Notes

Progress Update (2025-12-04)

Implemented:

  1. automerge-init.ts - WASM initialization for Cloudflare Workers using slim variant
  2. automerge-sync-manager.ts - Core CRDT sync manager with proper merge semantics
  3. automerge-r2-storage.ts - Binary R2 storage for Automerge documents
  4. wasm.d.ts - TypeScript declarations for WASM imports

Integration Fixes:

  • getDocument() now returns CRDT document when sync manager is active
  • handleBinaryMessage() syncs currentDoc with CRDT state after updates
  • schedulePersistToR2() delegates to sync manager when CRDT mode is enabled
  • Fixed CloudflareAdapter TypeScript errors (peer-candidate peerMetadata)

Current State:

  • useCrdtSync = true flag is enabled
  • Worker compiles and runs successfully
  • JSON sync fallback works for backward compatibility
  • Binary sync infrastructure is in place
  • Needs production testing with multi-client sync and delete operations

Merged to dev branch (2025-12-05):

  • All Automerge CRDT infrastructure merged
  • WASM initialization, sync manager, R2 storage
  • Integration fixes for getDocument(), handleBinaryMessage(), schedulePersistToR2()
  • Ready for production testing

2025-12-05: Data Safety Mitigations Added

Added safety mitigations for Automerge format conversion (commit f8092d8 on feature/google-export):

Pre-conversion backups:

  • Before any format migration, raw document backed up to R2
  • Location: pre-conversion-backups/{roomId}/{timestamp}_{formatType}.json

Conversion threshold guards:

  • 10% loss threshold: Conversion aborts if too many records would be lost
  • 5% shape loss warning: Emits warning if shapes are lost

Unknown format handling:

  • Unknown formats backed up before creating empty document
  • Raw document keys logged for investigation

Also fixed:

  • Keyboard shortcuts dialog error (tldraw i18n objects)
  • Google Workspace integration now first in Settings > Integrations

Fixed persistence issue: Modified handlePeerDisconnect to flush pending saves and updated client-side merge strategy in useAutomergeSyncRepo.ts to properly bootstrap from server when local is empty while preserving offline changes

Fixed TypeScript errors in networking module: corrected useSession->useAuth import, added myConnections to NetworkGraph type, fixed GraphEdge type alignment between client and worker

Investigation Summary (2025-12-25)

Current Architecture:

  • Worker: CRDT sync enabled with SyncManager
  • Client: CloudflareNetworkAdapter with binary message support
  • Storage: IndexedDB for offline persistence

Issue: Automerge Repo not generating sync messages when handle.change() is called. JSON sync workaround in use.

Suspected Root Cause: The Automerge Repo requires proper peer discovery. The adapter emits peer-candidate for server, but Repo may not be establishing proper sync relationship.

Remaining ACs:

  • #2 Client-server binary protocol (partially working - needs Repo to generate messages)
  • #3 Deletions persist (needs testing once binary sync works)
  • #4 Concurrent edits merge (needs testing)
  • #6 All functionality works (JSON workaround is functional)

Next Steps:

  1. Add debug logging to adapter.send() to verify Repo calls
  2. Check sync states between local peer and server
  3. May need to manually trigger sync or fix Repo configuration

Dec 25: Added debug logging and peer-candidate re-emission fix to CloudflareAdapter.ts

Key fix: Re-emit peer-candidate after documentId is set to trigger Repo sync (timing issue)

Committed and pushed to dev branch - needs testing to verify binary sync is now working