8min. read

Convex + PowerSync: Design Notes from the Experimental Release

Adding experimental Convex support: the design decisions we made, the parts of Convex that fit well, the rough edges, and the open questions we're still working through.

Photo of Kobie Botha
By Kobie Botha
Featured image for "Convex + PowerSync: Design Notes from the Experimental Release"

We've added experimental Convex support to PowerSync, after a steady amount of requests for it from the community. This post is the engineering side of that work: the design decisions we made, the parts of Convex that fit PowerSync well, the rough edges, and the questions we're still working through.

PowerSync in thirty seconds

If you haven't used PowerSync before, here's briefly how it works. PowerSync keeps a backend database in sync with a local SQLite database embedded in your app, so the app reads and writes locally, is highly responsive, and keeps working whether or not it's online.

There are two main pieces: the PowerSync Service, which runs on the server and connects to your source database, and a client SDK, which manages the local SQLite database on each device. On the read path, the Service sends each client only the rows it should see, based on Sync Streams you define. That's partial sync: every device holds its own subset of the data, not a copy of the whole database. On the write path, your app writes to local SQLite, the SDK queues the change, and that queue uploads through a path you control. For Convex, that path is your existing Convex mutations.

Convex support is just another module

PowerSync is designed to be stack-agnostic and each supported source database (currently: Postgres, MongoDB, MySQL, SQL Server, and now Convex) is its own module on top of a shared core. The core handles what's common to every backend: the sync protocol, bucket storage (how replicated data is partitioned for efficient syncing), and checkpointing (tracking how far replication has progressed). Each module handles the database-specific replication. So adding Convex support came down to writing the Convex-specific replication code, and that's mostly what the rest of this post covers.

How the Convex module replicates data

PowerSync reads from Convex through its Streaming Export API, using three endpoints: json_schemas lists the tables, list_snapshot reads a full copy of a table, and document_deltas returns changes to data in the order they happened.

When PowerSync first connects, it pins a single snapshot timestamp and reads every selected table at that timestamp. Because the timestamp stays fixed, the copy is consistent even though list_snapshot reads the tables one at a time.

That initial copy is resumable. PowerSync records how far it got in each table, so a restart continues instead of starting over. This matters for large datasets, where the first copy can take a while.

Once the copy is done, PowerSync switches to streaming. It polls document_deltas for new changes, starting from the same timestamp the snapshot was taken at, so no rows are missed and nothing is copied twice. For each change, it uses your Sync Streams to decide which rows to keep, how to transform them, and which buckets they belong to. Buckets are the partitions PowerSync stores replicated data in, so that later, when a client connects, the Service can hand it just the buckets it needs.

What fit well

A few things about Convex lined up well with how PowerSync already works.

A Convex mutation is atomic, and every write in it shares one commit timestamp (_ts). Convex never splits those writes across two document_deltas pages, so we receive them together and commit them as a single transaction. A client never sees half a mutation.

document_deltas also returns the complete document after each change, not just the fields that changed. That means we don't have to keep our own copy of the previous row to work out a diff, so we store less per row than we do for some other databases.

Another big win was having a stable ID. Convex gives every document an _id that never changes. Our Postgres, MySQL and SQL Server replicators all have to track replica identity, because the columns that identify a row are configured per table and a row's identity can change underneath you. With Convex, _id is always the identity, so we skipped that whole class of bookkeeping.

The checkpoint table

PowerSync needs to know when a client's queued writes have been acknowledged by Convex. We get that signal from the Convex replication cursor: once replication reaches the cursor recorded for a write checkpoint, we can tell the client its write is durable. The catch is that the document_deltas cursor only advances when something writes to Convex. On an idle deployment, polling document_deltas returns the same cursor over and over, so a write checkpoint can sit in storage, correct but never delivered, because no later change ever moves replication up to it.

Our fix is to write to Convex ourselves. After we record a write checkpoint, we call a createCheckpoint mutation that upserts a row in a small powersync_checkpoints table. That write produces a new entry in document_deltas, which advances the cursor. When the replicator sees the marker row it ignores it as user data, but the new cursor position is enough to release the write checkpoint to the client. You deploy the mutation; we call it.

We're not doing anything unusual here, though on other backends this step is generally plumbing that the developer never has to touch. For example, in our Postgres replicator, we emit a logical replication message, which you never see, and with MongoDB we manage a _powersync_checkpoints collection for you. Ideally we'd like to remove this step. It would take a way to emit an ordered event into document_deltas without running a mutation, or a change to how we publish write checkpoints so we don't need the cursor to advance past an already-committed position. We wrote up our full analysis in convex-write-checkpoints.md.

ID mapping with client-generated UUIDs

Convex generates document IDs server-side, so a client can't know a row's _id until after it's inserted. PowerSync needs the opposite: a stable local ID the moment a row is created, since the row lives in local SQLite and the app reads and writes it before the upload reaches Convex.

To bridge that, each replicated document carries a client-generated UUID, synced back to the client as uuid AS id. Convex keeps its own _id, the client uses the UUID, and your mutations map between the two on upload.

This is the most visible DX cost of the integration today since every mutation has to handle the mapping. We looked into whether Convex could take a client-supplied ID instead, but from our analysis it would need significant changes on the Convex side. For now, it seems the UUID pattern is here to stay.

Schema changes

PowerSync reads each row's value straight from the JSON document in list_snapshot and document_deltas, not from Convex's json_schemas metadata. That's deliberate: json_schemas can omit a field until a document populates it, so reading types from it would tie a column's representation to whether Convex had reported that field yet.

A nice side effect is that schema changes mostly handle themselves. Adding a field, removing one, or changing a type flows through document_deltas as a normal document change, with no re-snapshot. The exception is dropped tables. Deleting a table emits no per-document deletes, so rows already on clients can linger. The workaround is to clear the table first, or delete its documents through a mutation. We've documented this limitation and plan to handle it properly in a later release.

Sync Streams instead of Convex server functions

The Convex team has been candid that they're not fully comfortable with this part of the design, and we think it's worth discussing openly.

In a Convex app, reads are TypeScript query functions deployed to Convex. Authorization lives in those functions, and Convex's reactivity pushes updates to subscribed clients. With PowerSync, the read path for synced data moves out of those functions. Which rows sync to each client is defined in Sync Streams (SQL-like queries the Service evaluates), and client reads run as SQL against local SQLite. Authorization you have in query functions gets expressed again in Sync Streams for the synced data.

Sync Streams exist because of partial sync. A query function answers a query over the full database; a Sync Stream defines which subset reaches each device. Something has to declare that subset for offline-first, and server functions don't have a primitive for it today.

Writes are also affected. Because the client works against SQLite, values that Convex stores as structured data sync down as their SQLite equivalents. A Convex object field, for example, arrives as JSON text, so a mutation that writes it back has to convert it. The demo app has a basic example of this.

So the tradeoff here is that you get SQL on the client (including joins), offline reads, and devices that only hold data they're allowed to see. You pay by reworking the read path of an existing app, the biggest piece of migration work, plus some conversion work on writes. New apps have less of this.

Convex Components: the bigger open question

Components are Convex's mechanism for packaging reusable backend modules into a Convex app. What if the entire PowerSync service ran as a Convex component? This is not impossible, but it would be a lot of work for us to implement, and we don't yet know whether the consistency guarantees PowerSync relies on would hold up. Still, if the integration gets traction, we're open to exploring it further.

Try it out

We shipped this early version to find out whether the current approach is good enough and to learn where the biggest gaps are. We want your feedback: it will directly influence whether – and how – this integration evolves.

Get started with our Setup Guide. Share your feedback by opening an issue on GitHub or join us in Discord.