Blog — Ramin Sharifi

A few weeks after I wrote about the architecture of TimeLogger, the obvious follow-up is the part every architecture post glosses over: what actually broke.

Shipping one app across iPhone, Mac, Watch, and widget extensions means you don't get to write one bug class — you get to write one per platform. Timestamps lie differently on each, sync transports each fail in their own dialect, and SwiftUI has at least three state systems that look identical until they don't.

These are the bugs from the last month of work that I think are worth writing down. Not the typos, not the "oh right I forgot a guard statement" kind — the bugs where the code was doing exactly what I'd asked it to and I had asked for the wrong thing.

1. Seconds-precision timestamps and the strict-greater-than trap

The daemon's sync endpoint decides whether to apply an incoming deletion with a check that reads like the safest thing in the world:

rust

if deleted_at > existing.last_modified {
    apply_deletion();
}

The daemon's timestamps are whole seconds — Local::now().timestamp(). So was the iPhone's when it built the PendingDeletion. Stop a timer on the phone, the deletion goes out stamped with the current second, the timer's last_modified is… the current second. Strict > fails. Deletion dropped.

What the user saw: tapping Stop on iPhone did nothing visible on the Mac. Stop again on the Mac and now there were two entries — a duplicate, because the daemon doesn't dedupe entry inserts on local_id.

A second bug, same shape: the sync watermark on iOS was also stored as a whole-second timestamp, and the server-side query used strict > against it. Any timer inserted on the Mac during the same wall-clock second as a completing iPhone sync became permanently invisible to the phone on every later poll. A one-second race condition that silently hid data forever.

The fix in both cases was to either force the client-side timestamp strictly past the server's record, or back the stored watermark off by one second so the boundary is always included.

Lesson: if you're comparing two timestamps and both sides round to the same unit, strict inequality is wrong. Use >=, or add an epsilon, or bump the loser by one tick. Don't trust your wall clock to be unique within a second — especially across devices with slightly skewed clocks.

2. BLE dropped chunks silently, because `updateValue:` returns a boolean nobody reads

CoreBluetooth peripherals send data via updateValue:forCharacteristic:onSubscribedCentrals:. The method returns a BOOL telling you whether the central's transmit queue accepted the payload. Under back-pressure — which happens on basically any multi-chunk response — it returns NO.

The original code did this:

objc

// Send chunk; if it fails, sleep 10ms and retry once.
BOOL ok = [peripheral updateValue:chunk ...];
if (!ok) {
    usleep(10 * 1000);
    [peripheral updateValue:chunk ...]; // retry return value ignored
}

Notice what the retry does not do: check whether the retry succeeded. So on any response large enough to need multiple chunks, a single dropped chunk meant the iPhone never saw CHUNK_LAST and waited out its 10-second request timeout. Every sync. Visibly fine on small payloads, broken on everything else.

The CoreBluetooth-sanctioned pattern is to hand chunks to a pending queue on the BLE dispatch queue and drain them from peripheralManagerIsReadyToUpdateSubscribers: — a delegate callback Apple fires when the transmit queue has room again. The previous code had a stub for that delegate method that did nothing.

Switched to the drain pattern. Sync went from flaky to reliable. Also bumped the iOS request timeout from 10s to 30s — first-sync payloads can be several KB, .withResponse chunked writes run around 100ms each, and 10s was tight enough that clean syncs were timing out on cold starts.

Lesson: when an API returns a status code, read it. When the docs describe a delegate method's back-pressure role, implement it. "Sleep and retry once" is not a substitute for flow control.

3. The BLE sync failure silently grew the payload every cycle

This is a consequence of bug #2 and it took me longer than I'd like to admit to see it.

Before the chunk drain fix, every sync above ~1 KB timed out and the iPhone left needsSync=true on those rows. The next poll re-sent the same entries — but because the sync had failed, they still had no server_id. The daemon, seeing a brand-new-looking entry, inserted a fresh copy. Those fresh copies then got pulled back into SwiftData via the iCloud path. Next poll, the outgoing payload was bigger. Next one, bigger still.

On launch after the fix shipped, some users' local stores had dozens of duplicates of rows that should have synced weeks earlier. Even with flow control working, the payload was now too large to get through BLE in any reasonable window.

The unblock was a launch-time fold: pair rows by natural key (start time + category + duration, basically), keep the one with a server_id, and enqueue PendingDeletion records for the rest so the daemon drops its redundant copies too. A one-shot dedupe that both shrinks the next payload enough to fit and self-heals the server-side bloat.

Lesson: a broken sync doesn't just leave you with missing data, it can leave you with growing data. If you're building anything with retry-on-failure and the identity model isn't airtight (no server_id + no local_id dedup = duplicates), you will eventually need a reconciliation pass.

4. `SWIFT TASK CONTINUATION MISUSE: sendRequest leaked its continuation without resuming it`

Two sync paths in the iOS app could land on BLEManager.sendRequest concurrently:

The top-level 2-second poll in ContentView
A 500ms-debounced post-mutation sync from scheduleSyncAfterMutation

sendRequest stored a single pendingCompletion continuation to resolve when the response came back. Whoever arrived second overwrote the first. The first continuation was now unreachable — neither resumed nor thrown away — and Swift's concurrency runtime prints a loud warning when a continuation is deallocated without being resumed.

In practice the first caller hung for a few seconds until its timeout, then the sync engine retried on the next tick. Visible symptom: intermittent sync stalls under heavy activity.

Fix was one line of intent: the second caller should not overwrite, it should bail with BLEError.busy. SyncEngine already retries on the next tick.

swift

guard pendingCompletion == nil else {
    throw BLEError.busy
}
pendingCompletion = continuation

Lesson: any Swift async bridge that stores a single continuation in shared mutable state is a concurrency bug waiting for the schedule to cooperate. Either serialize the callers with an actor, or be explicit about what happens when one arrives while another is in flight.

5. CloudKit has two failure modes that look like success

Two CloudKit landmines inside a month, both diagnosed after users saw "sync stopped working for no reason":

The stale change token. Incremental fetches send a CKServerChangeToken saved from the last fetch. If enough time passes, or Apple rotates the server's state, the token expires. The error is CKError.changeTokenExpired with the charming message "client knowledge differs from server knowledge." The naive handler logs the error and returns — which surfaces it as "sync failure" to the user, but nothing gets fixed. You have to drop the token and refetch with nil.

The missing zone. Same shape: .zoneNotFound / .userDeletedZone. If the user wipes CloudKit data from Settings, the custom zone disappears. Every subsequent sync errors until the client recreates the zone. Also needs to be caught and self-healed on the next sync, not surfaced as a permanent error.

The redundant auto-export. I was running a hand-written CloudKitManager that owned the CloudKit push loop. Separately, adding the iCloud entitlement flipped SwiftData's ModelConfiguration into cloudKitDatabase: .automatic, which quietly registered CoreData+CloudKit export background tasks against the same container. Symptom: BGSystemTaskSchedulerErrorDomain code=3 noise from com.apple.coredata.cloudkit.activity.export.* flooding the console, and occasional double-writes under sync pressure. Fix was to explicitly set cloudKitDatabase: .none on the ModelConfiguration — you can have SwiftData's auto-sync or you can drive CloudKit yourself, but not both on the same container.

The missing background mode. "CloudKit push notifications require the remote-notification background mode" isn't actually shown to the user — it's a runtime warning — but without it, silent CloudKit pushes never wake the app, and iCloud-side changes only land on the next foreground sync. Added to Info.plist, done.

Lesson: CloudKit integration is a set of five or six opt-in pieces that all have to be configured correctly, and most of them fail in ways that look like "sync is just slow today." Treat every CloudKit error as a state machine input, not a log line.

6. Live Activities outlive your process

Live Activities are the running-timer bubbles that appear in the Dynamic Island and on the Lock Screen. You start one with Activity.request(…) and end it with .end(…). Straightforward, except:

The currentActivityId I was remembering lived in a static variable. Statics reset when the app process dies. Live Activities do not. They're owned by the system and persist across app kills, reboots, and the heat death of your battery.

So: start a timer, background the app, force-quit it, relaunch. The Live Activity from earlier is still sitting there. The app thinks currentActivityId == nil and its publish(runningTimer: nil) call quietly does nothing because there's no id to end. Meanwhile Apple's UI is still showing a timer that's been dead for an hour.

Worse: tap Start again. Now you have two Live Activities, and only one of them will actually update.

The fix is to stop trusting your remembered id and start iterating Apple's source of truth:

swift

for activity in Activity<TimeLoggerAttributes>.activities {
    await activity.end(dismissalPolicy: .immediate)
}

On stop, end every stray. On update, end duplicates and keep one. On relaunch with a timer running, adopt the existing activity instead of spawning another. This is more code than the static-id version, but it's the only version that actually matches the lifetime of the thing you're managing.

Lesson: when the system owns the lifetime of a resource, your in-process bookkeeping is a cache, not a source of truth. Reconcile against the system, not against your memory of what you did last time.

This one was almost funny once I saw it.

The iOS widget was showing wrong data after midnight. Sometimes stale, sometimes duration bars that extended off the side of the screen, sometimes dates that said "in 31 years."

The main app wrote its widget snapshot using JSONSerialization — which for dates just dumps whatever you give it. I was giving it Date().timeIntervalSince1970, so: seconds since 1970.

The widget decoded the snapshot using JSONDecoder(). JSONDecoder's default dateDecodingStrategy is .deferredToDate, which assumes seconds since 2001 — Apple's reference date. So every timestamp was being read as if it were 31 years later than it actually was.

Both sides now pin .secondsSince1970 explicitly, and — equally important — share the same Codable model between the app and the widget extension instead of each side doing ad-hoc serialization.

There was a second layer to the midnight bug: a running timer that spans midnight was being shown with its full duration on "today's" bar, so at 00:01 you'd see a giant eight-hour bar from the previous day's overnight work. The horizon view now clamps the running bar's contribution to the current day and adds a post-midnight timeline entry so the horizon resets visibly even if iOS is slow to fire the scheduled widget reload.

Lesson: JSONSerialization and JSONEncoder/JSONDecoder do not serialize dates the same way. If you're writing with one and reading with the other, pin the strategy explicitly on both sides. And for anything cross-process, share the Codable model; don't roll it twice.

8. SwiftUI affordances that quietly don't work

A small taxonomy:

.swipeActions is a no-op inside ScrollView + LazyVStack. It only works inside a List. The Entries screen was using the scroll/stack pair for styling reasons, so the swipe-to-delete the user kept trying on the Log tab simply didn't exist — the modifier was there in the code, it just did nothing at runtime, no warning, no log. I ended up writing a custom SwipeToDelete wrapper that supports short-swipe reveal and full-swipe commit. Fine in the end, but the silent no-op cost me an afternoon of "why is this not responding."

Quick-start category grid jittered on every render. The code was roughly categories.prefix(4).sorted() on a Set. Set has no stable order, so .prefix(4) took four arbitrary elements and then sorted those — meaning the four-item set could change on any re-render. Every sync tick changed allEntries, the memo recomputed, and the visible four tiles shuffled. The fix is every first-year CS student's answer: sort first, truncate second.

try? modelContext.save() eats every error. Scattered across the codebase, it looks innocuous — "I don't care about the result." But SwiftData's save() can fail for a lot of reasons that matter to the user: disk full, schema migration needed, constraint violation. The user sees the UI revert their change with no explanation. Every try? in a mutation path got replaced with a persist(_:) helper that surfaces failures via an inline banner and rolls back the optimistic update on save error.

The fake macOS traffic lights. This one wasn't a bug in the Swift-strict sense, just a UX landmine. My sidebar had three hand-painted close/minimize/zoom circles drawn at the top of a header view. The real AppKit window chrome rendered above them. Users — surprise — kept clicking the decorative ones, not the real ones. The decorative ones had no action wiring, so nothing happened. Deleted them. Real chrome is enough.

Lesson: SwiftUI is a great-looking framework with a lot of affordances that fail by doing nothing. Any modifier that expects a specific parent view is a landmine if you put it somewhere else. Audit your tap targets by actually tapping them, not by looking at the code.

9. "The button isn't working" when the button was fine

A long tail of bug reports reduced to the same root cause: the view's data was stale. Specifically, on macOS, views like TodosView and EntriesView each kept their own @State copy of the list and only reloaded on .task or right after their own mutation. When another device added a todo, the Mac's list sat stale for up to the poll interval — which read to the user as "todo add doesn't work."

The fix was to hoist the shared state onto APIClient as @Published streams, fetch them on every 1s polling tick (with async let so one tick is three concurrent fetches), and expose a pokeNow() that races against the sleep so post-mutation refreshes are immediate.

Also: ⌘1–⌘6 sidebar navigation was advertised in the sidebar legend but never bound, because the Commands block lived inside the MenuBarExtra popup and only fired when that popup was open. Same for ⌘N for new timer. Wired them at the Scene level so they work from the main window.

Lesson: "the button isn't doing anything" is almost always one of three things — the button never fires, the mutation never happens, or the view never re-reads. Before you touch the button, prove which layer is actually broken. For cross-device state, default to centralized observable stores, not per-view @State.

10. The App Store Connect upload gauntlet

Two validation errors on the first upload:

90474 — UISupportedInterfaceOrientations was absent from the main bundle Info.plist. iPad multitasking wants it declared, even for "iPhone" apps that also run on iPad.
90717 — The marketing icon-1024.png had an alpha channel (transparent rounded corners from my icon generator). App Store requires the marketing icon to be opaque — the system renders the rounded corners at runtime. Flattened against the icon's own background color, re-uploaded, done.

Not interesting in themselves, but worth writing down because App Store Connect's error codes are otherwise opaque and the fixes are specific.

Meta lessons

Stepping back, the bugs above cluster into a few themes I want to internalize:

1. Every "==" across process/device boundaries is actually an "≈". Timestamps are the obvious one, but it extends to ids (server vs. local), strings (normalization), and even floats (FP serialization). Design for approximate equality from the start.

2. Silent failure paths are the expensive ones. try?, ignored return values, no-op modifiers, static ids that reset — every bug in the list above had a version of "the code ran without error and nothing happened." Visible errors get fixed in hours. Silent ones get fixed in weeks, after a user notices.

3. The sync protocol is the product. In a multi-device app, your offline-first layer, your deduplication, and your reconciliation pass are the user experience, not a background detail. Budget for them proportionally.

4. When Apple's system owns a lifetime, reconcile instead of remember. Live Activities, CloudKit zones, BLE subscriptions, background tasks — all have lifetimes that outlive your process. The only safe model is to query the system for current state, not to cache your last action.

5. The cheapest fix happens before first ship. Roughly half of these bugs would have been caught by actually using the app for a full day across two devices before shipping. The other half genuinely required real-world network conditions to reproduce. Beta-testing yourself is underrated.

None of these were insurmountable. All of them were embarrassing in hindsight. That's the shape of honest debugging work — the category of bug where, once you see it, you can't unsee it. Writing them down is the only way to not relearn the same lesson six months from now.

If you've shipped an app across Apple's platforms and have your own favorite footgun, I'd love to hear it. Nothing makes these stories less lonely than finding out everyone else has a scar in the same spot.

The Bugs That Cost Me Time: Shipping TimeLogger Across Four Apple Platforms