[Guide] Build a Photo Annotator (Keyboard Support Later) - implementation notes
The Photo Annotator guide has a pragmatic take: ship a solid pointer-driven annotator first, then layer keyboard support later using key-bridging primitives and focus scheduling so you can add shortcuts without rewriting the canvas interaction model. The route-local annotation docs idea also seems useful for debuggability ("what did we think the annotation was?").
How did you structure the key-bridge so it stayed predictable (no global shortcut soup)? Where did you store annotation truth: route-local doc, vault, or component state tied to the canvas? What focus scheduling rules worked so keyboard support didn't feel janky or inaccessible? How did you render evidence so bug reports include the active tool/selection/anchor info?
Comments (20)
Back to latestKey bridging only worked once we treated it like a contract surface with a tiny vocabulary.
If every tool invents its own shortcuts, the bridge becomes chaos.
We implemented key bridging as a single dispatcher with a strict "reason" string and a focus schedule. The log format mattered because it made shortcut behavior auditable:
txt
[tips] keyBridge key=Escape handled=true action=selection:clear reason=user:key focusTarget=canvas
[tips] focusSchedule intent=canvas:focus at=now reason=selection:clear
If someone says "Escape didn't work", you can inspect whether it was handled and where focus was supposed to go.
Counterpoint: deferring keyboard support can paint you into a corner if the pointer model isn't accessible.
We still designed the tool model as if keyboard existed (tool, selection, anchor) even before wiring shortcuts.
Yes—"later" can't mean "never planned".
The reason key bridging works is because the tool model is already explicit and can be driven from multiple inputs.
Route-local annotation docs were surprisingly helpful.
We store a doc per photo with activeTool, selectionId, and draftAnchor, and we render those as evidence in debug mode.
We kept annotation truth as a route-local doc and treated the canvas as a renderer, not as the source of truth.
That made undo/redo and cross-panel UI (side list of annotations) consistent.
Focus scheduling was the hardest part: when do you move focus to the canvas vs the side panel vs a label input? We made it deterministic by encoding it as a schedule, not an immediate imperative call:
txt
[tips] focusSchedule intent=labelInput:focus reason=tool:rename selection=ann_42
[tips] focusSchedule intent=canvas:focus reason=tool:drawWe also rendered data-tool, data-selection, data-anchor, data-key-bridge-scope on the shell.
It's not pretty, but it makes screenshots useful for diagnosing weird selection bugs.
Counterpoint: logging key bridges is great until it becomes a performance problem (lots of key events).
We logged only handled keys + a small set of navigation keys. Everything else stays silent.
If the key bridge is global, it will eventually conflict with text inputs. We scoped the key bridge by focus target (canvas vs input) and logged the scope decision:
txt
[tips] keyBridge scope=input handled=false key=Backspace reason=typing
[tips] keyBridge scope=canvas handled=true key=Backspace action=selection:delete reason=user:keyDocs tie-in: the focus scheduling idea maps to the accessibility guidance pretty well.
Related: Accessibility (Best Practices) and Tools and Ecosystem.
Long-form take: key bridging is less about shortcuts and more about *input normalization*.
Once you normalize inputs into a common intent (draw, select, nudge, delete), the app stops caring whether it came from pointer, keyboard, or a toolbar.
The key is keeping intent strings stable and observable so new shortcuts don't change semantics by accident.
We also stored derived "selection summary" (type + count) so the side panel and canvas always agreed.
Without it, you end up with the canvas showing one thing and the panel showing another, which is trust-destroying.
We used a key bridge integration layer that can be turned off per route.
That prevented shortcut leakage into other tools and made the bridge feel deliberate.
Counterpoint: route-local docs can fragment if you have multi-photo workflows (compare, bulk annotate).
We introduced a vault that holds cross-route selection and made the route docs reference it by id.
That split makes sense: route doc for local posture, vault for cross-route identity.
The important thing is that both are explicit and render evidence so you can see the split.
We found it helpful to log tool transitions too, because a lot of "bug" reports were actually "wrong tool" reports:
txt
[tips] tool=select -> tool=draw reason=toolbar:click
[tips] tool=draw -> tool=select reason=keyBridge:EscapeIf you delay keyboard support, at least make sure your pointer interactions are consistent and discoverable.
The app should feel like it has a coherent model before you add more input channels.
Long-form counterpoint: focus scheduling can be used as a band-aid for a UI that's constantly stealing focus.
If you find yourself scheduling focus on every render tick, that's a smell: you're not respecting user control.
We ended up with a stricter contract: focus can move only on explicit intent boundaries (tool change, selection change, route change).
We also added a debug-only overlay that prints the route doc state in plain text.
It made it obvious when the canvas and the doc were drifting, which was the root cause of several bugs.
The nice thing about key bridging is that it becomes testable.
Tests can assert on a log line / evidence key instead of trying to simulate pixel-perfect pointer movement.
If you're implementing this, start by deciding what evidence a screenshot must contain (tool, selection, anchor, focus target).
Once that exists, the rest of the architecture decisions get a lot easier to validate.