Kino AI | Concept Fall 2026'
AI-First Metadata & Multicam Browsing Modes for Filmakers
The Brief
What's the experience of clicking a video and seeing that there are shots of the same scenes, from different angles?
Videos in Kino should have “exotic” metadata fields, and ones that allow for more interesting, interactive UI than simply displaying a string. Specifically, what does a Multicam View look like? How do we access the full page version of this view?
Keeping in mind these constraints
1.
There would be at most 10 shots corresponding to one scene.
2.
Some version of this view must exist in the Kino Inspector (RHS panel) as well as its own page
3.
A file’s name and its path can be very long
Context
Today, filmmakers don’t start editing until Kino helps them make sense of their footage
Kino AI, backed by YCombinator and AI Grant, is an AI video assistant that helps editors search footage precisely and log clips with natural-language descriptions. The platform already excels at precise search across transcripts, descriptions, and people detection—but faced a critical challenge when dealing with multicam shoots. This 1-week exploratory design was my attempt at tackling this.
The status quo - Kino's current metadata viewer in the inspector on the RHS, as well as the job row itself.
Solutions
A novel way to navigate through camera footage
Nodes: Relationship-focused navigation
Understand how footage relates—story structure, alternate angles, B-roll connections, reaction shots.
Grid: Comparison-focused navigation
Direct visual comparison and simultaneous playback
Spatial angle map view
Visualizes the physical setup and spatial relationships of cameras
Explorations
Researching interaction patterns: what works, what doesn’t
I studied how other video/photo editing products handle metadata - collecting standout interaction patterns, common pitfalls to avoid, and “delightful but useful” ideas worth borrowing. I broke down why each pattern works, where it breaks at scale, and how to adapt the best ideas to constraints and user goals.
Creating a system that works
Editors need to think about multicam footage in distinctly different ways depending on their current task.
When building the stories…
They think in relationships: "What B-roll supports this interview?" "Where's the reaction shot?"
When selecting the best take…
They think in comparisons: "Which angle has better framing?" "Is camera 2 or 3 better here?"
With this understanding, I use ChatPRD and GPT to rapidly prototype the ideas I have, touching all touchpoints of the interface.
Where I landed
AI as an editing companion
The RHS turns AI into a practical editing assistant—auto-scoring each angle (overall, audio, stabilization) so you can triage footage fast and pick the strongest shots before editing.
Search by semantics - AI labels scene details and subjects across angles, so you can jump to the right moment, verify context, and build selects without scrubbing.
Understanding and comparing media
Mirrors how professional multicam monitors work in live production, but enhanced with AI metadata overlays that would be impossible in hardware.
Building narratives
Makes the invisible visible—instead of editors mentally tracking relationships across dozens of clips, the system visualizes them
What would want to I explore next
✦ Automated Angle Selection Given the rich AI metadata (composition, emotion, audio quality), could the system suggest "best angle" for each moment based on editorial intent? For example: "Select most emotionally expressive angles for this scene."
✦ Cross-Project Learning If the system learns which metadata correlates with user selection patterns (e.g., "this editor always picks the widest stable shot"), could it surface those preferences proactively in new projects?
✦ Collaborative Validation In team environments, how might multiple editors validate metadata together? Could high-confidence validations from one trusted user propagate to similar clips?








