In build · POC Case study ~7 min read

A mirror, not a dashboard. The dormant digital archive, made readable again.

Most modern phones carry millions of pictures and screenshots their owners will never reopen. Wisdom Portal is a proof of concept for what happens if you stop letting that archive go silent — if you read the trail of what someone has been paying attention to for a decade as the autobiography it accidentally already is. Built around one user's twelve-year archive. Designed for the pattern that lives on every phone.

It is not a dashboard. It is a mirror. The system surfaces patterns; it does not pronounce on the person.

Talk to the studio Why local-first matters here

/ Status POC complete. Paused at curation phase, pre-dedup. The pattern is what we're publishing; the corpus is private to the family it belongs to.

Editorial illustration: a loose pile of phone-screen-shaped rectangles on the left transitions across the composition into a tidy bound book on the right, page open to a chapter heading — the dormant archive becoming readable. — From a pile of phone screenshots to a single bound chapter.

The shoebox, but digital, but bigger

Pull up the camera roll on any phone owned for more than a couple of years. Scroll. Scroll some more. The archive accumulates organically — recipes, fashion notes, career tips, spiritual quotes, parenting advice, fitness routines, articles saved for later, family events, a child growing up. Nobody curates as they go. Nobody has time. The collection grows so large that scrolling stops being a useful action against it.

The wisdom inside that archive is right there. It is just not readable any more. The volume defeats the reader. The shoebox of paper photos a previous generation kept in a closet had a hundred items in it; the modern equivalent has tens of thousands.

This isn't pain in the commercial sense. But there is a quieter loss: the silent erasure that happens when someone collects without curating, and what they collected becomes inaccessible to themselves. The system's job is to make the volume readable again.

The browser-based portal showing the curated archive as a readable surface, with the optional family ebook output to the side. — The portal — a mirror, not a dashboard.

Four ways to search the same archive

The pipeline: a pile of photos and screenshots flows through dedup, OCR, face clustering, and classification pipelines into a single curation surface; the surface drives both the portal and the optional ebook output. — From the pile to the portal — one curation surface.

Photos and screenshots are both inputs to the same goal. The pipeline ingests both: deduplicates near-duplicates, extracts time and location, runs face detection so the same person is recognised across years and ages, OCRs the text out of every screenshot, classifies each entry, and stores everything in a single curation surface.

From that surface, the user can search four different ways — each answering a different kind of question.

Keyword search.

Find every entry containing this exact phrase. The fastest path; best for re-finding something you remember saving.

Semantic search.

Find entries that mean this, even if the words don't match. Vector-embedding-based; finds the recipe-with-no-name when you search "that thing with paneer and spinach."

Visual search.

Find images that look like a query image — same place, same scene, same mood. The path most useful for photos rather than screenshots.

Smart-expand.

Query expansion that broadens a search across related terms the user didn't think to type. Bridges the user's vocabulary today and the user's vocabulary from a decade ago.

And one more thing the closed archive cannot do: outward search. From any saved entry, the user can pivot to fresh content on the same topic on Reddit, Google, Twitter, or YouTube. A saved-and-forgotten note about morning routines becomes a starting point, not a dead end. The archive is no longer a closed set.

Ten dimensions of a life, in a Vedantic frame

The wisdom corpus is mapped across ten life dimensions borrowed from Indian philosophical tradition — the Vedantic Panchakosha — collapsed into reader-facing chapters: Body, Family, Livelihood, Insight, Time, Service, Home, Inner Peace, Play, and Self-Study. The framework is what keeps the system from being a fancy folder organiser.

It is also what the project will not violate. There is no scoring of the user. No "you are 64% spiritual." No prediction of what the user will save next. No gamification — no streaks, no badges, no completion percentage. No comparison to anyone else. A recipe and a spiritual realisation are not aggregated to the same metric. The framework's anti-patterns list is binding, not aspirational.

The hardest AI problem in personal-archive work isn't recognition. It's restraint. When the system knows the user saved fifty entries about coping with anxiety in a particular year, the right behaviour is to show a quiet density change in the relevant dimension and leave it there — not to render it as a diagnosis. Gaps are sacred. Whatever was happening, it is the user's to know or not know.

Privacy by architecture

This is the part that decided the build. A decade of one person's photos and saved fragments is the most intimate kind of data a system can touch. Sending it through a third-party model vendor is the wrong answer on its face — not because anyone is going to look, but because the architecture itself shouldn't let them.

The whole pipeline runs on the studio's local GPU rig. OCR, classification, semantic embeddings, face clustering, visual search — every model that sees the corpus is open-weight and lives on hardware we own. The questions the user asks of the archive never leave the household.

The local-first decision was forced, in fact, by an early experiment that ran the screenshot-classification step on a cloud frontier-model API and triggered a multi-day token-quota lockout. That incident codified a hard rule for the studio — no frontier API for batch processing on personal-scale corpora, ever — and pushed the project onto the rig.

The output: a family ebook

The archive doesn't stop at search. The end artefact the system is designed to produce is a printable book — the user's life journey, distilled into chapters that emerged from the archive itself, illustrated with photos pulled from the same corpus, narrated optionally as audio. The kind of object that sits on a shelf and gets handed to children.

This isn't for everyone. It's for the people who have been leaving digital breadcrumbs of their own growth without realising it, and who would like to read those breadcrumbs as a single thread. The pattern generalises. Anyone with a phone full of unsorted memories and the desire to see them as something other than a pile is the candidate audience.

Where the project is today

A face-clustering step: the same person at different ages produces distinct embedding clusters that need to be linked carefully without overreaching. — The hard part: linking the same person across decades, without overreaching.

The proof of concept is complete. The portal runs, the four search modalities work, the curation surface is usable, the wisdom corpus has been triaged and classified. What's open is a deduplication pass — the photo recognition step produces some duplicate face clusters across decades of growth (the same person at five and at forty does not embed to the same vector, and off-the-shelf face-recognition libraries are not tuned for the problem). Resuming the project means picking up the dedup work where it stopped.

The first user is one specific family member. She is not named here, and she will not be named anywhere this case study leads. The pattern is what we are publishing — the architecture, the search modalities, the framework, the privacy contract. The corpus stays where it belongs.

For the broader thinking on why personal-data AI belongs on hardware you control, the local-first piece is the one to read. Don't trust the cloud.