All work · 8 initiatives

Eight systems we've built or are building.

Some are flagship engagements. Some are personal experiments that became something more. All run on infrastructure we (or our clients) own — no per-token bills, no vendor lock-in.

8Initiatives

2In production

5In build

On‑premInference fleet

Tax‑lawCorpus indexed

Filter (click All to reset): Domain:

CBIC RAG — Indian Tax Law retrieval

A quote-grounded retrieval system over the Central Board of Indirect Taxes & Customs corpus: the actual statutes, notifications, circulars, and instructions that govern Indian indirect tax. The output isn't a search-result list — it's a paragraph that answers the question, with every claim citing the source paragraph and page.

This is our flagship engagement and the most demanding system we've built. Tax law has structure (sections, sub-sections, provisos) that naive RAG breaks completely. We built a two-pass chunker where a structure-classifier LLM reads each document's layout first, then a deterministic splitter respects those boundaries.

Section-aware chunking. Order-of-magnitude lift in retrieval recall over the naive baseline across multiple validation sets.
On-prem embedding fan-out. Multi-card embedding pool with a reserved slot for hybrid reranking. No per-call API bills.
Quote-grounded answers. Every sentence in the answer cites the source chunk with byte offsets. No hallucinated citations.
Local-first deployment. Runs entirely on the client's hardware. No documents leave the network. No API bills.

Read the full case study

GPU Rig Operations — many cards, one shell

The studio's on-prem inference fleet: a multi-card rig of commodity GPUs running a custom shell that handles model placement, fleet-wide power orchestration, deep-idle sleep states, persistent per-card defaults, and recovery from the inevitable driver hiccups.

It exists because cloud inference bills got silly fast. It runs because we built the orchestration layer ourselves — nothing off-the-shelf handles a heterogeneous local fleet with mining-style PCIe topology and the mix of small embedders + medium LLM that real RAG workloads need.

Custom shell for real-time fleet orchestration: power, model placement, health, status.
Power orchestration — deep-idle when nothing's running, configurable per card.
Multi-level model verification suite — we know when a card is degrading before workloads notice.
Born of necessity. Built so we'd stop renting inference we could do in our basement.

Read the field note

Pharma CI Job Agent

A daily job-search agent for a competitive intelligence professional in pharma. Scrapes career pages at scale, scores roles against a deterministic relevance model (no LLM drift), deduplicates with fuzzy + hash matching, and delivers a structured HTML report to email each morning.

Safety-first: never auto-applies, never stores credentials, never claims to be human. The agent finds and ranks; the human applies. Boring choice on purpose.

Deterministic scoring — same job + same model = same score, every run.
Fuzzy + hash deduplication across employers that re-post identical roles.
Web dashboard for reviewing, starring, and dismissing jobs before applying.
SMTP delivery so the morning report shows up in regular email, not in yet another app.

Read the case study

EarnLearn

An earn-to-learn platform built for the student who isn't self-motivated yet: progress unlocks small, real rewards instead of leaderboards and badges. Six structured modules, parent-student account separation with progress tracking, runs offline on any device on the same Wi-Fi.

The bigger pattern this exemplifies: family-scale software. Built for one user with real needs, not for a million hypothetical ones. We think this is going to matter more in the next five years than another SaaS for everyone.

Parent-student role separation with progress visibility.
Offline-capable — works on a school Chromebook with no internet.
Structured curriculum, not endless YouTube rabbit holes.
Built on Express + Vite for live-reload dev experience.

Read the case study

RescueViral Creator OS

An end-to-end production system for short-form dramatic rescue and wildlife videos targeting TikTok / YouTube Shorts. Script generation → voice narration → auto-editing → render → publish, with a local LLM running the writing and a local TTS doing the voicing.

Currently in concept — spec is written, stack is chosen, building is paused while CBIC RAG and other client work takes priority. Will resume when we have a creator partner who wants to be the design partner.

End-to-end pipeline from idea to render with FFmpeg 8 doing the heavy lifting.
Local LLM inference via llama-server Vulkan — no per-video API cost.
Targets high-engagement documentary shorts rather than generic content.

Read the case study

Wisdom Portal — a family ebook from a lifetime of photos

A photo-AI pipeline that ingests a lifetime of family photos from a personal cloud library, deduplicates them, clusters by face and event, builds an interest timeline, and produces a navigable web portal — eventually a printed family ebook — that captures one person's story.

The user here is one specific person, and the output is for her family. The fact that it's powered by AI is invisible to anyone using it. That's the goal.

Face clustering + dedup across a lifetime of overlapping snapshots.
Interest timeline built from photo metadata, locations, and clusters.
Local web portal the family can browse, no cloud required.
Printed ebook output — the eventual artifact is something tangible, not just a website.

Read the case study

Auto Easy — a booking app every auto-body shop runs as their own

A per-shop services-and-booking app for auto-body shops. Every shop gets its own instance with its own customer base — the loyalty isn't a feature, it's the architecture. Customers onboard by scanning the shop's unique QR code, which binds them to that shop from the first tap.

One codebase, four roles: shop owner, car-owning client, mechanic on the floor, and an in-shop kiosk. Subscription-with-tiers SaaS sold per shop, with custom consulting for shops that want bespoke integrations. Currently in active build with a real auto-body shop as our pilot partner — the tool is being shaped by the people who'll actually use it, not by a roadmap meeting.

QR-bound customer onboarding — one scan, one shop, no shared marketplace.
Four roles, one codebase — owner, client, mechanic, kiosk all share the same backend.
Tiered subscription per shop + custom consulting for bespoke fits.
Live pilot with a working auto-body shop — not a demo, a deployment.

Read the case study Join the pilot list

Agentic QA — regression testing for AI-driven applications

A regression-testing harness for AI-driven applications. Tests are written as agent skills, not imperative scripts. Evidence is verified on disk. The LLM never marks its own work — a deterministic verifier emits the verdict, and an integrity loop demotes any "passed" run whose evidence has gone missing.

First customer is the studio's own Auto Easy. Productised separately as its own SaaS for any team building AI-driven apps that need serious regression coverage they can actually trust.

Trust = Determinism = Scripts. AI is a tool the script picks up, never the orchestrator on top.
Disk-verified evidence. Every screenshot is checked on disk; an integrity loop demotes any run whose evidence has been deleted.
Dual-executor anti-hallucination. Same skill run by two different LLMs — disagreements are flagged, not averaged.
Auto Easy is customer #1. Designed to apply to any AI-driven application that needs auditable test results.

Read the case study

Have a system in mind we should be building together?

Book a 30-min call