Selected Work
Selected Works — Production systems and open-source work. Click any card to open the case study.
Stack: Python, Django, React, TypeScript, PostgreSQL, RAG, Qdrant, vLLM, Ollama, Docker, Kubernetes, Terraform
Blue Omics — Full-stack research data platform
A Django, React, and PostgreSQL platform that grew from zero to 5M+ live records and became the primary system for an entire research lab.
Problem: A research lab ran its data on a sprawl of spreadsheets and manual workflows. Submitting, searching, and cross-referencing records was slow, error-prone, and impossible to scale across 30+ researchers and 5 labs.
Approach:
- Designed and built Blue Omics from scratch: a React frontend on a Django REST backend with PostgreSQL, structured across 32 data models and 58 API endpoints.
- Built 7 ingestion pipelines for heterogeneous formats (PDF, Excel, CSV, Word, PowerPoint), cutting manual data prep from hours to minutes.
- Tuned PostgreSQL with 35 explicit indexes and caching to hold low-millisecond latency under concurrent access by 30+ users.
- Deployed on GCP with Kubernetes and Terraform, Docker multi-stage builds, and CI/CD. Optimized the frontend from 8s to 3s load time.
Stack: Django REST, React, TypeScript, PostgreSQL, GCP + Kubernetes, Terraform, Docker
Impact: Live records: 0 → 5M+; Trait-lookup latency: spreadsheet → ~25 ms; Frontend load time: 8 s → 3 s; Daily active users: baseline → +40%
Trade-offs: Chose a well-indexed PostgreSQL core over premature service-splitting to keep one clear backup and monitoring story. The platform replaced manual workflows entirely and became the system of record, which is what earned the promotion path from Software Engineer to Lead.
TurboQuant on Apple Silicon — CPU-only LLM quantization study
Independent evaluation of TurboQuant (arXiv 2504.19874) ported to run on Apple Silicon. Open source and reproducible.
Problem: TurboQuant is a near-optimal LLM weight and activation quantization method, but the reference path assumed dedicated GPU hardware. The open question: can it run, and hold long-context accuracy, on consumer Apple Silicon with no GPU?
Approach:
- Worked from a CPU-only fork on an M1 Pro (16GB) and fixed five implementation bugs that were blocking correct inference.
- Ran a two-round study: an MLX path and a separate llama.cpp Metal path, each benchmarked on long-context needle-in-a-haystack retrieval.
- Published the full evaluation, the bug fixes, and reproducible results as an open-source repository, with writeups on LinkedIn and X.
Stack: MLX, llama.cpp (Metal), Apple Silicon (M1 Pro), Python
Impact: Needle retrieval @ 16K: 0% → 100%; KV cache memory: baseline → significantly reduced; Bugs fixed in fork: 5 blocking → 0
Trade-offs: A CPU-only target trades raw throughput for accessibility: the point was proving strong quantization and long-context accuracy are reachable on hardware anyone has on their desk, not winning a latency benchmark. Reflects how I approach AI infrastructure: take a research-grade method, get it actually running on accessible hardware, measure it honestly, and share it.
https://github.com/devYRPauli/turboquant-m1pro-evaluation
ApplyScore — AI resume gap-analysis extension
A published Chrome extension that scores how well a resume matches any job posting on the web, with evidence-linked gaps and no fluff.
Problem: Most AI resume tools hallucinate skills and rewrite bullets with confident fluff that recruiters see through instantly. The honest question, how well does this resume actually match this job, went unanswered.
Approach:
- Built a universal scraper that reads job postings across LinkedIn, Greenhouse, Ashby, Lever, Workday and more, piercing Shadow DOM to work on virtually any board.
- Runs a strict, evidence-based gap analysis: a confidence-weighted 0-100 fit score, requirement-by-requirement matches linked to the exact resume bullets that prove them, and a prioritized list of what is missing.
- Privacy-first by design: the resume is cached locally and the user brings their own API key (OpenAI, Anthropic, or Google), so data and model choice stay fully in their control.
Stack: JavaScript, Chrome Extension APIs, Shadow DOM scraping, LLM APIs (BYO-key)
Trade-offs: Deliberately a gap analyzer, not a rewriter. Suggesting only 1-2 targeted, non-hallucinated bullets keeps it honest; the BYO-key model trades one-click convenience for the user keeping full control of their data and cost.
https://chromewebstore.google.com/detail/applyscore/ibecekikdjelajpnjnmapejhahgcplim
Builder Tools (free, client-side)
Builder Tools — Free, client-side. Your data never leaves the browser.
Token Counter — Cost across frontier models, side-by-side (runs entirely in your browser, no signup)
Prompt Formatter — Restructure raw prompts into blocks (runs entirely in your browser, no signup)
JSON to Schema — Generate Pydantic / Zod / TypeScript (runs entirely in your browser, no signup)
Regex Playground — Test, explain, match in real-time (runs entirely in your browser, no signup)
cURL Converter — cURL to fetch / Python requests / httpx (runs entirely in your browser, no signup)
Contrast Checker — WCAG AA/AAA with live preview (runs entirely in your browser, no signup)