Harim
Choi.
Production ML engineer, 6+ years end-to-end across CV, NLP, predictive. Self-taught, non-traditional path. Ship cycles: MCP server in 2 days, Monogram in 2 weeks, bidNLP and langgraph in 1 month each, R2CCP in 2 months, WSSS in 3 months (SOTA at release). All projects in 1+ year production unless research.
Production ML + OSS + research.
wsss-refined-pseudolabels
56.2% mIoU · SOTA at release · 3 moWeakly-supervised semantic segmentation, refined pseudo-labels. Frozen CLIP (ViT-B/16) + DINOv2 backbone, Multi-Signal Reliability Estimation, Phase-Adaptive Refinement, Boundary-Aware loss. 56.2% mIoU on COCO-Val: +4.3pp over WeCLIP+ 80K baseline, +9.1pp over WeCLIP (CVPR 2024). SOTA at release. Delivered in 3 months. External contracted research, independently secured.
nlp-analysis-agent
F1 96.4% · 50 ms CPU · 1+ yr prodKorean public-procurement notice classification (bidNLP). RoBERTa-large + LoRA Teacher-Student, UltimateTrainer (FocalLoss + R-Drop + FGM). Hybrid weak labels: hard-rule routing + SBERT (0.9) / finetuned-RoBERTa (0.1) max-sim ensemble. Static INT8 ONNX (LoRA-merge → AVX512-VNNI per-tensor, 200-sample MinMax calib): 1.3 GB → 330 MB, 150 ms → 50 ms, <1% F1 loss. FastAPI service, 1+ year in production. Processing 70,000 notices/week: 40 hr manual → 2 min automated. F1 96.4% vs GPT-4o 35.1% (2.75×).
monogram
PyPI · mono-gramDrop into Telegram. Auto-save as wiki. Wake up to a project dashboard. 5-stage LLM pipeline, atomic Git Tree commits, MCP server (13 tools).
google-surf-mcp
209 stars · 27 forksVendor-agnostic Google search MCP server. Drop-in for Claude Desktop, Cursor, or any MCP-compatible client. SSRF-hardened, 11 test cases, npm-published.
ensemble-bid-prediction
+25-40% win rateR2CCP for tender bid rate prediction. Identified interval collapse in the public implementation (cumulative-mass intervals merge bimodal peaks). Fixed via per-bin threshold + entropy regularization → bimodal preserved. +25 to 40% bid win rate, 1+ year deployed.
langgraph-travel-agent
7 modules · 4 APIsMulti-agent travel booking with 4-API parallel orchestration and human-in-the-loop checkpointing. Refactored 1707-line monolith into 7 clean modules.
claude-setup
dotfilesPersonal Claude Code configuration. Hooks, slash commands, statusline, MCP servers, project-level CLAUDE.md priors. Reproducible across machines.
Active across three tracks in parallel.
- production
- bidNLP: F1 96.4%, 50 ms CPU, 1+ year in production. R2CCP custom impl: +25-40% win rate, 1+ year deployed. NGBoost × XGBoost bid price: win rate 3× baseline.
- research
- DSSP: 12-branch decision-science taxonomy for LLM agents, 14 audits, arXiv preprint coming. E-AT: entropy-based adversarial calibration, LAPC loss family v1-v7.
- agent infra
- monogram: 5-stage LLM pipeline + 13-tool MCP server, atomic Git Tree commits. google-surf-mcp: vendor-agnostic Google search MCP, 209 stars (141 in first 5 days), npm-published.