← FOG·CITY

Tech

VCN #47: Fast Local

When
Wednesday, August 5 · 7:00 PM – 10:00 PM
Listed by
Lu.ma — Frontier Tower
Heads up: this is a hands-on build night and bringing a laptop is mandatory. A local model that's too slow is dead weight. Tonight we make it fast. Back at #41 Bare Metal you ran a coding agent against a local open-weights model with zero API bill. Great, except it crawled. A coding agent that takes 40 seconds to think is a coding agent you stop using. VCN #47: Fast Local is the night we make that local rig actually quick enough to live on. Format: The walkthrough. Where the time goes. GGUF quantization tradeoffs, llama.cpp vs vLLM, tensor parallelism, speculative decoding, KV-cache tuning, and how to actually measure tokens/sec instead of guessing.The benchmark bar. We run Claude Code powered by z.ai first and clock it. That hosted speed is the number your local endpoint is trying to close on. You can't tune what you don't measure.Make it fast. Hands-on hour. Quantize a model, stand up a fast serving endpoint on Nebius Token Factory credits, tune the knobs, and watch your tokens/sec climb.Demos. Point your coding agent at your own fast endpoint and feel the difference live. By 10pm you leave with a fast local inference endpoint your coding agent can actually live on. Builders only. Bring the local rig from Bare Metal, or just a model you want served fast. Doors 7pm. Walkthrough 7:30. Frontier Tower Floor 10. Hosted by Vibe Coding Nights: Rayyan Zahid (Immersive Commons), Michalis Vasileiadis (Hacker Bob), Eric Mockler (AI Geneticist), Devinder Sodhi (Learning Layer Labs). Facilitator: Rayyan Zahid. Guest speaker TBD (open call). RSVP if your local model is private but painfully slow and you want it quick. Frontier Tower members: your ticket is on us. Reach out to the team directly and we'll get you a free RSVP.

More tech soon