AI Tools

Wingman: The AI Interview Copilot for Mac That Actually Sits Beside You

A practical guide to Wingman, Codersera's invisible AI copilot for Mac. What it is, who it's for, how it fits a real interview workflow, and the day-in-the-life of an engineer using it under pressure.

Published 09 May 2026 • Updated 10 May 2026 • 8 min read

The technical interview is a strange ritual. Two engineers who, in any normal context, would solve the problem together — pull up a doc, sketch on a whiteboard, ask a colleague — instead sit across a Zoom call where one performs and the other judges. The performer has 45 minutes, no notes, no colleague, no IDE that is helpful in the way their day job IDE is. Then they go back to the day job and immediately reach for all of those things to actually ship anything.

Wingman is the small, quiet acknowledgement that this format does not represent how you actually work. It is a free macOS app, built by Codersera, that sits as an invisible overlay on your screen during a live technical interview, transcribes what the interviewer is asking, and streams a structured answer beside it — fast enough that you can read it, internalise it, and speak it in your own words.

This is the engineer's guide to what Wingman is, who it is for, how to set it up in three minutes, what a real workflow with it looks like, and how to think about it honestly.

What Wingman actually is

Wingman is a single Mac binary. You download a .dmg, drag it into /Applications, paste an LLM API key into Settings, and it lives in your menu bar. When you launch it before an interview, it does three things at once:

Listens. System audio capture feeds a transcription pipeline. The interviewer's question shows up as text inside the overlay within roughly a second of them finishing the sentence.
Thinks. The transcribed question is sent to whatever model your API key supports — DeepSeek, OpenAI, Anthropic via OpenRouter, any OpenAI-compatible endpoint. The model is given a system prompt tuned for the kind of answer engineers actually want: structured, complexity-aware, with edge cases called out.
Stays out of the way. The overlay window uses macOS display-capture exclusion (the same primitives ScreenCaptureKit uses for kids-mode and DRM). It renders only on your physical display. Zoom, Meet, screen-share, most screen recorders — they all see your desktop without the panel. You see the panel.

That is the whole product. It is not a course, it is not a content library, it is not a coach. It is a transparent helper window that knows what is being asked and shows you a good answer.

Who Wingman is for

Wingman is sharpest for three groups of engineers:

1. Senior engineers in active interview season

You are running three to five loops a week. Your day is interview, code, sleep, repeat. By round 30, your brain is mush and your reaction time has dropped twenty percent. The question that feels routine in week one — "design a URL shortener," "reverse a linked list in place" — feels strange in week six because you have answered it eleven times and you cannot remember which interviewer you told which version to. Wingman is the safety net for the question you blank on at minute eighteen, the one that would have cost you the offer.

2. Career switchers

You are a backend engineer moving into ML, or a frontend engineer moving into platform, or a generalist moving into systems. New domain, new vocabulary, the same brutal pacing. You know enough to do the job but you cannot yet speak the dialect at interview tempo. Wingman fills the gap between what you know cold and what you can credibly discuss in real time, while you rebuild fluency over the first months in the new role.

3. Coaches, mentors, and managers prepping a candidate

You are running mock loops with a junior engineer or a peer. You watch Wingman surface the canonical structured answer and you debrief on the gap between your candidate's reply and the model's. It is a faster feedback loop than any whiteboard session — you compare two versions of the answer side by side and you can be specific about where the candidate's reasoning thinned out.

The three-minute setup

Wingman setup is deliberately uncomplicated. There is no account, no signup, no telemetry call, no "verify your email."

Download the latest .dmg from codersera.com/tools/wingman. Drag Wingman.app into /Applications.
Launch it. macOS will ask for screen recording and accessibility permissions — both are needed to render the overlay and listen to system audio. Grant them.
Open Settings → API Key. Paste your LLM API key. If you do not have one, the cheapest credible option in 2026 is DeepSeek — fractions of a cent per answer. OpenAI and Anthropic via OpenRouter both work too. Pick a model and go.

That is it. Toggle the overlay with your global shortcut (default is Cmd+Shift+Space). Adjust the opacity slider until the panel is just barely visible against the brightest backgrounds you might face. Test it once with a fake interview prompt to confirm the latency feels good on your network and machine.

A day in the life

It is Tuesday morning. You have a mid-stage system design interview at 11am with a Series-C infrastructure company. The recruiter said the focus is on "distributed systems fundamentals" and the team is "strong on storage." You spent yesterday evening reviewing your notes on consistent hashing, quorum reads, and why the CAP theorem is more of a triangle of regret than a clean three-way trade-off.

10:55am. You launch Wingman from the menu bar. The overlay glides into the top-right corner of your second monitor. You drag it slightly out of your eyeline so you have to glance, not stare. You open Zoom and join the call.

11:02am. The interviewer is a staff engineer named Priya. She is friendly, direct, says she has been at the company four years. She gives you the prompt: "Design a metadata service for an object store. A hundred million objects, a thousand reads per second peak, a hundred writes per second peak, multi-region. What are you reaching for?"

You have the rough shape of an answer in your head. You start: "I'd reach for a sharded relational store, probably Postgres or one of the NewSQL options like CockroachDB, sharded by object ID prefix. The thousand reads per second isn't extreme, but multi-region pushes me toward something with built-in raft replication so I'm not hand-rolling consensus —"

While you talk, Wingman is already showing you the structured version of the answer in the panel. "Components: API gateway, metadata service, sharded storage layer, replication coordinator, read-through cache. Trade-offs: NewSQL (CockroachDB / Spanner) vs sharded Postgres + custom replication vs document store (DynamoDB Global Tables). Failure modes: stale reads on cross-region failover, hot partitions on prefix sharding, write amplification under multi-region quorum..."

You glance and you keep talking. "...and the failure mode I am most worried about is hot partitions on prefix sharding — if a single customer dumps a million objects with a common prefix, that shard gets hammered. So I'd hash the object ID before sharding, not prefix —"

Priya nods. "Good. What about the cache layer?"

You blank for half a second. Wingman has it: "Read-through cache (Redis cluster, region-local). Cache invalidation: pub/sub on metadata writes. TTL fallback at sixty seconds. Negative caching for missing keys. Watch out: cache stampede on hot key expiry — use single-flight or stale-while-revalidate."

You speak the version of that answer that sounds like you. "Region-local Redis cluster, read-through. Invalidation via pub/sub on writes. I'd add a sixty-second TTL as a safety net and negative-cache the misses to keep us from hammering the metadata service for keys that don't exist." You have used this pattern in production. You know it. You just needed the prompt.

Twelve minutes later the interview moves to the coding round. You close the system design panel and Wingman switches into LeetCode mode — the question shows up as text on the left, your IDE is on the right, and the panel shows brute force, optimal, complexity, edge cases. You type your own solution, glancing only when stuck.

The interview ends at 11:48am. Priya says "this was a strong loop." You close Wingman, get a glass of water, and go for a walk. You did the actual reasoning. Wingman just made sure your brain did not stall under pressure.

How Wingman fits with the rest of Codersera's tools

Wingman is one piece of a small set of tools Codersera builds for engineers — the same engineers we vet, place, and work with every day on the hire side of the business.

InterviewLab is the practice tool. Voice-driven mock interviews with an AI interviewer that adapts to your role and resume, plus a structured feedback report covering technical depth, system design, communication, and role fit. Most engineers prep with InterviewLab and then bring Wingman to the real loop.
Clipy is the take-home companion. Free, browser-based, unlimited screen recording with instant shareable links. When the take-home asks for a video walkthrough of your solution, this is the path of least resistance.
Note-Taker is the lightweight notepad you keep open during prep. Scratch problems, post-mortems, the question patterns you keep seeing across loops.

The pattern is the same across all four tools: take a thing engineers already do, remove the friction, and do not charge for it.

Honest comparison vs alternatives

Wingman is not the only tool in this category. A short, fair comparison:

Vs other interview-helper apps. Most paid alternatives in this space charge $30–$80/month and lock the model behind their backend, which means they take a margin on every answer. Wingman is free and you bring your own key. The trade-off: you set up the API key yourself.
Vs running ChatGPT in a separate window. Two problems. First, ChatGPT is visible in screen-share, so you have to tab-switch and hope the interviewer does not see the URL bar. Second, you are typing the question yourself, which costs you ten to fifteen seconds per round. Wingman transcribes automatically and renders out of the screen-share. The seconds add up.
Vs a second monitor with notes. Notes do not adapt. Wingman does. The first time the interviewer asks a follow-up you did not anticipate, your notes are useless and Wingman is exactly as useful as it was on the first question.

The honest part

Wingman is in the same ethical category as a notepad of prepared answers, an open IDE during the call, a second monitor with documentation, or a colleague slacking you hints. Some interviewers consider all of those off-limits. Some consider all of them fine. Most have never thought about it because the format predates remote work and predates good language models.

The judgement call is yours. We built Wingman because we believe the live technical interview, as currently practised, mostly tests how well you perform in an artificial format that does not look like the job. We would rather see engineers get hired into roles they will be excellent in, with the help of a quiet copilot, than miss the offer because they froze on minute eighteen of round four of week six.

What we will not do is make Wingman feel sleazy. There are no upsells, no "premium tier," no telemetry pushing you toward a paid plan. The app is free. Your audio stays local. Your API key stays in your machine's keychain. We make money on the hire side of the business, not on you.

FAQ

Is Wingman really free?

Yes. The app is free. You only pay for the LLM API calls your key makes — typically fractions of a cent per answer. We do not have a paid tier and we do not plan to add one.

What model should I use?

For cost-per-answer in 2026, DeepSeek is the cheapest credible option. For best quality on system design, Claude or GPT-class via OpenRouter. The honest answer: they are all good enough for interview answers. Pick the one your wallet prefers.

Can I use Wingman on Windows or Linux?

No. Wingman is Mac-only on Apple Silicon (M1 / M2 / M3 / M4). Building the equivalent screen-capture-exclusion on Windows is a different engineering project and is not on our roadmap.

Does Wingman do diagrams for system design?

Wingman returns structured prose with components, trade-offs, and follow-up talking points. You sketch the diagram while you talk — Wingman feeds you the structure to riff on. We considered diagram generation and decided the structured prose was more useful at interview tempo.

What happens if my API key runs out of credit mid-interview?

Wingman shows the API error in the panel. The interview keeps going — you fall back to your own brain. We recommend topping up your key the night before any interview week so this never bites you.

How do I report a bug or request a feature?

Email support@codersera.com or open an issue against the public release notes page at codersera.com/tools/wingman/releases. We read everything and ship fixes in the next auto-update.

Get Wingman

If you have an interview this week, Wingman takes about three minutes to set up. Download Wingman free for Mac. Add your API key, run one practice question to get a feel for the latency, and you are ready.

If you are still in the prep phase and not ready for live loops, start with InterviewLab instead — voice-driven mock interviews with structured feedback. Bring Wingman in when you are interviewing for real.