ML & DL Concepts

Concept questions check that your understanding is real — not memorized. The winning answer is concise, names the trade-off, and gives a concrete example. Below are common questions with the shape of a strong answer.

How to answer concept questions

A reliable structure: define it plainly → say why it matters → give an example or trade-off. Avoid reciting jargon. If you don’t know, reason aloud from fundamentals — interviewers value that far more than a confident wrong answer.

Foundational ML

What is overfitting, and how do you handle it? A model that memorizes training data, including its noise — low training error, high error on new data. The cause is too much capacity relative to the data. Fixes: more data, regularization (dropout, weight decay), a simpler model, early stopping. Detect it by watching training and validation loss diverge. See How Models Learn.

Why not just use accuracy? With imbalanced classes it’s misleading — 95% accuracy is worthless if 95% of cases are one class. Use precision and recall, and pick which to favor by the cost of each error type: recall for cancer screening, precision for spam. See Model Evaluation.

Explain the bias-variance trade-off. High bias = too simple, underfits (misses the real pattern). High variance = too complex, overfits (chases noise). Lowering one tends to raise the other; the goal is the balance that minimizes total error on unseen data.

Supervised vs. unsupervised — when each? Supervised needs labeled data and predicts a known target. Unsupervised finds structure in unlabeled data. Use unsupervised when labels don’t exist or the goal is exploration. See Learning Paradigms.

What is data leakage? Information not available at prediction time leaking into training — a feature that encodes the answer, or test statistics used during preprocessing. It produces great offline scores and production collapse. Mention you’d be suspicious of unusually high accuracy.

Deep learning

Why did transformers replace RNNs for text? RNNs process sequentially (no parallelism, slow to train) and forget long-range context. Transformers use self-attention: all positions at once (parallelizable, GPU-friendly) and any token reaches any other directly. The cost is quadratic scaling with sequence length. See Key Architectures.

What is attention, in one minute? For each token, attention computes how much to draw from every other token, then builds a context-weighted representation. It lets a word pull meaning from relevant distant words regardless of distance.

Why are GPUs needed? Neural network math is mostly large matrix multiplications — massively parallel. GPUs have thousands of cores for exactly that. The usual limit is GPU memory, not speed.

LLM-specific

What is a token? A chunk of text from a fixed vocabulary, ~¾ of a word. The unit of billing, context limits, and latency.

Why do LLMs hallucinate? Can you eliminate it? An LLM predicts plausible next tokens; a plausible falsehood scores as well as the truth. It’s intrinsic — you can’t eliminate it, only mitigate: ground with RAG, verify outputs, constrain the task, keep humans in the loop. See How LLMs Work.

Temperature — what does it do? Controls randomness in decoding. 0 is near-deterministic (extraction, classification); higher adds variety (creative work).

Fine-tuning vs. RAG vs. prompting — when each? Prompting for instruction gaps; RAG for knowledge gaps and changing data; fine-tuning for behavior, format, and style — not for facts. Try them in that order of cost. See Adapting LLMs.

Pretraining vs. fine-tuning? Pretraining: self-supervised next-token prediction on a vast corpus, building general knowledge — hugely expensive. Fine-tuning: continuing training on a small task-specific dataset to adapt the model — cheap, especially with LoRA.

Questions you should ask back

Interviews are two-way, and good questions signal seniority: How do you evaluate model quality? What does your AI observability look like? How do you manage inference cost? Showing you think about operating AI — not just building demos — sets you apart.

Key takeaways

Answer concept questions in three beats: plain definition, why it matters, example or trade-off. Know the foundational ML ideas (overfitting, precision/recall, bias-variance, leakage), the deep learning core (why transformers won, attention, GPUs), and the LLM specifics (tokens, hallucination, temperature, adaptation methods). Always frame answers around trade-offs, and reason honestly from fundamentals when unsure.