К обычному разбору
Тренировка по собеседованиюТехническое собеседованиеOkko2025-08-25

Okko: Техническое собеседование

Идите сверху вниз: сначала попробуйте сами, затем откройте разбор. Если шаг с кодом, пишите решение прямо здесь и запускайте проверки на странице.

Шагов
4
Вопросов
4
Задач
0
1Вопрос10 мин

Вопрос

What is a probability space? What is a set of measure zero and why can a finite or countable set have probability zero in a continuous distribution?

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Показать разбор

Короткий ответ

A probability space is (Omega, F, P): outcomes, measurable events and a probability measure. A measure-zero set has P(A)=0; under continuous distributions, individual points and countable unions of points can have zero probability.

Подробный разбор

A probability space consists of three parts. Omega is the sample space of possible outcomes. F is a sigma-algebra of events we are allowed to assign probabilities to. P is a probability measure on F with P(Omega)=1, P(empty)=0 and countable additivity.

The sigma-algebra matters because for continuous spaces not every arbitrary subset is well-behaved under a measure. It is closed under complements and countable unions, which lets probability operations remain consistent.

A set of measure zero is an event with probability zero. In a continuous distribution over R, a single exact point has probability zero, and any countable set of points also has probability zero by countable additivity. This does not mean the event is logically impossible; it means it carries no probability mass under that measure.

Типичные ошибки

  • Say probability zero means impossible.
  • Forget the sigma-algebra component.
  • Use finite additivity when countable additivity is the key property.

Как сказать на собеседовании

  • State the triple (Omega, F, P) first.
  • Use continuous uniform distribution on [0,1] as the simplest example.
2Вопрос8 мин

Векторное пространство, span и базис

Векторное пространство, span и базис

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Показать разбор

Короткий ответ

A vector space supports vector addition and scalar multiplication with the usual axioms. A span is all linear combinations of a set of vectors; a basis is a linearly independent set whose span is the whole space.

Подробный разбор

A vector space is a set of objects where you can add two vectors and multiply a vector by a scalar while satisfying closure, associativity, distributivity, identity and inverse properties.

The linear span of vectors v1...vk is the set of all linear combinations a1 v1 + ... + ak vk. It can be the whole space or a subspace.

A basis is a stronger object. It spans the target space and its vectors are linearly independent. Linear independence means no basis vector can be represented as a linear combination of the others. Because of that, every vector in the space has a unique coordinate representation in that basis.

Типичные ошибки

  • Say any spanning set is a basis.
  • Forget linear independence.
  • Define basis only as coordinates without the spanning property.

Как сказать на собеседовании

  • Use R2: two non-collinear vectors form a basis, three vectors span but are dependent.
3Вопрос10 мин

Вопрос

What does the Central Limit Theorem say and why is it important in statistics and A/B testing?

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Показать разбор

Короткий ответ

Under common regularity conditions, the normalized sample mean tends to a normal distribution as sample size grows. This justifies normal approximations for confidence intervals, tests and experiment readouts.

Подробный разбор

The Central Limit Theorem says that for independent identically distributed variables with finite variance, the distribution of the normalized sample mean approaches a normal distribution as sample size grows:

(mean - mu) / (sigma / sqrt(n)) -> N(0, 1).

This is important because many metrics are averages or proportions. Even if individual observations are not normally distributed, the sampling distribution of the mean can be approximately normal for large n. That gives practical formulas for standard errors, confidence intervals and hypothesis tests.

In A/B testing, CLT reasoning underlies z-tests and t-test approximations for many user-level metrics. The assumptions still matter: independence, finite variance, enough sample size and a metric definition that does not create severe dependence or heavy-tail instability.

Типичные ошибки

  • Say the original data becomes normally distributed.
  • Ignore independence and finite variance assumptions.
  • Use CLT blindly for heavy-tailed or dependent user events.

Как сказать на собеседовании

  • Say "sample mean" early.
  • Connect to standard error sigma divided by square root of n.
4Вопрос12 мин

Вопрос про production ML

Compare REST and gRPC at a high level. Then explain what a database index does and what simple data structures can back an index.

Ответьте без подсказки

Сначала проговорите ответ вслух или тезисами.

Запишите черновик

Формулы, план решения, риски и примеры.

Сравните с разбором

Откройте разбор только после своей попытки.

Показать разбор

Короткий ответ

REST is commonly HTTP/resource-oriented and often JSON/text based; gRPC uses protobuf contracts over HTTP/2 and supports efficient binary RPC and streaming. Database indexes speed lookups by maintaining auxiliary structures such as B-trees for ranges or hash indexes for equality.

Подробный разбор

REST is an architectural style commonly implemented over HTTP with resource URLs, methods such as GET/POST and JSON payloads. It is simple, widely interoperable and human-readable, but may be less efficient for high-throughput service-to-service RPC.

gRPC defines service methods and message schemas with protobuf, usually runs over HTTP/2, uses compact binary serialization and supports streaming. It is useful for internal microservice communication where typed contracts, lower payload overhead and long-lived connections matter.

A database index is an auxiliary data structure that lets the database find rows without scanning the whole table. For equality lookup, a hash table-like structure can work. For range queries, sorting and ordered traversal matter, so B-tree/B+tree-style indexes are common. The trade-off is write overhead and extra storage because the index must be maintained when data changes.

Типичные ошибки

  • Call gRPC lower-level than HTTP without noting that it commonly runs over HTTP/2.
  • Say an index always makes every query faster.
  • Use hash indexes for range queries.

Как сказать на собеседовании

  • Give one use case for REST and one for gRPC.
  • For indexes, immediately distinguish equality from range lookup.