Назад к подготовке

ВопросСложнаяdata-qualityВопрос про production ML на техническом собеседовании · Chinor Chinor

Шумные ASR-аннотации и агрегация расшифровок

Шумные ASR-аннотации и агрегация расшифровок

Ответить самому

Сначала сформулируйте ответ как на собеседовании, затем откройте разбор и оцените себя.

Загрузка

Короткий ответ

Use strict annotation guidelines, normalize numbers and addresses, double-label hard samples, align transcripts, resolve disagreements by confidence or review, and measure annotator quality.

Полный разбор

Start with guidelines. Define how to write times, dates, numbers, branch names, hesitations and unclear speech. Provide examples and a normalization dictionary for branches and common address variants. For aggregation, align transcripts at word or character level, using edit distance or sequence alignment. Tokens agreed by most annotators can be accepted automatically. Disagreements around key entities such as time, date and address should be escalated to expert review or resolved using the operator booking table when it is trustworthy. Track annotator quality with overlap samples, disagreement rate, entity-specific error rate and adjudication outcomes. The goal is not a pretty transcript; it is a transcript and entity labels that improve ASR and downstream booking extraction.

Теория

Speech labels are sequence labels, so majority vote needs alignment and entity-aware adjudication rather than ordinary class voting.

Типичные ошибки

Use raw annotator text without normalization.
Aggregate transcripts with naive string equality.
Ignore entity disagreements because WER looks acceptable.
Fail to measure annotator-level quality.

Как отвечать на собеседовании

Mention sequence alignment before majority decisions.
Focus review on dates, times and addresses.