Обязательно

Multimodal Data Governance

Video/audio/image/text governance: consent, PII, likeness abuse, synthetic media provenance and safety filters.

Время изучения: 26 мин

Multimodal Data Governance

Governance and safety controls for image-text, audio and video corpora: alignment, provenance, PII, consent, safety filters and deletion workflows.

Что должен уметь кандидат

  • Explain how multimodal datasets add alignment, safety, storage and rights-management issues beyond text-only corpora.
  • Treat CLIP-style filtering/alignment as curation signal, not proof of safety or legality.
  • Specify dataset-card fields: sources, collection process, filters, intended use, biases and unsafe-content risks.
  • Propose governance workflows for exclusions, audits and downstream user warnings.

Что спрашивают на собеседовании

  • How would you curate a web-scale image-text dataset responsibly?
  • What metadata does video need that text does not?
  • How would you handle remove requests after model training?

Практическая задача

Create governance checklist for image-text/video dataset: sample schema, safety filters, audit sampling, dataset card and escalation process.

Source-grounded правило

Safety scores and filters are imperfect signals; publish limitations and review process rather than claiming guaranteed safety.