Дедупликация по email

MediumPython
05:00
Лучше работает на десктопе
Pandasdrop_duplicatesdata-cleaningdeduplication

Дана таблица пользователей (словарь) с колонками: email, name, date.

Для каждого email оставьте только последнюю запись по дате.

Сигнатура

def dedup_by_email(data: dict) -> list[dict]:

Результат отсортирован по email.

Примеры

data = {"email": ["alice@mail.ru","bob@mail.ru","alice@mail.ru","bob@mail.ru"],
        "name": ["Alice","Bob","Alice A","Robert"],
        "date": ["2024-01-01","2024-01-05","2024-03-15","2024-02-10"]}
dedup_by_email(data) → [
    {"email": "alice@mail.ru", "name": "Alice A", "date": "2024-03-15"},
    {"email": "bob@mail.ru", "name": "Robert", "date": "2024-02-10"}
]

Примеры

Пример 1

Вход:
data = {"email":["alice@mail.ru","bob@mail.ru","alice@mail.ru","bob@mail.ru"],"name":["Alice","Bob","Alice A","Robert"],"date":["2024-01-01","2024-01-05","2024-03-15","2024-02-10"]}
Выход:[{"email":"alice@mail.ru","name":"Alice A","date":"2024-03-15"},{"email":"bob@mail.ru","name":"Robert","date":"2024-02-10"}]

Пример 2

Вход:
data = {"email":["x@a.com","y@a.com","x@a.com","x@a.com"],"name":["X1","Y","X2","X3"],"date":["2024-01-01","2024-06-01","2024-05-01","2024-03-01"]}
Выход:[{"email":"x@a.com","name":"X2","date":"2024-05-01"},{"email":"y@a.com","name":"Y","date":"2024-06-01"}]

Пример 3

Вход:
data = {"email":["solo@test.com"],"name":["Solo"],"date":["2024-07-07"]}
Выход:[{"email":"solo@test.com","name":"Solo","date":"2024-07-07"}]
Консоль
Нажмите Run или Ctrl+Enter для запуска