Some datasets/models/benchamrks could have same paper in citation, but different ids which leads to duplications in paper _Originally posted by @KennethEnevoldsen in https://github.com/embeddings-benchmark/mteb/pull/4012#pullrequestreview-3722397200_