Skip to content

[PULP-1125] Pulp does not correctly handle 2 NVRA (different epoch) packages in one repository #4239

@dralley

Description

@dralley

Version

any

Describe the bug

This is a specific subvariant of #2678 (comment)

Package filenames are generated NVRA. Repository uniqueness is constrained by NEVRA. Thus, two packages can co-exist in a repository with the same NVRA but different epochs, and they will declare that their packages are at the same filename, but only one can "win". Currently that probably means the package with the later build time "wins", but DNF downgrade would likely be broken in this instance as there is no way to actually fetch the older package.

The solution needs to also supplant the previous, incomplete solution for a similar issue

# TODO: this is meant to be a !! *temporary* !! fix for
# https://github.com/pulp/pulp_rpm/issues/2407
pkg_pks_to_ignore = set()
latest_build_time_by_nevra = defaultdict(list)
packages = Package.objects.filter(pk__in=content)
for pkg in packages.values(
"pk",
"name",
"epoch",
"version",
"release",
"arch",
"time_build",
).iterator():
nevra = format_nevra(pkg["name"], pkg["epoch"], pkg["version"], pkg["release"], pkg["arch"])
latest_build_time_by_nevra[nevra].append((pkg["time_build"], pkg["pk"]))
for nevra, pkg_data in latest_build_time_by_nevra.items():
# sort the packages by when they were built
if len(pkg_data) > 1:
pkg_data.sort(key=lambda p: p[0], reverse=True)
pkg_pks_to_ignore |= set(entry[1] for entry in pkg_data[1:])
log.warning(
"Duplicate packages found competing for NEVRA {nevra}, selected the one with "
"the most recent build time, excluding {others} others.".format(
nevra=nevra, others=len(pkg_data[1:])
)
)

To Reproduce
Put two packages of the same NVRA but different epochs into one repository - they should co-exist yet when the repository is published only one package will be accessible.

Expected behavior

We need a better method of filename conflict resolution than simply "pick the package with the latest build time" to "win" when publishing. Discussion needs to happen on exactly what that is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions