Skip to content

Conversation

@tobiasschweizer
Copy link
Collaborator

@tobiasschweizer tobiasschweizer commented Feb 11, 2026

This PR adds the dates for the latest successful harvests (if any) to the endpoints config.

There are still some things to clarify:

  • until_date: I think this is not explicitly calculated but set with the row update (see
    until_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
    ).
  • precision (date and daytime): I think the crawler just cares about the day, not the daytime
  • make sure this can never fail in Python (endpoint with no previous harvest, only an initial harvest, existing incremental harvests)
  • so far, this only shows the latest successful harvest. Is this sufficient?

Here is a sample output for my local setup:

{
  "endpoints_configs": [
    {
      "name": "Social Sciences Data Station",
      "harvest_url": "https://ssh.datastations.nl/oai",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": null,
        "additional_metadata_params": {
          "format": "dataverse_json",
          "endpoint": "https://ssh.datastations.nl/api/datasets/export",
          "protocol": "REST_API"
        }
      },
      "code": "DANS",
      "protocol": "OAI-PMH",
      "from_date": null,
      "until_date": "2025-11-10T10:16:59.766606Z",
      "started_at": "2025-11-10T10:16:59.792841Z",
      "completed_at": "2025-11-10T11:08:40.935497Z"
    },
    {
      "name": "Generalist",
      "harvest_url": "https://dataverse.nl/oai",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": null,
        "additional_metadata_params": {
          "format": "dataverse_json",
          "endpoint": "https://dataverse.nl/api/datasets/export",
          "protocol": "REST_API"
        }
      },
      "code": "DANS",
      "protocol": "OAI-PMH",
      "from_date": null,
      "until_date": "2025-11-10T15:18:49.727519Z",
      "started_at": "2025-11-10T15:18:49.742455Z",
      "completed_at": "2025-11-10T16:57:53.944338Z"
    },
    {
      "name": "Life Sciences",
      "harvest_url": "https://lifesciences.datastations.nl/oai",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": null,
        "additional_metadata_params": {
          "format": "dataverse_json",
          "endpoint": "https://lifesciences.datastations.nl/api/datasets/export",
          "protocol": "REST_API"
        }
      },
      "code": "DANS",
      "protocol": "OAI-PMH",
      "from_date": null,
      "until_date": "2025-11-10T17:01:55.991154Z",
      "started_at": "2025-11-10T17:01:56.007802Z",
      "completed_at": "2025-11-10T17:12:24.449104Z"
    },
    {
      "name": "Physical and Technical Sciences",
      "harvest_url": "https://phys-techsciences.datastations.nl/oai",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": null,
        "additional_metadata_params": {
          "format": "dataverse_json",
          "endpoint": "https://phys-techsciences.datastations.nl/api/datasets/export",
          "protocol": "REST_API"
        }
      },
      "code": "DANS",
      "protocol": "OAI-PMH",
      "from_date": null,
      "until_date": "2025-11-10T17:19:04.432023Z",
      "started_at": "2025-11-10T17:19:04.449715Z",
      "completed_at": "2025-11-10T17:28:06.530949Z"
    },
    {
      "name": "SwissUbase",
      "harvest_url": "https://www.swissubase.ch/oai-pmh/v1/oai",
      "harvest_params": {
        "metadata_prefix": "oai_ddi25",
        "set": null,
        "additional_metadata_params": null
      },
      "code": "SWISS",
      "protocol": "OAI-PMH",
      "from_date": null,
      "until_date": "2025-11-12T09:14:23.958454Z",
      "started_at": "2025-11-12T09:14:24.003355Z",
      "completed_at": "2025-11-12T09:19:06.827129Z"
    },
    {
      "name": "Archaeology Data Station",
      "harvest_url": "https://archaeology.datastations.nl/oai",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": null,
        "additional_metadata_params": {
          "format": "dataverse_json",
          "endpoint": "https://archaeology.datastations.nl/api/datasets/export",
          "protocol": "REST_API"
        }
      },
      "code": "DANS",
      "protocol": "OAI-PMH",
      "from_date": null,
      "until_date": "2025-11-12T10:21:30.215685Z",
      "started_at": "2025-11-12T10:21:30.243403Z",
      "completed_at": "2025-11-12T16:18:10.405918Z"
    },
    {
      "name": "DABAR",
      "harvest_url": "https://dabar.srce.hr/oai/",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": [
          "openaire"
        ],
        "additional_metadata_params": {
          "format": "mods",
          "endpoint": "https://dabar.srce.hr/oai/",
          "protocol": "OAI-PMH"
        }
      },
      "code": "DABAR",
      "protocol": "OAI-PMH",
      "from_date": "2025-11-17T12:19:37.249864Z",
      "until_date": "2025-12-11T09:35:30.248703Z",
      "started_at": "2025-12-11T09:35:30Z",
      "completed_at": "2025-12-11T09:35:34Z"
    },
    {
      "name": "Onedata",
      "harvest_url": "https://demo.onedata.org/oai_pmh",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": [
          "a842ea97ec1855a54bf77a90e915cac7cha3ab"
        ],
        "additional_metadata_params": null
      },
      "code": "ONE",
      "protocol": "OAI-PMH",
      "from_date": null,
      "until_date": "2025-11-17T12:24:20.022003Z",
      "started_at": "2025-11-17T12:24:20Z",
      "completed_at": "2025-11-17T12:24:20Z"
    },
    {
      "name": "DABAR",
      "harvest_url": "https://dabar.srce.hr/oai/",
      "harvest_params": {
        "metadata_prefix": "oai_datacite",
        "set": [
          "openaire"
        ],
        "additional_metadata_params": {
          "format": "mods",
          "endpoint": "https://dabar.srce.hr/oai/",
          "protocol": "OAI-PMH"
        }
      },
      "code": "DABAR",
      "protocol": "OAI-PMH",
      "from_date": "2025-11-17T12:19:37.249864Z",
      "until_date": "2025-12-11T09:35:30.248703Z",
      "started_at": "2025-12-11T09:35:30Z",
      "completed_at": "2025-12-11T09:35:34Z"
    }
  ]
}

@tobiasschweizer tobiasschweizer self-assigned this Feb 11, 2026
@tobiasschweizer tobiasschweizer added the enhancement New feature or request label Feb 11, 2026
@tobiasschweizer
Copy link
Collaborator Author

@Michal-Kolomanski Would this PR provide what you need?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant