-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
TODOs for PoC:
-
complete endpoint configuration information for crawler config:
- unique endpoint shortname (
repository_suffix) - define metadata_prefix (in
harvest_params?)
- unique endpoint shortname (
-
make FastAPI route for
- reading OAI-PMH endpoint config from table
endpointsandrepositories. This create the config files currently under version control in https://github.com/EOSC-Data-Commons/metadata-harvester/tree/master/repos_config - pushing OAI-PMH harvesting results and additional metadata to table
harvest_events: columnsraw_metadataandadditional_metadata. Ideally, this would allow for batches of data as the crawler gets them.
- reading OAI-PMH endpoint config from table
-
clarify whether
src/utils/normalize_datacite_json.pycould be moved to https://github.com/EOSC-Data-Commons/metadata-harvester: for development, it is convenient to map the file into the transform container using a bind mount (automatic reload without container rebuild). Maybe thesrc/utils/normalize_datacite_json.pycould be part of a library which could be used in the celery task. -
use a different Python client for postgreSQL like https://github.com/psycopg/psycopg2 and possibly a declarative class mapping tool such as SQLAlchemy, see Use SQLAlchemy #19
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels