@@ -12,7 +12,7 @@ To run the containers:
1212 - ` POSTGRES_ADDRESS ` (default "postgres") and ` POSTGRES_PORT ` (default 5432)
1313 - ` OPENSEARCH_ADDRESS ` (default "opensearch") and ` OPENSEARCH_PORT ` (default 9200)
1414 - ` FASTAPI_ADDRESS ` (default "127.0.0.1") and ` FASTAPI_PORT ` (default 8080)
15- - API keys for mcp server:
15+ - API keys for search API server:
1616 ``` sh
1717 cp keys.env.template keys.env
1818 ```
@@ -27,52 +27,52 @@ To run the containers:
2727- create OpenSearch index, see below.
2828- run transformation process, see below.
2929
30- ## pgAdmin
30+ ## pgAdmin
3131
32- - when using pgAdmin, register a new server with ` Host name ` "postgres" (container name in docker network) with port "5432".
32+ - when using pgAdmin, register a new server with ` Host name ` "postgres" (container name in docker network) with port "5432".
3333- provide credentials as defined in ` .env ` .
3434
3535# Basic Setup
3636
3737- ``` shell
3838 cd scripts
3939 ```
40- - Install [ uv] ( https://docs.astral.sh/uv/ ) and run
40+ - Install [ uv] ( https://docs.astral.sh/uv/ ) and run
4141 ``` sh
4242 uv sync
4343 ```
44-
44+
4545## Create Postgres DB and Load and Transform Data
4646
4747- ``` sh
4848 cd scripts/postgres_data
4949 ```
5050
51- - create table structure and repo config as defined in ` scripts/postgres_data/create_sql `
52- (to start from scratch, you have to remove the tables first with [ DROP] ( https://www.postgresql.org/docs/current/sql-droptable.html ) ):
51+ - create table structure and repo config as defined in ` scripts/postgres_data/create_sql `
52+ (to start from scratch, you have to remove the tables first with [ DROP] ( https://www.postgresql.org/docs/current/sql-droptable.html ) ):
5353 ``` sh
5454 uv run create_db.py
5555 ```
5656
57- - load XML data from ` scripts/postgres_data/data ` (populates table ` harvest_events ` ):
57+ - load XML data from ` scripts/postgres_data/data ` (populates table ` harvest_events ` ):
5858 ``` sh
5959 uv run import_data.py
6060 ```
6161
62- - transform data from ` scripts/postgres_data/data ` to a local dir
63- (to test transformation, alternative to using the Celery process):
62+ - transform data from ` scripts/postgres_data/data ` to a local dir
63+ (to test transformation, alternative to using the Celery process):
6464 ``` sh
6565 uv run transform.py -i harvests_{repo_suffix} -o {repo_suffix}_json -s JSON_schema_file [-n]
6666 ```
6767 If the -n flag is provided, the JSON data will also be normalized and validated against the JSON schema file ` utils/schema.json ` .
6868
69- ## Create OpenSearch Index
69+ ## Create OpenSearch Index
7070
7171- ``` sh
7272 cd scripts/opensearch_data
7373 ```
7474
75- - create ` test_datacite ` index (deletes existing ` test_datacite ` index):
75+ - create ` test_datacite ` index (deletes existing ` test_datacite ` index):
7676
7777 ``` sh
7878 uv run create_index.py
@@ -88,7 +88,7 @@ To run the containers:
8888
8989The transformer container provides an [ API] ( http://127.0.0.1:8080/docs ) to start the transformation and indexing process.
9090
91- A transformation requires a ` harvest_run_id ` .
91+ A transformation requires a ` harvest_run_id ` .
9292When running the script ` import_data.py ` (scripts/postgres_data/data),
9393for each endpoint a harves run is created, the single OAI-PMH records are registered as harvest events,
9494and the harvest run is then closed. Note that a transformation can only be performed for a closed harvest run.
@@ -100,7 +100,7 @@ and the harvest run is then closed. Note that a transformation can only be perfo
100100
101101- To obtain a harvest run id and status for a given endpoint (https://dabar.srce.hr/oai ):
102102``` sh
103- http://127.0.0.1:8080/harvest_run? harvest_url=https%3A%2F%2Fdabar.srce.hr%2Foai
103+ http://127.0.0.1:8080/harvest_run? harvest_url=https%3A%2F%2Fdabar.srce.hr%2Foai
104104```
105105
106106- start transformation process:
0 commit comments