-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Sendit Base
- Create sendit base image to save building time ready for testing.
We would want to be able to quickly deploy the main application Dockerfile, or entirely different tools with the same google APIs, off of this base. Then the Dockerfile for sendit can simply update the (regularly changing) libraries:
FROM pydicom/sendit-base
# update deid
WORKDIR /opt
RUN git clone -b development https://github.com/pydicom/deid
WORKDIR /opt/deid
RUN python setup.py install
# som
WORKDIR /opt
RUN git clone https://github.com/vsoch/som
WORKDIR /opt/som
RUN python setup.py install
WORKDIR /code
ADD . /code/
CMD /code/run_uwsgi.sh
EXPOSE 3031
No targuzz
- Images should be represented on the level of dicoms
This makes a lot of sense to me in terms of metadata - we want to represent metadata about images, not about zipped up things that need to be unzipped first. We can also very easily view a dicom in a browser from a url, and this isn't the case with .tar.gz (unless it's another format like nifti).
User Friendly Config File
If we can see some day being able to deploy a sendit instance for a researcher, the configuration needs to be easy and stupid. The harder part is generation of the deid recipe, but for the rest of it, it should come down to reading a file that gets integrated into their custom build and then drives the application. It might also make sense to represent the config in the database as a model, that way one instance can have several (and the input folders for each are defined when created) and changes can be made without stopping/restarting the application.
Som BigQuery Client
- Implement bigquery client into som-tools, use for metadata
We would want to use BigQuery instead of Datastore. This is ready to go and needs testing.
Testing
I want to do the following tests to generally get a "move images" and "move metadata" strategy. It comes down to testing batched uploads (in sync), batched uploads (separate images from metadata) vs. rsynch (more risky but a lot faster according to others).
- Test speed with bigquery + metadata + storage
- Test speed with caching metadata + storage
- If time still slow, investigate rsync
Changes for Dasher
- changes to dasher endpoint (session?)
I'll leave this to Susan to ping me when we absolutely need changes.
Note - this is still a Stanford hosted server, without PHI on cloud