Skip to content

Updates for sendit development verison 2.0 #15

@vsoch

Description

@vsoch

Sendit Base

We would want to be able to quickly deploy the main application Dockerfile, or entirely different tools with the same google APIs, off of this base. Then the Dockerfile for sendit can simply update the (regularly changing) libraries:

FROM pydicom/sendit-base

# update deid
WORKDIR /opt
RUN git clone -b development https://github.com/pydicom/deid
WORKDIR /opt/deid
RUN python setup.py install

# som
WORKDIR /opt
RUN git clone https://github.com/vsoch/som
WORKDIR /opt/som
RUN python setup.py install

WORKDIR /code
ADD . /code/
CMD /code/run_uwsgi.sh

EXPOSE 3031

No targuzz

  • Images should be represented on the level of dicoms
    This makes a lot of sense to me in terms of metadata - we want to represent metadata about images, not about zipped up things that need to be unzipped first. We can also very easily view a dicom in a browser from a url, and this isn't the case with .tar.gz (unless it's another format like nifti).

User Friendly Config File

If we can see some day being able to deploy a sendit instance for a researcher, the configuration needs to be easy and stupid. The harder part is generation of the deid recipe, but for the rest of it, it should come down to reading a file that gets integrated into their custom build and then drives the application. It might also make sense to represent the config in the database as a model, that way one instance can have several (and the input folders for each are defined when created) and changes can be made without stopping/restarting the application.

Som BigQuery Client

  • Implement bigquery client into som-tools, use for metadata
    We would want to use BigQuery instead of Datastore. This is ready to go and needs testing.

Testing

I want to do the following tests to generally get a "move images" and "move metadata" strategy. It comes down to testing batched uploads (in sync), batched uploads (separate images from metadata) vs. rsynch (more risky but a lot faster according to others).

  • Test speed with bigquery + metadata + storage
  • Test speed with caching metadata + storage
  • If time still slow, investigate rsync

Changes for Dasher

  • changes to dasher endpoint (session?)
    I'll leave this to Susan to ping me when we absolutely need changes.

Note - this is still a Stanford hosted server, without PHI on cloud

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions