Skip to content

add general overview of data organization on AnVIL? #16

@KatherineCox

Description

@KatherineCox

After looking over the intro, I have a more general question about the book - what do you think about having chapter up front about how data is organized on AnVIL?

  • Explain where data "lives" (Workspace buckets) and who pays for data storage
  • Explain how you can link to data files without copying them (saving on storage fees), and how cloning a Workspace works - what happens to the data?
  • Discuss data tables - why you would want them, link out to info about how to create/manage them
  • Explain a bit about egress and requestor-pays - give people an idea of when/why they would accrue costs for using data
  • Possibly discuss the idea of having "Data" Workspaces (that are just used for holding and sharing data, very few people have write access) and "Analysis" Workspaces (clone the Data Workspace and do whatever you want with the data). This is more advice than "how-to", but could be a useful idea for people to have in their heads.
  • Briefly discuss Workspace sharing and authorization domains (and link out for details)

I don't think this would need to go into a whole lot of detail - we can link out to Terra Docs or to other chapters in the book. But it could be helpful to provide a big picture overview of where data lives and how it gets moved around and organized - general concepts that would provide a useful baseline across subsequent of the chapters.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions