File Uploader

This project demonstrates an approach to leverage S3 multipart upload and Step Functions (Distributed Map) to concurrently download a large file, up to 100TB (10GB * 10,000) in theory, from any given url (request range is required), and upload to a S3 bucket.

This project also demonstrates a way to host the source code in CodeCommit and deploy via CDK Pipeline.

Solution

Components

Partitioner An Python Lambda take URL and SingleTaskSize as input, fetches the total download file size from given url. Based on given single task size, split the upload task into smaller tasks, and pass tasks to next state.
Uploader An Python Lambda is triggered by Step Functions leverages request range to download a portion of file, and upload to S3 by using multipart upload.
Step Functions An state machine handles tasks validation, fan-out, retry and error handling, also handles S3 multipart upload create, complete and abort.

Diagram

Test

Simply run make test to run lint and unit test on Partitioner and Uploader.

Deploy

Prerequisites

An AWS IAM user account which has enough permission to deploy:
- CodeCommit
- CodeBuild
- CodePipeline
- Step Functions
- Lambda
- S3
Set up AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION and CDK_DEFAULT_ACCOUNT in .env file.

Deploy with Docker

This project is using AWS CodeCommit to host source code and CDK Pipeline to deploy. Simply run make ci-deploy to run lint, build, create new repository in CodeCommit, push source code and deploy the project CDK Pipeline.

Example

An example Step Functions payload below to upload an awscli file to S3.

{
  "URL": "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip",
  "SingleTaskSize": 6000000
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
bin		bin
docker/deployer		docker/deployer
images		images
partitioner		partitioner
stacks		stacks
uploader		uploader
.env.example		.env.example
.env.test		.env.test
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
cdk.context.json		cdk.context.json
cdk.json		cdk.json
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

File Uploader

Solution

Components

Diagram

Test

Deploy

Prerequisites

Deploy with Docker

Example

About

Uh oh!

Releases

Packages

Uh oh!

Languages

CameronXie/file-uploader

Folders and files

Latest commit

History

Repository files navigation

File Uploader

Solution

Components

Diagram

Test

Deploy

Prerequisites

Deploy with Docker

Example

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages