AWS Lambda function that automatically backs up Google Drive files to Amazon S3.
This system backs up files from a specific Google Drive folder to S3 on a daily schedule. Two implementations are available: a simple folder sync (currently deployed) and an advanced incremental sync (available for future enhancement).
File: lambda_function.py
- Folder Sync: Backs up specific "Mac Backup" folder from Google Drive
- Scheduled Execution: Runs daily at 6 AM Eastern via CloudWatch Events
- Duplicate Prevention: Checks if files exist in S3 before uploading
- Folder Structure: Maintains original Drive folder hierarchy in S3
- Secure Storage: Credentials stored in AWS Secrets Manager
- Binary Files Only: Skips Google Workspace files (Docs, Sheets, Slides)
The current system follows this flow:
- CloudWatch Events triggers Lambda daily at 6 AM Eastern
- Lambda authenticates using credentials from Secrets Manager
- Lambda scans "Mac Backup" folder in Google Drive
- Lambda downloads files that don't already exist in S3
- Lambda uploads files to S3 maintaining folder structure
File: drive_to_s3_backup.py
- Incremental Sync: Uses Google Drive Changes API to only process modified files
- Format Conversion: Converts Google Docs/Sheets/Slides to .docx/.xlsx/.pptx
- State Management: Tracks sync progress via SSM Parameter Store
- Stable S3 Keys: Prevents file duplication with consistent naming
- Full Drive Sync: Can sync entire Drive, not just specific folders
- CloudWatch Events triggers Lambda on schedule
- Lambda authenticates using credentials from Secrets Manager
- Lambda checks SSM Parameter Store for last sync state
- Lambda queries Google Drive Changes API for modified files
- Lambda downloads and converts files, then uploads to S3
- Lambda updates sync state in SSM Parameter Store
-
Install dependencies:
pip install -r requirements.txt
-
Set up AWS resources:
- Create S3 bucket:
google-drivesync-backup - Store Google OAuth credentials in Secrets Manager
- Create SSM parameter for state tracking
- Create S3 bucket:
-
Deploy to Lambda:
zip -r function.zip drive_to_s3_backup.py # Upload to AWS Lambda
SECRET_ID = "drivesync/google-oauth" # Secrets Manager secret ID
SSM_PARAM = "/drivesync/startPageToken" # SSM parameter name
S3_BUCKET = "google-drivesync-backup" # S3 bucket name
S3_PREFIX = "drivesync" # S3 key prefixStore credentials in AWS Secrets Manager as JSON:
{
"token": {
"refresh_token": "your_refresh_token",
"client_id": "your_client_id",
"client_secret": "your_client_secret",
"token_uri": "https://oauth2.googleapis.com/token",
"scopes": ["https://www.googleapis.com/auth/drive.readonly"]
}
}- Google Docs → .docx (Word)
- Google Sheets → .xlsx (Excel)
- Google Slides → .pptx (PowerPoint)
s3://bucket/drivesync/folder/filename__file_id
Lambda execution role needs:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"secretsmanager:GetSecretValue",
"ssm:GetParameter",
"ssm:PutParameter"
],
"Resource": "*"
}
]
}Set up CloudWatch Events:
- Hourly:
rate(1 hour) - Every 6 hours:
cron(0 */6 * * ? *)
{
"status": "ok",
"uploaded": 5,
"skipped": 12
}- File processing details
- Error messages
- Performance metrics
google-auth: Google API authenticationgoogle-api-python-client: Drive API clientboto3: AWS SDK
Part of a complete file management system:
- mac-to-google-drive - Mac → Google Drive sync
- synagogue-file-search - Web search interface
MIT License