Skip to content

Omar-p/media-feed

Repository files navigation

Media Feed Project

  • A media sharing platform with React frontend, NestJS backend, and AWS infrastructure.

πŸ—οΈ Architecture Overview

Media Flow Diagram

  • media-upload-view-flow.svg

Architecture Diagram

  • infrastructure

Backend (NestJS + TypeScript)

  • Authentication: Simple email/password authentication for user identification
  • Media Management: Presigned URL generation for direct S3 uploads
  • Like System: User interactions with media (like/unlike functionality)
  • Media Processing: Automated media type detection via Lambda functions

Frontend (React + Vite + Tailwind CSS)

  • Authentication: Sign up/Sign in forms
  • Media Upload: Direct-to-S3 uploads using presigned URLs
  • Media Gallery: Paginated media browsing with like/unlike functionality

Infrastructure & Media Processing Flow

  1. User uploads media β†’ Frontend gets presigned URL from backend
  2. File uploaded to S3 β†’ S3 PUT event triggers Lambda function
  3. Lambda processes file β†’ Detects media type (IMAGE/VIDEO/AUDIO) and metadata
  4. Lambda sends to SQS β†’ Queue message with processed metadata
  5. Backend consumes SQS β†’ Persists media data to PostgreSQL database
  6. Media served via CloudFront β†’ Cached delivery with optimized policies per media type

Production Infrastructure (AWS)

  • ECS on EC2: Backend deployment with nginx
  • RDS PostgreSQL: Managed database with automated backups
  • S3 + CloudFront: Media storage and global CDN distribution
  • Lambda + SQS: Serverless media processing pipeline
  • CloudFormation: Infrastructure as Code with comprehensive templates

Media Processing Architecture

Why Direct S3 Upload?

We offload media uploads directly to Amazon S3 instead of routing them through our backend servers for several critical reasons:

Performance & Scalability

  • Reduced Server Load: Large media files don't consume backend server resources or bandwidth
  • Parallel Processing: Multiple users can upload simultaneously without impacting server performance
  • Global Distribution: S3's global infrastructure provides faster upload speeds for users worldwide

Reliability & Cost Efficiency

  • No Timeout Issues: Backend servers won't timeout on large file uploads
  • Lower Infrastructure Costs: Eliminates need for high-bandwidth backend instances
  • Built-in Durability: S3 provides 99.999999999% (11 9's) durability out of the box

Security

  • Presigned URLs: Time-limited, secure upload URLs with specific permissions
  • Direct Transfer: Files never touch our servers, reducing attack surface
  • Controlled Access: Backend generates presigned URLs with specific bucket/path restrictions

Why Lambda for Media Type Detection?

We use AWS Lambda for media type validation instead of relying on client-side file extensions or basic MIME type checking for security and reliability reasons:

Security-First Approach

⚠️  NEVER trust file extensions or client-provided MIME types
  • File Extension Spoofing: Users can easily rename malicious.exe to image.jpg
  • MIME Type Manipulation: Client applications can send false MIME types
  • Malicious Uploads: Attackers often disguise harmful files as legitimate media

Robust Detection Method

Our Lambda function performs deep file inspection:

  1. File Signature Analysis: Reads the first few bytes (magic numbers) to identify true file type

    JPEG: FF D8 FF
    PNG:  89 50 4E 47
    MP4:  66 74 79 70
    
  2. Metadata Extraction: Analyzes file headers and internal structure

  3. Content Validation: Ensures file content matches the detected type

  4. Comprehensive Coverage: Supports IMAGE/VIDEO/AUDIO type detection

Why Not Backend Validation?

  • Resource Intensive: File analysis requires CPU and memory resources
  • Blocking Operations: Would slow down API responses
  • Scalability: Lambda auto-scales based on upload volume
  • Isolation: Failed processing doesn't impact main application

πŸ“‹ Prerequisites

πŸš€ Quick Start Guide

Step 1: Clone and Setup

git clone <repository-url>
cd media-feed

Step 2: Create LocalStack Initialization Script

chmod +x localstack/init/01-setup-services.sh
  • local video demo
1000017251.mp4

Step 3: Start the Application (Critical Order)

# 1. Clean any previous setup
make clean

# 2. Start all services
make dev

# 3. Wait for all containers to start (60-90 seconds)
# Check status: make status
# 4. Setup AWS infrastructure in LocalStack
make localstack-setup

# 5. CRITICAL: Restart backend to connect to SQS queue
docker compose restart backend

# 6. Verify everything is healthy
make status

Step 4: Check localstack during development

make lambda-logs        # View Lambda processing logs
make sqs-messages      # Check processed media metadata
make localstack-status # Verify all AWS resources

🌐 Application URLs

βš™οΈ Configuration & Environment Variables

Development Environment (.env.development)

The .env.development file contains non-sensitive configuration values only:

  • Database connection details (for local development)
  • LocalStack endpoints and configuration
  • Development server ports and settings

Important: No API keys, secrets, or production credentials are stored in this file. All sensitive values are:

  • Injected via GitHub Actions secrets for CI/CD
  • Managed through AWS Parameter Store/Secrets Manager in production
  • Set as environment variables in production containers

πŸ” API Endpoints

Authentication (Simple email/password)

# Sign Up
POST /authentication/sign-up
{
  "email": "user@example.com",
  "password": "password123",
  "confirmPassword": "password123"
}

# Sign In
POST /authentication/sign-in
{
  "email": "user@example.com",
  "password": "password123"
}

Media Management

# Get presigned URL for upload (requires authentication)
POST /media/presigned-url
Authorization: Bearer <jwt-token>

# Get media feed (public)
GET /media?page=1&size=5

# Like media (requires authentication)
POST /media/:mediaId/likes
Authorization: Bearer <jwt-token>
Response: { "liked": true, "likesCount": 42 }

# Unlike media (requires authentication)
DELETE /media/:mediaId/likes
Authorization: Bearer <jwt-token>
Response: { "liked": false, "likesCount": 41 }

πŸ”„ Media Processing Pipeline

  1. Upload Request: Frontend requests presigned URL from backend
  2. Direct Upload: File uploaded directly to S3 using presigned URL
  3. S3 Event: PUT event triggers Lambda function automatically
  4. Lambda Processing:
  • Analyzes file metadata and content-type
  • Detects media type (IMAGE/VIDEO/AUDIO)
  • Extracts user ID and media ID from S3 key path
  • the lambda needs some enhancement for failure cases, but for now it works well
  1. SQS Message: Lambda sends processed metadata to SQS queue
  2. Backend Processing: Backend consumes SQS messages and persists to database
  3. Media Available: Media appears in public feed with metadata

πŸ› οΈ Development Commands

Essential Commands

make dev              # Start all services
make dev-build        # Build and start all services  
make down             # Stop all services
make status           # Check service health
make logs             # View all logs
make clean            # Clean up everything

Service-Specific Logs

make logs-backend     # Backend application logs
make logs-frontend    # Frontend development server
make logs-localstack  # LocalStack AWS simulation
make logs-db          # PostgreSQL database logs

Media Processing Testing

make test-s3-upload           # Test S3 upload β†’ Lambda trigger # TODO: Need to fix
make test-different-media-types # Test IMAGE/VIDEO/AUDIO detection
make lambda-logs              # View Lambda execution logs
make sqs-messages            # Check processed metadata
make test-complete-flow      # End-to-end pipeline test

Database Operations

make db-backup               # Backup database
make db-restore              # Restore from backup
make db-reset               # Reset database (⚠️ deletes data)
make shell-db               # Access database shell

Container Access

make shell-backend          # Access backend container
make shell-frontend         # Access frontend container

πŸ“ Project Structure

media-feed/
β”œβ”€β”€ backend/                    # NestJS backend application
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ authentication/     # Auth module (sign-up/sign-in)
β”‚   β”‚   β”œβ”€β”€ media/             # Media management & Sqs consumer
β”‚   β”‚   β”œβ”€β”€ users/             # User management
β”‚   β”‚   └── common/            # Shared utilities
β”‚   β”œβ”€β”€ docker-entrypoint.sh   # Database migration script
β”‚   β”œβ”€β”€ Dockerfile             # Multi-stage Docker build
β”‚   └── package.json
β”œβ”€β”€ frontend/                   # React + Vite frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/        # Reusable UI components
β”‚   β”‚   β”œβ”€β”€ pages/            # Page components
β”‚   β”‚   β”œβ”€β”€ hooks/            # Custom React hooks
β”‚   β”‚   └── lib/              # Utilities and API client
β”‚   β”œβ”€β”€ Dockerfile            # Frontend containerization
β”‚   └── package.json
β”œβ”€β”€ localstack/                # AWS simulation setup
β”‚   └── init/                 # LocalStack initialization scripts
β”œβ”€β”€ infrastructure/           # AWS CloudFormation templates
β”‚   β”œβ”€β”€ vpc.yaml             # Network infrastructure
β”‚   β”œβ”€β”€ ecs.yaml             # Container orchestration
β”‚   β”œβ”€β”€ media-processing.yaml # S3, Lambda, SQS, CloudFront
β”‚   └── ssl-us-east-1.yaml   # SSL certificate management
β”œβ”€β”€ docker-compose.yaml      # Development orchestration
β”œβ”€β”€ .env.development         # Development configuration (non-sensitive)
β”œβ”€β”€ Makefile                 # Development commands
└── README.md

☁️ CloudFormation Infrastructure

The production infrastructure is fully defined using Infrastructure as Code with AWS CloudFormation templates:

Template Structure

  • vpc.yaml: Complete network setup with public/private subnets, NAT gateways, security groups
  • ecs.yaml: ECS cluster, services, task definitions, auto-scaling, and load balancers
  • media-processing.yaml: S3 buckets, Lambda functions, SQS queues, CloudFront distributions
  • ssl-us-east-1.yaml: SSL certificate management (ACM certificates for CloudFront)

Deployment

πŸ”§ Manual Deployment Steps

While most of the deployment is automated, there are two manual steps required in production:

1. Database Migrations

Issue: I failed to run it as part of the backend image.

Script Available: backend/docker-entrypoint.sh contains migration logic but requires manual trigger,

2. SSL Certificate & Nginx Configuration

Issue: Cost optimization choice using EC2 + Nginx instead of ALB + fargate + ACM.

Current Solution: Manual Certbot setup

# On EC2 instance
sudo certbot --nginx -d api.yourdomain.com
sudo systemctl reload nginx

Future Automation Options:

  • Ansible playbooks for automated server configuration(migration, ssl)

πŸ“ Architecture Decisions & Trade-offs

Development Context

Note: This is my first NestJS and React application, and first web frontend project in general. While I focused on achieving functional requirements, some best practices may have been missed. The architecture prioritizes learning and functionality over perfection.

Authentication Strategy

  • Current: Simple email/password for MVP and user identification
  • Production Plan: Migrate to managed solution (AWS Cognito, Auth0, or Firebase Auth)
  • Rationale: Focus on core functionality first, authentication as a service later

Media Storage & Delivery

  • Storage: Direct S3 uploads via presigned URLs (reduces backend load)
  • CDN: CloudFront with media-type-specific caching policies
  • URLs: Currently storing full URLs in database (trade-off for simplicity)
  • Future: Store S3 keys only, construct URLs dynamically

Infrastructure Choices

  • ECS on EC2: Cost optimization over ECS Fargate (~60% savings)
  • SSL Strategy: Certbot + Nginx (ACM not supported with EC2)
  • Database: RDS PostgreSQL with automated backups
  • Secrets: Parameter Store (cost consideration over Secrets Manager)
  • CloudFormation: Complete Infrastructure as Code approach

Development Experience

  • LocalStack: Complete AWS simulation for development
  • Docker: Consistent development environment
  • Hot Reload: Development containers with volume mounts
  • Make: Simplified command interface for complex Docker operations

Security Considerations

  • Secrets Management: GitHub Actions secrets for CI/CD, AWS Parameter Store for production
  • Database: No sensitive data in version control
  • JWT: Secure token-based authentication with proper expiration

Known Limitations & Future Improvements

  • Automated database migrations in deployment pipeline (current manual step)
  • Automated SSL certificate renewal via Ansible or SSM
  • TypeScript strict mode compliance throughout codebase
  • Comprehensive test coverage (unit + integration + e2e)
  • Enhanced error handling and input validation
  • Content moderation and safety checks for uploaded media
  • Rate limiting and API security hardening
  • Performance optimization for large media files
  • Monitoring and alerting with CloudWatch

Cost Optimization Decisions

The architecture balances functionality with Zero cost under aws free tier:

  • I used ssm parameter store instead of secrets manager to avoid cost, I know it's not for production use.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors