Use this checklist on migration day to ensure nothing is missed.
- Collected new Azure Tenant ID:
_______________________ - Collected new Azure Subscription ID:
_______________________ - Verified Contributor/Owner access to new subscription
- MongoDB connection string ready:
mongodb://... - AI provider API keys ready (OpenAI/Anthropic/Azure OpenAI)
- OneSignal credentials ready (if using push notifications)
- Exported current app settings to file:
az webapp config appsettings list \ --name emailgeniebackend \ --resource-group huddleup \ --output json > old-app-settings.json - Documented custom domains (if any):
_______________________ - Noted current SKU/tier:
_______________________ - Recorded average daily traffic/metrics
- Listed any third-party integrations
- Full MongoDB backup created:
mongodump --uri="OLD_URI" --out=/backup/paletai-$(date +%Y%m%d)
- Backup verified (can restore to test instance)
- Tested MongoDB connectivity from new Azure region
- Decided on migration strategy:
- Keep same MongoDB (update connection only)
- Migrate to new MongoDB (backup/restore)
- Use continuous sync (zero downtime)
- Test user accounts identified for post-migration testing
- Test game prompts prepared
- API test scripts ready
- Health check URLs bookmarked
- Maintenance window scheduled (if needed):
_______________________ - Users notified of potential downtime
- Status page prepared (if applicable)
- Team members briefed on migration plan
Estimated Time: 10-15 minutes
-
Navigated to deployment directory:
cd /mnt/d/Dev2/clients/HuddleUp/deployment -
Reviewed deployment script parameters
-
Executed deployment:
.\Deploy-PaletAI.ps1 ` -TenantId "..." ` -SubscriptionId "..." ` -ResourceGroupName "rg-paletai-prod" ` -Location "westus3" ` -MongoDbConnectionString "..." ` -OpenAiApiKey "..." ` -AiProvider "openai"
-
Deployment completed successfully
-
Saved deployment outputs file
-
Recorded new App Service URL:
https://_______.azurewebsites.net -
Recorded Storage Account name:
_______________________
- Resource group created in Azure Portal
- App Service Plan created (B1 SKU)
- App Service created
- Storage Account created
- Blob container "game-images" exists with public access
- Application Insights created
Estimated Time: 5-10 minutes
Choose one method:
- Retrieved publish profile from Azure Portal or CLI
- Added
AZURE_WEBAPP_PUBLISH_PROFILEto GitHub secrets - Triggered GitHub Actions workflow
- Workflow completed successfully
- Verified deployment in Azure Portal
- Cloned repository:
git clone https://github.com/HuddleUp-AI/paletaibackend.git - Created ZIP package (excluding .git, pycache, etc.)
- Deployed via Azure CLI:
az webapp deployment source config-zip \ --resource-group rg-paletai-prod \ --name APP_NAME \ --src deploy.zip - Deployment completed successfully
- App Service shows "Running" status in Azure Portal
- Accessed URL in browser (should show API welcome message or redirect to /docs)
- No deployment errors in Azure Portal → Deployment Center
Choose your strategy:
- Verified connection string in new app service
- Connection string is correct
- No migration needed - SKIP to Phase 4
- Stopped old app service (optional, for consistency):
az webapp stop --name emailgeniebackend --resource-group huddleup
- Created final backup from old database
- Restored to new MongoDB instance:
mongorestore --uri="NEW_URI" --dir=/backup/paletai-final - Verified data in new database:
mongosh "NEW_URI" --eval "db.users.countDocuments({})" mongosh "NEW_URI" --eval "db.games.countDocuments({})"
- Updated app service connection string (if different)
- Restarted app service
- Set up MongoDB replication/sync
- Verified sync is working
- Monitored replication lag
- Ready for cutover to new database
Estimated Time: 10 minutes
- Accessed:
https://NEW_APP_URL/health - Response status:
200 OK - Response body shows:
"status": "healthy""database": "healthy""api": "healthy"
- Accessed:
https://NEW_APP_URL/docs - Swagger UI loads correctly
- All endpoints visible
- Registered new test user via
/auth/register- Email:
test@example.com - Response: 200 OK, user created
- Email:
- Logged in via
/auth/login- Response: 200 OK, JWT token received
- Token stored:
_______________________
- Created game via
/gameswith test token- Prompt: "Create a simple snake game"
- Response: Task ID received
- Polled task status until complete
- Game created successfully
- Game ID:
_______________________
- Verified game has image URL
- Image URL accessible:
https://STORAGE_ACCOUNT.blob.core.windows.net/game-images/... - Image loads in browser
- Listed blobs in Azure Portal → Storage Account → game-images
- At least one blob exists
- Accessed
/games/feed - Response includes created game
- All game fields populated correctly
- Subscribed to notifications via
/notifications/subscribe - Subscription created successfully
- OneSignal dashboard shows subscription (if applicable)
Estimated Time: 5 minutes + DNS propagation (5-60 minutes)
- Lowered DNS TTL to 300 seconds (done 24+ hours ago)
- Old TTL has expired
- Custom domain added to new App Service:
az webapp config hostname add \ --webapp-name NEW_APP_NAME \ --resource-group rg-paletai-prod \ --hostname api.paletai.com
- Updated DNS CNAME record:
- Type:
CNAME - Name:
api(or@) - Value:
NEW_APP_NAME.azurewebsites.net - TTL:
300
- Type:
- Saved DNS changes
- Timestamp of DNS update:
_______________________
- Tested with
nslookup api.paletai.com - Tested with
curl https://api.paletai.com/health - Response from new server confirmed
- SSL certificate working (if configured)
Estimated Time: 30 minutes
- Opened Application Insights in Azure Portal
- Live Metrics shows active requests
- No errors or exceptions appearing
- Response times acceptable (<2 seconds)
- Started log stream:
az webapp log tail --name NEW_APP_NAME --resource-group rg-paletai-prod
- Logs show normal activity
- No error messages
- Database connections successful
- Created 5-10 test games
- Response times acceptable
- No timeouts or errors
- All games have images
- Feed loads quickly
- Logged in with production user account
- Created game
- Viewed feed
- Liked game
- All features working
- User feedback:
_______________________
- Confirmed new environment is stable
- No critical issues reported
- Metrics look normal
- Users are happy
- Stopped old app service:
az webapp stop --name emailgeniebackend --resource-group huddleup
- Monitored for any issues (users should be on new service)
- No complaints received
- Confirmed everything is working perfectly
- Exported any remaining logs/data from old environment
- Deleted old app service:
az webapp delete --name emailgeniebackend --resource-group huddleup
- Final backup of old environment (if not already deleted)
- Deleted old resource group:
az group delete --name huddleup --yes
- Updated internal documentation with new URLs
- Updated API documentation (if separate)
- Updated README with new deployment info
- Documented lessons learned
- Reviewed Application Insights metrics
- Identified any performance issues
- Adjusted SKU if needed (scale up/down)
- Configured auto-scaling rules (if S1+)
- Set up cost alerts in Azure
- Configured budget: $50/month (or as needed)
- Scheduled monthly cost review
- Configured Application Insights alerts:
- HTTP 5xx errors > 5 in 5 minutes
- Response time > 5 seconds
- Availability < 99%
- Configured Azure Monitor alerts:
- CPU > 80% for 10 minutes
- Memory > 80% for 10 minutes
- Disk space > 85%
- Tested alert delivery (email/SMS)
- Reviewed IP restrictions (if needed)
- Enabled managed identities (if applicable)
- Rotated API keys (if required)
- Reviewed CORS settings
- Ensured HTTPS only is enforced
- Set up automated MongoDB backups
- Documented restore procedure
- Tested backup restoration
- Configured retention policy
If critical issues are discovered:
- Reverted DNS CNAME to old app service
- Waited for DNS propagation (5-10 minutes)
- Started old app service
- Verified old service working
- Notified users of issue
- Restored from backup to old database
- Verified data integrity
- Updated old app service connection string
- Restarted old app service
- Investigated root cause
- Fixed issues in new environment
- Scheduled new migration date
- Documented what went wrong
Migration is successful when:
- ✅ Health endpoint returns 200 OK
- ✅ All API endpoints functional
- ✅ Database connectivity confirmed
- ✅ Game creation working
- ✅ Image upload to Blob Storage working
- ✅ No errors in Application Insights
- ✅ Response times < 2 seconds average
- ✅ Users can log in and use the app
- ✅ No critical bugs reported
- ✅ Cost is within expected range ($15-20/month for B1)
Date/Time: _______________
Issue: _______________________________________________________________
Resolution: ___________________________________________________________
Date/Time: _______________
Issue: _______________________________________________________________
Resolution: ___________________________________________________________
Date/Time: _______________
Issue: _______________________________________________________________
Resolution: ___________________________________________________________
- Technical lead sign-off:
_______________________Date:_______ - Migration completed successfully
- All systems operational
- Documentation updated
- Old environment scheduled for decommission
Migration Duration: Start: _______ End: _______ Total: _______ hours
Total Downtime: _______ minutes (goal: < 15 minutes)