Deployment
Infrastructure setup and deployment guide
Deployment Guide
This guide covers deploying the SynapseAI B2B Wholesale Platform to AWS using Terraform.
Infrastructure Overview
The platform is deployed on AWS using:
- ECS Fargate: Serverless container orchestration
- Application Load Balancer: Traffic distribution and SSL termination
- ElastiCache Redis: Session and cache storage
- ECR: Container image registry
- VPC: Isolated network environment
- Secrets Manager: Secure credential storage
- CloudWatch: Logging and monitoring
Prerequisites
Required Tools
- Terraform 1.5+
- AWS CLI configured with appropriate credentials
- Docker for building container images
- AWS Account with appropriate permissions
AWS Permissions
Your AWS user/role needs permissions for:
- VPC and networking resources
- ECS (Elastic Container Service)
- ECR (Elastic Container Registry)
- ElastiCache
- Application Load Balancer
- IAM roles and policies
- Secrets Manager
- CloudWatch Logs
- ACM (Certificate Manager)
Infrastructure Architecture
Networking
VPC Configuration:
- CIDR Block:
10.0.0.0/16 - Three public subnets across availability zones:
eu-north-1a:10.0.1.0/24eu-north-1b:10.0.2.0/24eu-north-1c:10.0.3.0/24
- Internet Gateway for public access
- Route tables for traffic routing
Load Balancer
Application Load Balancer (ALB):
- Internet-facing
- Operates across three public subnets
- Listeners:
- Port 443 (HTTPS): Primary secure traffic
- Port 80 (HTTP): Redirects to HTTPS
SSL Certificates:
- Domains:
api.mach-x-b2b.app,demo-api.mach-x-b2b.app - Managed via AWS Certificate Manager (ACM)
- Auto-renewal enabled
Target Groups:
machbot_target_group: Production backenddemo_machbot_target_group: Demo backend
Container Registry
ECR Repositories:
mach-ai-b2b-agent: Production imagesdemo-mach-ai-b2b-agent: Demo images
ECS Cluster
Cluster: server_cluster
- Capacity Provider: Fargate
- Service Discovery: Enabled for service-to-service communication
Services:
- ct_mcp_server_service: Commercetools MCP server (port 3001)
- demo_ct_mcp_server_service: Demo Commercetools MCP server
- voucherify_mcp_server_service: Voucherify MCP server (port 10000)
- machbot_service: Production backend (port 8001)
- demo_machbot_service: Demo backend
Database and Cache
ElastiCache Redis:
- Serverless configuration
- Daily snapshots at 09:00 UTC
- 1-day snapshot retention
- Automatic failover enabled
Deployment Process
Step 1: Set Up AWS Credentials
Configure AWS CLI:
aws configureOr set environment variables:
export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_DEFAULT_REGION=eu-north-1Step 2: Configure Terraform
Navigate to the terraform directory:
cd terraformInitialize Terraform:
terraform initStep 3: Set Up Secrets
Store secrets in AWS Secrets Manager:
# Commercetools secrets
aws secretsmanager create-secret \
--name ct_mcp_server_secrets \
--secret-string '{
"CT_CLIENT_ID": "your-client-id",
"CT_CLIENT_SECRET": "your-client-secret",
"CT_PROJECT_KEY": "your-project-key"
}'
# Voucherify secrets
aws secretsmanager create-secret \
--name voucherify_secrets \
--secret-string '{
"VOUCHERIFY_APP_ID": "your-app-id",
"VOUCHERIFY_APP_TOKEN": "your-app-token"
}'
# MachBot secrets
aws secretsmanager create-secret \
--name machbot_secrets \
--secret-string '{
"OPENAI_API_KEY": "sk-...",
"LITELLM_MODEL": "cerebras/llama-3.3-70b"
}'Step 4: Create Terraform Workspace
Use workspaces to manage multiple environments:
# Create and switch to dev workspace
terraform workspace new dev
# Or select existing workspace
terraform workspace select devAvailable workspaces:
dev: Development environmentstaging: Staging environment (create as needed)prod: Production environment (create as needed)
Step 5: Configure Variables
Create a variables file for your environment:
# terraform/env/dev.tfvars
AWS_REGION = "eu-north-1"
DOCKER_IMAGE = "your-account-id.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:latest"
DEMO_DOCKER_IMAGE = "your-account-id.dkr.ecr.eu-north-1.amazonaws.com/demo-mach-ai-b2b-agent:latest"Step 6: Build and Push Docker Images
Build Backend Image:
cd backend
# Build image
docker build -t mach-ai-b2b-agent:latest .
# Tag for ECR
docker tag mach-ai-b2b-agent:latest \
YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:latest
# Login to ECR
aws ecr get-login-password --region eu-north-1 | \
docker login --username AWS --password-stdin \
YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com
# Push to ECR
docker push YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:latestStep 7: Plan Infrastructure Changes
Review the changes Terraform will make:
terraform plan -var-file=env/dev.tfvarsReview the output carefully to ensure:
- Correct resources are being created
- No unexpected deletions
- Security groups are properly configured
Step 8: Apply Infrastructure
Deploy the infrastructure:
terraform apply -var-file=env/dev.tfvarsType yes when prompted to confirm.
This process takes approximately 10-15 minutes.
Step 9: Verify Deployment
Check ECS services are running:
aws ecs list-services --cluster server_cluster
aws ecs describe-services --cluster server_cluster --services machbot_serviceCheck ALB health:
aws elbv2 describe-target-health \
--target-group-arn $(terraform output -raw machbot_target_group_arn)Test the API:
curl https://api.mach-x-b2b.app/healthFrontend Deployment
Deploy to Vercel
The frontend is configured for Vercel deployment:
Option 1: Automatic Deployment
- Connect GitHub repository to Vercel
- Vercel automatically deploys on push to
main
Option 2: Manual Deployment
cd frontend
# Install Vercel CLI
npm i -g vercel
# Deploy
vercel --prodEnvironment Variables in Vercel:
Set these in the Vercel dashboard:
VITE_API_BASE_URL=https://api.mach-x-b2b.app
VITE_ENABLE_VOICE_CHAT=trueAlternative: Deploy to AWS S3 + CloudFront
cd frontend
# Build
pnpm build
# Sync to S3
aws s3 sync dist/ s3://your-bucket-name/ --delete
# Invalidate CloudFront cache
aws cloudfront create-invalidation \
--distribution-id YOUR_DIST_ID \
--paths "/*"Security Configuration
Security Groups
ALB Security Group (alb_sg):
- Inbound: Ports 80, 443 from
0.0.0.0/0 - Outbound: All traffic
MachBot Security Group (machbot_sg):
- Inbound: Port 8001 from ALB security group
- Outbound: All traffic
Commercetools MCP Security Group (ct_mcp_server_sg):
- Inbound: Port 3001 from VPC CIDR
- Outbound: All traffic
Voucherify MCP Security Group (voucherify_mcp_server_sg):
- Inbound: Port 10000 from VPC CIDR
- Outbound: All traffic
Redis Security Group (redis_sg):
- Inbound: Port 6379 from VPC CIDR
- Outbound: All traffic
IAM Roles
ECS Task Role (ecs_task_role):
Policies attached:
AmazonECSTaskExecutionRolePolicy: ECS task executionecs_secrets_policy: Read secrets from Secrets ManagerAmazonElastiCacheFullAccess: Access to Redis
Monitoring
CloudWatch Logs
View logs for each service:
# Backend logs
aws logs tail /ecs/machbot-service --follow
# MCP server logs
aws logs tail /ecs/ct-mcp-server --followCloudWatch Metrics
Monitor key metrics:
- ECS: CPU, memory utilization
- ALB: Request count, target response time, error rate
- ElastiCache: Cache hits, CPU, memory
Alarms
Set up CloudWatch alarms for:
- High CPU utilization (> 80%)
- High memory utilization (> 80%)
- High ALB error rate (> 5%)
- Redis connection failures
Scaling Configuration
Auto-Scaling
Configure ECS service auto-scaling:
resource "aws_appautoscaling_target" "machbot" {
max_capacity = 10
min_capacity = 2
resource_id = "service/server_cluster/machbot_service"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "machbot_cpu" {
name = "machbot-cpu-scaling"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.machbot.resource_id
scalable_dimension = aws_appautoscaling_target.machbot.scalable_dimension
service_namespace = aws_appautoscaling_target.machbot.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 70.0
}
}Maintenance
Update Application
Update Backend:
# Build new image
cd backend
docker build -t mach-ai-b2b-agent:v2 .
# Push to ECR
docker tag mach-ai-b2b-agent:v2 \
YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:v2
docker push YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:v2
# Update task definition
# Then force new deployment
aws ecs update-service \
--cluster server_cluster \
--service machbot_service \
--force-new-deploymentDatabase Maintenance
Redis Backup:
- Automatic daily snapshots at 09:00 UTC
- Manual snapshot: Via AWS Console or CLI
Restore from Snapshot:
aws elasticache create-serverless-cache \
--serverless-cache-name new-cache \
--snapshot-arns arn:aws:elasticache:region:account:snapshot/snapshot-nameRotate Secrets
Update secrets in Secrets Manager:
aws secretsmanager update-secret \
--secret-id machbot_secrets \
--secret-string '{"OPENAI_API_KEY": "new-key"}'
# Force ECS service to restart with new secrets
aws ecs update-service \
--cluster server_cluster \
--service machbot_service \
--force-new-deploymentTroubleshooting
Issue: Service tasks keep stopping
Check:
- CloudWatch Logs for error messages
- Task definition has correct resource limits
- Secrets are accessible
- Security groups allow necessary traffic
Solution:
# View task stopped reason
aws ecs describe-tasks \
--cluster server_cluster \
--tasks TASK_IDIssue: Health checks failing
Check:
- Application is listening on correct port
- Health check endpoint returns 200
- Security groups allow ALB to reach targets
Solution:
# Test from within VPC
aws ecs execute-command \
--cluster server_cluster \
--task TASK_ID \
--container machbot \
--command "curl localhost:8001/health" \
--interactiveIssue: Cannot connect to Redis
Check:
- Redis cluster is running
- Security group allows traffic
- Connection string is correct
Solution:
# Get Redis endpoint
aws elasticache describe-serverless-caches \
--serverless-cache-name b2b-ai-redis
# Test connection
redis-cli -h YOUR_REDIS_ENDPOINT pingDestroying Infrastructure
To remove all infrastructure:
# Review what will be destroyed
terraform plan -destroy -var-file=env/dev.tfvars
# Destroy
terraform destroy -var-file=env/dev.tfvars⚠️ Warning: This permanently deletes all resources including data in Redis.
Cost Optimization
Recommendations
- Use Fargate Spot: For non-critical workloads
- Right-size Tasks: Monitor and adjust CPU/memory
- ElastiCache Scaling: Use serverless for variable load
- CloudWatch Logs Retention: Set appropriate retention periods
- ALB Idle Timeout: Configure for your use case
Estimated Monthly Costs
Development Environment (~$200-300/month):
- ECS Fargate: ~$100
- ElastiCache Redis: ~$50
- ALB: ~$30
- Data Transfer: ~$20
- CloudWatch: ~$20
Production Environment (~$800-1200/month):
- ECS Fargate (scaled): ~$400-600
- ElastiCache Redis (larger): ~$200-300
- ALB: ~$50
- Data Transfer: ~$100
- CloudWatch: ~$50
Next Steps
- Backend Setup - Understand the backend
- Frontend Setup - Understand the frontend
- Architecture - System architecture
- Scripts - Utility scripts