Welcome

Introduction

Introduction

Setup & Installation

Architecture & Design

Architecture

Features & Scenarios

Deployment

Deployment

API Reference

API Reference

Tools & Utilities

Scripts & UtilitiesTroubleshooting

Deployment

Infrastructure setup and deployment guide

Deployment Guide

This guide covers deploying the SynapseAI B2B Wholesale Platform to AWS using Terraform.

Infrastructure Overview

The platform is deployed on AWS using:

  • ECS Fargate: Serverless container orchestration
  • Application Load Balancer: Traffic distribution and SSL termination
  • ElastiCache Redis: Session and cache storage
  • ECR: Container image registry
  • VPC: Isolated network environment
  • Secrets Manager: Secure credential storage
  • CloudWatch: Logging and monitoring

Prerequisites

Required Tools

  • Terraform 1.5+
  • AWS CLI configured with appropriate credentials
  • Docker for building container images
  • AWS Account with appropriate permissions

AWS Permissions

Your AWS user/role needs permissions for:

  • VPC and networking resources
  • ECS (Elastic Container Service)
  • ECR (Elastic Container Registry)
  • ElastiCache
  • Application Load Balancer
  • IAM roles and policies
  • Secrets Manager
  • CloudWatch Logs
  • ACM (Certificate Manager)

Infrastructure Architecture

Networking

VPC Configuration:

  • CIDR Block: 10.0.0.0/16
  • Three public subnets across availability zones:
    • eu-north-1a: 10.0.1.0/24
    • eu-north-1b: 10.0.2.0/24
    • eu-north-1c: 10.0.3.0/24
  • Internet Gateway for public access
  • Route tables for traffic routing

Load Balancer

Application Load Balancer (ALB):

  • Internet-facing
  • Operates across three public subnets
  • Listeners:
    • Port 443 (HTTPS): Primary secure traffic
    • Port 80 (HTTP): Redirects to HTTPS

SSL Certificates:

  • Domains: api.mach-x-b2b.app, demo-api.mach-x-b2b.app
  • Managed via AWS Certificate Manager (ACM)
  • Auto-renewal enabled

Target Groups:

  • machbot_target_group: Production backend
  • demo_machbot_target_group: Demo backend

Container Registry

ECR Repositories:

  • mach-ai-b2b-agent: Production images
  • demo-mach-ai-b2b-agent: Demo images

ECS Cluster

Cluster: server_cluster

  • Capacity Provider: Fargate
  • Service Discovery: Enabled for service-to-service communication

Services:

  1. ct_mcp_server_service: Commercetools MCP server (port 3001)
  2. demo_ct_mcp_server_service: Demo Commercetools MCP server
  3. voucherify_mcp_server_service: Voucherify MCP server (port 10000)
  4. machbot_service: Production backend (port 8001)
  5. demo_machbot_service: Demo backend

Database and Cache

ElastiCache Redis:

  • Serverless configuration
  • Daily snapshots at 09:00 UTC
  • 1-day snapshot retention
  • Automatic failover enabled

Deployment Process

Step 1: Set Up AWS Credentials

Configure AWS CLI:

aws configure

Or set environment variables:

export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_DEFAULT_REGION=eu-north-1

Step 2: Configure Terraform

Navigate to the terraform directory:

cd terraform

Initialize Terraform:

terraform init

Step 3: Set Up Secrets

Store secrets in AWS Secrets Manager:

# Commercetools secrets
aws secretsmanager create-secret \
  --name ct_mcp_server_secrets \
  --secret-string '{
    "CT_CLIENT_ID": "your-client-id",
    "CT_CLIENT_SECRET": "your-client-secret",
    "CT_PROJECT_KEY": "your-project-key"
  }'

# Voucherify secrets
aws secretsmanager create-secret \
  --name voucherify_secrets \
  --secret-string '{
    "VOUCHERIFY_APP_ID": "your-app-id",
    "VOUCHERIFY_APP_TOKEN": "your-app-token"
  }'

# MachBot secrets
aws secretsmanager create-secret \
  --name machbot_secrets \
  --secret-string '{
    "OPENAI_API_KEY": "sk-...",
    "LITELLM_MODEL": "cerebras/llama-3.3-70b"
  }'

Step 4: Create Terraform Workspace

Use workspaces to manage multiple environments:

# Create and switch to dev workspace
terraform workspace new dev

# Or select existing workspace
terraform workspace select dev

Available workspaces:

  • dev: Development environment
  • staging: Staging environment (create as needed)
  • prod: Production environment (create as needed)

Step 5: Configure Variables

Create a variables file for your environment:

# terraform/env/dev.tfvars
AWS_REGION = "eu-north-1"
DOCKER_IMAGE = "your-account-id.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:latest"
DEMO_DOCKER_IMAGE = "your-account-id.dkr.ecr.eu-north-1.amazonaws.com/demo-mach-ai-b2b-agent:latest"

Step 6: Build and Push Docker Images

Build Backend Image:

cd backend

# Build image
docker build -t mach-ai-b2b-agent:latest .

# Tag for ECR
docker tag mach-ai-b2b-agent:latest \
  YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:latest

# Login to ECR
aws ecr get-login-password --region eu-north-1 | \
  docker login --username AWS --password-stdin \
  YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com

# Push to ECR
docker push YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:latest

Step 7: Plan Infrastructure Changes

Review the changes Terraform will make:

terraform plan -var-file=env/dev.tfvars

Review the output carefully to ensure:

  • Correct resources are being created
  • No unexpected deletions
  • Security groups are properly configured

Step 8: Apply Infrastructure

Deploy the infrastructure:

terraform apply -var-file=env/dev.tfvars

Type yes when prompted to confirm.

This process takes approximately 10-15 minutes.

Step 9: Verify Deployment

Check ECS services are running:

aws ecs list-services --cluster server_cluster
aws ecs describe-services --cluster server_cluster --services machbot_service

Check ALB health:

aws elbv2 describe-target-health \
  --target-group-arn $(terraform output -raw machbot_target_group_arn)

Test the API:

curl https://api.mach-x-b2b.app/health

Frontend Deployment

Deploy to Vercel

The frontend is configured for Vercel deployment:

Option 1: Automatic Deployment

  1. Connect GitHub repository to Vercel
  2. Vercel automatically deploys on push to main

Option 2: Manual Deployment

cd frontend

# Install Vercel CLI
npm i -g vercel

# Deploy
vercel --prod

Environment Variables in Vercel:

Set these in the Vercel dashboard:

VITE_API_BASE_URL=https://api.mach-x-b2b.app
VITE_ENABLE_VOICE_CHAT=true

Alternative: Deploy to AWS S3 + CloudFront

cd frontend

# Build
pnpm build

# Sync to S3
aws s3 sync dist/ s3://your-bucket-name/ --delete

# Invalidate CloudFront cache
aws cloudfront create-invalidation \
  --distribution-id YOUR_DIST_ID \
  --paths "/*"

Security Configuration

Security Groups

ALB Security Group (alb_sg):

  • Inbound: Ports 80, 443 from 0.0.0.0/0
  • Outbound: All traffic

MachBot Security Group (machbot_sg):

  • Inbound: Port 8001 from ALB security group
  • Outbound: All traffic

Commercetools MCP Security Group (ct_mcp_server_sg):

  • Inbound: Port 3001 from VPC CIDR
  • Outbound: All traffic

Voucherify MCP Security Group (voucherify_mcp_server_sg):

  • Inbound: Port 10000 from VPC CIDR
  • Outbound: All traffic

Redis Security Group (redis_sg):

  • Inbound: Port 6379 from VPC CIDR
  • Outbound: All traffic

IAM Roles

ECS Task Role (ecs_task_role):

Policies attached:

  • AmazonECSTaskExecutionRolePolicy: ECS task execution
  • ecs_secrets_policy: Read secrets from Secrets Manager
  • AmazonElastiCacheFullAccess: Access to Redis

Monitoring

CloudWatch Logs

View logs for each service:

# Backend logs
aws logs tail /ecs/machbot-service --follow

# MCP server logs
aws logs tail /ecs/ct-mcp-server --follow

CloudWatch Metrics

Monitor key metrics:

  • ECS: CPU, memory utilization
  • ALB: Request count, target response time, error rate
  • ElastiCache: Cache hits, CPU, memory

Alarms

Set up CloudWatch alarms for:

  • High CPU utilization (> 80%)
  • High memory utilization (> 80%)
  • High ALB error rate (> 5%)
  • Redis connection failures

Scaling Configuration

Auto-Scaling

Configure ECS service auto-scaling:

resource "aws_appautoscaling_target" "machbot" {
  max_capacity       = 10
  min_capacity       = 2
  resource_id        = "service/server_cluster/machbot_service"
  scalable_dimension = "ecs:service:DesiredCount"
  service_namespace  = "ecs"
}

resource "aws_appautoscaling_policy" "machbot_cpu" {
  name               = "machbot-cpu-scaling"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.machbot.resource_id
  scalable_dimension = aws_appautoscaling_target.machbot.scalable_dimension
  service_namespace  = aws_appautoscaling_target.machbot.service_namespace

  target_tracking_scaling_policy_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ECSServiceAverageCPUUtilization"
    }
    target_value = 70.0
  }
}

Maintenance

Update Application

Update Backend:

# Build new image
cd backend
docker build -t mach-ai-b2b-agent:v2 .

# Push to ECR
docker tag mach-ai-b2b-agent:v2 \
  YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:v2
docker push YOUR_ACCOUNT_ID.dkr.ecr.eu-north-1.amazonaws.com/mach-ai-b2b-agent:v2

# Update task definition
# Then force new deployment
aws ecs update-service \
  --cluster server_cluster \
  --service machbot_service \
  --force-new-deployment

Database Maintenance

Redis Backup:

  • Automatic daily snapshots at 09:00 UTC
  • Manual snapshot: Via AWS Console or CLI

Restore from Snapshot:

aws elasticache create-serverless-cache \
  --serverless-cache-name new-cache \
  --snapshot-arns arn:aws:elasticache:region:account:snapshot/snapshot-name

Rotate Secrets

Update secrets in Secrets Manager:

aws secretsmanager update-secret \
  --secret-id machbot_secrets \
  --secret-string '{"OPENAI_API_KEY": "new-key"}'

# Force ECS service to restart with new secrets
aws ecs update-service \
  --cluster server_cluster \
  --service machbot_service \
  --force-new-deployment

Troubleshooting

Issue: Service tasks keep stopping

Check:

  1. CloudWatch Logs for error messages
  2. Task definition has correct resource limits
  3. Secrets are accessible
  4. Security groups allow necessary traffic

Solution:

# View task stopped reason
aws ecs describe-tasks \
  --cluster server_cluster \
  --tasks TASK_ID

Issue: Health checks failing

Check:

  1. Application is listening on correct port
  2. Health check endpoint returns 200
  3. Security groups allow ALB to reach targets

Solution:

# Test from within VPC
aws ecs execute-command \
  --cluster server_cluster \
  --task TASK_ID \
  --container machbot \
  --command "curl localhost:8001/health" \
  --interactive

Issue: Cannot connect to Redis

Check:

  1. Redis cluster is running
  2. Security group allows traffic
  3. Connection string is correct

Solution:

# Get Redis endpoint
aws elasticache describe-serverless-caches \
  --serverless-cache-name b2b-ai-redis

# Test connection
redis-cli -h YOUR_REDIS_ENDPOINT ping

Destroying Infrastructure

To remove all infrastructure:

# Review what will be destroyed
terraform plan -destroy -var-file=env/dev.tfvars

# Destroy
terraform destroy -var-file=env/dev.tfvars

⚠️ Warning: This permanently deletes all resources including data in Redis.

Cost Optimization

Recommendations

  1. Use Fargate Spot: For non-critical workloads
  2. Right-size Tasks: Monitor and adjust CPU/memory
  3. ElastiCache Scaling: Use serverless for variable load
  4. CloudWatch Logs Retention: Set appropriate retention periods
  5. ALB Idle Timeout: Configure for your use case

Estimated Monthly Costs

Development Environment (~$200-300/month):

  • ECS Fargate: ~$100
  • ElastiCache Redis: ~$50
  • ALB: ~$30
  • Data Transfer: ~$20
  • CloudWatch: ~$20

Production Environment (~$800-1200/month):

  • ECS Fargate (scaled): ~$400-600
  • ElastiCache Redis (larger): ~$200-300
  • ALB: ~$50
  • Data Transfer: ~$100
  • CloudWatch: ~$50

Next Steps

  • Backend Setup - Understand the backend
  • Frontend Setup - Understand the frontend
  • Architecture - System architecture
  • Scripts - Utility scripts

Architecture

System architecture and design overview

API Reference

API documentation for SynapseAI

On this page

Deployment GuideInfrastructure OverviewPrerequisitesRequired ToolsAWS PermissionsInfrastructure ArchitectureNetworkingLoad BalancerContainer RegistryECS ClusterDatabase and CacheDeployment ProcessStep 1: Set Up AWS CredentialsStep 2: Configure TerraformStep 3: Set Up SecretsStep 4: Create Terraform WorkspaceStep 5: Configure VariablesStep 6: Build and Push Docker ImagesStep 7: Plan Infrastructure ChangesStep 8: Apply InfrastructureStep 9: Verify DeploymentFrontend DeploymentDeploy to VercelAlternative: Deploy to AWS S3 + CloudFrontSecurity ConfigurationSecurity GroupsIAM RolesMonitoringCloudWatch LogsCloudWatch MetricsAlarmsScaling ConfigurationAuto-ScalingMaintenanceUpdate ApplicationDatabase MaintenanceRotate SecretsTroubleshootingIssue: Service tasks keep stoppingIssue: Health checks failingIssue: Cannot connect to RedisDestroying InfrastructureCost OptimizationRecommendationsEstimated Monthly CostsNext Steps