How do you put up a maintenance page in AWS when you want to deploy new versions of your application behind an ELB? We want to have the ELB route traffic to the maintenance instance while the new auto-scaled instances are coming up, and only "flip over" to the new instances once they're fully up. We use auto-scaling to bring existing instances down and new instances, which have the new code, up.


The scenario we're trying to avoid is having the ELB serve both traffic to new EC2 instances while also serving up the maintenance page. Since we dont have sticky sessions enabled, we want to prevent the user from being flipped back and forth between the maintenance-mode page and the application deployed in an EC2 instance. We also can't just scale up (say from 2 to 4 instances and then back to 2) to introduce the new instances because the code changes might involve database changes which would be breaking changes for the old code.


4 个解决方案



The simplest way on AWS is to use Route 53, their DNS service.

AWS上最简单的方法是使用Route 53,即他们的DNS服务。

You can use the feature of Weighted Round Robin.


"You can use WRR to bring servers into production, perform A/B testing, or balance your traffic across regions or data centers of varying sizes."

“您可以使用WRR将服务器投入生产,执行A / B测试,或平衡不同大小的区域或数据中心的流量。”

More information in AWS documentations on this feature


EDIT: Route 53 recently added a new feature that allows DNS Failover to S3. Check their documentation for more details: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover.html

编辑:Route 53最近添加了一项新功能,允许DNS故障转移到S3。查看他们的文档以获取更多详细信息:http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover.html



Route53 is not a good solution for this problem. It takes a significant amount of time for DNS entries to expire before the maintenance page shows up (and then it takes that same amount of time before they update after maintenance is complete). I realize that Lambda and CodeDeploy triggers did not exist at the time this question was asked, but I wanted to let others know that Lambda can be used to create a relatively clean solution for this, which I have detailed in a blog post: https://blog.ajhodges.com/2016/04/aws-lambda-setting-temporary.html

Route53不是解决此问题的好方法。在维护页面显示之前,DNS条目需要花费大量时间才能到期(然后在维护完成后更新之前需要相同的时间)。我知道在提出这个问题的时候Lambda和CodeDeploy触发器不存在,但是我想让其他人知道Lambda可以用来为这个创建一个相对干净的解决方案,我在博客文章中有详细说明:http: //blog.ajhodges.com/2016/04/aws-lambda-setting-temporary.html

The jist of the solution is to subscribe a Lambda function to CodeDeploy events, which replaces your ASG with a micro instance serving a static page in your load balancer during deployments.




Came up with another solution that's working great for us. Here are the steps:


  1. Replicate your EB environment to create another one, call it something like app-environment-maintenance, for instance.
  2. 复制您的EB环境以创建另一个,例如,将其称为app-environment-maintenance。
  3. Change the configuration for autoscaling and set the min and max servers both to zero. This won't cost you any EC2 servers and the environment will turn grey and sit in your list.
  4. 更改自动缩放的配置,并将min和max服务器都设置为零。这不会花费您任何EC2服务器,并且环境将变为灰色并位于您的列表中。
  5. This is a requirement, but we use Cloudfront, as many people will for HTTPS, etc. Cloudfront has error pages.
  6. 这是一项要求,但我们使用Cloudfront,因为许多人都会使用HTTPS等.Cloudfront有错误页面。
  7. Create a new S3 website hosting bucket with your error pages. Consider creating separate files for response codes, 503, etc. See #6 for directory requirements and routes.
  8. 使用您的错误页面创建一个新的S3网站托管存储桶。考虑为响应代码创建单独的文件,503等。有关目录要求和路由,请参阅#6。
  9. Add the S3 bucket to your Cloudfront distribution.
  10. 将S3存储桶添加到Cloudfront分发中。
  11. Add a new behavior to your Cloudfront distribution for a route like /error/*.
  12. 为您的Cloudfront分发添加新行为,以获取/ error / *等路由。
  13. Setup an error pages in Cloudfront to handle 503 response codes and point it to your S3 bucket route, like /error/503-error.html
  14. 在Cloudfront中设置错误页面以处理503响应代码并将其指向您的S3存储桶路由,例如/error/503-error.html
  15. Finally, you can use the AWS CLI to now swap the environment CNAME to take your main environment into maintenance mode. For instance:

    最后,您可以使用AWS CLI交换环境CNAME,使您的主环境进入维护模式。例如:

    aws elasticbeanstalk swap-environment-cnames \ --profile "$awsProfile" \ --region "$awsRegion" \ --output text \ --source-environment-name api-prod \ --destination-environment-name api-prod-maintenance

    aws elasticbeanstalk swap-environment-cnames \ --profile“$ awsProfile”\ --region“$ awsRegion”\ --output text \ --source-environment-name api-prod \ --destination-environment-name api-prod -保养

This would swap your app-prod environment into maintenance mode. It would cause the ELB to throw a 503 since there aren't any running EC2 instances and then Cloudfront will catch the 503 and return your respective 503 error page.


And that's it. I know there are quite a few steps and I tried a lot of the suggested options out there including Route53, etc. But all of these have issues with how they work with ELBs and Cloudfront, etc.


Note that after you swap the hostnames for the environments, it takes about a minute or so to propagate.




Our deployment process first runs a cloudformation to spun up a ec2 micro instance (Maintenance instance) which copies pre-defined static page from s3 onto the ec2. Cloudformation is supplied with elb's to which micro ec2 instance is attached. Then a script (powershell or cli) is run to remove web instances (ec2) from elb's leaving Maintenance instance.

我们的部署过程首先运行一个cloudformation来旋转一个ec2微实例(维护实例),它将预定义的静态页面从s3复制到ec2上。 Cloudformation由elb提供,微型ec2实例连接到elb。然后运行脚本(powershell或cli)从elb离开维护实例中删除Web实例(ec2)。

This way we switch to maintenance instance during deployment process.


In our case, we have two elb's, one for external and the other internal. Our internal elb's will not be updated during this process and is how we have post prod deployment smoke test is done. Once testing is done, we run another script to attach web instances back to elb's and delete the Maintenance stack.

在我们的例子中,我们有两个elb,一个用于外部,另一个用于内部。我们的内部elb将不会在此过程中更新,也就是我们如何进行post prod部署烟雾测试。测试完成后,我们运行另一个脚本将Web实例附加回elb并删除维护堆栈。