Terraform Throttling Route53

terraform parallelism
terraform aws provider

Did anyone experienced issues with Terraform being throttled when using it with AWS Route53 records and being VERY slow?

I have enabled DEBUG mode and getting this:

2018-11-30T14:35:08.467Z [DEBUG] plugin.terraform-provider-aws_v1.36.0_x4: 2018/11/30 14:35:08 [DEBUG] [aws-sdk-go] <?xml  version="1.0"?>
2018-11-30T14:35:08.467Z [DEBUG] plugin.terraform-provider aws_v1.36.0_x4: <ErrorResponse xmlns="https://route53.amazonaws.com/doc/2013-04-01/"><Error><Type>Sender</Type><Code>Throttling</Code><Message>Rate exceeded</Message></Error><RequestId>REQUEST_ID</RequestId></ErrorResponse>
2018-11-30T14:35:08.518Z [DEBUG] plugin.terraform-provider-aws_v1.36.0_x4: 2018/11/30 14:35:08 [DEBUG] [aws-sdk-go] DEBUG: Validate Response route53/ListResourceRecordSets failed, will retry, error Throttling: Rate exceeded

Terraform takes >1h just to do simple Plan, something which normally takes <5 mins.

My infrastructure is organized like this:

alb.tf:

module "ALB" 
{ source = "modules/alb" }

modules/alb/alb.tf:

resource "aws_alb" "ALB" 
{ name = "alb" 
subnets = var.subnets ...
}

modules/alb/dns.tf

resource "aws_route53_record" "r53" {
  count     =  "${length(var.cnames_generic)}"
  zone_id   = "HOSTED_ZONE_ID"
  name      = "${element(var.cnames_generic_dns, count.index)}.${var.environment}.${var.domain}"
  type      = "A"

  alias {
    name    = "dualstack.${aws_alb.ALB.dns_name}"
    zone_id = "${aws_alb.ALB.zone_id}"
    evaluate_target_health = false
  }
}

modules/alb/variables.tf:

variable "cnames_generic_dns" {
  type = "list"
  default = [
    "hostname1",
    "hostname2",
    "hostname3",
    "hostname4",
    "hostname5",
    "hostname6",
    "hostname7",
     ...
    "hostname25"
      ]
}

So I am using modules to configure Terraform, and inside modules there are resources (ALB, DNS..).

However, looks like Terraform is describing every single DNS Resource (CNAME and A records, which I have ~1000) in a HostedZone which is causing it to Throttle?

Terraform v0.10.7
Terraform AWS provider version = "~> 1.36.0"

that's a lot of DNS records! And partly the reason why the AWS API is throttling you.

First, I'd recommend upgrading your AWS provider. v1.36 is fairly old and there have been more than a few bug fixes since.

(Next, but not absolutely necessary, is to use TF v0.11.x if possible.)

In your AWS Provider block, increase max_retries to at least 10 and experiment with higher values.

Then, use Terraform's --parallelism flag to limit TF's concurrency rate. Try setting that to 5 for starters.

Last, enable Terraform's debug mode to see if it gives you any more useful info.

Hope this helps!

Heavy Route53 Throttling followed by unexpected plan execution , We are seeing a worsening issue across our Terraform managed environments around R53 throttling. Increasingly we are seeing TF plans and  When creating Route 53 zones, the NS and SOA records for the zone are automatically created. Enabling the allow_overwrite argument will allow managing these records in a single Terraform run without the requirement for terraform import.

Terraform Throttling Route53, that's a lot of DNS records! And partly the reason why the AWS API is throttling you. First, I'd recommend upgrading your AWS provider. v1.36 is  NOTE: Terraform provides both exclusive VPC associations defined in-line in this resource via vpc configuration blocks and a separate Zone VPC Association resource. At this time, you cannot use in-line VPC associations in conjunction with any aws_route53_zone_association resources with the same zone ID otherwise it will cause a perpetual difference in plan output.

Looks like throttling with Terraform AWS Route53 is completely resolved after upgrading to newer AWS provider. We have updated TF AWS provider to 1.54.0 like this in our init.tf :

version = "~> 1.54.0"

Here are more details about the issue and suggestions from Hashicorp engineers:

https://github.com/terraform-providers/terraform-provider-aws/issues/7056

AWS: aws_api_gateway_usage_plan, Provides a Route53 record resource. When creating Route 53 zones, the NS and SOA records for the zone are automatically created. Enabling the  We are seeing a worsening issue across our Terraform managed environments around R53 throttling. Increasingly we are seeing TF plans and applies hang for anything up to an hour. The R53 AWS console will be unusable during this time and we can see ListResourceRecordSets API being called multiple times per second in Cloudtrail.

AWS: aws_route53_record, Valid values: ERROR , INFO , OFF . Defaults to OFF . Supported only for WebSocket APIs. throttling_burst_limit - (Optional) The throttling burst limit for the default  Some sections may refer to lego directly - in most cases, these sections apply to the Terraform provider as well. The route53 DNS challenge provider can be used to perform DNS challenges for the acme_certificate resource with Amazon Route 53. For complete information on how to use this provider with the acme_certifiate resource, see here.

AWS: aws_apigatewayv2_stage, Amazon CloudWatch Logs. throttling_burst_limit - (Optional) Specifies the throttling burst limit. throttling_rate_limit - (Optional) Specifies the throttling rate limit. throttling_burst_limit - (Optional) Specifies the throttling burst limit. throttling_rate_limit - (Optional) Specifies the throttling rate limit. caching_enabled - (Optional) Specifies whether responses should be cached and returned for requests. A cache cluster must be enabled on the stage for responses to be cached.

AWS: aws_api_gateway_method_settings, Did anyone experienced issues with Terraform being throttled when using it with AWS Route53 records and being VERY slow? I have enabled  Using Service Quotas to view and manage quotas. You can use the Service Quotas service to view quotas and to request quota increases for many AWS services. For more information, see the Service Quotas User Guide. (You can currently use Service Quotas to view and manage only Route 53 and Route 53 Resolver quotas.

Comments
  • Thanks KJH, tried with max_retries and parallelism but it didn't helped. What helped is to leave it running as is, delete and recreate records again. I will reply in more details