AWS ELB is extremely slow and patchy

aws alb latency issues
aws load balancer troubleshooting
aws load balancer cpu utilization
elb latency
elb throughput limit
aws load balancer timeout logs
aws latency issues
ec2 console

I have setup internet-facing ELB to access webserver of Apache airflow, which runs in 8080 of the instance.

Configuration

  1. Single-AZ ELB
  2. Autoscaling group with single m4.large instance

Below is the terraform resource for the ELB

resource "aws_elb" "airflow_elb" {
  name = "${var.domain_name}-elb"
  subnets = [
    "${aws_subnet.private.id}"]

  security_groups = [
    "${aws_security_group.public.id}"]

  "listener" {
    instance_port = 8080
    instance_protocol = "http"
    lb_port = 80
    lb_protocol = "http"
  }

  health_check {
    healthy_threshold = "${var.elb_healthy_threshold}"
    interval = "${var.elb_interval}"
    target = "HTTP:8080/admin/"
    timeout = "${var.elb_timeout}"
    unhealthy_threshold = "${var.elb_unhealthy_threshold}"
  }

  access_logs {
    bucket = "${aws_s3_bucket.bucket.bucket}"
    bucket_prefix = "elb-logs"
    interval = 60
  }

  cross_zone_load_balancing = false
  idle_timeout = 400
  connection_draining = true
  connection_draining_timeout = 400



  tags {
    Name = "airflow-elb"
  }

}

I can ssh tunnel to the private-ip instance via bastion host and the portal works without any issue. But when I access via the DNS name of the ELB it is either extremely slow, in which case I can see the request is reponded almost instantaneously from the webserver, but takes forever to load or ELB throws HTTP 503

Please help!!

EDIT1: Backend processing time is very high, but I can see that happens only when accessed from ELB, when done from tunneled connection it behaves normally.

Assuming you are using Classic ELB According to AWS Documentation

Three Reasons Stated there are :

Cause 1: Insufficient capacity in the load balancer to handle the request.

Cause 2: No registered instances.

Cause 3: No healthy instances.

Login to Console and see if instances are registered under ELB and if they are, are they in healthy state ?

Also I am curious why have you used only one AZ ?

Troubleshoot Unequal Routing of Load Balancer Traffic by Elastic , Why is Elastic Load Balancing unequally routing my load balancer traffic? Register or Deregister EC2 Instances for Your Classic Load Balancer aims to prevent lower-capacity instance types from having too many Long-lived TCP connections between clients and instances cause uneven traffic load� This can be a problem with the elb of amazon. The elb scale the number of instances with the number of request. You should see some pick of requests at those times. Amazon adds some instances in order to fit the load. the instances are reachable during the launch process so your clients get those timeout. it's totally randomness so you should :

Very slow response time through ELB, I am trying to setup a new site and having issues with ELB. I have a VPC with several medium instances. They are running web servers (nginx)� The load balancer linearly increases the number of requests sent to a new target in a target group up to its fair share during the slow start ramp-up window. Slow start is available today for all existing and new Application Load Balancers in all AWS public regions.

The issue was actually with using sync worker with python 3 and how ELB reuses http connection. The issue disappeared after changing from sync worker to gevent. However gevent is not supported as of yet by python 3, so we r stuck with python 2.7 for now

ELB Classic Load Balancer Latency Troubleshooting, How do I troubleshoot high latency on my ELB Classic Load Balancer? Cloud ( Amazon EC2) instances registered to a Classic Load Balancer. TotalTime=%{ time_total}\n" http://www.example.com/ -so /dev/null; done High� Amazon Web Services Elastic Load Balancer (AWS ELB) enables websites and web services to serve more requests from users by adding more servers based on need. Unhealthy ELB can cause your website to go offline or slow down dramatically. In this article, we will cover: The available metrics; How to add more meaningful calculated metrics; The key metrics?

You can try this answer: https://stackoverflow.com/a/42300647/2727462

Solution If you're DNS is configured to hit directly on the ELB -> you should reduce the TTL of the association (IP,DNS). The IP can change at any time with the ELB so you can have serious damage on your traffic.

The client keep Some IP from the ELB in cache so you can have those can of trouble.

Scaling Elastic Load Balancers Once you create an elastic load balancer, you must configure it to accept incoming traffic and route requests to your EC2 instances. These configuration parameters are stored by the controller, and the controller ensures that all of the load balancers are operating with the correct configuration. The controller will also monitor the load balancers and manage the capacity that is used to handle the client requests. It increases capacity by utilizing either larger resources (resources with higher performance characteristics) or more individual resources. The Elastic Load Balancing service will update the Domain Name System (DNS) record of the load balancer when it scales so that the new resources have their respective IP addresses registered in DNS. The DNS record that is created includes a Time-to-Live (TTL) setting of 60 seconds, with the expectation that clients will re-lookup the DNS at least every 60 seconds. By default, Elastic Load Balancing will return multiple IP addresses when clients perform a DNS resolution, with the records being randomly ordered on each DNS resolution request. As the traffic profile changes, the controller service will scale the load balancers to handle more requests, scaling equally in all Availability Zones.

In my case the problem was in TTL. Issue can be tracked by a command like wget https://your-url. The comand output will show you the IP address to which it tries to connect. And when connection hangs you can figure out a wrong outdated IP address. If it happen - check your DNS settings and update TTL.

Target groups for your Network Load Balancers, Because the load balancer is in a virtual private cloud (VPC), traffic between the load balancer and the targets is authenticated at the packet level, so it is not at� You can use any IP address from the load balancer’s VPC CIDR for targets within load balancer’s VPC and any IP address from RFC 1918 ranges (10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16) or RFC 6598 range (100.64.0.0/10) for targets located outside the load balancer’s VPC (EC2-Classic and on-premises locations reachable over AWS Direct

[PDF] Elastic Load Balancing - Classic Load Balancers, Step 3: Assign security groups to your load balancer in a VPC . An Internet- facing load balancer has a publicly resolvable DNS name, so it can route requests from high latency and are slow to respond to requests. extended methods include PATCH, REPORT, MKCOL, PROPFIND, MOVE, and LOCK. ELB itself runs on EC2 instances and can suffer from the other issues that were described in this guide as well as many others. When ELB is not adequately handling web traffic, users will find the web application slow. Why it Occurs. AWS ELB shunts traffic between servers, but gives very limited visibility into its performance.

Application Load Balancer Announces Slow Start Support for its , Slow start is very useful for applications that depend on cache and need The load balancer linearly increases the number of requests sent to a for all existing and new Application Load Balancers in all AWS public regions. You configured an AWS WAF web access control list (web ACL) to monitor requests to your Application Load Balancer and it blocked a request. HTTP 405: Method not allowed The client used the TRACE method, which is not supported by Application Load Balancers.

Understanding AWS Elastic Load Balancing, I would have called this article “Understanding AWS… You may need to know a few things before we get started so you can understand some of the and suddenly they are overloaded and connections are starting to slow down. A Load Balancer distributes the load evenly to multiple instances. Configure ELB with Autoscaling on AWS cloud. This is part-4 and final part of a multi-part tutorial. You may read the earlier 3 series here: part-1 part-2 part-3.Here, is about howto Configure ELB with Autoscaling on AWS cloud.

Comments
  • Have you tried bypassing the ELB and hitting your instance directly, like http://{your_instance_ip}:8080 ?
  • Yes, as mentioned above, it works well if done that way
  • "${aws_subnet.private.id}" An Internet-facing ELB should not be in a private subnet. But also, if this is indeed a private subnet, the ELB should not be working at all. ELBs should not usually be in the same subnets as the instances they are balancing, but many people seem to assume the opposite.
  • @sqlbot thanks will try it out
  • Everything checks, instance is registered and healthy
  • are ports mapped working properly also can you share logs from s3 as I can see you have enabled logging
  • As I said, the portal works and the connection is not down, but patchy, works at time and doesn't work at time, and when it works it is very very slow
  • The problem is backend seems fine, I was able to tunnel port 8080 and it portal works really well from my machine's browser, and elb access logs has very low backend latency as well, ELB is breaking things and have no clue why