DNS problem on AWS EKS when running in private subnets

eks kube-dns
aws eks dns route53
eks coredns pending
aws ecs dns resolution
eks fqdn
kubernetes dns
busybox dns lookup
kube-dns logs

I have an EKS cluster setup in a VPC. The worker nodes are launched in private subnets. I can successfully deploy pods and services.

However, I'm not able to perform DNS resolution from within the pods. (It works fine on the worker nodes, outside the container.)

Troubleshooting using https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ results in the following from nslookup (timeout after a minute or so):

Server: 172.20.0.10 Address 1: 172.20.0.10

nslookup: can't resolve 'kubernetes.default'

When I launch the cluster in an all-public VPC, I don't have this problem. Am I missing any necessary steps for DNS resolution from within a private subnet?

Many thanks, Daniel


I feel like I have to give this a proper answer because coming upon this question was the answer to 10 straight hours of debugging for me. As @Daniel said in his comment, the issue I found was with my ACL blocking outbound traffic on UDP port 53 which apparently kubernetes uses to resolve DNS records.

The process was especially confusing for me because one of my pods worked actually worked the entire time since (I think?) it happened to be in the same zone as the kubernetes DNS resolver.

Enabling DNS resolution for Amazon EKS cluster endpoints, Amazon EKS now supports automatic DNS resolution for private cluster that allows inbound DNS requests from external subnets/CIDR ranges. Or, run the following AWS CLI command: With the proper security group rule in place, you should now be able to issue kubectl commands from a machine in  When you create an Amazon EKS cluster, you specify the Amazon VPC subnets for your cluster to use. Amazon EKS requires subnets in at least two Availability Zones. We recommend a network architecture that uses private subnets for your worker nodes and public subnets for Kubernetes to create internet-facing load balancers within.


To elaborate on the comment from @Daniel, you need:

  1. an ingress rule for UDP port 53
  2. an ingress rule for UDP on ephemeral ports (e.g. 1025–65535)

I hadn't added (2) and was seeing CoreDNS receiving requests and trying to respond, but the response wasn't getting back to the requester.

Some tips for others dealing with these kinds of issues, turn on CoreDNS logging by adding the log configuration to the configmap, which I was able to do with kubectl edit configmap -n kube-system coredns. See CoreDNS docs on this https://github.com/coredns/coredns/blob/master/README.md#examples This can help you figure out whether the issue is CoreDNS receiving queries or sending the response back.

Troubleshoot DNS Failures with Amazon EKS, Pods running inside the Amazon EKS cluster use the CoreDNS service's cluster IP as the default name server for querying internal and external  The question is: How to expose DNS names pointing to the EKS cluster. DNS should only available inside our subnets and accessible with our VPN connection (which essentially means that DNS should point to addresses inside our VPC) I have an EKS cluster which runs in the 10.0.0.0/16 VPC. Nodes are located inside private subnets, and services are


Re: AWS EKS Kube Cluster and Route53 internal/private Route53 queries from pods

Just wanted to post a note on what we needed to do to resolve our issues. Noting that YMMV and everyone has different environments and resolutions, etc.

Disclaimer: We're using the community terraform eks module to deploy/manage vpcs and the eks clusters. We didn't need to modify any security groups. We are working with multiple clusters, regions, and VPC's.

ref: Terraform EKS module

CoreDNS Changes: We have a DNS relay for private internal, so we needed to modify coredns configmap and add in the dns-relay IP address ...

ec2.internal:53 {
    errors
    cache 30
    forward . 10.1.1.245
}
foo.dev.com:53 {
    errors
    cache 30
    forward . 10.1.1.245
}
foo.stage.com:53 {
    errors
    cache 30
    forward . 10.1.1.245
}

...

VPC DHCP option sets: Update with the IP of the above relay server if applicable--requires regeneration of the option set as they cannot be modified.

Our DHCP options set looks like this:

["AmazonProvidedDNS", "10.1.1.245", "169.254.169.253"]

ref: AWS DHCP Option Sets

Route-53 Updates: Associate every route53 zone with the VPC-ID that you need to associate it with (where our kube cluster resides and the pods will make queries from).

there is also a terraform module for that: https://www.terraform.io/docs/providers/aws/r/route53_zone_association.html

DNS resolution for EKS Private Endpoints · Issue #221 · aws , Automatic DNS resolution for EKS private endpoint private hosted it's running on private subnet with private ip so when I am adding private IP  I dont see the PHZ that EKS manages, so where do we get the IP) and update a common Route53 resolver running a common VPC (assuming our datacenter DNS is setup up to forward to this common Route53 resolver which will give back the private IP of the EKS endpoint).


So I been struggling for a couple of hours i think, lost track of time, with this issue as well.

Since i am using the default VPC but with the worker nodes inside the private subnet, it wasn't working.

I went through the amazon-vpc-cni-k8s and found the solution.

We have to sff the environment variable of the aws-node daemonset AWS_VPC_K8S_CNI_EXTERNALSNAT=true.

You can either get the new yaml and apply or just fix it through the dashboard. However for it to work you have to restart the worker node instance so the ip route tables are refreshed.

issue link is here

thankz

EKS Private Endpoint Support · Issue #22 · aws/containers-roadmap , Provide customers with private endpoint access to EKS. worker nodes in private subnets to EKS masters is whats stopping us from using EKS. The problem is with DNS resolution when only private access is enabled. would spare us from telling every user to run aws eks update-config again and from  We do use Direct Connect. The problem is with DNS resolution when only private access is enabled. Because the hidden private zone is only attached to the VPC, EKS addresses like abc.us-east-1.eks.amazonaws.com can be resolved in VPC only (which is documented). We can reach the EKS API server using the private IP address but not using the DNS


We had run into a similar issue where DNS resolution times out on some of the pods, but re-creating the pod couple of times resolves the problem. Also its not every pod on a given node showing issues, only some pods.

It turned out to be due to a bug in version 1.5.4 of Amazon VPC CNI, more details here -- https://github.com/aws/amazon-vpc-cni-k8s/issues/641.

Quick solution is to revert to the recommended version 1.5.3 - https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html

VPC Networking, Default VPC CIDR used by eksctl is 192.168.0.0/16 , it is divided into 8 ( /19 ) subnets However, this implies that each of the EC2 instances in the initial nodegroup gets a To create a cluster using 2x private and 2x public subnets, run: There are two ways of overwriting the DNS server IP address used for all the internal  Based on this guide from aws, it is recommended that you specify both public and private subnets when creating your eks cluster, but that you only create your worker nodes in your private subnets. Current behaviour in this module will use the same subnets for creating the eks cluster as for placing the worker nodes within.


Running Amazon EKS behind Customer HTTP Proxy without NAT , No AWS NAT gateway and Internet Gateway for outbound traffic; All the traffic able to resolve both the issues and run EKS behind the proxy successfully. IF the EKS cluster API Endpoint setup is a Private subnet and does not have the kube-dns pod communicates directly with the Kubernetes service. EKS in private Subnet , Load Balancer in public subnet. I am running EKS in private subnet and thus unable to create an internet facing load balancer but was able to create Internal LoadBalancer. Is there any way I can create Loadbalancer(probably Manually) in public subnet and point to the pods running in EKS in the private subnet.


What to Know Before Using Amazon EKS - The Startup, Amazon EKS has brought a managed Kubernetes cluster to Amazon. It solved a lot of problems for us, hard problems that would require both organizational and When you install EKS however, you'll need to follow the AWS SNAT allows nodes in a private subnet to communicate with the internet,  You cannot use just any sort of CIDR, there only certain ranges that can be used in AWS VPC. Use private subnets for initial nodegroup¶ If you prefer to isolate initial nodegroup from the public internet, you can use --node-private-networking flag. When used in conjunction with --ssh-access flag, SSH port can only be accessed inside the VPC.


[PDF] Hybrid Cloud DNS Options for Amazon VPC, VPC and customer-defined Route 53 private DNS records.2 Route 53 For VPCs with multiple CIDR blocks, the DNS server IP address is located in the clusters, EKS clusters, PHZs associated, etc. the query can resolve any of those running on EC2 instances to handle DNS requests either from VPCs or on-​premises,. This topic describes how to create an Amazon EKS cluster and add worker nodes running on Amazon EC2 A1 instances to Amazon EKS clusters. Amazon EC2 A1 instances deliver significant cost savings for scale-out and Arm-based applications such as web servers, containerized microservices, caching fleets, and distributed data stores.