-
Notifications
You must be signed in to change notification settings - Fork 507
Open
Description
Issue
i'm using nodelocaldns, and the upstream is coredns.
it's wired that when all nodes reboot, after some minutes, the tcp dns query for nodelocaldns will be time out, the udp query is okay.
netshoot-66bc59cdd7-wrj9t:~# while true; do date; dig kubernetes.default.svc.cluster.local +short +retries=0 +tcp && sleep 5 || break; done;
...
...
Wed Jan 7 10:19:22 UTC 2026
10.233.0.1
Wed Jan 7 10:19:27 UTC 2026
10.233.0.1
Wed Jan 7 10:19:32 UTC 2026
10.233.0.1
Wed Jan 7 10:19:37 UTC 2026
10.233.0.1
Wed Jan 7 10:19:42 UTC 2026
10.233.0.1
Wed Jan 7 10:19:47 UTC 2026
10.233.0.1
Wed Jan 7 10:19:52 UTC 2026
;; communications error to 169.254.25.10#53: timed out
; <<>> DiG 9.20.10 <<>> kubernetes.default.svc.cluster.local +short +retries=0 +tcp
;; global options: +cmd
;; no servers could be reached
the nodelocaldns cm:
10.233.0.3 is the coredns svc ip
169.254.25.10 is the ip for nic nodelocaldns
apiVersion: v1
data:
Corefile: |
cluster.local:53 {
errors
cache {
success 9984 30
denial 9984 5
}
reload
loop
bind 169.254.25.10
forward . 10.233.0.3 {
force_tcp
}
prometheus :9253
health 169.254.25.10:9254
}
in-addr.arpa:53 {
errors
cache 30
reload
loop
bind 169.254.25.10
forward . 10.233.0.3 {
force_tcp
}
prometheus :9253
}
ip6.arpa:53 {
errors
cache 30
reload
loop
bind 169.254.25.10
forward . 10.233.0.3 {
force_tcp
}
prometheus :9253
}
.:53 {
errors
cache 30
reload
loop
bind 169.254.25.10
forward . 223.5.5.5
prometheus :9253
}
kind: ConfigMap
if I query via coredns directly, it's ok
if I delete the nodelocaldns pods, it's ok
if I add plugin pprof, it's ok
i debuged with a netshoot container, and i find that the tcp handshake is establishmented, but no data from nodelocaldns to client.
Can somebody help me?
Thanks in advanced.
k8s-dns-node-cache:1.21.1
coredns:v1.10.1
kubernetes: v1.24.6
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels