Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1414068
Summary: | 4.9.3-200 kernel causes kubernetes dns to not work | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dusty Mabe <dustymabe> | ||||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 25 | CC: | cz172638, eparis, gansalmon, ichavero, itamar, jbrooks, jonathan, jpazdziora, kernel-maint, labbott, madhu.chinakonda, mchehab | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | kernel-4.9.5-200.fc25 kernel-4.9.5-100.fc24 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 1414468 (view as bug list) | Environment: | |||||||||
Last Closed: | 2017-01-24 03:20:34 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1414468 | ||||||||||
Attachments: |
|
Description
Dusty Mabe
2017-01-17 16:30:04 UTC
Created attachment 1241888 [details]
journal-from-4.8.16.txt.gz
Created attachment 1241889 [details]
journal-from-4.9.3.txt.gz
Note that this system has selinux disabled because of https://bugzilla.redhat.com/show_bug.cgi?id=1414096. Jason brooks has confirmed that the same behavior happens on a system with newer kubernetes (1.5 with a fix for the selinux issue) and selinux enforcing. I tested on f25 w/ the 4.10.0-0.rc4.git0.1.fc26.x86_64 kernel and kube 1.5.1, and kube-dns works as expected, w/ selinux enforcing. So this was fun and definitely a kernel regression of some sort. I'll attach the iptables rules of the node. They are exactly (modulo the generated chain names) the same between a workin 4.8 and a broken 4.9 kernel. In the iptables rules I'm about to attach we have 1 container with ip addr 172.16.35.3. I try to run 'dig' from that container. We have another container 172.16.35.2. It is running DNS. We have a completely virtual ip address/udp port 10.254.0.10:53. Any traffic to the virtual ip/port should get dnat'd to 172.16.35.2:53. On a 4.9 kernel listing on the host with `tcpdump -i any` I see: 17:22:24.273178 IP 172.16.35.2.49994 > 10.254.0.10.domain: 46023+ [1au] A? www.google.com. (43) Basically I see traffic from the 'dig' to the virutal ip/port. Nothing else. On a 4.8 kernel with the exact same setup and iptables rules, again with `tcpdump -i any` I see: 18:21:25.949497 IP 172.16.35.2.42645 > 10.254.0.10.domain: 54717+ [1au] A? www.google.com. (43) 18:21:25.949565 IP 172.16.35.2.42645 > 172.16.35.3.domain: 54717+ [1au] A? www.google.com. (43) 18:21:25.954133 IP 172.16.35.3.domain > 172.16.35.2.42645: 54717 1/0/1 A 216.58.219.68 (59) 18:21:25.954147 IP 10.254.0.10.domain > 172.16.35.2.42645: 54717 1/0/1 A 216.58.219.68 (59) Which is what we'd expect. I see the client->vip. Then I see a second packet that has been DNAT to the real destination. I see the return from the real destination and the reversal of the DNAT. An interesting thing I noticed when playing with tcpdump is that the host sees the first packet coming from somewhere different in 4.8 vs 4.9. In 4.8 I can do: `tcpdump -i docker0` and I see all 4 (expected) packets. In 4.9 listening only on docker0 shows NO traffic at all. Instead in 4.9 I can only see the single packet using: `tcpdump -i vethca4159a` docker0 is a linux bridge: # brctl show bridge name bridge id STP enabled interfaces docker0 8000.0242cb34b484 no veth8ebe5b8 vethca4159a It is as if on 4.9 the frames are not coming off of the bridge and instead are coming directly off the veth and the packets are not going through iptables. I can relatively easily set up a reproducer for you or give you root access to a VM that reproduces the issue. Created attachment 1241980 [details]
full iptables ruleset that are not matching
I confirm that 4.10.0-0.rc4.git0.1.fc26.x86_64 is working for me. Packets are showing up on docker0 and are having iptables rules applied... I'm also having success w/ this 4.9.4-202.rhbz1414068.fc25.x86_64 kernel: https://koji.fedoraproject.org/koji/taskinfo?taskID=17316113 I'll commit the fix to the repository. This should show up in the 4.9.5 kernel or another 4.9.4 build if that happens for some reason. kernel-4.9.5-200.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-e6012e74b6 kernel-4.9.5-100.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2017-18ce368ba3 kernel-4.9.5-100.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-18ce368ba3 kernel-4.9.5-200.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-e6012e74b6 kernel-4.9.5-200.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.9.5-100.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. |