Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1093803
Summary: | dhclient fails to renew IP address lease after system time changes | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jiri Jaburek <jjaburek> | ||||
Component: | dhcp | Assignee: | Pavel Zhukov <pzhukov> | ||||
Status: | CLOSED ERRATA | QA Contact: | Ondrej Mejzlik <omejzlik> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.0 | CC: | awilliam, devurandom, egasiorowski, freaky, hartsjc, ipilcher, jburke, jeharris, jpopelka, jstancek, linuxgcc, matthew.dowdell, mbliss, michaelv, omejzlik, osabart, pemensik, psppsn96, ptalbert, pzhukov, rhsu5, sbrivio, sferguso, sukulkar, thaller, thozza, xingli, zguo | ||||
Target Milestone: | rc | Keywords: | Reproducer, TestCaseApproved | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | dhcp-4.2.5-78.el7, bind-9.11.4-11.P2.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2020-03-31 19:57:27 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1095800, 1380362, 1393869, 1534569, 1709724, 1716960 | ||||||
Attachments: |
|
Description
Jiri Jaburek
2014-05-02 17:26:54 UTC
(In reply to Jiri Jaburek from comment #0) > Created attachment 891977 [details] > reproducer, extract from the audit-test suite, utils.plib > > > Actual results: > dhclient is affected by system time changes > > Expected results: > dhclient can operate independently on system time > (or at least survive large jumps) > > Additional info: > This issue was originally found and described in bug 1034737 as a possible > NetworkManager problem. Since then, NM was altered to use CLOCK_BOOTTIME, > but the issue remained, so I tried it without NM and was still able to > reproduce it, using the steps described above. > > The steps are a simplification of several test cases done by the audit-test > suite, which we use for Common Criteria Certification testing. An example > reproducer, extracted from the suite, is attached. FTR: Regarding NetworkManager, ... NM expects dhclient to report back in time with a lease update. In bug 1034737 it can be seen that dhclient does indeed not report back to extend the address lifetime, so the address got removed by the kernel. NM itself does not watchdog dhclient regarding the timeout. I think that is correct, because dhclient not extending the address lifetime is not an error from NM's point of view. Thanks, yes, I know about this issue (bug #916116, comment #2). I never got into fixing it as it seemed like a invasive change to me (the affected code is common for dhclient and dhcpd). We should fix it, sure, but I'm not sure about RHEL-7 as we don't have enough sources (dhcp/dhclient is from historical reasons "sanity"-tested by RTT team) to make sure it doesn't break anything else. (In reply to Jiri Popelka from comment #3) > Thanks, yes, I know about this issue (bug #916116, comment #2). > I never got into fixing it as it seemed like a invasive change to me (the > affected code is common for dhclient and dhcpd). We should fix it, sure, but > I'm not sure about RHEL-7 as we don't have enough sources (dhcp/dhclient is > from historical reasons "sanity"-tested by RTT team) to make sure it doesn't > break anything else. Regarding my use case, is there some (preferably easily scriptable) way to tell dhclient to stop waiting and send a new lease request? The dhclient manpage mentions "OMAPI" and a omshell(1) binary, which is unfortunately available within the dhcp (server) package, not dhclient, and which seems to be somewhat unfinished / hardly scriptable without expect (tcl). The manpage also mentions "THE CONTROL OBJECT", not going into *what* it actually is (file? socket? where?), which could perhaps be used as well, pausing and resuming dhclient right away, which would break some connections, though. In ideal world, dhclient would respond to SIGHUP / SIGUSR1 / SIGUSR2 by restarting the negotiation sequence (requesting a new lease). Thanks, Jiri (In reply to Jiri Jaburek from comment #4) > is there some (preferably easily scriptable) way to > tell dhclient to stop waiting and send a new lease request? I'm not aware of any. > The dhclient manpage mentions "OMAPI" and a omshell(1) binary I've never tried to use omshell on dhclient, only dhcpd. *** Bug 1148159 has been marked as a duplicate of this bug. *** I have bad news. Using monotonic time (CLOCK_MONOTONIC_RAW or CLOCK_BOOTTIME) instead of gettimeofday() in dhclient/dhcpd wouldn't be a problem. However the message dispatching code uses [1] timer mechanism from bind's libisc library and that uses [2] gettimeofday() too. So far I have no idea how to fix it without rewriting bind's internals. Actually there's a possibility to use our own timer instead of the one from libisc, but that'd probably mean reverting [3], which is a step back and I have no idea what might break with that. [1] https://source.isc.org/cgi-bin/gitweb.cgi?p=dhcp.git;a=blob;f=common/dispatch.c;hb=HEAD#l354 [2] https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=blob;f=lib/isc/unix/stdtime.c#l76 [3] https://source.isc.org/cgi-bin/gitweb.cgi?p=dhcp.git;a=commitdiff;h=98bf16077d22f28e288a18e184a9d1f97cb5f4f7 I suspect we've (Mike Ruckman and myself) just rediscovered this in Fedora 23 validation testing. The scenario we found is with installs to a VM using libvirt networking. libvirt issues fairly short leases - I think they're valid for an hour. So we've seen this happen on first boot after install: 1. system clock is wrong at first - it's a little over an hour fast. dhclient runs, obviously, while the system clock is still wrong: it's showing 21:58 when the real time is 20:46. 2. dhclient plans to renew the lease as it usually does, just before it's half-expired: the logs show a message "renewal in 1478 seconds". That would be approx 22:22. 3. chrony kicks in right after the network comes up, and adjusts the system clock back to the correct time - 20:46. So now if you run the numbers we have a lease that will expire at 21:46, but dhclient isn't planning to try and renew it until 22:22. And indeed at 21:46 the system's network connection disappears. It seems that when dhclient *does* kick in and try to renew the lease, it fails; I don't know if that's a separate bug, or just a symptom of trying to renew a lease too late. So the upshot is that on the first boot after install, if the system clock is fast and the router issues fairly short leases (of course, how short the leases have to be depends on how inaccurate the system clock is), you'll wind up with a network connection that drops for good some time after you boot, until you reboot or manually reset the connection somehow. *** Bug 1323971 has been marked as a duplicate of this bug. *** *** Bug 1361934 has been marked as a duplicate of this bug. *** where is the fixed link *** Bug 1446115 has been marked as a duplicate of this bug. *** Removing Networking RPL.. should possibly be added to core services RPL *** Bug 1485047 has been marked as a duplicate of this bug. *** Please use the new AMI Aptio firmware release and retest Slimpro: 3.06.25 Aptio: Label-026 Joining a case that my customer has to the bug here. Their circumstance is close enough to the core issue here that it makes sense to include it. 1. A VMWare image is migrated to AWS and started. 2. Predictably around 1 hour, the instance loses network connectivity, and is subsequently restarted by a cloudwatch job. (AWS lease time for dhcp is 3600sec, or 1 hour) 3. Upon review, the RTC is set to localtime, and not UTC as it should be. When they set it to UTC, the issue vanishes. Upon reverting the RTC back to localtime, the issue reproduces predictably. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1087 |