Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 815091
Summary: | wpa_supplicant fails to dbus activate within 25 seconds | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Heiko Adams <bugzilla> | ||||||||
Component: | wpa_supplicant | Assignee: | Dan Williams <dcbw> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 17 | CC: | cpanceac, cpuobsessed, dalt74, danw, dcbw, gokcen.eraslan, gtmkramer, jdulaney, jirinek, jmontleo, johannbg, karmstrong, korsnick, mads, masami256, mathieu-acct, maxim, metherid, mishu, mrunge, notting, paul, pgueckel, plautrba, psimerda, rdieter, robatino, scampa.giovanni, skr, systemd-maint, vascom2 | ||||||||
Target Milestone: | --- | Keywords: | Regression | ||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | wpa_supplicant-1.0-0.4.fc17 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2012-05-02 04:53:32 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 752650, 817857 | ||||||||||
Attachments: |
|
Description
Heiko Adams
2012-04-22 12:40:59 UTC
I won't reproduce with: NetworkManager-0.9.4.0-6.git20120403.fc17.x86_64 kernel-3.3.2-8.fc17.x86_64 Created attachment 579365 [details]
Screenshot of MN popup menu before restarting the service
This is a screenshot of MN popup menu after booting my system and before restarting the service
Created attachment 579366 [details]
Screenshot of service manager before restarting the service
This is a screenshot of the service-manager after booting my system and before restarting the NM service
I'm seeing similar symptoms, no AP's found initially, at least until I manually restart the NM service. Forgot to mention in comment #4, but NetworkManager-0.9.4.0-1.git20120328.fc17 didn't have this issue. Created attachment 579532 [details]
NetworkManager log
I attatched a NetworkManager's log[NetworkManager log] to here. In the log, I can see this error log. 9528:Apr 23 21:58:47 rune NetworkManager[781]: <error> [1335185927.573217] [nm-s upplicant-interface.c:804] interface_add_cb(): (wlan0): error adding interface: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply time out expired, or the network connection was broken. 9529:Apr 23 21:58:47 rune NetworkManager[781]: dbus_g_proxy_cancel_call: asserti on `pending != NULL' failed I am seeing the same thing. I am able to work around it by right clicking the NM icon and unchecking "Enable Wireless" and then checking it off again. (In reply to comment #7) > Did not receive a reply. Possible causes include: the remote application did > not > send a reply, the message bus security policy blocked the reply, the reply > time > out expired, or the network connection was broken. are there any other messages nearby in the log? (like, dbus indicating that it blocked a reply) For example, one user had these messages: Apr 23 21:22:21 eregion NetworkManager[578]: <error> [1335212541.505187] [nm-supplicant-interface.c:804] interface_add_cb(): (wlan0): error adding interface: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Apr 23 21:22:21 eregion NetworkManager[578]: dbus_g_proxy_cancel_call: assertion `pending != NULL' failed Apr 23 21:22:21 eregion NetworkManager[578]: <info> (wlan0): supplicant interface state: starting -> down Apr 23 21:22:21 eregion NetworkManager[578]: <warn> Trying to remove a non-existant call id. Apr 23 21:22:21 eregion dbus-daemon[655]: dbus[655]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out Apr 23 21:22:21 eregion dbus[655]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out Apr 23 21:22:27 eregion systemd[1]: NetworkManager-wait-online.service: main process exited, code=exited, status=1 Apr 23 21:22:27 eregion systemd[1]: Unit NetworkManager-wait-online.service entered failed state. Apr 23 21:22:27 eregion NetworkManager[578]: <info> wpa_supplicant started Apr 23 21:22:27 eregion systemd[1]: iscsi.service: control process exited, code=exited status=3 which indicate that something is going wrong with dbus/systemd service activation and wpa_supplicant. The code in NM which talks to wpa_supplicant hasn't change in quite a while. I have following log that seems same as #10's one. Apr 24 07:56:01 rune dbus[617]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd' Apr 24 07:56:01 rune dbus-daemon[617]: dbus[617]: [system] Successfully activated service 'org.fedoraproject.Setroubleshootd' Apr 24 07:56:20 rune NetworkManager[711]: <error> [1335221780.242066] [nm-supplicant-interface.c:804] interface_add_cb(): (wlan0): error adding interface: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Apr 24 07:56:20 rune NetworkManager[711]: dbus_g_proxy_cancel_call: assertion `pending != NULL' failed Apr 24 07:56:20 rune NetworkManager[711]: <info> (wlan0): supplicant interface state: starting -> down Apr 24 07:56:20 rune NetworkManager[711]: <warn> Trying to remove a non-existant call id. Apr 24 07:56:20 rune dbus-daemon[617]: dbus[617]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out Apr 24 07:56:20 rune dbus[617]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out Apr 24 07:56:26 rune systemd[1]: NetworkManager-wait-online.service: main process exited, code=exited, status=1 Apr 24 07:56:27 rune systemd[1]: Unit NetworkManager-wait-online.service entered failed state. Apr 24 07:56:28 rune NetworkManager[711]: <info> wpa_supplicant started Apr 24 07:56:28 rune kernel: [ 52.479265] RPC: Registered named UNIX socket transport module. *** Bug 815856 has been marked as a duplicate of this bug. *** Seeing the same here. Booting seems to take an awful lot of time due to this, too. Systemd spends a significant amount of time waiting for the NetworkManager-wait-online.service before it finally fails. From what I understand from 'systemd-analyze chart' something in the order of 30 seconds. Not fixed in NetworkManager-0.9.4.0-7.git20120403.fc17.x86_64 Same exact problem here. I am also using NetworkManager-0.9.4.0-7.git20120403.fc17.x86_64. However, NetworkManager shut off my wireless card when I restarted the service, and it won't come on again until next reboot. Bootup blocking is a different bug; but in that case your system probably has the 'remote-fs-pre' systemd target enabled and active, which is causing systemd to wait for NetworkManager to bring up a network connection (via NM-wait-online) before continuing, since remote filesystems obviously require network connectivity. This bug is (at least for now) about the problem activating the supplicant and thus getting wireless devices working correctly. Note that NM-wait-online was linked to remote-fs-pre.target as a fix for bug 787314 Can you attach your fstab? *** Bug 815888 has been marked as a duplicate of this bug. *** So for the moment I'm going to toss this to systemd. Here's why... dbus-daemon calls systemd to activate the supplicant, and apparently that's timing out. But I have no idea where the systemd logs are about that process; they're clearly not getting dumped to /var/log/messages. Apr 26 10:41:47 golem dbus-daemon[660]: dbus[660]: [system] Activating via systemd: service name='fi.w1.wpa_supplicant1' unit='wpa_supplicant.service' Apr 26 10:41:47 golem dbus[660]: [system] Activating via systemd: service name='fi.w1.wpa_supplicant1' unit='wpa_supplicant.service' Apr 26 10:42:12 golem dbus[660]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out Apr 26 10:42:12 golem dbus-daemon[660]: dbus[660]: [system] Failed to activate service 'fi.w1.wpa_supplicant1': timed out Here dbus is clearly using systemd for the service activation, but we need more log messages from systemd about why the service failed to activate within 25 seconds. I assume systemd folks will know where to get those. I'm fairly certain that nothing in wpa_supplicant itself has changed within the past few weeks. Exactly the same here, I "repair" it with # systemctl restart NetworkManager.service till next reboot nominating release blocker : networking fail out of the box (could've sworn I'd marked it previously, but perhaps I'd only dreamed it). marking regression per comment #5, last (nm) build that didn't exhibit this was NetworkManager-0.9.4.0-1.git20120328.fc17 I have 3 bare metal installs of f17 (Gnome, KDE, and Xfce) and only two, KDE & Xfce, exhibit this behavior. The Gnome version starts wifi and connects just like it always has. All three are running the same version of NM: NetworkManager-0.9.4.0-7.git20120403.fc17.x86_64 I couldn't get it to happen with an F17 GNOME Shell install either... :( I suppose I'll try installing on a slightly faster machine and see if that helps reproduce it. FWIW: I saw symptoms that could seem similar when I experienced Bug 815331 (ipw2200 regression in reporting of properties cause problems when nm-applet started checking them in 0.9.4) - perhaps some of the comments here are influenced by that. (In reply to comment #23) > I couldn't get it to happen with an F17 GNOME Shell install either... :( I > suppose I'll try installing on a slightly faster machine and see if that helps > reproduce it. I'm running all 3 installs on a i7-2820QM, which is pretty darn fast, so I'm not sure if machine speed is related. Gnome version works, Xfce and KDE versions exhibit the bug. I have this problem on Thinkpad X40 with 1.2GB of RAM, PCMCIA wireless card, gnome in fallback mode. CPU scales between 600 and 1200 MHz. Investigation notes: /usr/share/dbus-1/system-services/fi.w1.wpa_supplicant1.service: [D-BUS Service] Name=fi.w1.wpa_supplicant1 Exec=@SBINDIR@/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf -B -u -f /var/log/wpa_supplicant.log -P /var/run/wpa_supplicant.pid User=root SystemdService=wpa_supplicant.service /lib/systemd/system/wpa_supplicant.service: [Unit] Description=WPA Supplicant daemon After=syslog.target network.target [Service] Type=forking EnvironmentFile=-/etc/sysconfig/wpa_supplicant ExecStart=/usr/sbin/wpa_supplicant -c /etc/wpa_supplicant/wpa_supplicant.conf $INTERFACES $DRIVERS -B -u $OTHER_ARGS [Install] WantedBy=multi-user.target 1) Unescaped @SBINDIR@ in the dbus service looks weird. 2) The ExecStart lines are slightly different, although via /etc/sysconfig/wpa_supplicant they end up being the same 3) /lib/systemd/system/wpa_supplicant.service has: After=network.target NM has: Before=network.target This seems potentially incompatible. 4) Changing wpa_supplicant.service such that (psuedo-diff): -Type=forking +Type=dbus +BusName=fi.w1.wpa_supplicant1 may or may not help. Fwiw, bill's workarounds didn't help me (as a matter of fact, after 4, NM wouldn't work at all until I put it back the way it was). I tried booting into both gnome and kde, and they both still fail equally for me until I manually do: systemctl restart NetworkManager.service I'm going to guess that everyone who's seeing this has nfs-lock.service enabled. (Check with systemctl is-enabled nfs-lock.service.) What we have: nfs-lock.service: Wants=remote-fs-pre.target Before=remote-fs-pre.target remote-fs-pre.target.wants has: NetworkManager-wait-online.service: After=NetworkManager.service Before=network.target wpa_supplicant.service has: After=syslog.target network.target nfs-lock.service enables the remote-fs-pre.target, which brings in NM-wait-online. NM-wait-online needs done before network.target is reached, so it waits 30 seconds for NM to connect before network.target is considered done. wpa_supplicant is After: network.target, so systemd waits for NM-wait-online to finish before it starts it. If NM-wait-online waits more than NM's specified timeout to wait for wpa_supplicant... it won't connect to it. I see two bugs here: - nfs-lock.service should not Want remote-fs-pre.target - that is automatically pulled in by network filesystems where necessary. - wpa_supplicant.service should not be After network.target. If anything, it should be Before... Fixing either of these would fix the issue where NM & wpa_supplicant can't talk to each other, but I think we should fix both. Leaving assigned to systemd for a short bit for confirmation that I've debugged this right. *** Bug 817635 has been marked as a duplicate of this bug. *** *** Bug 817605 has been marked as a duplicate of this bug. *** I can confirm that nfs-lock is enabled on my machine i confirm that nfs-lock.service is enabled on my affected system. Unfortunately, disabling nfs-lock.service didn't help wireless work, after reboot. I think it's NetworkManager and not systemd. I first downgraded Networkmanager and rebootet, but the error was still present. I then downgraded NetworkManager to 0.9.4.0-1.git20120328.fc17. On reboot, the problem did not occur again but wifi devices were present right from the start. I then upgraded systemd to 44-7.fc17 again and rebooted one more time. Still everything is fine. I had this problem until I removed NetworkManager.service from /etc/systemd/system/multi-user.target.wants System boots faster and wireless works OOTB (In reply to comment #35) > I think it's NetworkManager and not systemd. > > I first downgraded Networkmanager and rebootet, but the error was still > present. I then downgraded NetworkManager to 0.9.4.0-1.git20120328.fc17. On > reboot, the problem did not occur again but wifi devices were present right > from the start. I then upgraded systemd to 44-7.fc17 again and rebooted one > more time. Still everything is fine. NM can hack around the problem of wpa_supplicant startup by retrying the AddInterface() call a few times or something like that, but that's a hack and a workaround. I'd *also* like to figure out why the problem happens in the first place, since covering it up will simply cause problems later. It seems like Bill's poked around and figured that out. We should still be getting more information from systemd about why the service isn't starting in time though, which would have helped determine the cause earlier. Moving to wpa_supplicant since we'll fix the ordering issues there that are causing these problems. wpa_supplicant-1.0-0.4.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/wpa_supplicant-1.0-0.4.fc17 Anyone hitting this bug, please install the wpa_supplicant update and restart, then check your /var/log/messages for the systemd dbus activation failure message for the supplicant during restart. Thanks! Seems to be fixed. Autoconnecting to wireless networks works fine :-) Update seems to have fixed it. wpa_supplicant-1.0-0.4.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report. |