Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1662857
Summary: | lightdm does mlockall() which is not compatible with systemd-240 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Wolfgang Ulbrich <fedora> | ||||||
Component: | lightdm | Assignee: | Alternative GTK desktop environments <alt-gtk-de-sig> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 31 | CC: | alt-gtk-de-sig, christoph.wickert, fedora, leigh123linux, lnykryn, msekleta, ppywlkiqletw, prd-fedora, rdieter, s, systemd-maint, watanabe.yu, zbyszek | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | systemd-240-3.gitf02b547.fc30 ,lightdm-1.28.0-5.fc30 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2020-06-01 01:24:00 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Wolfgang Ulbrich
2019-01-02 08:43:00 UTC
Created attachment 1517878 [details]
stacktrace lightdm-gtk
Thanks Wolfgang. I think someone else suggested that. See bug 1662080. I'm going to try to downgrade and see what happens. (In reply to Paul DeStefano from comment #2) > Thanks Wolfgang. I think someone else suggested that. See bug 1662080. > > I'm going to try to downgrade and see what happens. Downgrading to systemd-239-10.git3bf819c.fc30 makes the problem go away. This change will most likely be reverted upstream. Upstream fix is merged https://github.com/systemd/systemd/pull/11327 Can this be added as patch to current 240 build, please? I am happy to test an update. Hmm, i added the commit to latest systemd build but it doesn't fix the problem. Slick-greeter doesn't start. Maybe someone other can test this scratch build? https://koji.fedoraproject.org/koji/taskinfo?taskID=31879037 (In reply to Wolfgang Ulbrich from comment #6) > Hmm, > i added the commit to latest systemd build but it doesn't fix the problem. > Slick-greeter doesn't start. > Maybe someone other can test this scratch build? > https://koji.fedoraproject.org/koji/taskinfo?taskID=31879037 However it might solve a different problem: Look at /etc/X11/xinit/xinitrc.d/00-start-message-bus.sh which will start another dbus daemon, and inside xfce4-session there is code to start another dbus daemon when DBUS_SESSION_BUS_ADDRESS is not set. lxdm still works and can start xfce. I'm going to test lightdm-gtk. (In reply to Villy Kruse from comment #7) > (In reply to Wolfgang Ulbrich from comment #6) > > Hmm, > > i added the commit to latest systemd build but it doesn't fix the problem. > > Slick-greeter doesn't start. > > Maybe someone other can test this scratch build? > > https://koji.fedoraproject.org/koji/taskinfo?taskID=31879037 > > However it might solve a different problem: Look at > /etc/X11/xinit/xinitrc.d/00-start-message-bus.sh which will start another > dbus daemon, and inside xfce4-session there is code to start another dbus > daemon when DBUS_SESSION_BUS_ADDRESS is not set. lxdm still works and can > start xfce. > > I'm going to test lightdm-gtk. lightdm-gtk greeter is still broken. An strace of the lightdm greeter shows a lot of failure from the mmap syscall with errno = EAGAIN. With systemd version 239 that was not the case. I believe the cause is this: In lightdm-gtk-greeter there is a call to mlockall (MCL_CURRENT | MCL_FUTURE) With systemd version 239 the ulimit for RLIMIT_MEMLOCK was set to 16 MiB and therefore the mlockall call would fail. This is lucky becasue the subsequent mmap would not fail. With systemd version 240 the RLIMIT_MEMLOCK is now set to 64 MiB and now the mlockall no longer fails. However, it not possible to mmap in all the memory and because that would still exceed the MEMLOCK limit. Workaround: Create a wrapper for lightdm-gtk-greeter and set ulimit -l to 16384 and lightdm works again. [root@ext2 sbin]# more lightdm-gtk-greeter-wrapper #!/bin/sh ulimit -l 16384 exec strace -o /var/tmp/greeter-trace -ff /usr/sbin/lightdm-gtk-greeter "$@" Well, this is a work around for a single local installation. But how we should fix our desktop spins? Sadly, Mr. Lennart Poettering closed upstream report and ignore posts from me and others. Looks like he don't wan't help us. Can we expect help from fedora systemd maintainers? Thanks for rebuilding systemd. https://koji.fedoraproject.org/koji/buildinfo?buildID=1178844 I just tested latest build with slick-greater or lightdm-gtk, but no luck. Both greater doesn't start. Please re-open. (In reply to Wolfgang Ulbrich from comment #10) > Well, this is a work around for a single local installation. > But how we should fix our desktop spins? > Sadly, Mr. Lennart Poettering closed upstream report and ignore posts from > me and others. > Looks like he don't wan't help us. > Can we expect help from fedora systemd maintainers? The workaround just shows that the upstream https://github.com/systemd/systemd/issues/11293 has nothing to do with the lightdm greeters failing. What you could do is make the maintainer of lightdm add the following to the lightdm service file: LimitMEMLOCK=4G Then argue that this issue is a blocking issue as the Fedora spins live DVD might not work. (In reply to Villy Kruse from comment #12) > (In reply to Wolfgang Ulbrich from comment #10) > > Well, this is a work around for a single local installation. > > But how we should fix our desktop spins? > > Sadly, Mr. Lennart Poettering closed upstream report and ignore posts from > > me and others. > > Looks like he don't wan't help us. > > Can we expect help from fedora systemd maintainers? > > > The workaround just shows that the upstream > https://github.com/systemd/systemd/issues/11293 has nothing to do with the > lightdm greeters failing. > > What you could do is make the maintainer of lightdm add the following to the > lightdm service file: > > LimitMEMLOCK=4G > > Then argue that this issue is a blocking issue as the Fedora spins live DVD > might not work. xfce-spin for arm is release blocking ;) Fact is that something with systemd-240 is changed and brokes lightdm-gtk and slick-greater, because with systemd-239-10.git3bf819c.fc30 both greeter starting well. Ok thanks, adding LimitMEMLOCK=4G to lightdm service file helps a lot. But isn't this a proper fix? I have rebuild lightdm with `LimitMEMLOCK=16777216` for f30 which works fine. https://koji.fedoraproject.org/koji/taskinfo?taskID=32015519 ... and thank you for the fish. Just kill the mlockall() call. E.g. set LimitMEMLOCK=4k for now in the unit file as a work-around, and work with upstream to replace the blanket mlockall() with a mlock() just on the password string. LimitMEMLOCK=16777216 is not a suitable long-term fix. mlockall() is generally a bad idea and certainly has no place in a graphical program. A program like this uses lots of memory and it is crucial that this memory can be paged out to relieve memory pressure. Strangely, upstream has "Selectively lock memory rather than calling mlockall for main daemon" in NEWS from a few years ago. It seems it was announced but not implemented or reverted or ?. Let's talk about snake oil security. If a program does mlockall(), and hibernation is requested, the machine hibernates. mlockall() does not prevent hibernation. Now let's imagine we know a password is stored in memory (swappable or not) and we would like to extract it. The easiest way to do this is to hibernate the machine, and then either boot using a rescue media, or even using the original system in rescue mode. Then we do "forensics" on the swap partition (call 'strings' + 'grep', nothing fancy), and after we have extracted the password, reset and resume the previous system. Unless the user looks at the logs, they won't even know that the whole procedure happened. The alternative option of trying to extract the password from the running system is much harder, and entirely pointless if we can hibernate with unencrypted swap. So what is the effect of mlockall()? It doesn't hinder password extraction, it mostly prevents the machine from making effective use of RAM and swap. That's why I titled the bug like I did. Feel free to shoot the messenger if this helps, but please fix this. (In reply to Zbigniew Jędrzejewski-Szmek from comment #18) > Feel free to shoot the messenger I promise not to shoot if you can report the issue upstream. https://github.com/CanonicalLtd/lightdm/issues Your understanding of the problem would help in the report. @Zbigniew Jędrzejewski-Szmek Thank you. (In reply to Zbigniew Jędrzejewski-Szmek from comment #17) > Strangely, upstream has "Selectively lock memory rather than calling > mlockall for main daemon" in NEWS from a few years ago. It seems it was > announced but not implemented or reverted or ?. If I run lightdm with strace I can see that only the greeter calls mlockall; lightdm itself doesn't. Normally, the greeter will not stay around for very long after accepting the login name and password. However, lightdm itself needs at some point to get the login name and password from the greeter in order to be able to call the pam modules. If the password stays in the process memory of lightdm I can't tell. Ideally, the password should be wiped out as soon as it is not needed any more. From src/session-child.c in lightdm int session_child_run (int argc, char **argv) { #if !defined(GLIB_VERSION_2_36) g_type_init (); #endif if (config_get_boolean (config_get_instance (), "LightDM", "lock-memory")) { /* Protect memory from being paged to disk, as we deal with passwords */ mlockall (MCL_CURRENT | MCL_FUTURE); } This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to '31'. This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle. Changing version to 31. FEDORA-2020-280b5c3c27 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-280b5c3c27 FEDORA-2020-6ef2ebf4a6 has been pushed to the Fedora 32 testing repository. In short time you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-6ef2ebf4a6` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-6ef2ebf4a6 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2020-280b5c3c27 has been pushed to the Fedora 31 testing repository. In short time you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-280b5c3c27` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-280b5c3c27 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2020-6ef2ebf4a6 has been pushed to the Fedora 32 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2020-280b5c3c27 has been pushed to the Fedora 31 stable repository. If problem still persists, please make note of it in this bug report. |