Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1662857

Summary: lightdm does mlockall() which is not compatible with systemd-240
Product: [Fedora] Fedora Reporter: Wolfgang Ulbrich <fedora>
Component: lightdmAssignee: Alternative GTK desktop environments <alt-gtk-de-sig>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 31CC: alt-gtk-de-sig, christoph.wickert, fedora, leigh123linux, lnykryn, msekleta, ppywlkiqletw, prd-fedora, rdieter, s, systemd-maint, watanabe.yu, zbyszek
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: systemd-240-3.gitf02b547.fc30 ,lightdm-1.28.0-5.fc30 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-06-01 01:24:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
stackstrace slick-greeter
none
stacktrace lightdm-gtk none

Description Wolfgang Ulbrich 2019-01-02 08:43:00 UTC
Created attachment 1517877 [details]
stackstrace slick-greeter

Description of problem:
slick-greeter and lightdm-gtk desktop managers doesn't start any more after update to systemd-240 in fedora rawhide.

Version-Release number of selected component (if applicable):
systemd-240-2.fc30

How reproducible and Steps to Reproduce:
1. Install Mate Cinnamon f30 or any other livecd with systemd older than v240 which use slick-greeter or lightdm-gtk , probably xfce livecd.
2. update everything except systemd.
3. reboot --> everythings is good and DM will start.
4. update systemd to 240 and reboot.
5. boom, slick-greeter or ligtdm-gtk doesn't start any more


Actual results:
Several desktop livecd are broken because you can't reach the Desktops anymore after updating to systemd-240

Expected results:
Booting into a graphical DE should work.

Additional info:
Upstream bugreport:
https://github.com/systemd/systemd/issues/11293

Please fix before we branch f30.
Currently, we can't test several DE spins any more.

Comment 1 Wolfgang Ulbrich 2019-01-02 08:44:42 UTC
Created attachment 1517878 [details]
stacktrace lightdm-gtk

Comment 2 Paul DeStefano 2019-01-02 17:43:57 UTC
Thanks Wolfgang.  I think someone else suggested that.  See bug 1662080.

I'm going to try to downgrade and see what happens.

Comment 3 Villy Kruse 2019-01-04 12:43:15 UTC
(In reply to Paul DeStefano from comment #2)
> Thanks Wolfgang.  I think someone else suggested that.  See bug 1662080.
> 
> I'm going to try to downgrade and see what happens.

Downgrading to systemd-239-10.git3bf819c.fc30 makes the problem go away.

Comment 4 Zbigniew Jędrzejewski-Szmek 2019-01-04 13:25:15 UTC
This change will most likely be reverted upstream.

Comment 5 Wolfgang Ulbrich 2019-01-07 16:51:13 UTC
Upstream fix is merged https://github.com/systemd/systemd/pull/11327
Can this be added as patch to current 240 build, please?
I am happy to test an update.

Comment 6 Wolfgang Ulbrich 2019-01-07 17:18:46 UTC
Hmm,
i added the commit to latest systemd build but it doesn't fix the problem.
Slick-greeter doesn't start.
Maybe someone other can test this scratch build?
https://koji.fedoraproject.org/koji/taskinfo?taskID=31879037

Comment 7 Villy Kruse 2019-01-07 19:10:07 UTC
(In reply to Wolfgang Ulbrich from comment #6)
> Hmm,
> i added the commit to latest systemd build but it doesn't fix the problem.
> Slick-greeter doesn't start.
> Maybe someone other can test this scratch build?
> https://koji.fedoraproject.org/koji/taskinfo?taskID=31879037

However it might solve a different problem:  Look at /etc/X11/xinit/xinitrc.d/00-start-message-bus.sh which will start another dbus daemon, and inside xfce4-session there is code to start another dbus daemon when DBUS_SESSION_BUS_ADDRESS is not set.  lxdm still works and can start xfce.

I'm going to test lightdm-gtk.

Comment 8 Villy Kruse 2019-01-07 19:58:22 UTC
(In reply to Villy Kruse from comment #7)
> (In reply to Wolfgang Ulbrich from comment #6)
> > Hmm,
> > i added the commit to latest systemd build but it doesn't fix the problem.
> > Slick-greeter doesn't start.
> > Maybe someone other can test this scratch build?
> > https://koji.fedoraproject.org/koji/taskinfo?taskID=31879037
> 
> However it might solve a different problem:  Look at
> /etc/X11/xinit/xinitrc.d/00-start-message-bus.sh which will start another
> dbus daemon, and inside xfce4-session there is code to start another dbus
> daemon when DBUS_SESSION_BUS_ADDRESS is not set.  lxdm still works and can
> start xfce.
> 
> I'm going to test lightdm-gtk.

lightdm-gtk greeter is still broken.

An strace of the lightdm greeter shows a lot of failure from the mmap syscall with errno = EAGAIN.  With systemd version 239 that was not the case.

Comment 9 Villy Kruse 2019-01-08 07:02:09 UTC
I believe the cause is this:

In lightdm-gtk-greeter there is a call to mlockall (MCL_CURRENT | MCL_FUTURE)

With systemd version 239 the ulimit for RLIMIT_MEMLOCK was set to 16 MiB and therefore the mlockall call would fail.  This is lucky becasue the subsequent mmap would not fail.

With systemd version 240 the RLIMIT_MEMLOCK is now set to 64 MiB and now the mlockall no longer fails.  However, it not possible to mmap in all the memory and because that would still exceed the MEMLOCK limit.

Workaround:

Create a wrapper for lightdm-gtk-greeter and set ulimit -l to 16384 and lightdm works again.


 [root@ext2 sbin]# more lightdm-gtk-greeter-wrapper
#!/bin/sh

ulimit  -l 16384

exec strace -o /var/tmp/greeter-trace -ff /usr/sbin/lightdm-gtk-greeter "$@"

Comment 10 Wolfgang Ulbrich 2019-01-12 09:52:23 UTC
Well, this is a work around for a single local installation.
But how we should fix our desktop spins?
Sadly, Mr. Lennart Poettering closed upstream report and ignore posts from me and others.
Looks like he don't wan't help us.
Can we expect help from fedora systemd maintainers?

Comment 11 Wolfgang Ulbrich 2019-01-12 12:27:03 UTC
Thanks for rebuilding systemd.
https://koji.fedoraproject.org/koji/buildinfo?buildID=1178844
I just tested latest build with slick-greater or lightdm-gtk, but no luck.
Both greater doesn't start.
Please re-open.

Comment 12 Villy Kruse 2019-01-12 12:43:57 UTC
(In reply to Wolfgang Ulbrich from comment #10)
> Well, this is a work around for a single local installation.
> But how we should fix our desktop spins?
> Sadly, Mr. Lennart Poettering closed upstream report and ignore posts from
> me and others.
> Looks like he don't wan't help us.
> Can we expect help from fedora systemd maintainers?


The workaround just shows that the upstream https://github.com/systemd/systemd/issues/11293 has nothing to do with the lightdm greeters failing.  

What you could do is make the maintainer of lightdm add the following to the lightdm service file:

LimitMEMLOCK=4G

Then argue that this issue is a blocking issue as the Fedora spins live DVD might not work.

Comment 13 Wolfgang Ulbrich 2019-01-12 13:07:31 UTC
(In reply to Villy Kruse from comment #12)
> (In reply to Wolfgang Ulbrich from comment #10)
> > Well, this is a work around for a single local installation.
> > But how we should fix our desktop spins?
> > Sadly, Mr. Lennart Poettering closed upstream report and ignore posts from
> > me and others.
> > Looks like he don't wan't help us.
> > Can we expect help from fedora systemd maintainers?
> 
> 
> The workaround just shows that the upstream
> https://github.com/systemd/systemd/issues/11293 has nothing to do with the
> lightdm greeters failing.  
> 
> What you could do is make the maintainer of lightdm add the following to the
> lightdm service file:
> 
> LimitMEMLOCK=4G
> 
> Then argue that this issue is a blocking issue as the Fedora spins live DVD
> might not work.

xfce-spin for arm is release blocking ;)

Fact is that something with systemd-240 is changed and brokes lightdm-gtk and slick-greater,
because with systemd-239-10.git3bf819c.fc30 both greeter starting well.

Comment 14 Wolfgang Ulbrich 2019-01-12 13:18:19 UTC
Ok thanks,
adding LimitMEMLOCK=4G to lightdm service file helps a lot.
But isn't this a proper fix?

Comment 15 Wolfgang Ulbrich 2019-01-14 09:45:44 UTC
I have rebuild lightdm with `LimitMEMLOCK=16777216` for f30 which works fine.
https://koji.fedoraproject.org/koji/taskinfo?taskID=32015519
... and thank you for the fish.

Comment 16 Zbigniew Jędrzejewski-Szmek 2019-01-15 09:00:00 UTC
Just kill the mlockall() call.  E.g. set LimitMEMLOCK=4k for now in the unit file as a work-around, and work with upstream to replace the blanket mlockall() with a mlock() just on the password string. LimitMEMLOCK=16777216 is not a suitable long-term fix.

mlockall() is generally a bad idea and certainly has no place in a graphical program. A program like this uses lots of memory and it is crucial that this memory can be paged out to relieve memory pressure.

Comment 17 Zbigniew Jędrzejewski-Szmek 2019-01-15 09:02:27 UTC
Strangely, upstream has "Selectively lock memory rather than calling mlockall for main daemon" in NEWS from a few years ago. It seems it was announced but not implemented or reverted or ?.

Comment 18 Zbigniew Jędrzejewski-Szmek 2019-01-15 09:49:40 UTC
Let's talk about snake oil security.

If a program does mlockall(), and hibernation is requested, the machine hibernates. mlockall() does not prevent hibernation. Now let's imagine we know a password is stored in memory (swappable or not) and we would like to extract it. The easiest way to do this is to hibernate the machine, and then either boot using a rescue media, or even using the original system in rescue mode. Then we do "forensics" on the swap partition (call 'strings' + 'grep', nothing fancy), and after we have extracted the password, reset and resume the previous system. Unless the user looks at the logs, they won't even know that the whole procedure happened. The alternative option of trying to extract the password from the running system is much harder, and entirely pointless if we can hibernate with unencrypted swap.

So what is the effect of mlockall()? It doesn't hinder password extraction, it mostly prevents the machine from making effective use of RAM and swap. That's why I titled the bug like I did. Feel free to shoot the messenger if this helps, but please fix this.

Comment 19 leigh scott 2019-01-15 10:05:09 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #18)

> Feel free to shoot the messenger


I promise not to shoot if you can report the issue upstream.


https://github.com/CanonicalLtd/lightdm/issues


Your understanding of the problem would help in the report.

Comment 20 leigh scott 2019-01-15 10:52:24 UTC
@Zbigniew Jędrzejewski-Szmek


Thank you.

Comment 21 Villy Kruse 2019-01-15 14:26:25 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #17)
> Strangely, upstream has "Selectively lock memory rather than calling
> mlockall for main daemon" in NEWS from a few years ago. It seems it was
> announced but not implemented or reverted or ?.

If I run lightdm with strace I can see that only the greeter calls mlockall;  lightdm itself doesn't.  Normally, the greeter will not stay around for very long after accepting the login name and password.

However, lightdm itself needs at some point to get the login name and password from the greeter in order to be able to call the pam modules.  If the password stays in the process memory of lightdm I can't tell.  Ideally, the password should be wiped out as soon as it is not needed any more.

From src/session-child.c in lightdm

int
session_child_run (int argc, char **argv)
{
#if !defined(GLIB_VERSION_2_36)
    g_type_init ();
#endif
  
    if (config_get_boolean (config_get_instance (), "LightDM", "lock-memory"))
    {
        /* Protect memory from being paged to disk, as we deal with passwords */
        mlockall (MCL_CURRENT | MCL_FUTURE);
    }

Comment 22 Ben Cotton 2019-08-13 17:02:16 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to '31'.

Comment 23 Ben Cotton 2019-08-13 19:18:01 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to 31.

Comment 24 Fedora Update System 2020-05-23 14:10:16 UTC
FEDORA-2020-280b5c3c27 has been submitted as an update to Fedora 31. https://bodhi.fedoraproject.org/updates/FEDORA-2020-280b5c3c27

Comment 25 Fedora Update System 2020-05-24 04:49:12 UTC
FEDORA-2020-6ef2ebf4a6 has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-6ef2ebf4a6`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-6ef2ebf4a6

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 26 Fedora Update System 2020-05-24 05:02:35 UTC
FEDORA-2020-280b5c3c27 has been pushed to the Fedora 31 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-280b5c3c27`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-280b5c3c27

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 27 Fedora Update System 2020-06-01 01:24:00 UTC
FEDORA-2020-6ef2ebf4a6 has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 28 Fedora Update System 2020-06-01 03:36:52 UTC
FEDORA-2020-280b5c3c27 has been pushed to the Fedora 31 stable repository.
If problem still persists, please make note of it in this bug report.