Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1815487 - X11 session is broken ('Something has gone wrong') after a quick re-login from Wayland to X11
Summary: X11 session is broken ('Something has gone wrong') after a quick re-login fro...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: gnome-session
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ray Strode [halfline]
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: RejectedBlocker
Depends On:
Blocks: F32FinalFreezeException
TreeView+ depends on / blocked
 
Reported: 2020-03-20 12:25 UTC by Kamil Páral
Modified: 2020-04-27 11:04 UTC (History)
17 users (show)

Fixed In Version: gnome-session-3.36.0-2.fc32
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-29 00:16:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
system journal (350.12 KB, text/plain)
2020-03-20 12:26 UTC, Kamil Páral
no flags Details
rpm -qa (56.40 KB, text/plain)
2020-03-20 12:26 UTC, Kamil Páral
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNOME Gitlab GNOME/gnome-session - merge_requests 39 0 None None None 2020-03-21 15:07:06 UTC

Description Kamil Páral 2020-03-20 12:25:03 UTC
Description of problem:
In a VM with clean F32 Beta installation and all updates, I can't log in using X11 session, only Wayland. If I try X11 session, I see "Oh no, something has gone wrong" screen and a Log out button. I'm using a standard libvirt VM in virt-manager, using spice+virtio (no 3d acceleration).

See the attached journal log. The login attempt starts at:
Mar 20 13:10:48 f32 systemd-logind[713]: New session 5 of user kparal.

In the log, I see following errors:
Mar 20 13:10:49 f32 /usr/libexec/gdm-x-session[1993]: (EE) open /dev/fb0: Permission denied
Mar 20 13:10:49 f32 /usr/libexec/gdm-x-session[1993]: (EE) modeset(0): glamor initialization failed
Mar 20 13:10:49 f32 gnome-session[2071]: gnome-session-check-accelerated: GL Helper exited with code 512
Mar 20 13:10:49 f32 gnome-session[2088]: libEGL warning: DRI2: failed to authenticate
Mar 20 13:10:49 f32 gnome-session[2071]: gnome-session-check-accelerated: GLES Helper exited with code 512
Mar 20 13:10:49 f32 systemd[880]: gnome-session-x11: Requested dependency OnFailure=gnome-session-failed.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session.target: Requested dependency OnFailure=gnome-session-failed.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session: Requested dependency OnFailure=gnome-session-failed.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session-pre.target: Requested dependency OnFailure=gnome-session-shutdown.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session-initialized.target: Requested dependency OnFailure=gnome-session-shutdown.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session: Requested dependency OnFailure=gnome-session-failed.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session-x11.target: Requested dependency OnFailure=gnome-session-failed.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session: Requested dependency OnFailure=gnome-session-failed.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 systemd[880]: gnome-session-failed.target: Requested dependency OnFailure=gnome-session-shutdown.target ignored (target units cannot fail).
Mar 20 13:10:49 f32 gnome-session[2023]: gnome-session-binary[2023]: WARNING: Error creating FIFO: File exists
Mar 20 13:10:49 f32 gnome-session-binary[2023]: WARNING: Error creating FIFO: File exists
Mar 20 13:10:49 f32 gnome-session-c[2096]: Error creating FIFO: File existsMar 20 13:10:49 f32 systemd[880]: Starting GNOME Shell on X11...
Mar 20 13:10:49 f32 gnome-session[2098]: gnome-session-binary[2098]: GnomeDesktop-WARNING: Could not create transient scope for PID 2111: GDBus.Error:org.freedesktop.DBus.Error.UnixProcessIdUnknown: Process with ID 2111 does not exist.
Mar 20 13:10:51 f32 gsd-xsettings[2343]: Cannot open display:
Mar 20 13:10:51 f32 systemd[880]: gsd-xsettings.service: Main process exited, code=exited, status=1/FAILURE
Mar 20 13:10:51 f32 systemd[880]: gsd-xsettings.service: Failed with result 'exit-code'.
Mar 20 13:10:51 f32 systemd[880]: Failed to start GNOME XSettings.
Mar 20 13:10:51 f32 systemd[880]: Dependency failed for GNOME XSettings.
Mar 20 13:10:51 f32 systemd[880]: gsd-xsettings.target: Job gsd-xsettings.target/start failed with result 'dependency'.
Mar 20 13:10:51 f32 systemd[880]: gsd-xsettings.service: Triggering OnFailure= dependencies.
Mar 20 13:10:51 f32 systemd[880]: Started GNOME Session Failed lockdown screen (user).
Mar 20 13:10:51 f32 systemd[880]: Reached target GNOME Session Failed.



Version-Release number of selected component (if applicable):
gnome-shell-3.36.0-3.fc32.x86_64
gnome-session-3.36.0-1.fc32.x86_64
gnome-settings-daemon-3.36.0-1.fc32.x86_64

How reproducible:
always

Steps to Reproduce:
1. install F32 Workstation Live Beta in a VM, boot it
2. select X11 when logging in
3. see "Oh no" screen

You can also try it on the LiveCD itself:
1. boot F32 Workstation Live Beta
2. passwd liveuser
3. log out
4. try to log in using X11

Comment 1 Kamil Páral 2020-03-20 12:26:02 UTC
Created attachment 1671884 [details]
system journal

Login attempt starts at 13:10:48.

Comment 2 Kamil Páral 2020-03-20 12:26:40 UTC
Created attachment 1671885 [details]
rpm -qa

Comment 3 Kamil Páral 2020-03-20 12:29:32 UTC
Proposing as F32 Final Blocker. This seems to affect only VMs (I run F32 X11 on my laptop just fine), but there still might be good reasons for running X11 session, even in a VM. This is also a major obstacle for QA work, because I currently have at least one bug that I need to test on X11, and I can't, because of this bug.

Comment 4 Michael Catanzaro 2020-03-21 14:00:20 UTC
I hope X11 is not still release blocking three years after we switched to Wayland by default....

That said, of course this looks bad.

P.S. It would be good to have a separate bug for the spam of errors coming from gnome-session, which look mostly unrelated and don't bring down the session.

Comment 5 Michael Catanzaro 2020-03-21 14:01:09 UTC
Um... is Rui still the right package maintainer for g-s-d? Haven't heard much from him since he left Red Hat.

Comment 6 Benjamin Berg 2020-03-21 14:14:26 UTC
I can think of one scenario how this could happen. Could it be that you logged out of a wayland session and back in almost immediately into Xorg?

I think what is happening is that the GNOME_SETUP_DISPLAY environment variable is leaking between the sessions. It will only be set in a wayland session, and if it is still there when you log into X11 you would get exactly this error.

We can work around this by either making gnome-shell unset the variable explicitly if not needed, or by purging it in gnome-session at login time.

Comment 7 Benjamin Berg 2020-03-21 14:15:58 UTC
Kamil, does the described scenario sound compatible with what you did?

The systemd instance will usually quit 10s after you log out, so that is the window you have for the environment variable disappearing by itself.

Comment 8 Benjamin Berg 2020-03-21 15:07:07 UTC
Opened an upstream pull request to work around this in gnome-session. As such, I am also moving the component here.

Comment 9 Lukas Ruzicka 2020-03-23 16:45:10 UTC
I have tried to install the WS Beta in a VM. It works both in Wayland and Xorg. However, I am running Fedora 31 as the host system. It might be related.

Comment 10 Geoffrey Marr 2020-03-23 19:34:44 UTC
Discussed during the 2020-03-23 blocker review meeting: [0]

The decision to classify this bug as a "RejectedBlocker" was made as so far this is believed to affect virtual environments only, it is rejected on the basis that there is no situation in which the criteria require GNOME-on-X11 to work on our supported virtualization stack (Wayland is the default and should always work).

[0] https://meetbot.fedoraproject.org/fedora-blocker-review/2020-03-23/f32-blocker-review.2020-03-23-16.00.txt

Comment 11 Fedora Update System 2020-03-23 21:11:44 UTC
FEDORA-2020-d798a0dae2 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-d798a0dae2

Comment 12 Ray Strode [halfline] 2020-03-23 21:14:57 UTC
i've put the patch alluded to in comment 7 in the above update.  Note this patch will only fix sessions that were successfully logged in, logged out, and then logged back in again.

It might be there is more than one bug, in which case we may have to do more work.

Comment 13 Fedora Update System 2020-03-24 01:52:44 UTC
FEDORA-2020-d798a0dae2 has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-d798a0dae2`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-d798a0dae2

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 14 Kamil Páral 2020-03-24 12:13:08 UTC
(In reply to Benjamin Berg from comment #6)
> I can think of one scenario how this could happen. Could it be that you
> logged out of a wayland session and back in almost immediately into Xorg?

Yes and no. Some of my attempts were performed by logging out of wayland and immediately trying X11. But I also rebooted the machine cleanly and tried to log in to X11 on the first try (I believe the attached log is from this attempt) and it still failed. So the problem is not related to session leakage.

Comment 15 Kamil Páral 2020-03-24 12:41:56 UTC
> But I also rebooted the machine cleanly and tried to
> log in to X11 on the first try (I believe the attached log is from this
> attempt) and it still failed. So the problem is not related to session
> leakage.

Hum, I might have been wrong. I can definitely reproduce the issue when logging to wayland first and relogging to x11 (without waiting on the gdm screen), with both the old and new version. But I can't reproduce it when logging to x11 right after boot. So your theory might be correct and I just misremembered. Either way, the problem is not gone with gnome-session-3.36.0-2.fc32.

The severity of this issue is considerably lower than what I initially thought, though. And it might affect bare metal as well.

Comment 16 Kamil Páral 2020-03-24 12:47:01 UTC
(In reply to Kamil Páral from comment #15)
> And it might affect bare metal as well.

Yes, this affects bare metal as well, if you re-log quickly. This is not VM specific.

Comment 17 Ray Strode [halfline] 2020-03-24 21:33:50 UTC
so i can confirm that calling systemctl --user unset-environment GNOME_SETUP_DISPLAY makes gsd-xsettings service start (with a manual systemctl --user start gsd-xsettings.service after tweaking the service file a bit).

It's not clear why the equivalent code isn't working in gnome-session. I'll need to debug with gdb probably.

Comment 18 Benjamin Berg 2020-03-24 22:26:57 UTC
Uh, I am quite confused to be honest. Everything looks correct to me (I checked that the patch was applied in the package) and if the code wasn't working at all, then we should still have some other bugs unresolved …

Maybe one could run a dbus-monitor alongside the session startup (i.e. add "dbus-monitor --session >/tmp/log &" to /usr/bin/gnome-shell)?

Comment 19 Kamil Páral 2020-03-25 14:15:27 UTC
I have noticed that the Power Off/Log Out buttons are subject to the same delay. Every time you see the gdm screen or when you log in, there is ~10 seconds timeout during which Power Off/Log Out buttons don't do anything, clicking on them is just ignored. If you log out and wait until the buttons start to show the relevant dialogs and only then log in, the problem doesn't occur. If you log in before those buttons start working (<~10s), they problem occurs (but only when going from wayland to x11; when going from x11 to wayland, you can do that immediately).

I'll try the dbus-monitor stuff.

Comment 20 Kamil Páral 2020-03-25 14:19:06 UTC
(In reply to Benjamin Berg from comment #18)
> Maybe one could run a dbus-monitor alongside the session startup (i.e. add
> "dbus-monitor --session >/tmp/log &" to /usr/bin/gnome-shell)?

Sorry, which file should I adjust and how? /usr/bin/gnome-shell is a binary.

Comment 21 Benjamin Berg 2020-03-25 14:33:33 UTC
> Sorry, which file should I adjust and how? /usr/bin/gnome-shell is a binary.

Sorry … that was supposed to be /usr/bin/gnome-session

It'll be a *huge* file after a short period of time; what we are looking for is very early in the file. i.e. search for UnsetAndSetEnvironment

Comment 22 Ray Strode [halfline] 2020-03-25 20:30:02 UTC
so I think I figured out the problem.

We restart the user bus daemon on logout to make sure all session clients attached to the bus are kicked.  This means in a typical scenario where there is only one session running, the bus daemon very briefly outlives the rest of the session.  If the user logs in again during this window, the bus daemon gets reused, and it has a stale environment.  It also has no way to clear that stale environment at start up.  The UpdateActivationEnvironment bus method doesn't have a way to unset variables.

We could fix this by unsetting the environment variables in the systemd manager in gnome-session-ctl --shutdown or so (in addition to the place we do it at startup), so the restarted bus would get the clean environment from the systemd.  But, I actually think it's better to do it from the shell side.  I mean the shell is what's setting the environment variables, so it should be what's unsetting them after the shell finishes.

Comment 23 Ray Strode [halfline] 2020-03-25 20:39:37 UTC
filed merge request upstream here:

https://gitlab.gnome.org/GNOME/gnome-shell/-/merge_requests/1129

Comment 24 Fedora Update System 2020-03-25 21:14:26 UTC
FEDORA-2020-d798a0dae2 has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 25 Fedora Update System 2020-03-25 21:40:34 UTC
FEDORA-2020-d615facab7 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-d615facab7

Comment 26 Fedora Update System 2020-03-26 08:18:02 UTC
FEDORA-2020-d615facab7 has been pushed to the Fedora 32 testing repository.
In short time you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2020-d615facab7`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-d615facab7

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 27 Benjamin Berg 2020-03-26 09:47:43 UTC
Sure, I am aware that the variable survives on the DBus side. But g-s-d is not started through DBus activation, so I am still confused as to how it might be picking up that variable.

That said, I do like the idea of just unsetting those from ExecPostStop in the systemd unit. And if it works, whatever. :)

Comment 28 Kamil Páral 2020-03-26 10:21:46 UTC
(In reply to Fedora Update System from comment #26)
> https://bodhi.fedoraproject.org/updates/FEDORA-2020-d615facab7

Thanks Ray. That indeed fixes this problem, I can now relogin fast wayland->x11 and it doesn't crash.

Asking for a final freeze exception just to be sure, even though it's not needed yet.

Comment 29 Ray Strode [halfline] 2020-03-26 13:51:48 UTC
(In reply to Benjamin Berg from comment #27)
> Sure, I am aware that the variable survives on the DBus side. But g-s-d is
> not started through DBus activation, so I am still confused as to how it
> might be picking up that variable.
Well I didn't investigate the side of things, just noticed that if I ran 
ps -ef after logout that the bus daemon was running.

That said, I wouldn't be surprised if one of the g-s-d daemons tanks if it
can't talk to something that is dbus activated (like dconf or whatever)

Comment 30 Fedora Update System 2020-03-29 00:16:16 UTC
FEDORA-2020-d615facab7 has been pushed to the Fedora 32 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.