Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1804993 - abrt-dump-journal-xorg: Cannot read journal data
Summary: abrt-dump-journal-xorg: Cannot read journal data
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: abrt
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: ekulik
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
Depends On:
Blocks: F32FinalBlocker
TreeView+ depends on / blocked
 
Reported: 2020-02-20 03:35 UTC by Chris Murphy
Modified: 2020-03-06 07:19 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-28 07:17:27 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journal (372.78 KB, text/plain)
2020-02-20 03:42 UTC, Chris Murphy
no flags Details

Description Chris Murphy 2020-02-20 03:35:57 UTC
Description of problem:

abrt-xorg.service fails to start


Version-Release number of selected component (if applicable):
$ rpm -qa | grep abrt | sort -n
abrt-2.14.0-2.fc32.x86_64
abrt-addon-ccpp-2.14.0-2.fc32.x86_64
abrt-addon-kerneloops-2.14.0-2.fc32.x86_64
abrt-addon-pstoreoops-2.14.0-2.fc32.x86_64
abrt-addon-vmcore-2.14.0-2.fc32.x86_64
abrt-addon-xorg-2.14.0-2.fc32.x86_64
abrt-cli-2.14.0-2.fc32.x86_64
abrt-dbus-2.14.0-2.fc32.x86_64
abrt-desktop-2.14.0-2.fc32.x86_64
abrt-gui-2.14.0-2.fc32.x86_64
abrt-gui-libs-2.14.0-2.fc32.x86_64
abrt-java-connector-1.1.4-1.fc32.x86_64
abrt-libs-2.14.0-2.fc32.x86_64
abrt-plugin-bodhi-2.14.0-2.fc32.x86_64
abrt-retrace-client-2.14.0-2.fc32.x86_64
abrt-tui-2.14.0-2.fc32.x86_64
gnome-abrt-1.3.1-1.fc32.x86_64
python3-abrt-2.14.0-2.fc32.x86_64
python3-abrt-addon-2.14.0-2.fc32.noarch


How reproducible:
Always


Steps to Reproduce:
1.
2.
3.

Actual results:

[chris@vm ~]$ systemctl --failed
  UNIT              LOAD   ACTIVE SUB    DESCRIPTION          
● abrt-xorg.service loaded failed failed ABRT Xorg log watcher

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

1 loaded units listed.
[chris@vm ~]$ systemctl status abrt-xorg.service
● abrt-xorg.service - ABRT Xorg log watcher
     Loaded: loaded (/usr/lib/systemd/system/abrt-xorg.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2020-02-19 20:24:35 MST; 2min 0s ago
    Process: 818 ExecStart=/usr/bin/abrt-dump-journal-xorg -fxtD (code=exited, status=1/FAILURE)
   Main PID: 818 (code=exited, status=1/FAILURE)
        CPU: 33ms

Feb 19 20:24:35 vm.local systemd[1]: Started ABRT Xorg log watcher.
Feb 19 20:24:35 vm.local abrt-dump-journal-xorg[818]: abrt-dump-journal-xorg: Cannot read journal data.
Feb 19 20:24:35 vm.local systemd[1]: abrt-xorg.service: Main process exited, code=exited, status=1/FAILURE
Feb 19 20:24:35 vm.local systemd[1]: abrt-xorg.service: Failed with result 'exit-code'.
[chris@vm ~]$ 



Expected results:

Service units enabled by default should not fail.


Additional info:

enforcing=0 so selinux isn't related

Comment 1 Fedora Blocker Bugs Application 2020-02-20 03:37:11 UTC
Proposed as a Blocker for 32-final by Fedora user chrismurphy using the blocker tracking app because:

 Final: All system services present after installation with one of the release-blocking package sets must start properly, unless they require hardware which is not present.

Comment 2 Chris Murphy 2020-02-20 03:42:17 UTC
Created attachment 1664246 [details]
journal

Comment 3 František Zatloukal 2020-02-24 20:37:36 UTC
Discussed during the 2020-02-24 blocker review meeting: [1]

The decision to classify this bug as an AcceptedBlocker was made:

"All system services present after installation with one of the release-blocking package sets must start properly, unless they require hardware which is not present."

[1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2020-02-24/f32-blocker-review.2020-02-24-17.00.log.txt

Comment 4 ekulik 2020-02-25 07:55:48 UTC
I’ll take a look.

Comment 5 ekulik 2020-02-25 08:38:27 UTC
I am unable to reproduce the issue on Fedora-Workstation-Live-x86_64-32-20200224.n.0.

Comment 6 ekulik 2020-02-25 08:52:03 UTC
And the service is running after installation without me touching anything. I’m going to close this unless you can give me a clear reproducer.

Comment 7 ekulik 2020-02-25 08:58:24 UTC
(In reply to Chris Murphy from comment #2)
> Created attachment 1664246 [details]
> journal

This is full of noise.

Comment 8 ekulik 2020-02-25 09:31:16 UTC
Please take a look at https://bugzilla.redhat.com/show_bug.cgi?id=1433652#c13. Could be an issue with your storage configuraiton.

Comment 9 ekulik 2020-02-28 07:17:27 UTC
I’ll add a better error message to suggest verifying the state of the journal, but this cannot be blocked on without information from you.

Comment 10 Chris Murphy 2020-02-28 17:34:06 UTC
I can't reproduce with Fedora-Workstation-Live-x86_64-32-20200226.n.0.iso on baremetal. I did get two reproducers in VMs over a week ago. Clean install, reboot, and the unit fails.

re: bug 1433652#c13, I don't have a corrupt journal in this case, but if there were a corrupt journal I think abrt-xorg.service needs to be able to handle it gracefully, rather than the service unit failing. Journal corrupt is less common these days but it's still an ordinary event. Systemd devs have said it's not a big deal, the file can still be read, just the corrupt records are skipped over, and the journal is rotated out anyway.

If I find a more consistent reproducer, I'll reopen.

Comment 11 ekulik 2020-03-05 16:01:03 UTC
I can suggest adding -v, -vv or -vvv to the service unit and seeing what kind of error libsystemd is returning (genius idea to print the error at a higher log level, so it doesn’t normally appear), because it seems to be happening on that side exclusively. We only try to get some data from it and, if it returns an error, then it returns an error and we can only bail out. If we can narrow down the error, maybe we can add some fail-safes and not explode, but I do not have a reliable reproducer.

Comment 12 Chris Murphy 2020-03-05 19:13:02 UTC
Merely having a system and/or user journal corrupt, doesn't trigger this...


$ sudo journalctl --verify
[sudo] password for chris: 
Sorry, try again.
[sudo] password for chris: 
PASS: /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/system~                             
4baf58: Invalid entry item (7/32 offset: 000000                                                                                       
4baf58: Invalid object contents: Bad message                                                                                          
File corruption detected at /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/user-1000~:4baf58 (of 8388608 bytes, 59%).
FAIL: /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/user-1000~ (Bad message)
2f76f00: Invalid entry item (12/27 offset: 000000                                                                                     
2f76f00: Invalid object contents: Bad message                                                                                         
File corruption detected at /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/system~:2f76f00 (of 50331648 bytes, 98%).
FAIL: /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/system~ (Bad message)
PASS: /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/system.journal                                                                
PASS: /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/user-1000~                          
PASS: /var/log/journal/26336922e1044e80ae4bd42e1d6b9099/user-1000.journal                                                             
$ systemctl status abrt-xorg.service
● abrt-xorg.service - ABRT Xorg log watcher
     Loaded: loaded (/usr/lib/systemd/system/abrt-xorg.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2020-03-05 12:08:52 MST; 1min 20s ago
   Main PID: 764 (abrt-dump-journ)
      Tasks: 1 (limit: 3459)
     Memory: 19.5M
        CPU: 36ms
     CGroup: /system.slice/abrt-xorg.service
             └─764 /usr/bin/abrt-dump-journal-xorg -fxtD

Mar 05 12:08:52 localhost.localdomain systemd[1]: Started ABRT Xorg log watcher.
$ 


It's fairly trivial to corrupt a systemd journal, just by forcing power off on a VM. Such sabotage probably makes any discovered bug not a blocker, but still a valid way of checking for proper error handling. Yet I still can't trigger it.

Comment 13 ekulik 2020-03-06 07:19:43 UTC
Okay, so I’ll just go through the code path and bump the log level for error messages, so we can get more information the next time this happens.


Note You need to log in before you can comment on or make changes to this bug.