Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1769063 - kernel-5.3.8-200 will not boot Dell Inspiron
Summary: kernel-5.3.8-200 will not boot Dell Inspiron
Keywords:
Status: CLOSED DUPLICATE of bug 1779611
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-05 20:55 UTC by dc.hart
Modified: 2020-01-03 18:04 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-03 18:04:04 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description dc.hart 2019-11-05 20:55:15 UTC
1. Please describe the problem:

I get the grub menu, Selecting the kernel causes the screen to go blank and no further progress. Nothing is captured in the logs. It just sits there until I hit the power button (it shuts down immediately).

I hope that this is the information required from sudo lshw:

reptile.swamp               
    description: Notebook
    product: Inspiron 5567 (0767)
    vendor: Dell Inc.
    serial: 3PK1Qxx
    width: 64 bits
    capabilities: smbios-2.8 dmi-2.8 smp vsyscall32
    configuration: boot=normal chassis=docking family=Inspiron sku=0767 uuid=44454C4C-5000-104B-8031-B3C04F514332
  *-core
       description: Motherboard
       product: 04M49V
       vendor: Dell Inc.
       physical id: 0
       version: A00
       serial: /3PK1QC2/CN129636A40662/
     *-firmware
          description: BIOS
          vendor: Dell Inc.
          physical id: 0
          version: 1.2.8
          date: 05/22/2019
          size: 64KiB
          capacity: 16MiB
          capabilities: pci pnp upgrade shadowing cdboot bootselect edd int13floppynec int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer a
cpi usb smartbattery biosbootspecification netboot uefi
     *-memory
          description: System Memory
          physical id: 3a
          slot: System board or motherboard
          size: 8GiB

2. What is the Version-Release number of the kernel:

5.3.8-200


3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

5.2.18-200.fc30.x86_64 works.


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Reboot to default kernel


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Yes


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.


7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

No log output is created.

Comment 1 dc.hart 2019-11-14 20:46:08 UTC
This problem persists through kernel-5.3.11-200.fc30.x86_64. I changed the severity to high (I am not sure if it is now urgent).

Comment 2 dc.hart 2019-11-17 20:14:39 UTC
I don't mean to be a pest but nearly two weeks have elapsed with no response whatsoever. I am sure that someone needs additional information from me to debug this problem. Here is lscpu if that helps:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
Address sizes:       39 bits physical, 48 bits virtual
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  2
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               142
Model name:          Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
Stepping:            9
CPU MHz:             3370.651
CPU max MHz:         3500.0000
CPU min MHz:         400.0000
BogoMIPS:            5808.00
Virtualization:      VT-x
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            4096K
NUMA node0 CPU(s):   0-3
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d

Comment 3 Carl Byington 2019-11-19 16:49:16 UTC
I have exactly the same issue. I am currently running 5.2.18-200.fc30.x86_64. I have attempted to boot into kernel-5.3.11-200.fc30.x86_64 which was automatically installed via dnf. However, none of the journalctl logs show those attempts. My system has an encrypted disk, and I never get to the prompt for the disk key. So probably those boots were unable to flush the in memory log to disk.


for i in {0..-3}; do journalctl -b $i | head -3 | tail -1; done


Nov 19 08:08:42 localhost.localdomain kernel: Linux version 5.2.18-200.fc30.x86_64 (mockbuild.fedoraproject.org) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Tue Oct 1 13:14:07 UTC 2019
Nov 18 10:31:13 localhost.localdomain kernel: Linux version 5.2.18-200.fc30.x86_64 (mockbuild.fedoraproject.org) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Tue Oct 1 13:14:07 UTC 2019
Nov 17 08:50:16 localhost.localdomain kernel: Linux version 5.2.18-200.fc30.x86_64 (mockbuild.fedoraproject.org) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Tue Oct 1 13:14:07 UTC 2019
Nov 16 07:41:30 localhost.localdomain kernel: Linux version 5.2.18-200.fc30.x86_64 (mockbuild.fedoraproject.org) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Tue Oct 1 13:14:07 UTC 2019

One other issue - this Dell laptop (with 5.2.18-200 and earlier) dumps a few messages about hardware errors to the screen very early in the boot sequence - this is apparently a known issue on these laptops. It does not seem to cause any problems with those earlier kernels.

Message from syslogd@localhost at Nov 19 08:08:42 ...
 kernel:mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 9: ee2000000040110a

Message from syslogd@localhost at Nov 19 08:08:42 ...
 kernel:mce: [Hardware Error]: TSC 0 ADDR fef1ff00 MISC 3880010086 

Message from syslogd@localhost at Nov 19 08:08:42 ...
 kernel:mce: [Hardware Error]: PROCESSOR 0:906e9 TIME 1574179720 SOCKET 0 APIC 0 microcode b4

Those messages don't appear on the screen when attempting to boot 5.3.x kernels.

Comment 4 Carl Byington 2019-11-19 17:25:02 UTC
Removing "quiet rhbg" from the kernel args, and adding some or all of acpi=off, pci=noacpi, earlyprintk=vga gives a mostly blank screen with just:

EFI stub: UEFI secureboot is enabled

After that message, it just hangs. I never get the LUKS prompt for the disk key.

Comment 5 dc.hart 2019-11-23 18:36:27 UTC
Latest rawhide kernel - same problem. Tried secure boot change in bios. No help. I am wondering if I should try to convert from UEFI to legacy. We NEED to hear from someone at Fedora.

Comment 6 alxndr13 2019-11-26 08:33:55 UTC
same error here on a OptiPlex 7050. Error occured only on 5.3.11. Booting with 5.3.8 works. 

lshw output: 
alpha              
    description: Desktop Computer
    product: OptiPlex 7050 (07A1)
    vendor: Dell Inc.
    serial: 8V6QCM2
    width: 64 bits
    capabilities: smbios-3.0.0 dmi-3.0.0 smp vsyscall32
    configuration: boot=normal chassis=desktop family=OptiPlex sku=07A1 uuid=44454C4C-5600-1036-8051-B8C04F434D32
  *-core
       description: Motherboard
       product: 0XHGV1
       vendor: Dell Inc.
       physical id: 0
       version: A00
       serial: /8V6QCM2/CNWS20078B01AC/
     *-firmware
          description: BIOS
          vendor: Dell Inc.
          physical id: 0
          version: 1.6.5
          date: 09/09/2017
          size: 64KiB
          capacity: 16MiB
          capabilities: pci pnp upgrade shadowing cdboot bootselect edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer acpi usb biosbootspecification netboot uefi
     *-memory
          description: System Memory
          physical id: 9
          slot: System board or motherboard
          size: 16GiB

Comment 7 Hans de Goede 2019-11-26 17:29:16 UTC
d.c.hart: Have you tried booting with "rhgb quiet" removed from the kernel commandline?  (you can edit the kernel cmdline in the grub menu).

Maybe that will give some messages / hints as to what is going on. If that does not help, please try adding: "nomodeset" to the kernel commandline and see if that helps.

Comment 8 dc.hart 2019-11-26 19:03:55 UTC
No messages or hints. Someone else said it returns something like "uefi secureboot"

Nomodeset - same result. Straight to a blank screen. I tried changing the bios to secure boot. Same result. I tried a bare minimum 5.4 kernel with no modules. Same thing. F-31 live image boots from a USB stick. Something changed in 5.3 onward that is affecting a small number of users. I have the latest BIOS according to Dell.

Comment 9 Hans de Goede 2019-11-26 20:56:07 UTC
Hmm, can you check what the exact kernel version is on the livecd? I think it is 5.3.6. You can still boot the machine using 5.2.18 right? 

You can download 5.3.6 here:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1400114

Here are instructions for installing a kernel directly from koji:
https://fedorapeople.org/~jwrdegoede/kernel-test-instructions.txt

If the livecd has a different version then 5.3.6, please try the livecd version, you can find all official kernel builds here:
https://koji.fedoraproject.org/koji/packageinfo?packageID=8

If the livecd-version does boot when installed on the system please also test 5.3.7:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1402630

That will narrow down the possible causes to the changes in a single 5.3.z version, which should help find the cause.

Comment 10 dc.hart 2019-11-26 21:21:58 UTC
Same result with 5.3.6 and 5.3.7. I am baffled.

Comment 11 Hans de Goede 2019-11-27 10:26:39 UTC
(In reply to dc.hart from comment #10)
> Same result with 5.3.6 and 5.3.7. I am baffled.

Are you using classic BIOS boot or UEFI boot ? IF you do not know, try running: "ls /sys/firmware/efi/efivars" if you get a "No such file or directory" error then your system is booting in classic BIOS mode, if you get a bunch of files you are running in UEFI mode.

It is possible that you are using one mode for the installed version and another for the livecd, typically with the livecd your BIOS-es boot-menu will let you choose the USB device as boot-source twice, once labelled EFI and the one without EFI typically is classic BIOS mode. Either way you can use the same check under the livecd too.

If the 2 boot methods are different, that might explain, in that case try to boot the livecd in the same mode as the install to see if that helps.

If both methods are the same and they are both classic BIOS, then this might be an issue with the bootloader, for classic BIOS the livecd uses syslinux where as the install uses grub2.

If your install is using classic BIOS one thing to try is updating the installed grub version, with classic BIOS grub gets installed into the mbr, and the version in the MBR stays at the version from installation time, even though the grub package itself may have been updated later. To get the newer version into your MBR you need to re-install grub in the MBR, see:

https://fedoraproject.org/wiki/GRUB_2#Updating_GRUB_2_configuration_on_BIOS_systems

Note the /dev/sda is an example, if you are using a single sata disk and booting from that disk then it is correct, but your setup might be more complex, you need to specify the disk which your system is booting from, which might be a different disk then the one with Linux on it if you have multiple disks.

Note you only need to run the grub2-install command, you shouldn't need to run grub2-mkconfig, although giving that a try does not hurt.

Comment 12 Ben Cotton 2019-11-27 14:19:25 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 13 dc.hart 2019-11-27 14:51:14 UTC
Version = 30! 

Please revert the status

Comment 14 dc.hart 2019-11-27 15:46:37 UTC
Both are UEFI. BTW (grasping at straw) the EFI System Partition is fat16. Is that correct?

[dch@reptile ~]$ sudo parted -l
Model: ATA Samsung SSD 840 (scsi)
Disk /dev/sda: 500GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name                  Flags
 1      1049kB  211MB   210MB   fat16        EFI System Partition  boot, esp
 2      211MB   1285MB  1074MB  ext4
 3      1285MB  500GB   499GB                                      lvm

The entry in fstab seems conflicting:

UUID=88D3-9C76  /boot/efi       vfat    umask=0077,shortname=winnt      0       2

Comment 15 Hans de Goede 2019-11-27 20:19:00 UTC
(In reply to dc.hart from comment #14)
> Both are UEFI. BTW (grasping at straw) the EFI System Partition is fat16. Is that correct?

That should not be a problem.

You are on F30, right? Only thing I can think of is that there is a bug in the somewhat older grub in F30 which triggers on some Dell machines in combination with newer kernels. Mind you this is just a theory.

What you can do is install the F31 grub on F30, go to:

https://koji.fedoraproject.org/koji/buildinfo?buildID=1417369

Then download: grub2-efi-x64-2.02-103.fc31.x86_64.rpm

And then run:

sudo rpm -Uvh grub2-efi-x64-2.02-103.fc31.x86_64.rpm

rpm might complain about some other grub bits being too old when you do that, in that case download the other bits and then and them add them to the rpm -Uvh commandline.

I'm assuming here that the shim from F30 will also be happy with the signatures on the F31 grub, I'm not familiar with the key management for the keys used for this, I guess they might be per distro. So if you get some secure boot related error after this and you cannot load the grub menu at all any more, try disabling secureboot in your BIOS settings.

Comment 16 dc.hart 2019-12-02 18:41:11 UTC
Worked ... and then it didn't.

On Wednesday evening I decided to upgrade grub by upgrading from 30 to 31. The system booted from 5.3.12. The system booted on Thursday and Friday. On Saturday the system would not boot from the 5.3.x kernel but continues to boot from 5.2.18-200.fc30.x86_64. I reinstalled grub2\* and efi\* - same immediate blank screen - nothing to the logs. No clues from zapping rhgb quiet.

I have no software from any source other than the official repositories. I am extremely busy over the next few days (in spite of being retired). I don't know enough about what happens once a kernel is selected from the grub menu but wonder if this has something to do with akmods or akmod-VirtualBox. VB is mission critical for me as I run a discrete virtual machine (Mint) with a vpn.

I am also wondering if I might have a virus. 2019 is my 20th year of using Redhat or a derivative. This is a first. I will backup and experiment on Friday.

Comment 17 dc.hart 2019-12-08 21:04:10 UTC
I tried everything and ended up doing a clean install which is an adventure. The latency of dnf had me chewing my keyboard. But I digress.

After install the machine boots from 5.3.7. After dnf upgrade it will not boot from either 5.3 kernel but will boot from 5.2.18. This leads me to believe that this is NOT a kernel issue but, rather, a problem with grub2. I have not made a change on this report. I will leave that up to someone at Redhat.

What I do not understand is why this doesn't seem to be affecting a large number of users. Dell, i7, SSD doesn't seem all that unique. Later in the week I might experiment with extlinux. With each new kernel I would create initramfs and vmlinux symlinks.

Comment 18 Hans de Goede 2019-12-18 13:20:19 UTC
(In reply to dc.hart from comment #17)
> I tried everything and ended up doing a clean install which is an adventure.
> The latency of dnf had me chewing my keyboard. But I digress.
> 
> After install the machine boots from 5.3.7. After dnf upgrade it will not
> boot from either 5.3 kernel but will boot from 5.2.18. This leads me to
> believe that this is NOT a kernel issue but, rather, a problem with grub2. I
> have not made a change on this report. I will leave that up to someone at
> Redhat.

Hmm, weird. Did the dnf upgrade also upgrade grub2 perhaps?

You could try doing:

sudo dnf downgrade 'grub2*'

That will give you an older version. Please try to run it twice, the first
time to go from updates-testing version to the updates one and then another
time to go the release version. The second run may fail because you may end
up at the release version on the first run.

If that does not help, you can also try downgrading the shim:

sudo dnf downgrade 'shim*'

Comment 19 Hans de Goede 2019-12-18 13:21:54 UTC
p.s.

It take it the original/release F31 kernel which you can also still select after the dnf upgrade is also broken after the dnf upgrade?

Comment 20 Hans de Goede 2019-12-19 08:46:04 UTC
I just noticed that we also have bug 1779611 opened by another user now, which is about the F31 release version of grub working and the one from the updates repo causing the system to not boot with any 4.3 kernels. I have the feeling these 2 bugs might be the same issue.

Comment 21 Andy 2019-12-19 20:50:24 UTC
Still an issue with Dell Inspiron 5567 on kernel 5.3.16-200.fc30.x86_64 - exact same issue, standard dnf upgrade on Fedora 30.   Boots fine wth 5.2.18 kernel

Comment 22 dc.hart 2019-12-19 21:09:24 UTC
I just received a new Dell 5584 with the newer i7. We'll see what happens. More importantly I can experiment with the older machine. I am going to start with sgdisk -Z /dev/sda and see if that yields a different result.

Comment 23 Hans de Goede 2019-12-20 08:33:03 UTC
Andy, d.c.hart, has either of you tried to downgrade grub as I suggested in comment 18 ?  Comments in bug 1779611 suggest that that bug is the same issue and there downgrading grub helps.

Comment 24 Andy 2019-12-20 12:58:52 UTC
Downgraded GRUB to 1:2.02-84.fc30 (the lowest it will go on F30).   

Same behavior - still will not boot 5.3.x and still boots 5.2.18

Comment 25 Hans de Goede 2020-01-03 18:04:04 UTC
We've received one more similar bug report, at this time the most likely cause is that this is a grub issue and most information wrt debugging this from the grub side is located in bug 1779611, so I'm marking this as a duplicate of that bug.

*** This bug has been marked as a duplicate of bug 1779611 ***


Note You need to log in before you can comment on or make changes to this bug.