Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1538550 - failed to find a suitable stage1 device
Summary: failed to find a suitable stage1 device
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: python-blivet
Version: 28
Hardware: s390x
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Vendula Poncova
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker
TreeView+ depends on / blocked
 
Reported: 2018-01-25 09:50 UTC by Dan Horák
Modified: 2018-03-19 10:03 UTC (History)
16 users (show)

Fixed In Version: python-blivet-3.0.0-0.4.b1.fc28
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1550617 (view as bug list)
Environment:
Last Closed: 2018-03-14 01:51:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
storage.log (272.95 KB, text/plain)
2018-01-25 09:50 UTC, Dan Horák
no flags Details
anaconda.log (35.86 KB, text/plain)
2018-01-25 09:52 UTC, Dan Horák
no flags Details
program.log (34.10 KB, text/plain)
2018-01-25 09:53 UTC, Dan Horák
no flags Details
syslog (271.96 KB, text/plain)
2018-02-06 16:09 UTC, Vendula Poncova
no flags Details

Description Dan Horák 2018-01-25 09:50:58 UTC
Created attachment 1385980 [details]
storage.log

When installing current Rawhide (20180123) in a s390x guest, I get the "failed to find a suitable stage1 device" error during storage configuration.


Version-Release number of selected component (if applicable):
anaconda 28.18-1.fc28

How reproducible:
100%

Comment 1 Dan Horák 2018-01-25 09:52:29 UTC
Created attachment 1385981 [details]
anaconda.log

Comment 2 Dan Horák 2018-01-25 09:53:09 UTC
Created attachment 1385982 [details]
program.log

Comment 3 Vendula Poncova 2018-02-06 16:09:50 UTC
Created attachment 1392197 [details]
syslog

I am able to reproduce the bug, but I am not sure what is the cause. However, there are some errors in the attached syslog:

15:24:33,372 ERR systemd-udevd:Invalid rule /usr/lib/udev/rules.d/60-readahead.rules:10: unknown key 'WAIT_FOR'

15:24:37,332 WARNING kernel:Call Trace:
15:24:37,332 WARNING kernel:([<040000007cb6fc30>] 0x40000007cb6fc30)
15:24:37,332 WARNING kernel: [<00000000004f7664>] blk_mq_run_hw_queue+0x4c/0xb0 
15:24:37,332 WARNING kernel: [<00000000004f7730>] blk_mq_run_hw_queues+0x68/0x88 
15:24:37,332 WARNING kernel: [<00000000001e0610>] flush_smp_call_function_queue+0xc0/0x160 
15:24:37,332 WARNING kernel: [<0000000000116830>] smp_handle_ext_call+0xb8/0xc0 
15:24:37,332 WARNING kernel: [<000000000010bc74>] do_ext_interrupt+0xbc/0xe0 
15:24:37,332 WARNING kernel: [<00000000001b222c>] __handle_irq_event_percpu+0x64/0x240 
15:24:37,332 WARNING kernel: [<00000000001b2446>] handle_irq_event_percpu+0x3e/0x88 
15:24:37,332 WARNING kernel: [<00000000001b7154>] handle_percpu_irq+0x6c/0x98 
15:24:37,332 WARNING kernel: [<00000000001b11e2>] generic_handle_irq+0x42/0x60 
15:24:37,332 WARNING kernel: [<000000000010c08c>] do_IRQ+0x84/0xb8 
15:24:37,332 WARNING kernel: [<00000000008488d4>] ext_int_handler+0x100/0x104 
15:24:37,332 WARNING kernel: [<0000000000102be6>] enabled_wait+0x4e/0xe0 
15:24:37,332 WARNING kernel:([<0000000000000000>]           (null))
15:24:37,332 WARNING kernel: [<0000000000102f22>] arch_cpu_idle+0x32/0x48 
15:24:37,332 WARNING kernel: [<00000000001942f8>] do_idle+0xe8/0x170 
15:24:37,332 WARNING kernel: [<0000000000194556>] cpu_startup_entry+0x3e/0x48 
15:24:37,332 WARNING kernel: [<0000000000116760>] smp_start_secondary+0x120/0x138 
15:24:37,332 WARNING kernel: [<0000000000848cd6>] restart_int_handler+0x62/0x78 
15:24:37,332 WARNING kernel: [<0000000000000000>]           (null) 
15:24:37,332 WARNING kernel:Last Breaking-Event-Address:
15:24:37,332 WARNING kernel: [<00000000004f75e4>] __blk_mq_delay_run_hw_queue+0xbc/0xd8

Comment 4 Vendula Poncova 2018-02-06 16:19:57 UTC
Let's start with fixing the error from s390utils:

15:24:33,372 ERR systemd-udevd:Invalid rule /usr/lib/udev/rules.d/60-readahead.rules:10: unknown key 'WAIT_FOR'

Comment 5 Fedora End Of Life 2018-02-20 15:23:24 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 28 development cycle.
Changing version to '28'.

Comment 6 Michal Schmidt 2018-02-23 16:06:31 UTC
The use of "WAIT_FOR" in the udev rules is bug 1326715. It's probably harmless, because udev removed the implementation of WAIT_FOR already back in 2015.

The kernel warning looks more important. It's likely a bug in the dasd driver. 
The warning is this one:

static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
                                        unsigned long msecs)
{
>>>>    if (WARN_ON_ONCE(!blk_mq_hw_queue_mapped(hctx)))
                return;


dasd is not expressly implicated in the call trace, but it's possible for the flow to go like this:
__dasd_cleanup_cqr
  blk_mq_complete_request
    __blk_mq_complete_request
      rq->csd.func = __blk_mq_complete_request_remote;
      smp_call_function_single_async(ctx->cpu, &rq->csd);

Then a CPU gets the IPI and handles it:
flush_smp_call_function_queue
  __blk_mq_complete_request_remote
    rq->q->softirq_done_fn(rq);
      // tail call. softirq_done_fn was dasd_request_done.
      dasd_request_done
        // tail call
        blk_mq_run_hw_queues

The above only explains why the call trace looks the way it does. It does not explain the actual bug.

commit 673514aff5d797b0cdb3f09097ae0e253b5667c3
("s390/dasd: fix race during dasd initialization") supposedly fixed a similar bug before (already in v4.14).

Comment 7 Dan Horák 2018-02-23 16:19:49 UTC
I've opened https://github.com/ibm-s390-tools/s390-tools/issues/27 for the WAIT_FOR issue.

Comment 8 Dan Horák 2018-02-23 16:42:21 UTC
The kernel traceback is likely unrelated to the problem. The problem seem to be related to the fact there are partitions on the DASDs (from a previously installed system). When I removed partitions from all DASDs from a shell with fdasd, refreshed/rescanned the storage and then selected again all DASDs as the target for the installation, then there was no error for the storage spoke and actual installation could be started. This is with Fedora-28-20180221.n.0 compose.

Comment 9 Dan Horák 2018-02-23 17:24:32 UTC
Hmm, or maybe not ... The storage is set only after I unselect the "install bootloader" checkbox.

Comment 10 Dan Horák 2018-02-26 08:59:30 UTC
The anaconda.log contains

...
12:07:46,458 DBG bootloader: _is_valid_disklabel(dasda) returning False
12:07:46,458 DBG bootloader: _is_valid_size(dasda) returning True
12:07:46,458 DBG bootloader: _is_valid_location(dasda) returning True
12:07:46,459 DBG bootloader: _is_valid_format(dasda) returning True
12:07:46,459 DBG bootloader: is_valid_stage1_device(dasda) returning False
12:07:46,828 DBG bootloader: new disk order: []
12:07:46,829 DBG kickstart.kickstart.bootloader: Bootloader: use /boot partition's disk 'dasda' as boot drive, dry run False
12:07:46,838 DBG bootloader: _is_valid_disklabel(dasda) returning False
12:07:46,838 DBG bootloader: _is_valid_size(dasda) returning True
12:07:46,838 DBG bootloader: _is_valid_location(dasda) returning True
12:07:46,838 DBG bootloader: _is_valid_format(dasda) returning True
12:07:46,839 DBG bootloader: is_valid_stage1_device(dasda) returning False
12:07:46,840 DBG bootloader: _is_valid_disklabel(dasda1) returning False
12:07:46,840 DBG bootloader: _is_valid_size(dasda1) returning True
12:07:46,840 DBG bootloader: _is_valid_location(dasda1) returning True
12:07:46,841 WRN bootloader: dasda1 not bootable
12:07:46,841 DBG bootloader: _is_valid_format(dasda1) returning True
12:07:46,841 DBG bootloader: is_valid_stage1_device(dasda1) returning False
12:07:46,842 DBG bootloader: _is_valid_disklabel(dasda2) returning False
12:07:46,843 DBG bootloader: _is_valid_size(dasda2) returning True
12:07:46,843 DBG bootloader: _is_valid_location(dasda2) returning True
12:07:46,843 WRN bootloader: dasda2 not bootable
12:07:46,843 DBG bootloader: _is_valid_format(dasda2) returning True
12:07:46,843 DBG bootloader: is_valid_stage1_device(dasda2) returning False
12:07:46,844 DBG bootloader: stage1 device cannot be of type lvmvg
12:07:46,845 DBG bootloader: stage1 device cannot be of type lvmlv
12:07:46,845 DBG bootloader: stage1 device cannot be of type lvmlv
12:07:46,846 ERR ui.gui.spokes.storage: BootLoader setup failed: failed to find a suitable stage1 device
...

for a dasd patitioned as
(parted) p                                                                
Model: IBM S390 DASD drive (dasd)
Disk /dev/dasda: 2462MB
Sector size (logical/physical): 512B/4096B
Partition Table: dasd
Disk Flags: 

Number  Start   End     Size    File system  Flags
 1      98.3kB  1074MB  1074MB  ext4
 2      1074MB  2462MB  1388MB               lvm


I see the problem in the "_is_valid_disklabel(dasda1)" call as it returns False. Or it could be blivet's fault - https://github.com/storaged-project/blivet/blob/master/blivet/formats/disklabel.py#L222

Comment 11 Vendula Poncova 2018-02-27 15:54:29 UTC
Fixed in a pull request: https://github.com/storaged-project/blivet/pull/663

Comment 12 Dan Horák 2018-02-28 09:24:40 UTC
Just verified the proposed fix makes F-28 installable.

Comment 13 Fedora Update System 2018-03-07 15:29:16 UTC
python-blivet-3.0.0-0.3.b1.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-538fc64b1c

Comment 14 Fedora Update System 2018-03-08 15:28:12 UTC
python-blivet-3.0.0-0.3.b1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-538fc64b1c

Comment 15 Fedora Update System 2018-03-13 15:09:19 UTC
anaconda-28.22.2-3.fc28, python-blivet-3.0.0-0.4.b1.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-b623ca92af

Comment 16 Fedora Update System 2018-03-14 01:51:13 UTC
anaconda-28.22.2-3.fc28, python-blivet-3.0.0-0.4.b1.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 17 Dan Horák 2018-03-19 10:03:16 UTC
Fix has been verified with the Fedora-28-20180318.n.0 compose, thanks.


Note You need to log in before you can comment on or make changes to this bug.