Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 510524 - fix zFCP in anaconda and also make it work with changed sysfs interface of device driver
Summary: fix zFCP in anaconda and also make it work with changed sysfs interface of de...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: rawhide
Hardware: s390x
OS: Linux
low
medium
Target Milestone: ---
Assignee: David Cantrell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker
TreeView+ depends on / blocked
 
Reported: 2009-07-09 16:06 UTC by Steffen Maier
Modified: 2009-08-25 09:33 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-08-25 09:33:48 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
[PATCH 1/5] correctly activate zFCP LUN on s390 (4.03 KB, patch)
2009-07-10 20:27 UTC, Steffen Maier
no flags Details | Diff
[PATCH 2/5] correctly delete a SCSI device provided by a zFCP LUN on s3 90 (2.61 KB, patch)
2009-07-10 20:27 UTC, Steffen Maier
no flags Details | Diff
[PATCH 3/5] correctly deactivate zFCP LUN on s390 (3.70 KB, patch)
2009-07-10 20:28 UTC, Steffen Maier
no flags Details | Diff
[PATCH 4/5] error messages of zFCP on s390: log or pass to the UI (2.69 KB, patch)
2009-07-10 20:28 UTC, Steffen Maier
no flags Details | Diff
[PATCH 5/5] prevent getting started up or shutdown again while already in such state (1.34 KB, patch)
2009-07-10 20:28 UTC, Steffen Maier
no flags Details | Diff

Description Steffen Maier 2009-07-09 16:06:00 UTC
Description of problem:
zFCP LUNs (more specifically disks in this context) cannot be activated in anaconda, neither by specifying FCP_* options in the parmfile or conffile, no by using "add zFCP" in the advanced storage configuration of the GUI.

On going back through the wizard screens in anaconda, the storage subsystem gets shutdown at roughly the first screen and if zFCP LUNs would have been active before, they are shutdown incorrectly causing all kinds of kernel error messages, e.g. SCSI devices can no longer access their LUN.

Version-Release number of selected component (if applicable):
anaconda-11.5.0.51-1.fc11.s390x

How reproducible:
In parm file, conf file, or the anaconda UI, try to add zFCP LUNs.
With actived zFCP LUNs, go backwards in the anaconda UI up to the first screen.

Actual results:
zFCP LUNs (disks) cannot be activated and also do not get deactivated correctly.

Expected results:
zFCP LUNs can be activated and deactivated by all means provided by anaconda.

Additional info:
Patch with fix will follow.

More details can be found in a somewhat related bug against RHEL 5.3:
Bug 494033 - upgrade on FCP disks impossible (possibly also on iSCSI)

The essential parts of #494033 which apply here:

> 13:57:12 INFO    : moving (1) to step partitionobjinit
> 13:57:12 DEBUG   : echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_add
> 13:57:12 DEBUG   : echo 0x401040ea00000000 >
/sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add
> 13:57:12 DEBUG   : echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> 13:57:12 DEBUG   : echo 0x401040eb00000000 >
/sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add
> 13:57:12 DEBUG   : echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> 13:57:12 DEBUG   : echo 0x401040ea00000000 >
/sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add
> 13:57:12 WARNING : error bringing zfcp device 0.0.3c1b online: [Errno 22] Invalid argument
> 13:57:12 DEBUG   : echo 0x401040eb00000000 >
/sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add
> 13:57:12 WARNING : error bringing zfcp device 0.0.3c1b online: [Errno 22] Invalid argument
> 13:57:12 DEBUG   : starting mpaths

The steps are executed in the wrong order. Also each LUN seems to be
tried to be added twice. Since the drives appear, it does not seem to
matter. However, I strongly suggest getting it right in order not to
provoke any other issues in the future. This would be the correct
order with a simplified scheme (not taking into account which LUNs are
on the same WWPN, which has already been added before the first of its
LUNs):

forall defined FCP disks do
1) set FCP adapter device online
2) add WWPN to adapter
3) add LUN to WWPN
done

I.e. the above log should come out as follows:

> echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_add
> echo 0x401040ea00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add
> echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_add
> echo 0x401040eb00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add

The absolut correct version would be (but probably requires to much
code checking dependencies and the above also works):

> echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_add
> echo 0x401040ea00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add
> echo 0x401040eb00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_add

> 13:57:13 DEBUG   : done starting mpaths.  Drivelist: ['sda', 'sdb', 'dasda']
> 13:57:13 INFO    : pv is /dev/dasda2 in vg VolGroup00, size is 6943
> 13:57:13 INFO    : vg VolGroup00, size is 6912, pesize is 32768
> 13:57:13 DEBUG   : VolumeGroupRequestSpec('VolGroup00').preexist_size is 6912.0
> 13:57:13 INFO    : lv is VolGroup00/LogVol00, size of 4896
> 13:57:13 INFO    : lv is VolGroup00/LogVol01, size of 2016
> 13:57:13 DEBUG   : /dev/VolGroup00/LogVol00 not probed as ext4dev
> 13:57:13 DEBUG   : /dev/VolGroup00/LogVol00 not probed as ext4
> 13:57:13 INFO    : moving (1) to step parttype

OK, the SCSI disks are there now. To my surprise the VolGroup01 on sda
does not appear in the log.

> 13:57:38 INFO    : moving (-1) to step partitionobjinit
> 13:57:38 DEBUG   : removing drive dasda from disk lists
> 13:57:38 DEBUG   : removing drive sda from disk lists
> 13:57:38 DEBUG   : removing drive sdb from disk lists
> 13:57:38 DEBUG   : echo 1 > /sys/bus/scsi/devices/0:0:0:1/delete
> 13:57:38 DEBUG   : echo 0 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> 13:57:38 DEBUG   : echo 0x401040ea00000000 >
/sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_remove
> 13:57:38 DEBUG   : echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_remove
> 13:57:38 WARNING : error bringing zfcp device 0.0.3c1b offline: [Errno 6] No such device or address
> 13:57:38 DEBUG   : echo 1 > /sys/bus/scsi/devices/0:0:0:2/delete
> 13:57:38 DEBUG   : echo 0 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> 13:57:38 DEBUG   : echo 0x401040eb00000000 >
/sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_remove
> 13:57:38 DEBUG   : echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_remove
> 13:57:38 INFO    : moving (-1) to step findinstall

Even worse, when I step back, hoping to now have activated SCSI LUNs
and my installation on VolGroup01 on sda would appear in the upgrade
systems dropdown box, anaconda unconfigures all FCP stuff again.

And additionally in a wrong order again and this time the wrong order
does matter and even generates ugly error messages from the zfcp
device driver on the console:

*** result of wrong deletion of zfcp scsi disks on the console: ***

> zfcp: unit erp failed on unit 0x401040ea00000000 on port 0x500507630300c562  on adapter 0.0.3c1b
> zfcp: unit erp failed on unit 0x401040eb00000000 on port 0x500507630300c562  on adapter 0.0.3c1b
>  rport-0:0-0: blocked FC remote port time out: saving binding

In order to prevent this, the following order must be used for
unconfiguring FCP SCSI devices (again simplified as above):

forall SCSI devices that are FCP attached: remove SCSI device
forall defined LUNs remove unit from corresponding WWPN
forall defined WWPNs remove port from corresponding adapter
forall defined FCP adapters set adapter offline

I.e. the above log should come out as follows:

> echo 1 > /sys/bus/scsi/devices/0:0:0:1/delete
> echo 1 > /sys/bus/scsi/devices/0:0:0:2/delete
> echo 0x401040ea00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_remove
> echo 0x401040eb00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_remove
> echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_remove
> echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_remove
> echo 0 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online
> echo 0 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online

The absolut correct version would be (but probably requires to much
code checking dependencies and the above also works):

> echo 1 > /sys/bus/scsi/devices/0:0:0:1/delete
> echo 1 > /sys/bus/scsi/devices/0:0:0:2/delete
> echo 0x401040ea00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_remove
> echo 0x401040eb00000000 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/0x500507630300c562/unit_remove
> echo 0x500507630300c562 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/port_remove
> echo 0 > /sys/bus/ccw/drivers/zfcp/0.0.3c1b/online

*** most important constraints on configuring zFCP: ***

http://www.ibm.com/developerworks/linux/linux390/documentation_dev.html
Device Drivers, Features, and Commands - SC33-8411-02
May 2009, Linux Kernel 2.6 - Development stream
http://download.boulder.ibm.com/ibmdl/pub/software/dw/linux390/docu/l26ddd02.pdf

Chapter 6. SCSI-over-Fibre Channel device driver
Working with the zfcp device driver

Setting an FCP channel online or offline
By default, FCP channels are offline. Set an FCP channel online before
you perform any other tasks.

Configuring and removing ports
Before you start: The FCP channel must be online.
...
You cannot remove a port while SCSI devices are configured for it (see
"Configuring SCSI devices" on page 72) or if the port is in use, for
example, by error recovery.

Configuring SCSI devices
To configure a SCSI device for a target port write the device's LUN to
the port's unit_add attribute.
...
Adding a SCSI device also registers the device with the SCSI stack and
creates a sysfs entry in the SCSI branch (see "Mapping the
representations of a SCSI device in sysfs").

Removing SCSI devices
To remove a SCSI device from a target port you need to first
unregister the device from the SCSI stack and then remove it from the
target port.

Comment 1 Steffen Maier 2009-07-10 20:27:16 UTC
Created attachment 351297 [details]
[PATCH 1/5] correctly activate zFCP LUN on s390

Comment 2 Steffen Maier 2009-07-10 20:27:43 UTC
Created attachment 351298 [details]
[PATCH 2/5] correctly delete a SCSI device provided by a zFCP LUN on s3 90

Comment 3 Steffen Maier 2009-07-10 20:28:09 UTC
Created attachment 351299 [details]
[PATCH 3/5] correctly deactivate zFCP LUN on s390

Comment 4 Steffen Maier 2009-07-10 20:28:36 UTC
Created attachment 351300 [details]
[PATCH 4/5] error messages of zFCP on s390: log or pass to the UI

Comment 5 Steffen Maier 2009-07-10 20:28:52 UTC
Created attachment 351301 [details]
[PATCH 5/5] prevent getting started up or shutdown again while already  in such state

Comment 6 Steffen Maier 2009-07-10 20:40:05 UTC
patches are tested as is currently possible: anaconda executed in a running F11 getting multiple LUNs over different paths using partly /tmp/fcpconfig and also "add zFCP" GUI plus going back and forth through the wizard screens to startup and shutdown at will

Comment 7 David Cantrell 2009-08-25 02:45:29 UTC
Steffen,

Didn't I already apply these patches to the git repo?  I remember going through a lot of zFCP patches on the mailing list.

FYI, you don't need to open bugs *and* post to the list.  The list is sufficient.

If these are already in the git repo, let's close this bug.


Note You need to log in before you can comment on or make changes to this bug.