Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 957786 - test putting ~/.kde symlinks on local storage
Summary: test putting ~/.kde symlinks on local storage
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: kdelibs
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Rex Dieter
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-04-29 14:22 UTC by Juha Tuomala
Modified: 2015-02-23 12:52 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-09-23 17:22:22 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Juha Tuomala 2013-04-29 14:22:30 UTC
Description of problem:
A system that uses NFS home directories suffers a complete freeze if something interrupts this disk service. That can be caused by network- or service level - or NFS software itself. Unfortunately these aren't that uncommon these days.

It becomes more problematic when NFS depends on so many subsystems to stay continously working.

When this freeze happens, mouse stops moving, keyboard stops responding, no pixel changes on screen, underlying virtual consoles are not accessible. Only thing that helps is a powerbutton or complete desktop kill via ssh.

In worst case, this small network hickup leading NFS to stop working is caused by some KDE program that gets frozen by its own actions and is unable to complete the task that would make the system recover. This starving situation then leads to unrecoverable freeze.

In my understanding most of these could be completely avoided or shortly recovered if Fedora wouldn't access KDE's IPC-sockets via symlinks that reside in $HOME/.kde directory (pointing to /tmp sockets). 

I'm not asking to remove those symlinks, those could remain there for transition period. kdelibs binaries however, should not use those symlinks at all, instead use those sockets directly from /tmp.


Version-Release number of selected component (if applicable):
kdelibs-4.10.2-2.fc18.x86_64


How reproducible:
Always.


Steps to Reproduce:
1. move home directory to NFS share and mount it, start KDE on top of it.
2. interrupt NFS somehow, best would be a KDE program like nm applet.
3. freeze
  

Actual results:
Frozen system.


Expected results:
Recovery from service interrupt.


Additional info:

% ls -l .kde/socket-*
lrwxrwxrwx. 1 tuju tuju 17 Nov  5 12:22 .kde/socket-wasa.example.com -> /tmp/ksocket-tuju


An example situation today:

Installing system updates. 

/var/log/messages:
Apr 29 11:42:23 wasa yum[1831]: Updated: 1:autofs-5.0.7-13.fc18.x86_64
Apr 29 11:42:25 wasa systemd[1]: Reloading.
Apr 29 11:42:25 wasa systemd[1]: Stopping Automounts filesystems on demand...
Apr 29 11:42:26 wasa automount[644]: umount_autofs_indirect: ask umount returned busy /net
Apr 29 11:42:28 wasa systemd[1]: Starting Automounts filesystems on demand...
Apr 29 11:45:28 wasa systemd[1]: autofs.service operation timed out. Terminating.

I have had hangups when updating NetworkManager/its KDE applets and losing the network for shortly. It never comes back when nm-applet is dying for disappearing sockets. 

Or updating packages with a KDE GUI-tool, that either breaks a running software  when files change on disk providing a NFS-dependency or service gets restarted and there goes home directory and sockets.

NFS depends on so many components software and hardware wise, it's not reasonable to except that those sockets are 100% available.

I mentioned this issue to rdieter in IRC and he asked to write all details into bug report.

Comment 1 Juha Tuomala 2013-04-29 14:25:18 UTC
Additional details for todays hickup:


% rpm -q --scripts autofs
postinstall scriptlet (using /bin/sh):

if [ $1 -eq 1 ] ; then 
        # Initial installation 
        /usr/bin/systemctl preset autofs.service >/dev/null 2>&1 || : 
fi
preuninstall scriptlet (using /bin/sh):

if [ $1 -eq 0 ] ; then 
        # Package removal, not upgrade 
        /usr/bin/systemctl --no-reload disable autofs.service > /dev/null 2>&1 || : 
        /usr/bin/systemctl stop autofs.service > /dev/null 2>&1 || : 
fi
postuninstall scriptlet (using /bin/sh):

/usr/bin/systemctl daemon-reload >/dev/null 2>&1 || : 
if [ $1 -ge 1 ] ; then 
        # Package upgrade, not uninstall 
        /usr/bin/systemctl try-restart autofs.service >/dev/null 2>&1 || : 
fi

Comment 2 Rex Dieter 2013-04-29 14:35:11 UTC
The point of this exercize is to see if putting ~/.kde symlinks elsewhere is

1.  easy without too much fuss
2.  if put on local disk, see if this makes kde less error-prone to $HOME-on-nfs hiccups

Comment 3 Juha Tuomala 2013-05-08 09:28:55 UTC
Why there has to be duct-tape (symlinks) in the first place? Put that socket directly there.

Comment 4 Rex Dieter 2013-06-11 01:28:38 UTC
Per IRC discussion recently...

Turns out patching this will be non-trivial.  But, here's a better/easier/safer test.  Per
http://techbase.kde.org/KDE_System_Administration/KDE_Filesystem_Hierarchy#KDEHOME


# some local (ie, non NFS) dir to use temporarily as a test:
mkdir -p /usr/local/tmp/${USER}

cp -a ~/.kde/ /usr/local/tmp/${USER}/


# create a snippet ~/.kde/env/KDEHOME.sh containing:
KDEHOME=/usr/local/tmp/${USER}/.kde
export KDEHOME


restart kde session.


If doing this fixes the symptoms described here, we have ample evidence to pursue this more.

Comment 5 Fedora End Of Life 2013-12-21 13:18:36 UTC
This message is a reminder that Fedora 18 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 18. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '18'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 18's end of life.

Thank you for reporting this issue and we are sorry that we may not be 
able to fix it before Fedora 18 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior to Fedora 18's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 6 Rex Dieter 2013-12-21 13:38:08 UTC
I consider comment #4 probably good enough for sites with less than fully-functional (e.g. broken locking) nfs implementations

Re-open if that doesnt work.

Comment 7 Juha Tuomala 2013-12-21 13:48:26 UTC
File locks work on NFS. This case was opened to questionize using duct-tape as construction material. That was mentioned in comment #3.

There is no need to proof that it's wrong. Everyone knows it already.

Comment 8 Rex Dieter 2013-12-21 13:51:55 UTC
so, can you please test if setting KDEHOME to point to local storage does make things better, per comment #3 request for feedback?

Comment 9 Rex Dieter 2014-09-23 17:22:22 UTC
Closing (again), feel free to reopen and provide the requested testing/feedback if able, thanks.

Comment 10 Juha Tuomala 2015-02-23 12:52:06 UTC
(In reply to Rex Dieter from comment #8)
> so, can you please test if setting KDEHOME to point to local storage does
> make things better, per comment #3 request for feedback?

I guess nick tibbs confirmed, that it does.

Related information: http://fedoraproject.org/wiki/KDE/NFS


Note You need to log in before you can comment on or make changes to this bug.