Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1208176

Summary: Race starting multiple libvirtd user sessions at the same time
Product: Red Hat Enterprise Linux 7 Reporter: Richard W.M. Jones <rjones>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: agedosier, berrange, clalancette, dyuan, extras-qa, itamar, jforbes, jsuchane, kchamart, laine, libvirt-maint, mkletzan, mzhan, rbalakri, rjones, shyu, veillard, virt-maint, yafu, zhwang
Target Milestone: rcKeywords: Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.15-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1200149 Environment:
Last Closed: 2015-11-19 06:26:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1200149    
Bug Blocks: 910269, 1194593    

Description Richard W.M. Jones 2015-04-01 14:53:16 UTC
Since libvirt is to be rebased to 1.2.13, I'm cloning this
bug as it affects 1.2.13 on Rawhide and thus will probably
affect RHEL 7.2.

+++ This bug was initially created as a clone of Bug #1200149 +++

Description of problem:

Run the following command as *non root* :

killall libvirtd ; for i in `seq 1 5`; do virsh list >/tmp/log$i 2>&1 & done

Some virsh processes may fail (Exit 1).  If so, examine the error
messages in the /tmp/log* files.

--- Additional comment from Richard W.M. Jones on 2015-04-01 07:20:32 EDT ---

Still happens in:

libvirt-1.2.13-2.fc23.x86_64

--- Additional comment from Kashyap Chamarthy on 2015-04-01 09:51:58 EDT ---

I can reproduce the error, with the below versions (same as Rich):

  $ uname -r; rpm -q libvirt-daemon-kvm qemu-system-x86
  4.0.0-0.rc5.git4.1.fc22.x86_64
  libvirt-daemon-kvm-1.2.13-2.fc22.x86_64
  qemu-system-x86-2.3.0-0.2.rc1.fc22.x86_64


As noted by Rich earlier, test was done as NON-root:

$ id -u -n
kashyapc
$ killall libvirtd ; for i in `seq 1 5`; do virsh list >/tmp/log$i 2>&1 & done
[. . .]

Hit enter:

$
[1]   Exit 1                  virsh list > /tmp/log$i 2>&1
[2]   Done                    virsh list > /tmp/log$i 2>&1
[3]   Exit 1                  virsh list > /tmp/log$i 2>&1
[4]-  Exit 1                  virsh list > /tmp/log$i 2>&1
[5]+  Exit 1                  virsh list > /tmp/log$i 2>&1
[kashyapc@foo ~]$ 


`grep` the logs:

$ grep Fail /tmp/log*
/tmp/log1:error: Failed to connect socket to '/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory
/tmp/log3:error: Failed to connect socket to '/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory
/tmp/log4:error: Failed to connect socket to '/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory
/tmp/log5:error: Failed to connect socket to '/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory

Comment 1 Michal Privoznik 2015-04-15 15:53:37 UTC
I've just pushed the patch upstream:

commit be78814ae07f092d9c4e71fd82dd1947aba2f029
Author:     Michal Privoznik <mprivozn>
AuthorDate: Thu Apr 2 14:41:17 2015 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Wed Apr 15 13:39:13 2015 +0200

    virNetSocketNewConnectUNIX: Use flocks when spawning a daemon
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1200149
    
    Even though we have a mutex mechanism so that two clients don't spawn
    two daemons, it's not strong enough. It can happen that while one
    client is spawning the daemon, the other one fails to connect.
    Basically two possible errors can happen:
    
      error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': Connection refused
    
    or:
    
      error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': No such file or directory
    
    The problem in both cases is, the daemon is only starting up, while we
    are trying to connect (and fail). We should postpone the connecting
    phase until the daemon is started (by the other thread that is
    spawning it). In order to do that, create a file lock 'libvirt-lock'
    in the directory where session daemon would create its socket. So even
    when called from multiple processes, spawning a daemon will serialize
    on the file lock. So only the first to come will spawn the daemon.
    
    Tested-by: Richard W. M. Jones <rjones>
    Signed-off-by: Michal Privoznik <mprivozn>

v1.2.14-174-gbe78814

Comment 3 vivian zhang 2015-06-26 03:00:32 UTC
I can produce this bug with build libvirt-1.2.8-16.el7.x86_64

1. log in with NON-root user
# su - test1
Last login: Thu May 28 16:02:04 CST 2015 on pts/1
$ virsh list
 Id    Name                           State
----------------------------------------------------
$ ps aux |grep libvirtd
test1    16546  1.5  0.1 645588 14076 ?        Sl   10:50   0:00 /usr/sbin/libvirtd --timeout=30
test1    16575  0.0  0.0 112640   960 pts/1    S+   10:50   0:00 grep --color=auto libvirtd
root     27111  0.0  0.2 1024300 18636 ?       Ssl  Jun23   0:00 /usr/sbin/libvirtd

2. run command as below, see libvirtd process exit 1 with error

$ killall libvirtd ; for i in `seq 1 5`; do virsh list >/tmp/log$i 2>&1 & done;
libvirtd(27111): Operation not permitted
libvirtd: no process found
[1] 16619
[2] 16620
[3] 16621
[4] 16622
[5] 16623
$ virsh list
 Id    Name                           State
----------------------------------------------------

[1]   Done                    virsh list > /tmp/log$i 2>&1
[2]   Done                    virsh list > /tmp/log$i 2>&1
[3]   Done                    virsh list > /tmp/log$i 2>&1
[4]-  Done                    virsh list > /tmp/log$i 2>&1
[5]+  Exit 1                  virsh list > /tmp/log$i 2>&1

3. check log, find errors as below
$ grep Fail /tmp/log*
/tmp/log5:error: Failed to connect socket to '/home/test1/.cache/libvirt/libvirt-sock': Connection refused

$ cat /tmp/log?
 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/home/test1/.cache/libvirt/libvirt-sock': Connection refused



Verify this bug with build libvirt-1.2.16-1.el7.x86_64

1. log in with NON-root user
# su - test1
Last login: Fri Jun 26 10:40:18 HKT 2015 on pts/11

$ ps aux |grep libvirtd
test1    19200  0.0  0.0 112640   964 pts/11   S+   10:41   0:00 grep --color=auto libvirtd
root     31153  0.0  0.0 155440  3780 pts/3    S+   Jun17   0:00 vim /etc/libvirt/libvirtd.conf

$ virsh list
 Id    Name                           State
----------------------------------------------------


$ ps aux |grep libvirtd
test1    19204  5.0  0.2 811588 17896 ?        Sl   10:41   0:00 /usr/sbin/libvirtd --timeout=30
test1    19243  0.0  0.0 112640   964 pts/11   S+   10:41   0:00 grep --color=auto libvirtd
root     31153  0.0  0.0 155440  3780 pts/3    S+   Jun17   0:00 vim /etc/libvirt/libvirtd.conf

2. run command as below
$ killall libvirtd ; for i in `seq 1 5`; do virsh list >/tmp/log$i 2>&1 & done;
[1] 19246
[2] 19247
[3] 19248
[4] 19249
[5] 19250

[1]   Done                    virsh list > /tmp/log$i 2>&1
[2]   Done                    virsh list > /tmp/log$i 2>&1
[3]   Done                    virsh list > /tmp/log$i 2>&1
[4]-  Done                    virsh list > /tmp/log$i 2>&1
[5]+  Done                    virsh list > /tmp/log$i 2>&1

3. check log, no Failed log existed
$ grep Fail /tmp/log*

$ cat /tmp/log?
 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

 Id    Name                           State
----------------------------------------------------

reproduce steps for several times, always get the same result, move to verified

Comment 4 Michal Privoznik 2015-06-26 08:09:13 UTC
(In reply to vivian zhang from comment #3)

> reproduce steps for several times, always get the same result, move to
> verified

Did you actually forgot to change bug status? :-)

Comment 5 vivian zhang 2015-06-26 10:16:32 UTC
(In reply to Michal Privoznik from comment #4)
> (In reply to vivian zhang from comment #3)
> 
> > reproduce steps for several times, always get the same result, move to
> > verified
> 
> Did you actually forgot to change bug status? :-)

yes, thanks for your reminder

Comment 7 errata-xmlrpc 2015-11-19 06:26:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html