Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1319483

Summary: Linking to -lrbd causes process startup times to balloon
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: cephAssignee: Boris Ranto <branto>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: berrange, branto, crobinso, david, fedora, jdillama, kchamart, ldachary, steve, vumrao
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-11 14:59:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269    
Attachments:
Description Flags
perf report none

Description Richard W.M. Jones 2016-03-20 14:00:51 UTC
Created attachment 1138304 [details]
perf report

Description of problem:

Take a trivial program:

  $ cat test.c 
  #include <stdlib.h>
  int main () { exit (0); }

and compare the startup time with and without linking to -lrados -lrbd.

Without:

  $ gcc test.c -o test
  $ TIMEFORMAT='%R' ; time ./test
  0.001

With:

  $ gcc test.c -o test -lrbd
  $ TIMEFORMAT='%R' ; time ./test
  0.044

This really matters - currently initializing librbd consumes
15% of the total time taken to start up the libguestfs appliance.

I looked at the code and did some profiling with perf [see attachment]
and it seems as if the following code is responsible:

  https://github.com/ceph/ceph/blob/master/src/common/Cycles.cc#L50

This code is really wrong, but it in lieu of being able to fix it,
it would be nice at least to have an environment variable we can
use to skip the madness.

Version-Release number of selected component (if applicable):

ceph-9.2.0-4.fc24.x86_64

How reproducible:

100%

Steps to Reproduce:
1. See above.

Comment 1 Richard W.M. Jones 2016-03-20 14:05:03 UTC
Adding Cole and Dan:

Because qemu links to ceph, this kills qemu startup times too.
Just doing `qemu-system-x86_64 -help' takes 0.1 seconds, and
according to perf that's basically because of the above.

Comment 2 Richard W.M. Jones 2016-03-20 14:08:07 UTC
Alternate fix would be to defer Cycles::init until Ceph is actually
used for something.  That way we could do our qemu feature detection
without hitting the code, but running qemu "for real" to mount a Ceph
drive would do the right thing.

Comment 3 Richard W.M. Jones 2016-03-21 17:43:07 UTC
Scratch build containing my experimental fix:
http://koji.fedoraproject.org/koji/taskinfo?taskID=13411776

Comment 4 Richard W.M. Jones 2016-03-21 22:49:06 UTC
Although this patch fixes process startup times in general, it
unfortunately does not fix them for qemu.  qemu is still slow
because of the large number of external libraries it uses, plus
because it links to gtk.  (Removing the gtk dependency halves the
qemu startup time).

Comment 5 Richard W.M. Jones 2016-04-11 14:59:24 UTC
I backported the upstream commit and pushed it to Rawhide.
Fixed in ceph-9.2.0-5.fc25