Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1317190 - Performance of version 4.4.4 drastically reduced from prior versions
Summary: Performance of version 4.4.4 drastically reduced from prior versions
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 23
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1317147 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-12 22:21 UTC by Jason H.
Modified: 2016-04-08 20:19 UTC (History)
13 users (show)

Fixed In Version: kernel-4.5.0-302.fc24 kernel-4.4.6-301.fc23
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-02 15:54:00 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
OpenSSL Speed Test on 4.2.3-300 (4.03 KB, text/plain)
2016-03-12 22:21 UTC, Jason H.
no flags Details
OpenSSL Speed Test on 4.4.4-301 (4.03 KB, text/plain)
2016-03-12 22:22 UTC, Jason H.
no flags Details
SysBench Test on 4.2.3-300 (2.25 KB, text/plain)
2016-03-12 22:23 UTC, Jason H.
no flags Details
SysBench Test on 4.4.4-301 (2.26 KB, text/plain)
2016-03-12 22:23 UTC, Jason H.
no flags Details
/proc/cpuinfo for kernel 4.4.3-300 (7.94 KB, text/plain)
2016-03-14 11:32 UTC, Jason H.
no flags Details
/proc/cpuinfo for kernel 4.4.4-301 (7.93 KB, text/plain)
2016-03-14 11:33 UTC, Jason H.
no flags Details
lspci (Jon W.) (2.02 KB, text/plain)
2016-03-14 21:55 UTC, Jon W.
no flags Details
dmesg 4.4.4-301.fc23.x86_64 (75.01 KB, text/plain)
2016-03-14 21:58 UTC, Jon W.
no flags Details
dmesg 4.2.3-300 (69.53 KB, text/plain)
2016-03-14 22:20 UTC, Jason H.
no flags Details
dmesg 4.4.4-301 (68.24 KB, text/plain)
2016-03-14 22:21 UTC, Jason H.
no flags Details
lspci (1.67 KB, text/plain)
2016-03-14 22:21 UTC, Jason H.
no flags Details
dmesg 4.4.3-300.fc23.x86_64 (75.67 KB, text/plain)
2016-03-15 15:25 UTC, Jon W.
no flags Details
prash-openssl-speed (49.48 KB, application/vnd.oasis.opendocument.spreadsheet)
2016-03-17 09:15 UTC, Prash
no flags Details
/sys/devices/system/cpu/intel_pstate for 4.4.4-301.fc23.x86_64 (slow) (257 bytes, text/plain)
2016-03-17 20:18 UTC, Jason H.
no flags Details
/sys/devices/system/cpu/intel_pstate for 4.4.5-300.perfdropreverts.fc23.x86_64 (fast) (257 bytes, text/plain)
2016-03-17 20:19 UTC, Jason H.
no flags Details
/sys/class/thermal for 4.4.4-301.fc23.x86_64 (slow) (3.31 KB, text/plain)
2016-03-17 20:20 UTC, Jason H.
no flags Details
/sys/class/thermal for 4.4.5-300.perfdropreverts.fc23.x86_64 (3.31 KB, text/plain)
2016-03-17 20:21 UTC, Jason H.
no flags Details

Description Jason H. 2016-03-12 22:21:39 UTC
Created attachment 1135699 [details]
OpenSSL Speed Test on 4.2.3-300

Description of problem: The performance of the 4.4.4-301 kernel is significantly reduced (to about 18%) from release version 4.2.3-300.  This manifests itself as laggy performance of applications, and general system slugishness.


Version-Release number of selected component (if applicable): kernel-4.4.4-301.fc23.x86_64


How reproducible: Always


Steps to Reproduce:
1. Install Fedora 23 release
2. Upgrade all system components to current versions, except for kernel-*
3. Reboot
4. Run "openssl speed" test, and save results.
5. Upgrade kernel to 4.4.4-301
6. Re-Run "openssl speed" test, and save results

Actual results:
See attachments.  Speed test after kernel upgrade are approximately 18% of the values prior to the upgrade.


Expected results:
Similar speed test results.


Additional info:
CPU, quad core 64-bit: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz

Comment 1 Jason H. 2016-03-12 22:22:28 UTC
Created attachment 1135700 [details]
OpenSSL Speed Test on 4.4.4-301

Comment 2 Jason H. 2016-03-12 22:23:14 UTC
Created attachment 1135705 [details]
SysBench Test on 4.2.3-300

Comment 3 Jason H. 2016-03-12 22:23:45 UTC
Created attachment 1135706 [details]
SysBench Test on 4.4.4-301

Comment 4 Jason H. 2016-03-13 18:28:24 UTC
I just tested against all prior versions of kernel released through "updates", and everything up through version 4.4.3-300 worked normally, while version 4.4.4-300 is where the slow down occurs.  I have also tested against version 4.4.5-301 currently in "updates-testing" repository, and the problem persists.

Comment 5 Jason H. 2016-03-14 11:31:44 UTC
This seems to have to do with the Intel SpeedStep governors.  In 4.4.3 and prior, my 2.40 MHz processor would fluctuate between 1000 and 3400 MHz.  In 4.4.4, the processor would fluctuate between 400 and 700 MHz, according to /proc/cpuinfo.

Setting /sys/devices/system/cpu/cpufreq/policy0/scaling_governor to performance, instead of the default "powersave" forces the CPU to 2400 MHz, and improves performance greatly, but still not to the same level as in 4.4.3.

Attached /proc/cpuinfo from both kernels, while under load.

Comment 6 Jason H. 2016-03-14 11:32:33 UTC
Created attachment 1136112 [details]
/proc/cpuinfo for kernel 4.4.3-300

Comment 7 Jason H. 2016-03-14 11:33:02 UTC
Created attachment 1136113 [details]
/proc/cpuinfo for kernel 4.4.4-301

Comment 8 Jon W. 2016-03-14 18:47:05 UTC
(In reply to Jason H. from comment #5)
> This seems to have to do with the Intel SpeedStep governors.  In 4.4.3 and
> prior, my 2.40 MHz processor would fluctuate between 1000 and 3400 MHz.  In
> 4.4.4, the processor would fluctuate between 400 and 700 MHz, according to
> /proc/cpuinfo.
> 
> Setting /sys/devices/system/cpu/cpufreq/policy0/scaling_governor to
> performance, instead of the default "powersave" forces the CPU to 2400 MHz,
> and improves performance greatly, but still not to the same level as in
> 4.4.3.
> 
> Attached /proc/cpuinfo from both kernels, while under load.

This is also affecting me and has ever since I installed the updates for this kernel.  The above helped me get to usable performance but it is still degraded.

Comment 9 Laura Abbott 2016-03-14 21:48:21 UTC
I'm not seeing any degradation on sysbench when I run on my machine with 4.4.5. I'm also not seeing any major changes to SpeedStep between 4.4.3 and 4.4.4 or any other system that jumps out at me. Can you share the following

1) dmesg from working and non-working kernels
2) lspci

You can also try the bisection scripts at https://pagure.io/fedbisect to see which commit between 4.4.3 and 4.4.4 may have broken it.

Comment 10 Jon W. 2016-03-14 21:55:11 UTC
Created attachment 1136316 [details]
lspci (Jon W.)

Comment 11 Jon W. 2016-03-14 21:58:03 UTC
Created attachment 1136317 [details]
dmesg 4.4.4-301.fc23.x86_64

Comment 12 Jon W. 2016-03-14 21:59:06 UTC
At this point I can only get what is currently running at 4.4.4 as I am still using the computer.  I will have get the old dmesg later.

Comment 13 Jason H. 2016-03-14 22:20:45 UTC
Created attachment 1136322 [details]
dmesg 4.2.3-300

Comment 14 Jason H. 2016-03-14 22:21:08 UTC
Created attachment 1136323 [details]
dmesg 4.4.4-301

Comment 15 Jason H. 2016-03-14 22:21:36 UTC
Created attachment 1136324 [details]
lspci

Comment 16 Jason H. 2016-03-14 22:45:47 UTC
The Bisect scripts failed miserably:
$ ./fedbisect.sh sync something
~/fedbisect/scripts ~/fedbisect
Traceback (most recent call last):
  File "./fedbisect-run.py", line 3, in <module>
    import bisect_state
  File "/home/jason/fedbisect/scripts/bisect_state.py", line 5, in <module>
    import koji_cli
  File "/home/jason/fedbisect/scripts/koji_cli.py", line 29, in <module>
    import koji
ImportError: No module named koji


After installing "koji" package:

$ ./fedbisect.sh sync something
~/fedbisect/scripts ~/fedbisect
Traceback (most recent call last):
  File "./fedbisect-run.py", line 3, in <module>
    import bisect_state
  File "/home/jason/fedbisect/scripts/bisect_state.py", line 8, in <module>
    from git import Repo
ImportError: No module named git


I installed every package for python that includes the name "git", and couldn't get it working.

Please include instructions on all required prerequisites.

Comment 17 Laura Abbott 2016-03-15 01:59:00 UTC
Sorry, the docs are missing the dependencies. I'm going to update them. The
packages you should need are

   koji
   GitPython
   fedpkg
   hmaccalc
   pesign
   gcc

Comment 18 Jon W. 2016-03-15 15:25:14 UTC
Created attachment 1136645 [details]
dmesg 4.4.3-300.fc23.x86_64

Comment 19 Jason H. 2016-03-16 13:25:15 UTC
I ran through fedbisect, it returned "Found your commit!".  I did a little research about git bisect to get the details.  I would recommend adding a section in your script to output the last bad commit details when it is found, or add to your readme what to do when "Found your commit" is displayed.


"git bisect log" returns:

# first bad commit: [774ac8b7eff69e0786970157de2157e68b22f456] Thermal: initialize thermal zone device correctly

"git bisect visualize" returns:

commit 774ac8b7eff69e0786970157de2157e68b22f456
Author: Zhang Rui <rui.zhang>
Date:   Fri Oct 30 16:31:47 2015 +0800

    Thermal: initialize thermal zone device correctly
    
    commit bb431ba26c5cd0a17c941ca6c3a195a3a6d5d461 upstream.
    
    After thermal zone device registered, as we have not read any
    temperature before, thus tz->temperature should not be 0,
    which actually means 0C, and thermal trend is not available.
    In this case, we need specially handling for the first
    thermal_zone_device_update().
    
    Both thermal core framework and step_wise governor is
    enhanced to handle this. And since the step_wise governor
    is the only one that uses trends, so it's the only thermal
    governor that needs to be updated.
    
    Tested-by: Manuel Krause <manuelkrause>
    Tested-by: szegad <szegadlo.pl>
    Tested-by: prash <prash.n.rao>
    Tested-by: amish <ammdispose-arch>
    Tested-by: Matthias <morpheusxyz123>
    Reviewed-by: Javi Merino <javi.merino>
    Signed-off-by: Zhang Rui <rui.zhang>
    Signed-off-by: Chen Yu <yu.c.chen>
    Signed-off-by: Greg Kroah-Hartman <gregkh>

Comment 20 Jason H. 2016-03-16 15:17:59 UTC
FedBisect also requires the following dependencies (from a base install of Fedora):
openssl-devel
bc
gcc
m4
net-tools

Comment 21 Laura Abbott 2016-03-16 19:11:01 UTC
Thanks for using the bisect scripts. They are still a work in progress so I will update for the Found your commit.

What you found is a good candidate for causing a perf drop. It's first in a series so I can't revert it by itself to test. Can you test http://koji.fedoraproject.org/koji/taskinfo?taskID=13369275 when it finishes ? This reverts the thermal series on top of 4.4.5.

Comment 22 Jason H. 2016-03-16 22:01:05 UTC
I just tested kernel-4.4.5-300.perfdropreverts.fc23.x86_64, and it works great, no performance issues experienced.

Thanks for all the work on this!

Comment 23 Laura Abbott 2016-03-16 22:55:57 UTC
The upstream developers want to know if this happens on 4.5 as well, can you test this? http://koji.fedoraproject.org/koji/buildinfo?buildID=744823

Comment 24 Jason H. 2016-03-16 23:24:35 UTC
kernel-4.5.0-300.fc24.x86_64 also experiences the same significant performance decrease.

In addition, with this kernel, my system completely freezes any time X is launched, I had to start in "run level" 3.  This is unrelated to the original problem, so I'm not going to look into this further, just interesting.

Comment 25 Laura Abbott 2016-03-17 00:55:11 UTC
Thanks for testing. More requests for info:

the output of "grep . /sys/class/thermal/*/*" on working and good kernel (preferably using the 4.4.5 scratch build I gave as a test since that will be fairly close)

On the bad kernel, the output of grep . /sys/devices/system/cpu/intel_pstate/*

Do you still see the problem if you set /sys/class/thermal/thermal_zone*/mode to "disabled"

Comment 26 Prash 2016-03-17 09:15:49 UTC
Created attachment 1137332 [details]
prash-openssl-speed

I can't reproduce the problem on my system.

For reference, I'm one of the original reporters of the bug in the handling of the thermal subsystem. My affected device is a HP ProBook 4410s laptop running Archlinux. The patches by Rui Zhang and Chen Yu fixed my problem, and I have been running patched kernels for a year now. I did not notice a drop in performance at any time.

For this bug report, I ran "openssl speed" like Jason H. I tested three different kernel versions: 4.1.19(LTS), 4.5.0-rc6-g18558ca, and 4.5.0, the last two of which, include the patches by Rui Zhang and Chen Yu. The 4.5.x kernels are 0.02% slower than the LTS kernels, but for my system, I can't say if they are (1) significant and (2) attributable to these patches.

Seems like these patches are incompatible with newer processors or chipsets.

Comment 27 Jason H. 2016-03-17 10:25:08 UTC
Laura, I will get you that information tonight.

Prash, since you mentioned Arch, I did test against different versions of the kernel on Arch as well, and the "good" baseline was a little slower than the comparable Fedora kernel, but nothing critical.  While using the 4.4.4 kernel from Arch, performance dropped by around 25%.  Not as a significant decrease as in Fedora (~%82 percent drop), but still a noticeable decrease for me.

I'm sure someone smarter than I will find a way to solve both our issues!

Comment 28 Jason H. 2016-03-17 20:18:07 UTC
Created attachment 1137504 [details]
/sys/devices/system/cpu/intel_pstate for 4.4.4-301.fc23.x86_64 (slow)

Output of "grep . /sys/devices/system/cpu/intel_pstate/*" for slow kernel 4.4.4.

Comment 29 Jason H. 2016-03-17 20:19:30 UTC
Created attachment 1137505 [details]
/sys/devices/system/cpu/intel_pstate for 4.4.5-300.perfdropreverts.fc23.x86_64 (fast)

Output of "grep . /sys/devices/system/cpu/intel_pstate/*" for working kernel

Comment 30 Jason H. 2016-03-17 20:20:47 UTC
Created attachment 1137507 [details]
/sys/class/thermal for 4.4.4-301.fc23.x86_64 (slow)

Output of "grep . /sys/class/thermal/*/*" for slow kernel.

Comment 31 Jason H. 2016-03-17 20:21:36 UTC
Created attachment 1137508 [details]
/sys/class/thermal for 4.4.5-300.perfdropreverts.fc23.x86_64

Output of "grep . /sys/class/thermal/*/*" command for working kernel.

Comment 32 Jason H. 2016-03-17 20:23:16 UTC
I have attached the requested outputs of /sys/class/thermal and /sys/devices/system/cpu/intel_pstate.

Setting /sys/class/thermal/thermal_zone0/mode to "disabled" had no effect to the performance issues on the 4.4.4 kernel.  My thermal_zone1 does not have a "mode" parameter to set.

Comment 33 Jacek Pawlyta 2016-03-20 13:17:47 UTC
*** Bug 1317147 has been marked as a duplicate of this bug. ***

Comment 34 Laura Abbott 2016-03-21 17:43:48 UTC
The patch authors gave a fix that someone else confirmed fixes the performance issue for them. The output from the thermal files here shows the same trip point weirdness so it looks like the same issue. I pulled in the patch to the tree. It should be available when 4.4.7 comes out (later this week or next).

Thanks again for reporting and following up.

Comment 35 Fedora Update System 2016-03-31 15:58:18 UTC
kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e

Comment 36 Fedora Update System 2016-03-31 16:02:05 UTC
kernel-4.4.6-201.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb

Comment 37 Jason H. 2016-03-31 20:18:57 UTC
(In reply to Fedora Update System from comment #35)
> kernel-4.4.6-301.fc23 has been submitted as an update to Fedora 23.
> https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e

I have tested kernel-4.4.6-301.fc23 x86_64, and it fixes my performance issues.

Thanks!

Comment 38 Fedora Update System 2016-04-01 01:55:36 UTC
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-ed5110c4bb

Comment 39 Fedora Update System 2016-04-01 15:22:57 UTC
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e602c0e5e

Comment 40 Fedora Update System 2016-04-01 20:57:10 UTC
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-81fd1b03aa

Comment 41 Fedora Update System 2016-04-02 00:44:04 UTC
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 42 Fedora Update System 2016-04-02 15:51:55 UTC
kernel-4.5.0-302.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 43 Fedora Update System 2016-04-08 15:52:05 UTC
kernel-4.4.6-301.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 44 Fedora Update System 2016-04-08 20:19:57 UTC
kernel-4.4.6-201.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.