1175466 – add timeout option to repo conf

Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1175466 - add timeout option to repo conf

Summary: add timeout option to repo conf

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	dnf
Sub Component:
Version:	21
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Packaging Maintenance Team
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1185553 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-12-17 19:15 UTC by Wolfgang Rupprecht
Modified:	2015-02-20 08:32 UTC (History)
CC List:	12 users (show)
Fixed In Version:	hawkey-0.5.3-2.fc21
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-02-20 08:32:24 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1183998	0	unspecified	CLOSED	[RFE] Moving of non-responsive mirror at the end of the queue	2022-05-16 11:32:56 UTC

Internal Links: 1183998

Description Wolfgang Rupprecht 2014-12-17 19:15:48 UTC

Description of problem:
dnf hangs for a very long time banging on non-responsive mirrors.

Downloading Packages:
[SKIPPED] PackageKit-1.0.3-4.fc21.x86_64.rpm: Already downloaded                                                                                                                                                                                                                                                                               
[SKIPPED] PackageKit-glib-1.0.3-4.fc21.x86_64.rpm: Already downloaded                                                                                                                                                                                                                                                                          
[SKIPPED] PackageKit-cached-metadata-1.0.3-4.fc21.x86_64.rpm: Already downloaded                                                                                                                                                                                                                                                               
[MIRROR] PackageKit-gtk3-module-1.0.3-4.fc21.x86_64.rpm: Curl error: Timeout was reached for ftp://mirror.cs.pitt.edu/fedora/linux/updates/testing/21/x86_64/p/PackageKit-gtk3-module-1.0.3-4.fc21.x86_64.rpm [Connection timed out after 120002 milliseconds]                                                                                 
[MIRROR] PackageKit-gstreamer-plugin-1.0.3-4.fc21.x86_64.rpm: Curl error: Timeout was reached for ftp://mirror.cs.pitt.edu/fedora/linux/updates/testing/21/x86_64/p/PackageKit-gstreamer-plugin-1.0.3-4.fc21.x86_64.rpm [Connection timed out after 120002 milliseconds]                                                                       
[MIRROR] PackageKit-command-not-found-1.0.3-4.fc21.x86_64.rpm: Curl error: Timeout was reached for ftp://mirror.cs.pitt.edu/fedora/linux/updates/testing/21/x86_64/p/PackageKit-command-not-found-1.0.3-4.fc21.x86_64.rpm [Connection timed out after 120001 milliseconds]                                                                     
(4-6/59): PackageKit-gtk3-module-1.0.3-4.fc21.x86_64.rpm                                                                                             57% [=====================================================================================-                                                              ] ---  B/s |  66 MB     --:-- ETA


Version-Release number of selected component (if applicable):
dnf.noarch                         0.6.3-2.fc21                          @System
[

How reproducible:
always

Steps to Reproduce:
1. dnf upgrade -y
2. 
3.

Actual results:
dnf hangs for a very long time, eventually moves onto another mirror for the first 3 downloads and then on the 4th to 6th download hangs again as it returns to the dead mirrors.

Expected results:
1) dnf has more reasonable timeouts.   10 seconds should do it.  If a mirror takes longer than that to respond we probably shouldn't be using it.  
2) past failures should be remembered and those mirrors are blacklisted for a certain length of time, certainly at least for this session, perhaps with the same timeout as the metadata.

Additional info:

Comment 1 Honza Silhan 2015-01-06 18:31:49 UTC

Thanks for the report.

1) Setting constants for non-responsiveness is always subjective. I personally think that current behavior is good enough. Maybe it could re-initiate failed downloads again at the end of the queue instead.
2) Is this possible, Tomas? Or better could it be marked as the least preferred mirror?

Comment 2 Petr Spacek 2015-01-07 08:59:24 UTC

(In reply to Jan Silhan from comment #1)
> 2) Is this possible, Tomas? Or better could it be marked as the least
> preferred mirror?

Personally I would be in favor of moving timing-our mirror to the last position in priority list. It could be just intermittent failure or just one package missing on that particular mirror.

Maybe DNF could be clever and remove mirror completely after large X failures (like 50)?

Comment 3 Tomas Mlcoch 2015-01-19 09:57:57 UTC

Hi all,
Librepo has several option that can be used for fine-tuning of such behavior.

LRO_CONNECTTIMEOUT - Max time in sec for connection phase. (Default: 300 seconds)

LRO_LOWSPEEDLIMIT - The transfer speed in bytes per second that the transfer should be below during LRO_LOWSPEEDTIME seconds for the library to consider it too slow and abort. (Default: 0)

LRO_LOWSPEEDTIME - The time in seconds that the transfer should be below the LRO_LOWSPEEDLIMIT for the library to consider it too slow and abort. (Default: 120 seconds)

LRO_ALLOWEDMIRRORFAILURES - Max number of allowed failures per mirror. If a mirror outreach this number and there was no successful download, the mirror ignored for the rest of the session. (Default: 4)

LRO_ADAPTIVEMIRRORSORTING - After each finished transfer, the mirrors are resorted. - A mirror is moved forward or backward by one position depending on its rank (calculated as ration between successful and failed downloads) and ranks of its neighbors (Default: True)

JFYI, as you can see, in the Wolfgang's case, its the combination of lowpeedlimit and lowspeedtime what kills the transfer after 120sec (because the default connection timeout is far more higher - 300sec). So maybe it could be useful to also add these two options into repo conf.

Moving of non-responsive mirror at the end of the queue as suggested by Petr is possible and it could work.

Petr or Jan, could someone of you open me an RFE in bugzilla to get this thing tracked? Thanks

Tomas

Comment 4 Honza Silhan 2015-01-26 11:20:59 UTC

PR: https://github.com/rpm-software-management/dnf/pull/199

Comment 5 Honza Silhan 2015-01-26 11:21:07 UTC

*** Bug 1185553 has been marked as a duplicate of this bug. ***

Comment 6 Zbigniew Jędrzejewski-Szmek 2015-01-26 22:57:51 UTC

Making it configurable is a nice step. But c'mon, 300s timeout (or 120 as it currently seems to be)? This should be changed to some value that "just works" for most common cases, and not to have people discover this configuration option on their own. Long connection timeouts make sense for random pages on the web, but not for accessing mirrors which are supposed to be fast.

Comment 7 Tomas Mlcoch 2015-01-27 09:11:42 UTC

It depends. Yes, mirrors are supposed to be fast but they are also supposed to be available most of the time.

The world is not perfect and there are still people with dial-up, GPRS and similar types of connection. Such connections are slow, lossy and have high latency. We need use values that works for majority of people and 120s looks like such value. It works for them (for people with slow connection with high loss rate and high latency) but also for others with reliable high-speed connection types. Only drawback is that the second group can sometimes hit two minutes delay. But I guess we could do some changes and use shorter timeout as default (maybe something like 30s).

Comment 8 Zbigniew Jędrzejewski-Szmek 2015-01-27 12:43:25 UTC

> can sometimes hit two minutes delay
If one mirror is nonresponsive. Sometimes more than one fails.

People who are on "bad" connections usually have slow transfers and/or unreliable packet delivery, but they usually do not have an extreme latency. Even for countries connected through satellite networks, round-trip latencies are usually below half a second. Let's say that determining whether a connection is up or down might take 10 roundtrips, so 10s should be enough.

> 30s
Still rather high though, but certainly better then 120s.

Comment 9 Rodrigo de Farias Gomes 2015-01-29 20:30:40 UTC

I believe that I am just unlucky :-)

[root@localhost ~]# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=13 ttl=42 time=947 ms
64 bytes from 8.8.8.8: icmp_seq=14 ttl=41 time=1606 ms
64 bytes from 8.8.8.8: icmp_seq=15 ttl=40 time=745 ms
64 bytes from 8.8.8.8: icmp_seq=16 ttl=41 time=7849 ms
64 bytes from 8.8.8.8: icmp_seq=17 ttl=41 time=6849 ms
64 bytes from 8.8.8.8: icmp_seq=18 ttl=41 time=6027 ms
64 bytes from 8.8.8.8: icmp_seq=19 ttl=41 time=7206 ms
64 bytes from 8.8.8.8: icmp_seq=20 ttl=41 time=6386 ms
64 bytes from 8.8.8.8: icmp_seq=21 ttl=41 time=6087 ms
64 bytes from 8.8.8.8: icmp_seq=22 ttl=42 time=5105 ms
64 bytes from 8.8.8.8: icmp_seq=23 ttl=42 time=4926 ms
64 bytes from 8.8.8.8: icmp_seq=24 ttl=41 time=5506 ms
64 bytes from 8.8.8.8: icmp_seq=25 ttl=41 time=5705 ms
64 bytes from 8.8.8.8: icmp_seq=26 ttl=40 time=5466 ms
64 bytes from 8.8.8.8: icmp_seq=27 ttl=41 time=10063 ms
64 bytes from 8.8.8.8: icmp_seq=28 ttl=41 time=9105 ms
64 bytes from 8.8.8.8: icmp_seq=29 ttl=41 time=8766 ms
^C
--- 8.8.8.8 ping statistics ---
37 packets transmitted, 17 received, 54% packet loss, time 36001ms
rtt min/avg/max/mdev = 745.763/5785.517/10063.948/2587.143 ms, pipe 11


I live in Brazil. I am using a 3g connection...

English is not my natural language, sorry...

Comment 10 Honza Silhan 2015-02-03 09:26:06 UTC

Fixed in the upstream. The default timeout is 30s - the same as in yum.

Comment 11 Fedora Update System 2015-02-16 00:03:11 UTC

dnf-plugins-core-0.1.5-1.fc21,hawkey-0.5.3-2.fc21,dnf-0.6.4-1.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/dnf-plugins-core-0.1.5-1.fc21,hawkey-0.5.3-2.fc21,dnf-0.6.4-1.fc21

Comment 12 Fedora Update System 2015-02-17 08:04:06 UTC

Package hawkey-0.5.3-2.fc21, dnf-plugins-core-0.1.5-1.fc21, dnf-0.6.4-1.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing hawkey-0.5.3-2.fc21 dnf-plugins-core-0.1.5-1.fc21 dnf-0.6.4-1.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-2139/dnf-plugins-core-0.1.5-1.fc21,hawkey-0.5.3-2.fc21,dnf-0.6.4-1.fc21
then log in and leave karma (feedback).

Comment 13 Fedora Update System 2015-02-20 08:32:24 UTC

hawkey-0.5.3-2.fc21, dnf-plugins-core-0.1.5-1.fc21, dnf-0.6.4-1.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.