Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 562209
Summary: | Booting boot.iso, installer unable to read network package metadata | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | James Laska <jlaska> | ||||||
Component: | anaconda | Assignee: | Anaconda Maintenance Team <anaconda-maint-list> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | rawhide | CC: | awilliam, jeff, jhrozek, jonathan, jturner, kasal, kdudka, nb, notting, tcallawa, vanmeeuwen+fedora | ||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | python-urlgrabber-3.9.1-5.fc13 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-02-26 21:07:30 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 538273 | ||||||||
Attachments: |
|
Description
James Laska
2010-02-05 15:59:00 UTC
This only happens if you do not have the network brought up in advance and must use the netconfig dialog. So if you use updates=http://, or stage2=, or some other method you won't see this problem. Error 6 is: CURLE_COULDNT_RESOLVE_HOST, /* 6 */ Now, when anaconda brings up the network there should be a /etc/resolv.conf getting written out and the resolver cache gettting flushed. However, if I break at the right place in yum (yumRepo.py:692) during anaconda and manually flush the resolver cache and then try the uh.urlgrab() call again, nothing's improved. If I start up a different python process on tty2 and do the same thing, it works fine. I'm at a loss for why we're seeing this in anaconda. Is python-pycurl or libcurl perhaps holding on to the resolver cache information somehow and needs to be told to flush? Jeff or Stepan, do you have any thoughts on the previous comment#1? I have no idea although Chris' hypothesis sounds plausible. This sounds more like it should be assigned to libcurl itself as pycurl is just a wrapper around libcurl. Thanks Jeff, reassigning to libcurl as requested. Kamil, any thoughts on this issue? Rawhide curl does not use glibc/nss for name resolving. It uses c-ares (bug #514771). Can I somehow reproduce the bug out of anaconda? It's merely impossible to debug it there... I have not tested this, but this procedure ought to work: (1) Boot a machine without networking. (2) Start a python shell. (3) Do some action that uses pycurl, say urlgrabber.urlgrab("http://www.fedoraproject.org") (4) That should fail with PYCURL ERROR #6. (5) Without quitting the python shell, bring up the network. (6) Repeat step #3. That should also fail, if this bug is to be believed. (In reply to comment #7) > I have not tested this, but this procedure ought to work: # python >>> import urlgrabber >>> urlgrabber.urlgrab("http://www.fedoraproject.org") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 612, in urlgrab return default_grabber.urlgrab(url, filename, **kwargs) File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 976, in urlgrab return self._retry(opts, retryfunc, url, filename) File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 880, in _retry r = apply(func, (opts,) + args, {}) File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 962, in retryfunc fo = PyCurlFileObject(url, filename, opts) File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1056, in __init__ self._do_open() File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1314, in _do_open self._do_grab() File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1444, in _do_grab self._do_perform() File "/usr/lib/python2.6/site-packages/urlgrabber/grabber.py", line 1301, in _do_perform raise err urlgrabber.grabber.URLGrabError: [Errno 14] PYCURL ERROR 6 - "" >>> [1]+ Stopped python # ifconfig eth0 # dhclient eth0 # ifconfig eth0 eth0 Link encap:Ethernet HWaddr 52:54:00:7E:25:B5 inet addr:10.10.10.152 Bcast:10.10.11.255 Mask:255.255.252.0 # fg python >>> urlgrabber.urlgrab("http://www.fedoraproject.org") '' Any chance to run anaconda within a debugger? Is it possible to catch a strace of the failure? Running anaconda in a debugger is not really doable. However, any tree containing anaconda-13.24-1 or later has strace in the loader and stage2, so we can strace anaconda. Doesn't look like jlaska was able to reproduce it. I ran strace -p on anaconda before enabling the network via the UI, and this is a snip of what I'm seeing. Note the sin_addr argument to connect. On a working install, strace will show the expected address here. Full strace output available if needed. connect(26, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 sendto(26, "\364\341\1\0\0\1\0\0\0\0\0\0\7mirrors\rfedoraproje"..., 43, MSG_NOSIGNAL, NULL, 0) = 43 clock_gettime(CLOCK_MONOTONIC, {285, 962307462}) = 0 clock_gettime(CLOCK_MONOTONIC, {285, 962516013}) = 0 clock_gettime(CLOCK_MONOTONIC, {285, 962717607}) = 0 poll([{fd=26, events=POLLIN|POLLRDNORM}], 1, 8123) = 1 ([{fd=26, revents=POLLERR}]) clock_gettime(CLOCK_MONOTONIC, {285, 963221462}) = 0 recvfrom(26, 0x7fffffff6140, 513, 0, 0x7fffffff6350, 0x7fffffff613c) = -1 ECONNREFUSED (Connection refused) close(26) Created attachment 394569 [details]
a reproducer
Attached is a py script triggering the state out of anaconda. Does any documentation says it is a flaw in curl and/or c-ares? If not, the bug should be reassigned back to anaconda or whatever...
Note the last test case only works with c-ares: Traceback (most recent call last): File "./trigger_bz562209.py", line 13, in perf c.perform() error: (6, 'Could not resolve host: www.fedoraproject.org (Could not contact DNS servers)') open("/etc/resolv.conf", O_RDONLY) = 3 open("/etc/nsswitch.conf", O_RDONLY) = 3 open("/dev/urandom", O_RDONLY) = 3 open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 open("/etc/hosts", O_RDONLY) = 4 OK When using glibc/nss any of the tested cases does not work. In IRC discussion with Kamil, he noted it wasn't clear when I filed this bug that F-12 does not use libcurl in loader, while F-13 does. So it will be difficult to track down a specific change that introduced this, since it was a large change. Kamil offered some suggestions ... 10:40:35 kdudka: so the portable solution is to not use libcurl before /etc/resolv.conf is valid 10:40:44 kdudka: or reload the module 10:41:16 kdudka: a less portable, though less intrusive right now, solution is to close the handle and create a new one 10:41:23 kdudka: but this will work only with c-ares 10:41:46 kdudka: we are going to drop c-ares for Fedora 14 because of several bugs, which are not easy to fix Clumens noted that anaconda uses libcurl directly in stage#1 loader, but switches to using python-urlgrabber in stage#2. So perhaps something in that transition is introducing this behavior? I've added skvidal to the cc list in case this issue bleeds into the python-pycurl space. urlgrabber isn't intercepting anything at the perform level, that I'm aware of. If there's a place I need to be looking at, I'll be glad to but I'm not sure how this happens at the urlgrabber layer. (In reply to comment #14) > Clumens noted that anaconda uses libcurl directly in stage#1 loader, but > switches to using python-urlgrabber in stage#2. So perhaps something in that > transition is introducing this behavior? The issue isn't what the loader uses or doesn't use, it appears to be just whether or not the loader sets up networking. If the loader doesn't, libcurl is initailized before networking is set up (presumably on module import), and it doesn't notice when the network actually is brought up. (In reply to comment #14) > In IRC discussion with Kamil, he noted it wasn't clear when I filed this bug > that F-12 does not use libcurl in loader, while F-13 does. So it will be > difficult to track down a specific change that introduced this, since it was a > large change. James, thanks for summarizing it here! > Clumens noted that anaconda uses libcurl directly in stage#1 loader, but Is urlinstTransfer() in loader/urls.c the correct place to look at? The function urlinstTransfer() does not seem to exist in f12 anaconda. What's the equivalent there? > switches to using python-urlgrabber in stage#2. So perhaps something in that > transition is introducing this behavior? Chris, could you please point me to the right place in the code? Was that code also changed between f12 and f13? (In reply to comment #16) > The issue isn't what the loader uses or doesn't use, it appears to be just > whether or not the loader sets up networking. If the loader doesn't, libcurl is > initailized before networking is set up (presumably on module import), and it > doesn't notice when the network actually is brought up. Which module are you actually talking about? pycurl, nor urlgrabber reads /etc/resolv.conf on import: $ strace -e trace=open python <<< "import pycurl; pycurl.global_init(pycurl.GLOBAL_ALL)" 2>&1 | grep etc open("/etc/ld.so.cache", O_RDONLY) = 3 open("/etc/ld.so.cache", O_RDONLY) = 4 open("/etc/selinux/config", O_RDONLY) = 4 $ strace -e trace=open python <<< "import urlgrabber" 2>&1 | grep etc open("/etc/ld.so.cache", O_RDONLY) = 3 open("/etc/localtime", O_RDONLY) = 7 open("/etc/ld.so.cache", O_RDONLY) = 8 open("/etc/selinux/config", O_RDONLY) = 8 However it happens on performing a transfer: $ strace -e trace=open python <<< "import urlgrabber; urlgrabber.urlgrab('http://fedoraproject.org', '/dev/null')" 2>&1 | grep etcopen("/etc/ld.so.cache", O_RDONLY) = 3 open("/etc/localtime", O_RDONLY) = 7 open("/etc/ld.so.cache", O_RDONLY) = 8 open("/etc/selinux/config", O_RDONLY) = 8 open("/etc/nsswitch.conf", O_RDONLY) = 3 open("/etc/host.conf", O_RDONLY) = 3 open("/etc/resolv.conf", O_RDONLY) = 3 open("/etc/ld.so.cache", O_RDONLY) = 3 open("/etc/hosts", O_RDONLY|O_CLOEXEC) = 3 open("/etc/ld.so.cache", O_RDONLY) = 3 open("/etc/ld.so.cache", O_RDONLY) = 3 open("/etc/gai.conf", O_RDONLY) = -1 ENOENT (No such file or directory) > > Clumens noted that anaconda uses libcurl directly in stage#1 loader, but > > Is urlinstTransfer() in loader/urls.c the correct place to look at? urlinstTransfer is the sole user of libcurl in the loader, so if we want to investigate anaconda's use of libcurl that's the place to do it. Note that there's no way out of urlinstTransfer without calling curl_easy_cleanup. > The function urlinstTransfer() does not seem to exist in f12 anaconda. What's > the equivalent there? urlinstStartTransfer and urlinstFinishTransfer are where you want to start. > > switches to using python-urlgrabber in stage#2. So perhaps something in that > > transition is introducing this behavior? > > Chris, could you please point me to the right place in the code? > > Was that code also changed between f12 and f13? F12 and earlier uses our own url fetching code in loader, and yum/urlgrabber in stage2 (once the graphical installer starts, basically). F13 and later uses libcurl in loader and yum/urlgrabber in stage2. Note that the yum/urlgrabber stack has also changed in F13 to use pycurl. I really don't see how changes in what loader uses could affect what happens in stage2, though. As I said, I don't believe there's a way out of urlinstTransfer that doesn't result in curl_easy_cleanup getting called. urlgrabber in F12 was using pycurl, too. It's not new to f13. Okay, let's sum things up. This problem appears to be happening because the nameserver configuration provided by /etc/resolv.conf changes mid-installation. The initial configuration was either empty or created when the network wasn't up, therefore is invalid. The new configuration does not get used because the old configuration is cached. We've seen problems like this before in anaconda, and the fix has always been to use glibc's res_init function to tell it to re-read the nameserver configuration. However, that does not work in this case because libcurl uses a different resolver. This resolver does not provide a similar function therefore we cannot use this fix. One potential fix according to the reproducer above is to close down the handle provided by pycurl.Curl() and create a new one, as that effectively creates a new object and forces the nameserver configuration to be re-read. However, anaconda does not have access to any of this information. It doesn't create any pycurl handles - that's all hidden from us by the yum/urlgrabber stack. Therefore, we can't really make that modification in anaconda. A workaround is to bring up the network really early on, like in the loader by passing updates=. For me, an unanswered question is why we're only seeing this now. The only libcurl-related changes in anaconda are in the loader, not stage2, and are completely contained within one optional function. Does this accurately sum up the current situation? Something to try out: Yum keeps a grabber object per repo, ultimately. this is where the pycurl.Curl() object gets setup - well - deep inside urlgrabber. but we should be able to shut it all down by deleting the grabber objects in the yum repo objects. so something like: for repo in yumbaseobj.repos.listEnabled(): repo._grab = None repo._grab_func = None then the next time you use yum to fetch data it should setup a new grabber object. Can you give that a try? (In reply to comment #21) > The only libcurl-related changes in anaconda are in the loader, not > stage2, and are completely contained within one optional function. f12 libcurl is not built on top of c-ares. It was intorduced in f13 - see comment #6. Is it possible to check it by forcing my own libcurl.so? Is it sufficient to put the library on the updates floppy? > Does this accurately sum up the current situation? AFAICT, yes. Just to keep the bug up2date. I've tried to force libcurl.so based on glibc/nss (instead of c-ares) and the error within anaconda did not occur. I looked into this further, and the suggestions in comment #22 did not help. The reason for this is that the pycurl.Curl instance is created in urlgrabber.grabber outside any class, so merely importing urlgrabber.grabber before bringing up the network will trigger this issue. I think anaconda and yum both do this. We came up with a couple ideas for fixes: (1) Have urlgrabber only create the Curl handle on demand. This means no connection sharing, though. (2) Add a method to urlgrabber to reset the Curl handle. This means adding a method to the API that we don't really want people to use. (3) Patching libcurl to handle network changes. (In reply to comment #25) > (3) Patching libcurl to handle network changes. The above implies patching of c-ares, including API/ABI changes in both libcurl and c-ares, or am I wrong? python-urlgrabber-3.9.1-5.fc13 built to implement #2. reset_curl_obj() anaconda-13.29-1.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/anaconda-13.29-1.fc13 python-urlgrabber-3.9.1-5.fc13,anaconda-13.29-1.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/python-urlgrabber-3.9.1-5.fc13,anaconda-13.29-1.fc13 anaconda-13.29-1.fc13, python-urlgrabber-3.9.1-5.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update anaconda python-urlgrabber'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F13/FEDORA-2010-2688 Unable to test without a fix for the 'no input devices recognized' bug#566396. Will retest once that issue is addressed. python-urlgrabber-3.9.1-5.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/F13/FEDORA-2010-2688 python-urlgrabber-3.9.1-5.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update python-urlgrabber'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F13/FEDORA-2010-2688 Tested with F-13-Alpha-RC4 i386/x86_64. The reported problem is resolved. python-urlgrabber-3.9.1-5.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/F13/FEDORA-2010-2688 python-urlgrabber-3.9.1-5.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report. |