Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 747377
Summary: | heap corruption via multi-threaded "git grep" | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jim Meyering <meyering> | ||||||
Component: | glibc | Assignee: | Andreas Schwab <schwab> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 16 | CC: | agk, agrover, atkac, awilliam, bkearney, bruno, cfergeau, chrisw, fweimer, jakub, mads, marc.c.dionne, mishu, mjg, npajkovs, pbrobinson, pingou, pnemade, rdieter, rjones, schwab, scottt.tw, tmraz, tmz | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | AcceptedBlocker | ||||||||
Fixed In Version: | agg-2.5-12.fc16 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-10-29 05:53:06 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 713568 | ||||||||
Attachments: |
|
Description
Jim Meyering
2011-10-19 16:11:11 UTC
affects F16 as well as rawhide Oh, another symptom that may be related: my rawhide VM failed to boot with a very odd/early diagnostic I don't recall, but it booted fine on the second attempt. valgrind? I tried valgrind git grep ... many times before resorting to downgrading. It reported nothing. Actually, valgrind does report something, if you try enough times. Here (this is on rawhide), valgrind finds nothing to report for the first 9 iterations, but on the 10th it hits a NULL-dereference: $ for i in $(seq 20);do printf .; valgrind -q ./git grep -w stat>k;done .........==2730== Thread 3: ==2730== Invalid read of size 1 ==2730== at 0x4BCD87: find_cached_object (cache.h:680) ==2730== by 0x4C000A: read_object (sha1_file.c:2206) ==2730== by 0x4C0572: read_sha1_file_extended (sha1_file.c:2243) ==2730== by 0x42C4C0: lock_and_read_sha1_file (cache.h:759) ==2730== by 0x42C744: run (grep.c:371) ==2730== by 0x3A56A07D8F: start_thread (in /lib64/libpthread-2.14.90.so) ==2730== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==2730== ==2730== ==2730== Process terminating with default action of signal 11 (SIGSEGV) ==2730== Access not within mapped region at address 0x0 ==2730== at 0x4BCD87: find_cached_object (cache.h:680) ==2730== by 0x4C000A: read_object (sha1_file.c:2206) ==2730== by 0x4C0572: read_sha1_file_extended (sha1_file.c:2243) ==2730== by 0x42C4C0: lock_and_read_sha1_file (cache.h:759) ==2730== by 0x42C744: run (grep.c:371) ==2730== by 0x3A56A07D8F: start_thread (in /lib64/libpthread-2.14.90.so) ==2730== If you believe this happened as a result of a stack ==2730== overflow in your program's main thread (unlikely but ==2730== possible), you can try to increase the size of the ==2730== main thread stack using the --main-stacksize= flag. ==2730== The main thread stack size used in this run was 8388608. Then it had two more successful runs and another identical failure, etc. I have narrowed it down to builtin/grep.c. Compiling that one file with -O1 -g appears to avoid this problem. Compile it with -O2 -g to provoke failure. News: this is threading-related. Disabling git's multi-threading avoids the failure, too. So the problem is triggered by the combination of compiling with -O2 and threading (yes, git grep is multi-threaded). Try -O2 -g -fno-inline if it reproduces too, and/or use __attribute__((optimize (1))) attribute on selected functions (binary search them) to find out which function it is. Don't use broken software. (In reply to comment #6) > Try -O2 -g -fno-inline if it reproduces too, and/or use __attribute__((optimize > (1))) attribute on selected functions (binary search them) to find out which > function it is. Thanks, Jakub. Using -O2 -g -fno-inline to compile builtin/grep.c and -O2 -g for all the rest, there is no failure. Thus, something inline-related somewhere in that file. Which function? binary search showed that adding __attribute__((optimize(1))) to only one function is enough to avoid the failure: __attribute__((optimize(1))) static void *run(void *arg) { int hit = 0; struct grep_opt *opt = arg; while (1) { struct work_item *w = get_work(); if (!w) break; opt->output_priv = w; if (w->type == WORK_SHA1) { unsigned long sz; void* data = load_sha1(w->identifier, &sz, w->name); if (data) { hit |= grep_buffer(opt, w->name, data, sz); free(data); } } else if (w->type == WORK_FILE) { size_t sz; void* data = load_file(w->identifier, &sz); if (data) { hit |= grep_buffer(opt, w->name, data, sz); free(data); } } else { assert(0); } work_done(w); } free_grep_patterns(arg); free(arg); return (void*) (intptr_t) hit; } But most likely many of the calls it does are inlined. Thus, please try adding __attribute__((noinline)) to the functions this function calls (in addition to that optimize attribute on run function), one by one, and see what inlining is essential to reproduce the failure. By the way, this problem affects /usr/bin/git, too. (export PATH=/usr/bin:/bin; for i in $(seq 200);do printf .; timeout 1 git grep -w stat>k 2>err || head -1 err;done ) The above evokes plenty of memory corruption warnings. Replacing the __attribute__((optimize(1))) on "run" with a weaker __attribute__((noinline)), only one other function needed __attribute__((noinline)) to avoid the failure: work_item. https://git.kernel.org/?p=git/git.git;a=blob;f=builtin/grep.c#l117 Definitely pthread-related: #define grep_lock() pthread_mutex_lock(&grep_mutex) #define grep_unlock() pthread_mutex_unlock(&grep_mutex) __attribute__((noinline)) static struct work_item *get_work(void) { struct work_item *ret; grep_lock(); while (todo_start == todo_end && !all_work_added) { pthread_cond_wait(&cond_add, &grep_mutex); } if (todo_start == todo_end && all_work_added) { ret = NULL; } else { ret = &todo[todo_start]; todo_start = (todo_start + 1) % ARRAY_SIZE(todo); } grep_unlock(); return ret; } This is still reproducible on rawhide using glibc-2.14.90-13.x86_64 I'm seeing this bug too. It only happened when I updated git to 1.7.7-1.fc17 (from git-1.7.6-1.fc16.x86_64), but note this pulled in a lot of dependent packages and it could be in any of them. Crashes happen intermittently, about 1 time in 4. Here is a stack trace from a crash. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff7d8b700 (LWP 5385)] _int_free (av=0x3b489ae700, p=0x7b2f70, have_lock=0) at malloc.c:4005 4005 old_idx = fastbin_index(chunksize(old)); (gdb) bt #0 _int_free (av=0x3b489ae700, p=0x7b2f70, have_lock=0) at malloc.c:4005 #1 0x000000000042d583 in work_done (w=0x74e9d8) at builtin/grep.c:177 #2 run (arg=0x78a410) at builtin/grep.c:220 #3 0x0000003b48a07d90 in start_thread (arg=0x7ffff7d8b700) at pthread_create.c:309 #4 0x0000003b486eeddd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (gdb) info threads Id Target Id Frame 9 Thread 0x7ffff4584700 (LWP 5392) "git" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 8 Thread 0x7ffff4d85700 (LWP 5391) "git" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 7 Thread 0x7ffff5586700 (LWP 5390) "git" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 6 Thread 0x7ffff5d87700 (LWP 5389) "git" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 5 Thread 0x7ffff6588700 (LWP 5388) "git" __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 4 Thread 0x7ffff6d89700 (LWP 5387) "git" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 3 Thread 0x7ffff758a700 (LWP 5386) "git" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 * 2 Thread 0x7ffff7d8b700 (LWP 5385) "git" _int_free (av=0x3b489ae700, p=0x7b2f70, have_lock=0) at malloc.c:4005 1 Thread 0x7ffff7d8d700 (LWP 5382) "git" _int_malloc (av=0x3b489ae700, bytes=30) at malloc.c:3463 As well as crashes, I also see: - hangs - error: [filename]: short read No such file or directory - error: [filename]: short read Success - *** glibc detected *** git: double free or corruption (fasttop): [address] Basically, a lot of randomness going on. FYI, I've just checked out git's upstream v1.7.6 tag, built, and ran this: for i in $(seq 200);do printf o; timeout 1 ./git grep -q stat;done That prints thousands of lines like this: oooo*** glibc detected *** ./git: double free or corruption (fasttop): 0x000000000253e0e0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x7c606)[0x7f48493b6606] ./git[0x42bbc1] /lib64/libpthread.so.0(+0x7d90)[0x7f48496f6d90] /lib64/libc.so.6(clone+0x6d)[0x7f4849428ddd] ======= Memory map: ======== ... Note that the pthread_cond_wait that you see in the traceback above happens to be called from the function listed in comment #11 which, when *not* inlined, made the problem go away. I ran this under valgrind using: while valgrind --error-exitcode=1 /usr/libexec/git-core/git-grep -q stat ; do : ; done I saw the two errors below very often, and didn't see any other type of error. ==2498== Thread 3: ==2498== Syscall param lstat(file_name) points to unaddressable byte(s) ==2498== at 0x3B486E1A05: _lxstat (lxstat.c:38) ==2498== by 0x42CF55: load_file (stat.h:464) ==2498== by 0x42D501: run (grep.c:211) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2498== Address 0x4f8df10 is 0 bytes inside a block of size 12 free'd ==2498== at 0x4A0662E: free (vg_replace_malloc.c:366) ==2498== by 0x42D58B: run (grep.c:178) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2498== ==2498== Syscall param open(filename) points to unaddressable byte(s) ==2498== at 0x3B48A0EDCD: ??? (syscall-template.S:82) ==2498== by 0x42CF84: load_file (fcntl2.h:54) ==2498== by 0x42D501: run (grep.c:211) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2498== Address 0x4f8df10 is 0 bytes inside a block of size 12 free'd ==2498== at 0x4A0662E: free (vg_replace_malloc.c:366) ==2498== by 0x42D58B: run (grep.c:178) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2498== ==2498== Thread 6: ==2498== Invalid free() / delete / delete[] ==2498== at 0x4A0662E: free (vg_replace_malloc.c:366) ==2498== by 0x42D582: run (grep.c:177) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2498== Address 0x4f8deb0 is 0 bytes inside a block of size 24 free'd ==2498== at 0x4A0662E: free (vg_replace_malloc.c:366) ==2498== by 0x42D582: run (grep.c:177) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2498== ==2498== Invalid free() / delete / delete[] ==2498== at 0x4A0662E: free (vg_replace_malloc.c:366) ==2498== by 0x42D58B: run (grep.c:178) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2498== Address 0x4f8df10 is 0 bytes inside a block of size 12 free'd ==2498== at 0x4A0662E: free (vg_replace_malloc.c:366) ==2498== by 0x42D58B: run (grep.c:178) ==2498== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ---------------------------------------------------------------------- ==2452== Thread 2: ==2452== Invalid read of size 1 ==2452== at 0x4C0C37: find_cached_object (cache.h:678) ==2452== by 0x4C42FA: read_object (sha1_file.c:2203) ==2452== by 0x4C4892: read_sha1_file_extended (sha1_file.c:2240) ==2452== by 0x42D3F0: lock_and_read_sha1_file (cache.h:757) ==2452== by 0x42D674: run (grep.c:371) ==2452== by 0x3B48A07D8F: start_thread (pthread_create.c:309) ==2452== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==2452== ==2452== ==2452== Process terminating with default action of signal 11 (SIGSEGV) ==2452== Access not within mapped region at address 0x0 ==2452== at 0x4C0C37: find_cached_object (cache.h:678) ==2452== by 0x4C42FA: read_object (sha1_file.c:2203) ==2452== by 0x4C4892: read_sha1_file_extended (sha1_file.c:2240) ==2452== by 0x42D3F0: lock_and_read_sha1_file (cache.h:757) ==2452== by 0x42D674: run (grep.c:371) ==2452== by 0x3B48A07D8F: start_thread (pthread_create.c:309) On my multi-core F16 desktop (linux-3.1.0-0.rc9.git0.0.fc16.x86_64), I ran coreutils' "make -j15 distcheck", and saw this sole test failure: FAIL: mkdir/t-slash (exit: 134) =============================== dispose_command: bad command type: 20 Aborting... I've never seen that before. That diagnostic comes from bash, yet I see no way in which the simple t-slash script could evoke it. Seconds later, I reran that command and got segfaults from msgmerge and make: make[5]: *** [all] Segmentation fault (core dumped) Here are the lines from dmesg: [519319.155222] msgmerge[25257]: segfault at 18 ip 00000038c561599e sp 00007ffffe6428b0 error 4 in libgettextsrc-0.18.1.so[38c5600000+3e000] [519386.303692] make[3382]: segfault at 10 ip 0000000000407fb2 sp 00007fffd74d2f80 error 4 in make[400000+29000] Repeating one more time, I see this: ... esac tar: Skipping to next header xz: coreutils-8.14.15-22f3b.tar.xz: Compressed data is corrupt tar: Exiting with failure status due to previous errors gtar: This does not look like a tar archive gtar: Exiting with failure status due to previous errors I reran the "make distcheck" command a few more times, and got those same tar/xz diagnostics consistently. Thinking that finally I might be able to debug easily, since it's all serial... Wrong. I realized that I am using a version of xz (built from git) that does multithreaded compression. Thinking threading could be the problem, I reran it like this: (retaining make's parallelism, though) (export OMP_NUM_THREADS=1; make distcheck) Now, the tar/xz failures are gone, but I see this ominous error from gcc (with nothing prior): comm.c:186: confused by earlier errors, bailing out The bug is not reproducible, so it is likely a hardware or OS problem. Finally, one more attempt (nothing else changed), and it succeeded, even without OMP_NUM_THREADS=1: And a 2nd success. And a third success. I'm going to put it in a loop and run "make distcheck" for a few hours... Two possible explanations. Some system-related problem, like whatever is causing this: http://bugzilla.redhat.com/747377 (but note I'm using the earlier glibc-2.14.90-10.x86_64, so that git threading/heap-corruption bug doesn't affect me) Or maybe it's bad memory. But since 747377 is reproducible (thanks, Rich Jones), and glibc may have merely changed something to amplify the likelihood of triggering an existing bug, I'm not going to spend hours running memtest86+ just yet. BTW, last week when I began investigating 747377, I successfully bootstrapped gcc from git and it passed most of its test cases. That's usually a good indication that RAM is ok. FYI, I've waited a while, just in case... Now I'm up to 9 consecutive "make bootstrap" successes. No failure. This bug (747377) is definitely not caused by bad RAM in my case, because I use ECC (and nothing is in the logs). I looks like a glibc bug. Maybe it's time to CC Ulrich Drepper ;-(. Discussion on the git mailing list: http://comments.gmane.org/gmane.comp.version-control.git/184184 I still don't know why downgrading glibc avoided the problem, but declaring the globals in grep.c to be volatile solves the problem for me. Patch posted here: http://thread.gmane.org/gmane.comp.version-control.git/184184/focus=184209 Jim, please file a PR with the preprocessed source of grep.c to http://gcc.gnu.org/bugzilla/ for analysis. Actually, it might be a glibc problem, http://sources.redhat.com/git/?p=glibc.git;a=commitdiff;h=aa78043a4aafe5db1a1a76d544a833b63b4c5f5c added leaf attribute, I bet it is undesirable to use leaf attribute on any of the pthread.h/sem.h synchronization primitives. Although they don't call functions from the current compilation unit, other threads might invoke functions from the current compilation unit and by using the synchronization primitives we are or might be waiting on those other threads to perform some changes in the current translation unit. Precisely. I just noticed that in the preprocessed diffs. That explains why downgrading solved the problem. For the record, here is the key part of the difference in going from the -10 to -13 version of glibc: $ diff -u builtin/grep-glibc-2.14.90-1[03].i|grep -A2 mutex_lock extern int pthread_mutex_lock (pthread_mutex_t *__mutex) - __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1))); + __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1))); Jakub's comment #22 is right: per POSIX, http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11: The following functions synchronize memory with respect to other threads: fork pthread_barrier_wait pthread_cond_broadcast pthread_cond_signal pthread_cond_timedwait pthread_cond_wait pthread_create pthread_join pthread_mutex_lock pthread_mutex_timedlock pthread_mutex_trylock pthread_mutex_unlock pthread_spin_lock pthread_spin_trylock pthread_spin_unlock pthread_rwlock_rdlock pthread_rwlock_timedrdlock pthread_rwlock_timedwrlock pthread_rwlock_tryrdlock pthread_rwlock_trywrlock pthread_rwlock_unlock pthread_rwlock_wrlock sem_post sem_timedwait sem_trywait sem_wait semctl semop wait waitpid Adding the leaf attribute would appear to violate that. Let's see what Ulrich has to say: http://sourceware.org/bugzilla/show_bug.cgi?id=13344 If there is problem with the __leaf__ attribute in glibc, does this mean that every multi-threaded package (or every C package which uses *printf, as Jakub wrote in http://sourceware.org/bugzilla/show_bug.cgi?id=13344#c2) which was built against glibc-2.14.90-12.999 and later should be rebuilt to ensure it is not affected by this bug? This seems like F16 blocker for me, nominating it... Reassigning to glibc for further inspection, as per comment #22 (In reply to comment #27) > If there is problem with the __leaf__ attribute in glibc, does this mean that > every multi-threaded package (or every C package which uses *printf, as Jakub > wrote in http://sourceware.org/bugzilla/show_bug.cgi?id=13344#c2) which was > built against glibc-2.14.90-12.999 and later should be rebuilt to ensure it is > not affected by this bug? Yes, many packages will have to be rebuilt once the error-inducing leaf attributes have been removed from glibc's headers. > This seems like F16 blocker for me, nominating it... Thanks. For anyone who just wants to get on with development using the latest glibc headers, but without the offending leaf attributes, I'm attaching a patch that changes the installed .h files. I haven't looked carefully at every pthread/semaphore/mutex-related function, but have merely removed the leaf attribute from those explicitly listed by POSIX, as well as one or two gnu-extended functions that obviously require the same treatment. Also, I haven't removed the leaf attribute from any *printf function, but it should be done, just in case. IMHO, it is far less likely that they will cause trouble: how many multi-threaded programs register a printf hook function *and* use the augmented *printf in the same compilation unit that defines the hook function? Created attachment 530272 [details]
remove some of the offending leaf attributes (incomplete)
This patch is not complete.
can we please have a summary for the normal people here? it's quite difficult to figure if this is a blocker bug based on the kind of discussion going on up there. does this bug mean we're going to have to rebuild everything that went into stable since glibc 12.999 went into the buildroot? if so, christ on a bike. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers (In reply to comment #31) > can we please have a summary for the normal people here? it's quite difficult > to figure if this is a blocker bug based on the kind of discussion going on up > there. > > does this bug mean we're going to have to rebuild everything that went into > stable since glibc 12.999 went into the buildroot? if so, christ on a bike. If we figure that glibc is buggy, then every package which is multithreaded (i.e. linked against libpthread) and was built against broken glibc needs to be rebuilt. Yes, I think this should be a blocker. glibc's addition of the leaf attribute to threading-related functions causes gcc -O2 to violate fundamental assumptions that are made in nearly every multi-threaded application. (In reply to comment #32) > (In reply to comment #31) > > can we please have a summary for the normal people here? it's quite difficult > > to figure if this is a blocker bug based on the kind of discussion going on up > > there. > > > > does this bug mean we're going to have to rebuild everything that went into > > stable since glibc 12.999 went into the buildroot? if so, christ on a bike. > > If we figure that glibc is buggy, then every package which is multithreaded > (i.e. linked against libpthread) and was built against broken glibc needs to be > rebuilt. Although pthread functions are the main issue, other functions are affected too (eg. printf in a rather rare corner case). To be on the safe side I'd recompile every package that touched glibc >= -12.999. Every C program or library that uses threading-related functions, compiled with gcc-4.6 or higher. C++ programs should be unaffected and gcc-4.5 doesn't know the leaf attribute. (In reply to comment #32) > (In reply to comment #31) ... > > does this bug mean we're going to have to rebuild everything that went into > > stable since glibc 12.999 went into the buildroot? if so, christ on a bike. > > If we figure that glibc is buggy, then every package which is multithreaded > (i.e. linked against libpthread) and was built against broken glibc needs to be > rebuilt. Checking for use of libpthread is a good start. I suspect that will catch the vast majority. But you'd also need to check for code that uses these functions: sem_post sem_timedwait sem_trywait sem_wait semctl semop and the GNU-specific one, semtimedop jim: wouldn't simply reverting the commit noted in comment #22 fix this? (In reply to comment #37) > jim: wouldn't simply reverting the commit noted in comment #22 fix this? You need to push out a new glibc with that commit reverted, then rebuild everything that was built against the broken glibc releases. The reason is that gcc sees the relaxed constraints on these pthread (and other) functions, and generates incorrect code[1]. Any code that was compiled when the glibc -12.999/-13 headers were in the koji buildroot could have thread safety and maybe other issues. [1] Example showing incorrect code generation around pthread_mutex: http://permalink.gmane.org/gmane.comp.version-control.git/184205 I know we have to rebuild everything, but for right now, I'm just trying to figure what we actually need to change in glibc first. That's step #1. Created attachment 530360 [details] patch fedora 16's glibc to revert the upstream leaf optimization Here's a proposed patch for F16. I've started an x86_64-only scratch build here: http://koji.fedoraproject.org/koji/taskinfo?taskID=3462080 This would be the only change since 2.14.90-13. okay, I've tested jim's patch. I confirmed the issue with the reproducer given in comment #10, which caused all sorts of crashes, overflowing my scrollback. Then I rebuilt glibc with the patch, rebuild git against the rebuilt glibc, re-ran the reproducer, and it completed with no crashes. looks good to me! I'll commit jim's patch and send out a glibc update, since I have provenpackager privs and he doesn't. F16 glibc build in progress: http://koji.fedoraproject.org/koji/taskinfo?taskID=3462105 It's already widely agreed that this is a blocker (we have +1s from me, dgilmore, jsmith and rbergeron, for the record), so setting AcceptedBlocker. glibc-2.14.90-14 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/glibc-2.14.90-14 confirmed that git rebuilt with the actual glibc build that I've submitted as an update works well. I think we can go ahead with the mass rebuild. (In reply to comment #45) > confirmed that git rebuilt with the actual glibc build that I've submitted as > an update works well. I think we can go ahead with the mass rebuild. Is there a Koji build of this git somewhere? If so I could test it on my Fedora box where the bug was highly reproducible. Not right now, no. I just built it locally. I could kick off a scratch build, though. just a sec. http://koji.fedoraproject.org/koji/taskinfo?taskID=3462339 try that, when it's done. (In reply to comment #47) > Not right now, no. I just built it locally. I could kick off a scratch build, > though. just a sec. > > http://koji.fedoraproject.org/koji/taskinfo?taskID=3462339 > > try that, when it's done. Confirmed that this package fixes the problem I was seeing. agg-2.5-12.fc16,aisleriot-3.2.1-2.fc16,anaconda-16.24-2.fc16,and-1.2.2-14.fc16,apiextractor-0.10.8-2.fc16,apper-0.7.1-0.4.20111021.fc16,arc-5.21o-9.fc16,at-3.1.13-3.fc16,atk-2.2.0-2.fc16,at-spi-1.32.0-6.fc16,at-spi2-atk-2.2.1-2.fc16,at-spi2-core-2.2.1-2.fc16,balance-3.52-3.fc16,clutter-1.8.2-2.fc16,cogl-1.8.2-2.fc16,contacts-0.12-12.fc16,control-center-3.2.1-2.fc16,cronie-1.4.8-10.fc16,cheese-3.2.1-2.fc16,dar-2.3.8-7.fc16,dhcp-4.2.3-3.fc16,dia-0.97-6.fc16,dialog-1.1-14.20110707.fc16,eb-4.4.1-3.fc16,eet-1.4.1-3.fc16,empathy-3.2.1.1-2.fc16,eog-3.2.1-2.fc16,epiphany-3.2.1-2.fc16,evince-3.2.1-2.fc16,file-5.07-6.fc16,file-roller-3.2.1-2.fc16,folks-0.6.4.1-2.fc16,freeipa-2.1.3-5.fc16,freetype-2.4.6-3.fc16,garcon-0.1.9-2.fc16,gc-7.2-0.5.alpha6.fc16,gcc-python-plugin-0.6-4.1.fc16,gconfmm26-2.28.3-2.fc16,gd-2.0.35-13.fc16,gdl-0.9.1-5.fc16,gdlmm-3.2.1-2.fc16,gdm-3.2.1.1-5.fc16,generatorrunner-0.6.14-2.fc16,gigolo-0.4.1-4.fc16,git-1.7.7-2.fc16,glib-1.2.10-35.fc16,glib-networking-2.30.1-2.fc16,gnome-applets-3.2.1-2.fc16,gnome-bluetooth-3.2.1-2.fc16,gnome-contacts-3.2.2-2.fc16,gnome-desktop-2.32.0-9.fc16,gnome-desktop3-3.2.1-2.fc16,gnome-games-3.2.1-2.fc16,gnome-keyring-3.2.1-2.fc16,gnome-online-accounts-3.2.1-2.fc16,gnome-panel-3.2.1-2.fc16,gnome-python2-2.28.1-5.fc16,gnome-python2-extras-2.25.3-37.fc16,gnome-session-3.2.1-2.fc16,gnome-settings-daemon-3.2.1-4.fc16,gnome-shell-3.2.1-2.fc16,gnome-system-monitor-3.2.1-2.fc16,gnome-terminal-3.2.1-2.fc16,gnome-themes-2.32.0-7.fc16,gnome-themes-standard-3.2.1-2.fc16,gnome-utils-3.2.1-2.fc16,grub-0.97-84.fc16,grub2-1.99-12.fc16,gt-0.4-13.fc16,gtk2-2.24.7-2.fc16,gucharmap-3.2.1-2.fc16,gv-3.7.2-2.fc16,gvfs-1.10.1-2.fc16,hplip-3.11.10-7.fc16,ibus-1.4.0-6.fc16,iok-1.3.13-2.fc16,irqbalance-1.0-4.fc16,kdebase-4.7.2-3.fc16,kdebase-workspace-4.7.2-9.fc16,kde-plasma-networkmanagement-0.9-0.64.beta2.nm09.fc16,kernel-3.1.0-5.fc16,konversation-1.3.1-6.fc16,krb5-1.9.1-18.fc16,krb5-auth-dialog-3.2.1-2.fc16,libdrm-2.4.26-3.fc16,libsoup-2.36.1-2.fc16,libwnck-2.30.7-2.fc16,libwnck3-3.2.1-2.fc16,lorax-16.4.7-2.fc16,make-3.82-7.fc16,mc-4.8.0-2.fc16,mdadm-3.2.2-13.fc16,midori-0.4.1-3.fc16,mingw32-qt-4.8.0-0.2.rc1.fc16,mingw32-qt-qmake-4.8.0-0.2.rc1.fc16,mm-1.4.2-8.fc16,mon-1.2.0-8.fc16,monit-5.2.5-2.fc16,mousetweaks-3.2.1-2.fc16,mutt-1.5.21-7.fc16,mutter-3.2.1-2.fc16,nautilus-3.2.1-2.fc16,nc-1.101-3.fc16,NetworkManager-0.9.1.90-5.git20110927.fc16,notification-daemon-0.7.3-2.fc16,orage-4.8.2-3.fc16,orc-0.4.16-4.fc16,orca-3.2.1-2.fc16,oxygen-gtk-1.1.4-2.fc16,pan-0.135-2.fc16,pl-5.10.2-7.fc16.1,polkit-0.102-3.fc16,polkit-gnome-0.104-2.fc16,ppl-0.11.2-3.fc16,python-2.7.2-5.2.fc16,python-py-1.4.5-4.fc16,python-pyside-1.0.8-2.fc16,q-7.11-11.fc16,qpid-cpp-0.12-4.fc16.2,qt-4.8.0-0.18.rc1.fc16,quagga-0.99.20-3.fc16,ristretto-0.2.1-2.fc16,rmap-1.2-9.fc16,seahorse-3.2.1-2.fc16,shiboken-1.0.9-2.fc16,sks-1.1.1-7.fc16,sl-3.03-11.fc16,soprano-2.7.2-2.fc16,sssd-1.6.2-5.fc16,st-0.1.1-3.fc16,strigi-0.7.6-4.fc16,sushi-0.2.1-2.fc16,tin-1.8.3-9.fc16,tk-8.5.10-2.fc16,tor-0.2.2.33-1601.fc16,totem-3.2.1-2.fc16,trac-0.12.2-6.fc16,transmission-2.42-2.fc16,tre-0.8.0-4.fc16,udisks-1.0.4-3.fc16,util-linux-2.20.1-2.fc16,vinagre-3.2.1-2.fc16,vino-3.2.1-2.fc16,vte3-0.30.1-2.fc16,wireshark-1.6.2-4.fc16,xfce4-panel-4.8.6-3.fc16,yelp-3.2.1-2.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/agg-2.5-12.fc16,aisleriot-3.2.1-2.fc16,anaconda-16.24-2.fc16,and-1.2.2-14.fc16,apiextractor-0.10.8-2.fc16,apper-0.7.1-0.4.20111021.fc16,arc-5.21o-9.fc16,at-3.1.13-3.fc16,atk-2.2.0-2.fc16,at-spi-1.32.0-6.fc16,at-spi2-atk-2.2.1-2.fc16,at-spi2-core-2.2.1-2.fc16,balance-3.52-3.fc16,clutter-1.8.2-2.fc16,cogl-1.8.2-2.fc16,contacts-0.12-12.fc16,control-center-3.2.1-2.fc16,cronie-1.4.8-10.fc16,cheese-3.2.1-2.fc16,dar-2.3.8-7.fc16,dhcp-4.2.3-3.fc16,dia-0.97-6.fc16,dialog-1.1-14.20110707.fc16,eb-4.4.1-3.fc16,eet-1.4.1-3.fc16,empathy-3.2.1.1-2.fc16,eog-3.2.1-2.fc16,epiphany-3.2.1-2.fc16,evince-3.2.1-2.fc16,file-5.07-6.fc16,file-roller-3.2.1-2.fc16,folks-0.6.4.1-2.fc16,freeipa-2.1.3-5.fc16,freetype-2.4.6-3.fc16,garcon-0.1.9-2.fc16,gc-7.2-0.5.alpha6.fc16,gcc-python-plugin-0.6-4.1.fc16,gconfmm26-2.28.3-2.fc16,gd-2.0.35-13.fc16,gdl-0.9.1-5.fc16,gdlmm-3.2.1-2.fc16,gdm-3.2.1.1-5.fc16,generatorrunner-0.6.14-2.fc16,gigolo-0.4.1-4.fc16,git-1.7.7-2.fc16,glib-1.2.10-35.fc16,glib-networking-2.30.1-2.fc16,gnome-applets-3.2.1-2.fc16,gnome-bluetooth-3.2.1-2.fc16,gnome-contacts-3.2.2-2.fc16,gnome-desktop-2.32.0-9.fc16,gnome-desktop3-3.2.1-2.fc16,gnome-games-3.2.1-2.fc16,gnome-keyring-3.2.1-2.fc16,gnome-online-accounts-3.2.1-2.fc16,gnome-panel-3.2.1-2.fc16,gnome-python2-2.28.1-5.fc16,gnome-python2-extras-2.25.3-37.fc16,gnome-session-3.2.1-2.fc16,gnome-settings-daemon-3.2.1-4.fc16,gnome-shell-3.2.1-2.fc16,gnome-system-monitor-3.2.1-2.fc16,gnome-terminal-3.2.1-2.fc16,gnome-themes-2.32.0-7.fc16,gnome-themes-standard-3.2.1-2.fc16,gnome-utils-3.2.1-2.fc16,grub-0.97-84.fc16,grub2-1.99-12.fc16,gt-0.4-13.fc16,gtk2-2.24.7-2.fc16,gucharmap-3.2.1-2.fc16,gv-3.7.2-2.fc16,gvfs-1.10.1-2.fc16,hplip-3.11.10-7.fc16,ibus-1.4.0-6.fc16,iok-1.3.13-2.fc16,irqbalance-1.0-4.fc16,kdebase-4.7.2-3.fc16,kdebase-workspace-4.7.2-9.fc16,kde-plasma-networkmanagement-0.9-0.64.beta2.nm09.fc16,kernel-3.1.0-5.fc16,konversation-1.3.1-6.fc16,krb5-1.9.1-18.fc16,krb5-auth-dialog-3.2.1-2.fc16,libdrm-2.4.26-3.fc16,libsoup-2.36.1-2.fc16,libwnck-2.30.7-2.fc16,libwnck3-3.2.1-2.fc16,lorax-16.4.7-2.fc16,make-3.82-7.fc16,mc-4.8.0-2.fc16,mdadm-3.2.2-13.fc16,midori-0.4.1-3.fc16,mingw32-qt-4.8.0-0.2.rc1.fc16,mingw32-qt-qmake-4.8.0-0.2.rc1.fc16,mm-1.4.2-8.fc16,mon-1.2.0-8.fc16,monit-5.2.5-2.fc16,mousetweaks-3.2.1-2.fc16,mutt-1.5.21-7.fc16,mutter-3.2.1-2.fc16,nautilus-3.2.1-2.fc16,nc-1.101-3.fc16,NetworkManager-0.9.1.90-5.git20110927.fc16,notification-daemon-0.7.3-2.fc16,orage-4.8.2-3.fc16,orc-0.4.16-4.fc16,orca-3.2.1-2.fc16,oxygen-gtk-1.1.4-2.fc16,pan-0.135-2.fc16,pl-5.10.2-7.fc16.1,polkit-0.102-3.fc16,polkit-gnome-0.104-2.fc16,ppl-0.11.2-3.fc16,python-2.7.2-5.2.fc16,python-py-1.4.5-4.fc16,python-pyside-1.0.8-2.fc16,q-7.11-11.fc16,qpid-cpp-0.12-4.fc16.2,qt-4.8.0-0.18.rc1.fc16,quagga-0.99.20-3.fc16,ristretto-0.2.1-2.fc16,rmap-1.2-9.fc16,seahorse-3.2.1-2.fc16,shiboken-1.0.9-2.fc16,sks-1.1.1-7.fc16,sl-3.03-11.fc16,soprano-2.7.2-2.fc16,sssd-1.6.2-5.fc16,st-0.1.1-3.fc16,strigi-0.7.6-4.fc16,sushi-0.2.1-2.fc16,tin-1.8.3-9.fc16,tk-8.5.10-2.fc16,tor-0.2.2.33-1601.fc16,totem-3.2.1-2.fc16,trac-0.12.2-6.fc16,transmission-2.42-2.fc16,tre-0.8.0-4.fc16,udisks-1.0.4-3.fc16,util-linux-2.20.1-2.fc16,vinagre-3.2.1-2.fc16,vino-3.2.1-2.fc16,vte3-0.30.1-2.fc16,wireshark-1.6.2-4.fc16,xfce4-panel-4.8.6-3.fc16,yelp-3.2.1-2.fc16 FYI, fixed upstream like this: http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=3871f58f065dac3917eb18220a479e9591769c8c We need to just double-check this is fixed in a TC3 fresh install (maybe via the git test) and then set VERIFIED. *** Bug 749730 has been marked as a duplicate of this bug. *** TC3 looks good here. glibc-2.14.90-14 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report. agg-2.5-12.fc16, aisleriot-3.2.1-2.fc16, anaconda-16.24-2.fc16, and-1.2.2-14.fc16, apiextractor-0.10.8-2.fc16, apper-0.7.1-0.4.20111021.fc16, arc-5.21o-9.fc16, at-3.1.13-3.fc16, atk-2.2.0-2.fc16, at-spi-1.32.0-6.fc16, at-spi2-atk-2.2.1-2.fc16, at-spi2-core-2.2.1-2.fc16, balance-3.52-3.fc16, clutter-1.8.2-2.fc16, cogl-1.8.2-2.fc16, contacts-0.12-12.fc16, control-center-3.2.1-2.fc16, cronie-1.4.8-10.fc16, cheese-3.2.1-2.fc16, dar-2.3.8-7.fc16, dhcp-4.2.3-3.fc16, dia-0.97-6.fc16, dialog-1.1-14.20110707.fc16, eb-4.4.1-3.fc16, eet-1.4.1-3.fc16, empathy-3.2.1.1-2.fc16, eog-3.2.1-2.fc16, epiphany-3.2.1-2.fc16, evince-3.2.1-2.fc16, file-5.07-6.fc16, file-roller-3.2.1-2.fc16, folks-0.6.4.1-2.fc16, freeipa-2.1.3-5.fc16, freetype-2.4.6-3.fc16, garcon-0.1.9-2.fc16, gc-7.2-0.5.alpha6.fc16, gcc-python-plugin-0.6-4.1.fc16, gconfmm26-2.28.3-2.fc16, gd-2.0.35-13.fc16, gdl-0.9.1-5.fc16, gdlmm-3.2.1-2.fc16, gdm-3.2.1.1-5.fc16, generatorrunner-0.6.14-2.fc16, gigolo-0.4.1-4.fc16, git-1.7.7-2.fc16, glib-1.2.10-35.fc16, glib-networking-2.30.1-2.fc16, gnome-applets-3.2.1-2.fc16, gnome-bluetooth-3.2.1-2.fc16, gnome-contacts-3.2.2-2.fc16, gnome-desktop-2.32.0-9.fc16, gnome-desktop3-3.2.1-2.fc16, gnome-games-3.2.1-2.fc16, gnome-keyring-3.2.1-2.fc16, gnome-online-accounts-3.2.1-2.fc16, gnome-panel-3.2.1-2.fc16, gnome-python2-2.28.1-5.fc16, gnome-python2-extras-2.25.3-37.fc16, gnome-session-3.2.1-2.fc16, gnome-settings-daemon-3.2.1-4.fc16, gnome-shell-3.2.1-2.fc16, gnome-system-monitor-3.2.1-2.fc16, gnome-terminal-3.2.1-2.fc16, gnome-themes-2.32.0-7.fc16, gnome-themes-standard-3.2.1-2.fc16, gnome-utils-3.2.1-2.fc16, grub-0.97-84.fc16, grub2-1.99-12.fc16, gt-0.4-13.fc16, gtk2-2.24.7-2.fc16, gucharmap-3.2.1-2.fc16, gv-3.7.2-2.fc16, gvfs-1.10.1-2.fc16, hplip-3.11.10-7.fc16, ibus-1.4.0-6.fc16, iok-1.3.13-2.fc16, irqbalance-1.0-4.fc16, kdebase-4.7.2-3.fc16, kdebase-workspace-4.7.2-9.fc16, kde-plasma-networkmanagement-0.9-0.64.beta2.nm09.fc16, kernel-3.1.0-5.fc16, konversation-1.3.1-6.fc16, krb5-1.9.1-18.fc16, krb5-auth-dialog-3.2.1-2.fc16, libdrm-2.4.26-3.fc16, libsoup-2.36.1-2.fc16, libwnck-2.30.7-2.fc16, libwnck3-3.2.1-2.fc16, lorax-16.4.7-2.fc16, make-3.82-7.fc16, mc-4.8.0-2.fc16, mdadm-3.2.2-13.fc16, midori-0.4.1-3.fc16, mingw32-qt-4.8.0-0.2.rc1.fc16, mingw32-qt-qmake-4.8.0-0.2.rc1.fc16, mm-1.4.2-8.fc16, mon-1.2.0-8.fc16, monit-5.2.5-2.fc16, mousetweaks-3.2.1-2.fc16, mutt-1.5.21-7.fc16, mutter-3.2.1-2.fc16, nautilus-3.2.1-2.fc16, nc-1.101-3.fc16, NetworkManager-0.9.1.90-5.git20110927.fc16, notification-daemon-0.7.3-2.fc16, orage-4.8.2-3.fc16, orc-0.4.16-4.fc16, orca-3.2.1-2.fc16, oxygen-gtk-1.1.4-2.fc16, pan-0.135-2.fc16, pl-5.10.2-7.fc16.1, polkit-0.102-3.fc16, polkit-gnome-0.104-2.fc16, ppl-0.11.2-3.fc16, python-2.7.2-5.2.fc16, python-py-1.4.5-4.fc16, python-pyside-1.0.8-2.fc16, q-7.11-11.fc16, qpid-cpp-0.12-4.fc16.2, qt-4.8.0-0.18.rc1.fc16, quagga-0.99.20-3.fc16, ristretto-0.2.1-2.fc16, rmap-1.2-9.fc16, seahorse-3.2.1-2.fc16, shiboken-1.0.9-2.fc16, sks-1.1.1-7.fc16, sl-3.03-11.fc16, soprano-2.7.2-2.fc16, sssd-1.6.2-5.fc16, st-0.1.1-3.fc16, strigi-0.7.6-4.fc16, sushi-0.2.1-2.fc16, tin-1.8.3-9.fc16, tk-8.5.10-2.fc16, tor-0.2.2.33-1601.fc16, totem-3.2.1-2.fc16, trac-0.12.2-6.fc16, transmission-2.42-2.fc16, tre-0.8.0-4.fc16, udisks-1.0.4-3.fc16, util-linux-2.20.1-2.fc16, vinagre-3.2.1-2.fc16, vino-3.2.1-2.fc16, vte3-0.30.1-2.fc16, wireshark-1.6.2-4.fc16, xfce4-panel-4.8.6-3.fc16, yelp-3.2.1-2.fc16, mesa-7.11-7.fc16, xorg-x11-drv-qxl-0.0.21-7.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report. glibc-2.14.90-15.1 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/glibc-2.14.90-15.1 glibc-2.14.90-18 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/glibc-2.14.90-18 |