Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1303323

Summary: tcsh: interposed malloc is not ABI-compliant due to lack of alignment
Product: [Fedora] Fedora Reporter: Joachim Frieben <jfrieben>
Component: tcshAssignee: David Kaspar // Dee'Kej <deekej>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: high    
Version: 24CC: arjun.is, codonell, deekej, dj, fpokorny, fweimer, goeran, herrold, jakub, jared, jchaloup, kdudka, kevin.paetzold, law, marc.c.dionne, mfabian, nalin, ovasik, pfrankli, praiskup, releng, rkollar, siddhesh, yselkowi
Target Milestone: ---Keywords: Patch
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/80466500043e2a67ec01fd560d3941c0635eebb3
Whiteboard: abrt_hash:21d2ae42a8b0d74075df03753046e74e359f85f0;VARIANT_ID=workstation;
Fixed In Version: tcsh-6.19.00-7.fc24 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-12 01:31:21 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1305208, 1315713    
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: dso_list
none
File: environ
none
File: exploitable
none
File: limits
none
File: maps
none
File: mountinfo
none
File: open_fds
none
File: proc_pid_status
none
File: var_log_messages
none
Fix detection of system malloc (#1303323, #1308177) none

Description Joachim Frieben 2016-01-30 19:24:46 UTC
Version-Release number of selected component:
tcsh-6.19.00-4.fc24

Additional info:
reporter:       libreport-2.6.3
backtrace_rating: 4
cmdline:        -sh
crash_function: __strftime_internal
executable:     /usr/bin/tcsh
global_pid:     2974
kernel:         4.5.0-0.rc1.git1.2.fc24.x86_64
runlevel:       N 5
type:           CCpp
uid:            1000

Truncated backtrace:
Thread no. 1 (1 frames)
 #0 __strftime_internal at strftime_l.c:914

Comment 1 Joachim Frieben 2016-01-30 19:25:00 UTC
Created attachment 1119658 [details]
File: backtrace

Comment 2 Joachim Frieben 2016-01-30 19:25:02 UTC
Created attachment 1119659 [details]
File: cgroup

Comment 3 Joachim Frieben 2016-01-30 19:25:04 UTC
Created attachment 1119660 [details]
File: core_backtrace

Comment 4 Joachim Frieben 2016-01-30 19:25:05 UTC
Created attachment 1119661 [details]
File: dso_list

Comment 5 Joachim Frieben 2016-01-30 19:25:07 UTC
Created attachment 1119662 [details]
File: environ

Comment 6 Joachim Frieben 2016-01-30 19:25:09 UTC
Created attachment 1119663 [details]
File: exploitable

Comment 7 Joachim Frieben 2016-01-30 19:25:10 UTC
Created attachment 1119664 [details]
File: limits

Comment 8 Joachim Frieben 2016-01-30 19:25:12 UTC
Created attachment 1119665 [details]
File: maps

Comment 9 Joachim Frieben 2016-01-30 19:25:14 UTC
Created attachment 1119666 [details]
File: mountinfo

Comment 10 Joachim Frieben 2016-01-30 19:25:16 UTC
Created attachment 1119667 [details]
File: open_fds

Comment 11 Joachim Frieben 2016-01-30 19:25:18 UTC
Created attachment 1119668 [details]
File: proc_pid_status

Comment 12 Joachim Frieben 2016-01-30 19:25:20 UTC
Created attachment 1119669 [details]
File: var_log_messages

Comment 13 Joachim Frieben 2016-02-02 18:19:15 UTC
The crash occurs every time when the tab key triggering file-name completion is pressed.

Comment 14 Nalin Dahyabhai 2016-02-04 09:32:22 UTC
In case it helps, I started seeing this after updating to glibc-2.22.90-31.fc24.  After backing down to 2.22.90-29.fc24, new shells didn't exhibit the problem.

Comment 15 Jan Kurik 2016-02-24 14:22:28 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase

Comment 16 Joachim Frieben 2016-03-05 08:57:09 UTC
* Thu Jan 28 2016 Florian Weimer <fweimer redhat com> - 2.22.90-31
- Add workaround for GCC PR69537.

GCC PR69537 was fixed back in gcc-6.0.0-0.7.fc24. Please drop your patch: it is not necessary any longer and breaks the C-Shell by causing a SIGSEV when using file-name completion, thanks!

Comment 17 Florian Weimer 2016-03-07 11:06:27 UTC
This is a stack misalignment issue, now observable because glibc was recompiled with a different GCC version.  I have not fully isolated the cause yet.

Comment 18 Joachim Frieben 2016-03-07 11:32:40 UTC
(In reply to Florian Weimer from comment #17)
I see: the switch to GCC6 happened just around that date (Thu Jan 28 2016). It seems that tcsh-6.19.00-5.fc24 later failed the mass rebuild for Fedora 24 which means that tcsh-6.19.00-4.fc24 currently delivered is still the package built with gcc-5.3.1-3.fc24. How about fixing that one?

Comment 19 Florian Weimer 2016-03-07 11:56:18 UTC
I have to take that back.  tcsh interposes its own malloc, but this implementation doesn't follow the x86_64 psABI.  It returns pointers which are not aligned to 16 bytes:

Breakpoint 1, malloc (nbytes=256) at tc.alloc.c:177
177     {
(gdb) finish
Run till exit from #0  malloc (nbytes=256) at tc.alloc.c:177
0x00005555555a0729 in Strbuf_store1 (buf=0x7fffffffd630, c=0 L'\000') at tc.str.c:699
699     DO_STRBUF(Strbuf, Char, Strlen);
Value returned is $1 = (void *) 0x555555876008

Please remove this malloc implementation.

Comment 20 Florian Weimer 2016-03-07 14:41:50 UTC
*** Bug 1308177 has been marked as a duplicate of this bug. ***

Comment 21 Yaakov Selkowitz 2016-03-07 17:03:25 UTC
Created attachment 1133876 [details]
Fix detection of system malloc (#1303323, #1308177)

(In reply to Florian Weimer from comment #19)
> I have to take that back.  tcsh interposes its own malloc, but this
> implementation doesn't follow the x86_64 psABI.

Thanks for tracking this down.  The builtin malloc is intended for 
systems without their own, and shouldn't be used with glibc, per the 
following in config_f.h:


However, nothing at this point defines __GLIBC__, as it is not a 
compiler built-in but rather an ordinary define in <features.h>.

Patch attached.

Comment 22 Yaakov Selkowitz 2016-03-07 17:21:01 UTC
(In reply to Yaakov Selkowitz from comment #21)
> Thanks for tracking this down.  The builtin malloc is intended for 
> systems without their own, and shouldn't be used with glibc, per the 
> following in config_f.h:

#if defined(__MACHTEN__) || defined(PURIFY) || defined(MALLOC_TRACE) || #defined(_OSD_POSIX) || defined(__MVS__) || defined (__CYGWIN__) || #defined(__GLIBC__) || defined(__OpenBSD__) || defined(__APPLE__)
# define SYSMALLOC
#else
# undef SYSMALLOC
#endif

(Unfortunately git-bz can't tell the difference between a bash comment and a quoted preprocessor directive. :-)

Comment 23 Pavel Raiskup 2016-03-08 08:43:20 UTC
FTR, there used to be (or still is) a glibc issue with ASRL, that is why we
use SYSMALLOC in el6 now.

Comment 24 Carlos O'Donell 2016-03-08 12:50:17 UTC
(In reply to Pavel Raiskup from comment #23)
> FTR, there used to be (or still is) a glibc issue with ASRL, that is why we
> use SYSMALLOC in el6 now.

This doesn't look like a glibc issue. It looks like a kernel VA layout issue, which none of userspace has any control over and needs to be fixed in the kernel.

Comment 25 Carlos O'Donell 2016-03-08 13:59:11 UTC
(In reply to Carlos O'Donell from comment #24)
> (In reply to Pavel Raiskup from comment #23)
> > FTR, there used to be (or still is) a glibc issue with ASRL, that is why we
> > use SYSMALLOC in el6 now.
> 
> This doesn't look like a glibc issue. It looks like a kernel VA layout
> issue, which none of userspace has any control over and needs to be fixed in
> the kernel.

If you have any problems with glibc's allocator please file a ticket and we'll be more than happy to investigate. On the glibc team, DJ Delorie has been working on a project to enhance malloc (we have a thread-local cache added now to get performance up to the levels tcmalloc and jemalloc have) and any input on requirements would really help us now.

Comment 26 Yaakov Selkowitz 2016-03-08 19:05:32 UTC
The OP has reported this issue upstream (thanks!) and my patch is now upstream:

https://github.com/tcsh-org/tcsh/commit/b2c7dbcf2b32ad5ad6dec5575fb630180677555a

Comment 27 jared mauch 2016-04-05 12:56:09 UTC
*** Bug 1321141 has been marked as a duplicate of this bug. ***

Comment 28 Marc Dionne 2016-04-20 15:51:36 UTC
Any timeline for getting a fix for this pushed out?  As things stand tcsh is basically unusable as an interactive shell in fedora 24.

The commit referenced in comment 26 does fix the issue for me, as did changing ROUNDUP to 15 in the builtin allocator to force 16 byte alignment.

Comment 29 David Kaspar // Dee'Kej 2016-04-20 17:19:18 UTC
Hello,

I'm currently doing a cleanup of the 'tcsh' package for Fedora 24. Unfortunately, the new package won't be avaiable for beta in time, but I'm dedicating most of my time for it to make it in F24, with some additional important fixes.

I am sorry for the inconvenience.

Best regards,

Dee'Kej

Comment 30 Joachim Frieben 2016-04-20 17:25:12 UTC
(In reply to David Kaspar [Dee'Kej] from comment #29)
A preliminary build fixing this crasher bug would be very helpful; less important modifications can be implemented in future builds ..

Comment 31 jared mauch 2016-05-02 23:54:53 UTC
Did someone make a build yet?  Seems a simple fix, I guess I should rebuild with the patch?

Comment 32 David Kaspar // Dee'Kej 2016-05-03 15:36:52 UTC
Here's the build for rawhide:
http://koji.fedoraproject.org/koji/taskinfo?taskID=13904702

I will get it into F24 today.

Comment 33 Fedora Update System 2016-05-03 16:34:37 UTC
tcsh-6.19.00-7.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-2417b6677a

Comment 34 Fedora Update System 2016-05-04 14:29:30 UTC
tcsh-6.19.00-7.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-2417b6677a

Comment 35 Fedora Update System 2016-05-12 01:31:07 UTC
tcsh-6.19.00-7.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.