Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 203915

Summary: 64-bit unclean code in evolution
Product: [Fedora] Fedora Reporter: Paul F. Johnson <paul>
Component: evolutionAssignee: Matthew Barnes <mbarnes>
Status: CLOSED RAWHIDE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 6CC: desktop-bugs, gnomeuser, jakub, redhat
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: evolution-data-server-1.8.0-11.fc6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-27 14:37:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150224    
Attachments:
Description Flags
Patch to clean up 64-bit unclean code none

Description Paul F. Johnson 2006-08-24 13:58:37 UTC
Description of problem:
Since the update to the new -
23 version of glibc (and the 
newer versions of Evolution 
from the 23rd Aug), Evolution 
fails to work. It starts but 
quickly begins to claim 90% 
of the processor time.

The software has not crashed, 
it's just sitting there.

Version-Release number of 
selected component (if 
applicable):
$ rpm -q evolution glibc
evolution-2.7.92-4.fc6
glibc-2.4.90-23
glibc-2.4.90-23

How reproducible:
Always

Steps to Reproduce:
1. Start evolution from 
either the command line or 
from the desktop icon
2.
3.
  
Actual results:
Evolution takes upto 90% of 
the processing time

Running a debugger through 
gives the following

1) Starting evolution hangs 
in an endless loop of

mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)
mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)
mmap(NULL, 68719480832, 
PROT_READ|PROT_WRITE,  
MAP_PRIVATE|MAP_ANONYMOUS, -
1, 0) = -1
ENOMEM (Cannot allocate 
memory)

This comes just after it 
tries to read any  
~/.evolution/mail/local/
*.cmeta-file

Deleting all the .cmeta-files 
will at least let evolution 
start.


2) After starting up 
evolution (deleting the
 .cmeta-files) it is still  
unusable.
It is using 90% cpu, but now 
there is no apparent reason 
for the hang.
After abount 30 minutes of 
waiting for any folder to 
show up, I just  
had to kill it to be
able to use the machine at 
all.

An strace of the process now 
onlys shows

poll([{fd=5, events=POLLIN}, 
{fd=4, events=POLLIN, 
revents=POLLIN}, {fd=9,
events=POLLIN|POLLPRI}, 
{fd=11, events=POLLIN|
POLLPRI}, {fd=27,  
events=POLLIN}, {fd=19,
events=POLLIN}, {fd=23, 
events=POLLIN}], 7, 12952) = 1
ioctl(4, FIONREAD, 
[32])                = 0
read(4, "\241 
U7z\0\300\2\17\1\0\0\26\1\0\0\254\r\0\0z\0\300\2\0"...,
 32) = 32
write(4, 
"\31\0\v\0M\0\0\0\0\0\30\0! 
f\0M\0\0\0\17\1\0\0\26\1\0\0"...,
  
44) = 44
ioctl(4, FIONREAD, 
[0])                 = 0
ioctl(4, FIONREAD, 
[0])                 = 0
poll([{fd=5, events=POLLIN}, 
{fd=4, events=POLLIN, 
revents=POLLIN}, {fd=9,
events=POLLIN|POLLPRI}, 
{fd=11, events=POLLIN|
POLLPRI}, {fd=27,  
events=POLLIN}, {fd=19,
events=POLLIN}, {fd=23, 
events=POLLIN}], 7, 7924) = 1
ioctl(4, FIONREAD, 
[32])                = 0
read(4, "\241 
V7z\0\300\2\17\1\0\0\26\1\0\0\255\r\0\0z\0\300\2\0"...,
 32) = 32
write(4, 
"\31\0\v\0M\0\0\0\0\0\30\0! 
f\0M\0\0\0\17\1\0\0\26\1\0\0"...,
  
44) = 44
ioctl(4, FIONREAD, 
[0])                 = 0
ioctl(4, FIONREAD, 
[0])                 = 0

With a new poll about every 2 
or 3 seconds.

These may be unrelated.

Expected results:
Evolution should give it's 
usual high standard open 
source goodness as a usenet 
and email client.

Additional info:
The problem is only seen on 
x86_64

Comment 1 Jakub Jelinek 2006-08-24 15:59:52 UTC
There seem to be several bugs.
One is that evolution calls malloc (0x1000000030).
camel_object_get_type has (%rax is 1):
lea -20(%rax), %edi    ! Note, 32-bit result
shl $0x4, %rdi
add $0x160, %rdi
and passes that as argument to g_try_malloc.
This corresponds to:
426             guint32 i, count, version;
...
459                     /* we batch up the properties and set them in one go */
460                     if (!(argv = g_try_malloc (sizeof (*argv) + (count -
CAMEL_ARGV_MAX) * sizeof (argv->argv[0]))))
461                             return -1;

which is clearly 64-bit unclean code.  Either you want to allocate at least
sizeof (*argv), then it should be (MIN (count, CAMEL_ARGV_MAX) - CAMEL_ARGV_MAX)
* sizeof (argv->argv[0]) or something similar, or it wants to allocate sometimes
less than sizeof (*argv), in that case it could use something like
offsetof (CamelArgV, argv) + count * CAMEL_ARGV_MAX.
Or use signed count.

Another one is glibc behavior when malloc is called with this and there isn't
enough memory, will look at that now.

Comment 2 Paul F. Johnson 2006-08-26 22:09:53 UTC
Seems to be happy with the glibc update on 26th Aug.

Comment 3 Paul F. Johnson 2006-08-31 01:10:26 UTC
Okay, it's not crashed and burned. Closing the bug

Comment 4 Jakub Jelinek 2006-09-12 14:13:24 UTC
This shouldn't be closed until evolution side is fixed.  Perhaps it doesn't
crash or hang if it the malloc fails, but the 64-bit unclean code is still
causing an allocation which will very likely not succeed ever.

Comment 5 Matthias Clasen 2006-09-26 04:21:55 UTC
Matt, did you investigate the question in comment #1 ?
Do we want to allocate at most (but sometimes less than) CAMEL_ARGV_MAX,
or what is the goal here ?

It appears to me that we should maybe change the array in CamelArgV to be
flexible, and do away with CAMEL_ARGV_MAX altogether ? Then we would simply do

g_try_malloc (sizeof(CamelArgV) + count * sizeof(CamelArg))

Comment 6 Matthew Barnes 2006-09-26 19:45:12 UTC
Wow, what a mess.  Why couldn't they just use GValueArray for this stuff?

I don't see how CAMEL_ARGV_MAX is actually limiting anything as far as memory
allocation goes.  It looks like the following expression is intended to come out
negative:

   (count - CAMEL_ARGV_MAX) * sizeof (argv->argv[0])

But, for example, if count == (CAMEL_ARGV_MAX + 1) it works out to:

   (1) * sizeof (argv->argv[0])

which is positive.  So the allocation limit is bypassed.

I spotted two other places in camel-object.c that use this expression.

Comment 7 Matthew Barnes 2006-09-26 20:09:20 UTC
I forgot to clarify that the intent of the expression seems to be to allocate
just enough memory to hold a given number of CamelArgs.

I like Matthias' idea of using a flexible array, but I think I'll save that for
post-FC6 (eventually I'd like to rewrite the entire Camel vararg mechanism).

For now I'll take a more conservative approach that actually enforces the limit:

   count = MIN (count, CAMEL_ARGV_MAX);
   argv = g_try_malloc (sizeof (CamelArgV) -
           (CAMEL_ARGV_MAX - count) * sizeof (CamelArg));

Comment 8 Matthew Barnes 2006-09-27 02:17:40 UTC
Created attachment 137194 [details]
Patch to clean up 64-bit unclean code

Comment 9 Matthew Barnes 2006-09-27 14:37:51 UTC
Fixed in evolution-data-server-1.8.0-11.fc6

Comment 10 Nicole Dai 2006-10-13 06:09:18 UTC
Retest with evolution-2.8.0-6.fc6, evolution-data-server-1.8.0-11.fc6 and
glibc-2.4.90-36 by the following steps:
1. Launch evolution via command or Email icon on panel
2. Normally use it, such as setup account,new message, send/receive mails, etc.
3. Observe CPU history and other system resouces.

Test result: There was no abnormity showed as evolution can be used normally and
system resouces load was low.