Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1689769 - Surprising effect of register volatile on s390x
Summary: Surprising effect of register volatile on s390x
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: 29
Hardware: s390x
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker
TreeView+ depends on / blocked
 
Reported: 2019-03-18 03:46 UTC by Jerry James
Modified: 2019-11-27 22:50 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-11-27 22:50:54 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Test code with register volatile (357 bytes, text/plain)
2019-03-18 03:52 UTC, Jerry James
no flags Details
experimental patch (2.86 KB, patch)
2019-03-20 13:00 UTC, IBM Bug Proxy
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 176259 0 None None None 2019-07-03 13:01:09 UTC

Description Jerry James 2019-03-18 03:46:39 UTC
Description of problem:
Note: the only s390x box I have access to has old mock configs, and is currently unable to build for F30 and Rawhide.  Therefore, I am reporting this bug against F29, the latest release for which I can build.  It may affect Rawhide as well.

Clisp has been failing to build on s390x for awhile.  It reports a stack smash when the binary is invoked.  The debugger shows that the stack pointer is being changed in weird ways.  Changes to the optimization level do not make the behavior go away, but change where in the code the issue manifests.  Sometimes the code segfaults; sometimes it reports the stack smash.

I have traced the issue to clisp's memory management scheme.  See the attached test file.  If built with "gcc -O0 -g -pipe -Wall -Wextra -fwrapv -fno-strict-aliasing -c test2.c", the asciz_equal prologue contains this instruction:

ldgr    %f0,%r15

and the function epilogue contains this instruction:

lgdr    %r15,%f0

However, if "-DBUG" is added to the gcc command line, then the prologue instruction is NOT generated, but the epilogue instruction is.  The result is that an essentially random value is put into %r15, the stack pointer.  The next time the calling function attempts a stack access, either the segfault or the stack smash report ensues.

The compiler DOES warn about the declaration of __SP:

test2.c:2:1: warning: optimization may eliminate reads and/or writes to register variables [-Wvolatile-register-var]
 register __volatile__ unsigned long __SP __asm__("15");
 ^~~~~~~~

However, I find it very surprising that that declaration can affect code inside functions following the declaration in this way.  Is this considered correct?

Version-Release number of selected component (if applicable):
gcc-8.3.1-2.fc29.s390x

How reproducible:
Always.

Steps to Reproduce:
1. Build the attached code without -DBUG.
2. Observe that the prologue saves %r15 and the epilogue restores it.
3. Build the attached code with -DBUG.
4. Observe that the prologue does NOT save %r15, but the epilogue restores it anyway.

Actual results:
Crashes and incorrect stack smash reports.

Expected results:
I'm not sure what behavior to expect, since the C code is arguably wrong, but not this.

Additional info:

Comment 1 Jerry James 2019-03-18 03:52:44 UTC
Created attachment 1545109 [details]
Test code with register volatile

Comment 2 Jakub Jelinek 2019-03-19 17:50:49 UTC
This changed in http://gcc.gnu.org/r203303

Comment 3 IBM Bug Proxy 2019-03-20 13:00:26 UTC
Created attachment 1546058 [details]
experimental patch


------- Comment on attachment From Andreas.Krebbel.com 2019-03-20 08:57 EDT-------


The stack pointer needs saving and restoring even if it is a global register. This is currently not handled correctly in s390_optimize_register_info.

I'm testing the attached patch. Does this fix the problem for you?

Comment 4 Jakub Jelinek 2019-03-20 13:22:34 UTC
Comment on attachment 1546058 [details]
experimental patch

Thanks, looks reasonable to me, though not sure if for the testcase it wouldn't be better to just have a runtime testcase with that global register variable, a noipa function that does something that needs stack allocation and perhaps in auxiliary source have the rest, main that calls that noipa function and have that noipa function say call some other noipa one in the auxiliary TU and do some runtime verification which would fail if the stack pointer changed in main (then it wouldn't find some variable for a check or comparison of addresses would fail etc.).

Comment 5 Jakub Jelinek 2019-03-20 13:56:00 UTC
Something like:
$ cat prNNNNN-1.c
/* PR target/NNNNN */
/* { dg-do run } */
/* { dg-options "-O0 -fomit-frame-pointer" } */
/* { dg-additional-sources "prNNNNN-2.c" } */

register void *sp __asm ("15");

__attribute__((noipa)) int
foo (const char *a, const char *b)
{
  while (1)
    {
      char c = *a++;
      if (c != *b++) return 0;
      if (c == '\0') return 1;
    }
}
$ cat prNNNNN-2.c
/* PR target/NNNNN */
/* { dg-do compile } */

extern int foo (const char *, const char *);

__attribute__((noipa)) void
bar (const char *p)
{
  static const char *x;
  if (!x)
    x = p;
  else if (p != x)
    __builtin_abort ();
}

int
main ()
{
  char a[8] = "abcdefg";
  bar (a);
  if (foo (a, a) != 1)
    __builtin_abort ();
  bar (a);
  return 0;
}

with NNNNN replaced by some gcc bugzilla bug filed for this.  At least the above fails for me with the current trunk and commenting out the sp declaration makes it work.

Comment 6 IBM Bug Proxy 2019-03-20 15:10:24 UTC
------- Comment From Andreas.Krebbel.com 2019-03-20 11:00 EDT-------
GCC Bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89775

Comment 7 IBM Bug Proxy 2019-03-20 15:30:27 UTC
------- Comment From Andreas.Krebbel.com 2019-03-20 11:29 EDT-------
Test was successful. I've committed the patch with the testcase from Jakub. I've verified that the testcase fails before and succeeds after the patch. Thanks!

Comment 8 Jerry James 2019-03-21 03:17:20 UTC
Thanks for the fast analysis and fix.  Once a patched gcc is available in the Fedora repos, I'll try it out on the clisp code.

Comment 9 Ben Cotton 2019-10-31 19:08:40 UTC
This message is a reminder that Fedora 29 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 29 on 2019-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '29'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 29 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Ben Cotton 2019-11-27 22:50:54 UTC
Fedora 29 changed to end-of-life (EOL) status on 2019-11-26. Fedora 29 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.