Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 2273618 - Optimizing with -O2 causes wrong results on s390x
Summary: Optimizing with -O2 causes wrong results on s390x
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: 40
Hardware: s390x
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker
TreeView+ depends on / blocked
 
Reported: 2024-04-05 11:00 UTC by Jonas Ådahl
Modified: 2024-04-12 13:45 UTC (History)
13 users (show)

Fixed In Version: gcc-14.0.1-0.14.fc41
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-12 13:45:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Reproducer (deleted)
2024-04-05 11:00 UTC, Jonas Ådahl
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 114605 0 P1 UNCONFIRMED [14 Regression] wrong code with -march=z13 -O0 since r14-5831 2024-04-05 13:23:55 UTC

Description Jonas Ådahl 2024-04-05 11:00:05 UTC
When investigating faulty rendering in GNOME Shell when running under s390x, I eventually discovered that compiling mutter with -O0 made the issue go away.

Eventually I narrowed it down to a function that did a memcpy from a local float array to a stack allocated float array in a callee.

I could also work around it in three ways:

* #pragma GCC optimize ("O0") around the affected function.
* Mark the float array copied from as volatile
* Switch the memcpy to a for loop

With that in mind, I took the relevant code, removed as much as I could while still reproducing. It isn't only the memcpy; e.g. it needs a bit of noise to make it reproduce.

Attaching reproducing C file. When running, if it doesn't reproduce, it exits cleanly. If it reproduces it'll print

1.000000 == 0.000000 failed
Aborted (core dumped)

The three discovered workarounds are included in the C file, hidden behind `#if 0`.

Reproducible: Always

Comment 1 Jonas Ådahl 2024-04-05 11:00:46 UTC
Created attachment 2025354 [details]
Reproducer

Comment 2 Dan Horák 2024-04-05 11:45:42 UTC
Jonas, could you make also the attachment public? Thanks.

Comment 3 Jonas Ådahl 2024-04-05 11:55:08 UTC
(In reply to Dan Horák from comment #2)
> Jonas, could you make also the attachment public? Thanks.

Done; sorry about that.

Comment 4 Dan Horák 2024-04-05 12:05:02 UTC
Thanks and for the record it reproduces on z14 with gcc-14.0.1-0.13.fc41.s390x, but not with gcc-13.2.1-4.fc38.s390x

Comment 5 Jakub Jelinek 2024-04-05 12:11:44 UTC
Simplified for -march=z13 -O0:

typedef struct { const float *a; int b, c; float *d; } S;

__attribute__((noipa)) void
bar (void)
{
}

__attribute__((noinline, optimize (2))) static void
foo (S *e)
{
  const float *f;
  float *g;
  float h[4] = { 0.0, 0.0, 1.0, 1.0 };
  if (!e->b)
    f = h;
  else
    f = e->a;
  g = &e->d[0];
  __builtin_memcpy (g, f, sizeof (float) * 4);
  bar ();
  if (!e->b)
    if (g[0] != 0.0 || g[1] != 0.0 || g[2] != 1.0 || g[3] != 1.0)
      __builtin_abort ();
}

int
main ()
{
  float d[4];
  S e = { .d = d };
  foo (&e);
  return 0;
}

Bisecting now.

Comment 6 Jakub Jelinek 2024-04-05 13:10:07 UTC
Bisected to https://gcc.gnu.org/r14-5831


Note You need to log in before you can comment on or make changes to this bug.