Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1411116
Summary: | java-1.8.0-openjdk: SIGSEGV (0xb) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | gil cattaneo <puntogil> | ||||
Component: | java-1.8.0-openjdk | Assignee: | Andrew John Hughes <ahughes> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rawhide | CC: | ahughes, arik, cesarb, dbhole, enrico.tagliavini, jerboaa, jvanek, mikko.tiihonen, msrb, omajid, sgehwolf, thelan | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2017-01-24 22:43:35 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
gil cattaneo
2017-01-08 14:04:49 UTC
I also get this on rawhide after upgrading to openjdk 1.8.0.111-3.b16 that has this in rpm changelog: "java SSL/TLS implementation: should follow the policies of system-wide crypto policy" After that update opening an SSL connection might randomly crash the JVM. Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [libc.so.6+0x157101] __memmove_avx_unaligned_erms+0x211 C [libsunec.so+0x1220] C [libsunec.so+0x138a] Java_sun_security_ec_ECKeyPairGenerator_generateECKeyPair+0x12a j sun.security.ec.ECKeyPairGenerator.generateECKeyPair(I[B[B)[Ljava/lang/Object;+0 j sun.security.ec.ECKeyPairGenerator.generateKeyPair()Ljava/security/KeyPair;+56 j java.security.KeyPairGenerator$Delegate.generateKeyPair()Ljava/security/KeyPair;+23 j sun.security.ssl.ECDHCrypt.<init>(Ljava/security/spec/ECParameterSpec;Ljava/security/SecureRandom;)V+17 j sun.security.ssl.ClientHandshaker.serverKeyExchange(Lsun/security/ssl/HandshakeMessage$ECDH_ServerKeyExchange;)V+44 j sun.security.ssl.ClientHandshaker.processMessage(BI)V+582 j sun.security.ssl.Handshaker.processLoop()V+96 j sun.security.ssl.Handshaker.process_record(Lsun/security/ssl/InputRecord;Z)V+24 j sun.security.ssl.SSLSocketImpl.readRecord(Lsun/security/ssl/InputRecord;Z)V+357 j sun.security.ssl.SSLSocketImpl.performInitialHandshake()V+84 Created attachment 1238744 [details]
Java crash error file containing register values
Hi, is this consistently reproducible? Yes, easily reproducable. I just tried to run "mvn -U versions:display-dependency-updates" on my project. 20/20 times it crashed with the same SIGSEGV error. Luckily when I get my IDE running after few tries it no longer tries to open network connections and stays up. (In reply to Mikko Tiihonen from comment #4) > Yes, easily reproducable. > > I just tried to run "mvn -U versions:display-dependency-updates" on my > project. 20/20 times it crashed with the same SIGSEGV error. > > Luckily when I get my IDE running after few tries it no longer tries to open > network connections and stays up. Thanks, are you able to put up relevant bits of your projects online somewhere so that we can reproduce this? This morning I got the update to 1.8.0.111-5.b16.fc26 and the problem vanished. I just tried to downgrade back to 1.8.0.111-3.b16.fc26 (downloaded from koji) that I was using when I reported the problem and the crashes started again occuring. So it seems that the problem is fixed in newer version. A simple way to reproduce this is to run: rm -rf ~/.m2/repository/ && mvn archetype:generate -DgroupId=com.mycompany.app -DartifactId=my-app -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false With the -3 build it crashes within few seconds. With -5 it finishes successfully (In reply to Mikko Tiihonen from comment #6) > This morning I got the update to 1.8.0.111-5.b16.fc26 and the problem > vanished. > > I just tried to downgrade back to 1.8.0.111-3.b16.fc26 (downloaded from > koji) that I was using when I reported the problem and the crashes started > again occuring. > > So it seems that the problem is fixed in newer version. > > A simple way to reproduce this is to run: > > rm -rf ~/.m2/repository/ && mvn archetype:generate > -DgroupId=com.mycompany.app -DartifactId=my-app > -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false > > With the -3 build it crashes within few seconds. With -5 it finishes > successfully Did you happen to have multiple arches installed? The only relevant change between -3 and -5 it seems was to introduce arch specific requires for its own subpackages (of java-1.8.0-openjdk). Nope, Only the x86_64 version of the openjdk is installed. I also wondered the terse changelog that does not mention any relevant changes. I wonder what the -4 version changed. Can you reproduce it on the older version? Maybe it is some strange alignment or initialization order issue that just might magically appear again in new builds if it is not fixed. As an additional datapoint I also tested the -4 version from koji. It also crashes. I'm also seeing these crashes, in both Maven and IDEA, but in java-1.8.0-openjdk-1.8.0.111-5.b16.fc24.x86_64, so it's still happening in the -5 version. Mine are in __memcpy_avx_unaligned:
Stack: [0x00007f682c5fc000,0x00007f682c6fd000], sp=0x00007f682c6fa638, free space=1017k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libc.so.6+0x14a734] __memcpy_avx_unaligned+0x2c4
C [libsunec.so+0x1240]
C [libsunec.so+0x13aa] Java_sun_security_ec_ECKeyPairGenerator_generateECKeyPair+0x12a
j sun.security.ec.ECKeyPairGenerator.generateECKeyPair(I[B[B)[Ljava/lang/Object;+0
j sun.security.ec.ECKeyPairGenerator.generateKeyPair()Ljava/security/KeyPair;+56
j java.security.KeyPairGenerator$Delegate.generateKeyPair()Ljava/security/KeyPair;+23
j sun.security.ssl.ECDHCrypt.<init>(Ljava/security/spec/ECParameterSpec;Ljava/security/SecureRandom;)V+17
j sun.security.ssl.ClientHandshaker.serverKeyExchange(Lsun/security/ssl/HandshakeMessage$ECDH_ServerKeyExchange;)V+44
[...]
Of note are the registers. Mine are:
RAX=0x00000000bad35fc0 is pointing into object: 0x00000000bad35fb0
[B
- klass: {type array byte}
- length: 311656120
RBX=0x00007f68481f0800 is a thread
RCX=0x0000000012937eb8 is an unknown value
RDX=0x0000000012937eb8 is an unknown value
RSP=0x00007f682c6fa638 is pointing into the stack for thread: 0x00007f68481f0800
RBP=0x00007f682c6fa6b0 is pointing into the stack for thread: 0x00007f68481f0800
RSI=0x0000000000000000 is an unknown value
RDI=0x00000000bad35fc0 is pointing into object: 0x00000000bad35fb0
[B
- klass: {type array byte}
- length: 311656120
R8 =0x0000000000000000 is an unknown value
R9 =0x0000000004000001 is an unknown value
R10=0x0000000000000001 is an unknown value
R11=0x0000000000000283 is an unknown value
R12=0x0000000012937eb8 is an unknown value
R13=0x0000000000000000 is an unknown value
R14=0x00007f6810029ad0 is an unknown value
R15=0x00007f68481f0a58 is an unknown value
Notice that RDI is pointing to a Java-allocated array of around 300 megabytes, and RDX (which should be the memcpy size parameter) is exactly the size of that array. And RSI is a null pointer.
The same can be found in attachment #1238744 [details]: RDI points to a Java-allocated array of around 1.7 gigabytes, RDX is exactly that length, and RSI is a null pointer.
The faulting instruction, c5 fe 6f 26, is "vmovdqu (%rsi),%ymm4", so it's a null pointer dereference.
The questions are: *why* is memcpy receiving a null pointer as its source parameter, and *why* is Java allocating a huge array as destination for the memcpy from the null pointer?
(In reply to Cesar Eduardo Barros from comment #10) > I'm also seeing these crashes, in both Maven and IDEA, but in > java-1.8.0-openjdk-1.8.0.111-5.b16.fc24.x86_64, so it's still happening in > the -5 version. Mine are in __memcpy_avx_unaligned: > > Stack: [0x00007f682c5fc000,0x00007f682c6fd000], sp=0x00007f682c6fa638, > free space=1017k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > C [libc.so.6+0x14a734] __memcpy_avx_unaligned+0x2c4 > C [libsunec.so+0x1240] > C [libsunec.so+0x13aa] > Java_sun_security_ec_ECKeyPairGenerator_generateECKeyPair+0x12a > j > sun.security.ec.ECKeyPairGenerator.generateECKeyPair(I[B[B)[Ljava/lang/ > Object;+0 > j > sun.security.ec.ECKeyPairGenerator.generateKeyPair()Ljava/security/KeyPair; > +56 > j > java.security.KeyPairGenerator$Delegate.generateKeyPair()Ljava/security/ > KeyPair;+23 > j > sun.security.ssl.ECDHCrypt.<init>(Ljava/security/spec/ECParameterSpec;Ljava/ > security/SecureRandom;)V+17 > j > sun.security.ssl.ClientHandshaker.serverKeyExchange(Lsun/security/ssl/ > HandshakeMessage$ECDH_ServerKeyExchange;)V+44 > [...] This suggests you are hitting bug 1415137. You can verify if that is the case by downgrading nss* packages. If the problem goes away for you, you are hitting bug 1415137, not this one. Actually, its probably the same bug :) We should close one as a duplicate. All right, I think I found something interesting. The most probable source for the failure we're seeing is at Java_sun_security_ec_ECKeyPairGenerator_generateECKeyPair *after* the EC_NewKey. The suspicious lines for me are the pair of calls to getEncodedBytes (defined just above it) which do a memcpy from a structure returned by EC_NewKey into a newly allocated Java array. Now look at https://hg.mozilla.org/projects/nss/rev/047ab976840a which introduces Curve25519; it changes ec_NewKey to use ecParams->pointSize instead of a formula based on ecParams->fieldID.size, when allocating precisely one of the structures I found suspicious. The ecParams comes from the Java code, and where in the Java code is ecParams->pointSize set? Nowhere! If you look at EC_DecodeParams on the Java side, you see that it has code to set ecParams->fieldID.size, but no code to set ecParams->pointSize. Therefore, my conclusion is that the fault is on the Java side, which is trying to fill the ecParams struct manually instead of calling a NSS function to fill it, and missing a (newly introduced) field. Then the pointSize field has garbage, so when it is large enough, NSS fails the allocation, resulting in both the null pointer and the large enough (garbage) value. When the value is by chance small enough, it doesn't return a null pointer and therefore won't crash (but it can result in the OutOfMemory exceptions I've been seeing in IntelliJ). So, one possible solution to this particular issue would be to add a dependency on the latest NSS and set that field manually to whichever value it was supposed to have. There's no telling, however, which other crazy things the Java code was doing which would break with that change. Someone who understands both code bases should take a careful look. *** This bug has been marked as a duplicate of bug 1415137 *** |