Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1368765
Summary: | hunspell has problems with input above the BMP, causes this: [abrt] ibus-typing-booster: SuggestMgr::leftcommonsubstring(char*, char const*)(): python3.5 killed by SIGSEGV | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mike FABIAN <mfabian> | ||||||||||||||||||||||||||||||||
Component: | hunspell | Assignee: | Caolan McNamara <caolanm> | ||||||||||||||||||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||||||||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||||||||||||||||||||||
Priority: | unspecified | ||||||||||||||||||||||||||||||||||
Version: | 24 | CC: | anish.developer, caolanm, hopparz, i18n-bugs, mfabian, smaitra | ||||||||||||||||||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||||||||||||||||||
Target Release: | --- | ||||||||||||||||||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||||||||||||||||||
URL: | https://retrace.fedoraproject.org/faf/reports/bthash/3391ac560b2bed2b86c62550527292850e9f5fc8 | ||||||||||||||||||||||||||||||||||
Whiteboard: | abrt_hash:ce549929093186e3290aca1694c819a2a6ba930e; | ||||||||||||||||||||||||||||||||||
Fixed In Version: | hunspell-1.3.3-10.fc24 | Doc Type: | If docs needed, set a value | ||||||||||||||||||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||||||||||||||||||
Last Closed: | 2016-09-01 16:53:11 UTC | Type: | --- | ||||||||||||||||||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||||||||||||||||
Embargoed: | |||||||||||||||||||||||||||||||||||
Attachments: |
|
Description
Mike FABIAN
2016-08-21 09:09:17 UTC
Created attachment 1192545 [details]
File: backtrace
Created attachment 1192546 [details]
File: cgroup
Created attachment 1192547 [details]
File: core_backtrace
Created attachment 1192548 [details]
File: dso_list
Created attachment 1192549 [details]
File: environ
Created attachment 1192550 [details]
File: exploitable
Created attachment 1192551 [details]
File: limits
Created attachment 1192552 [details]
File: maps
Created attachment 1192553 [details]
File: mountinfo
Created attachment 1192554 [details]
File: namespaces
Created attachment 1192555 [details]
File: open_fds
Created attachment 1192556 [details]
File: proc_pid_status
Created attachment 1192557 [details]
File: var_log_messages
Created attachment 1192563 [details]
python-enchant-crash.py
It crashes, because python3-enchant crashes:
$ python3 python-enchant-crash.py
['Budapest', 'Budapesti', 'Budapesté', 'Budapestű']
[b'Budapest', b'Budapesti', b'Budapest\xc3\xa9', b'Budapest\xc5\xb1']
This UTF-8 encoding can't convert to UTF-16:
𐲂𐳪𐳇𐳀𐳠𐳉𐳤𐳦
This UTF-8 encoding can't convert to UTF-16:
𐲂𐳪𐳇𐳀𐳠𐳉𐳤𐳦
This UTF-8 encoding can't convert to UTF-16:
𐲂𐳪𐳇𐳀𐳠𐳉𐳤𐳦
Segmentation fault (コアダンプ)
mfabian@ari:~
$
Created attachment 1192564 [details]
hunspell-conversion-problem.txt
python3-enchant probably crashes because of this problem in hunspell:
$ hunspell -d hu_HU -i utf-8 -l hunspell-conversion-problem.txt
hBudapxst
This UTF-8 encoding can't convert to UTF-16:
😇
This UTF-8 encoding can't convert to UTF-16:
😇
This UTF-8 encoding can't convert to UTF-16:
𐳠
This UTF-8 encoding can't convert to UTF-16:
𐳠
mfabian@ari:~
$
Of course the file converts to UTF-16 just fine:
$ iconv -f utf-8 -t utf-16 < hunspell-conversion-problem.txt | iconv -f utf-16 -t utf-8
Budapxst
😇
𐳠
mfabian@ari:~
$
It looks like hunspell has problems with characters above the BMP (Basic Multilingual Plane).
It works on current rawhide with hunspell-1.4.1: [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ cat /etc/fedora-release Fedora release 26 (Rawhide) [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ python3 python-enchant-crash.py ['Budapest', 'Budapesti', 'Budapesté', 'Budapestű'] [b'Budapest', b'Budapesti', b'Budapest\xc3\xa9', b'Budapest\xc5\xb1'] [] [] [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ rpm -q hunspell hunspell-1.4.1-1.fc25.x86_64 [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ hunspell -d hu_HU -i utf-8 -l hunspell-conversion-problem.txt Budapxst [mfabian@Fedora-Workstation-netinst-x86_6 ~]$ Can the fix be backported to f24? Good to know the big rework of stuff in hunspell had a practical worthwhile effect. I'll have to bisect to find when it started working to see what exactly was the cause to see if its backportable in isolation. hunspell-1.3.3-10.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-1a8b18ee44 hunspell-1.3.3-10.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-1a8b18ee44 hunspell-1.3.3-10.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report. |