Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1368765

Summary: hunspell has problems with input above the BMP, causes this: [abrt] ibus-typing-booster: SuggestMgr::leftcommonsubstring(char*, char const*)(): python3.5 killed by SIGSEGV
Product: [Fedora] Fedora Reporter: Mike FABIAN <mfabian>
Component: hunspellAssignee: Caolan McNamara <caolanm>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 24CC: anish.developer, caolanm, hopparz, i18n-bugs, mfabian, smaitra
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/3391ac560b2bed2b86c62550527292850e9f5fc8
Whiteboard: abrt_hash:ce549929093186e3290aca1694c819a2a6ba930e;
Fixed In Version: hunspell-1.3.3-10.fc24 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-01 16:53:11 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: dso_list
none
File: environ
none
File: exploitable
none
File: limits
none
File: maps
none
File: mountinfo
none
File: namespaces
none
File: open_fds
none
File: proc_pid_status
none
File: var_log_messages
none
python-enchant-crash.py
none
hunspell-conversion-problem.txt none

Description Mike FABIAN 2016-08-21 09:09:17 UTC
Description of problem:
I tried to use hu-rovas-post.mim with the hu-HU engine of ibus-typing-booster

Version-Release number of selected component:
ibus-typing-booster-1.5.0-1.fc24

Additional info:
reporter:       libreport-2.7.2
backtrace_rating: 4
cmdline:        /usr/bin/python3 /usr/share/ibus-typing-booster/engine/main.py --profile --ibus
crash_function: SuggestMgr::leftcommonsubstring(char*, char const*)
executable:     /usr/bin/python3.5
global_pid:     9190
kernel:         4.6.4-301.fc24.x86_64
pkg_fingerprint: 73BD E983 81B4 6521
pkg_vendor:     Fedora Project
runlevel:       N 5
type:           CCpp
uid:            10030

Truncated backtrace:
Thread no. 0 (10 frames)
 #0 SuggestMgr::leftcommonsubstring(char*, char const*) at suggestmgr.cxx:1859
 #1 SuggestMgr::ngsuggest(char**, char*, int, HashMgr**, int) at suggestmgr.cxx:1106
 #2 Hunspell::suggest(char***, char const*) at hunspell.cxx:889
 #3 MySpellChecker::suggestWord(char const*, unsigned long, unsigned long*) at myspell_checker.cpp:197
 #4 enchant_dict_suggest at enchant.c:943
 #5 ffi_call_unix64 at ../src/x86/unix64.S:76
 #6 ffi_call at ../src/x86/ffi64.c:525
 #7 _ctypes_callproc at /usr/src/debug/Python-3.5.1/Modules/_ctypes/callproc.c:811
 #9 PyCFuncPtr_call at /usr/src/debug/Python-3.5.1/Modules/_ctypes/_ctypes.c:3869
 #10 PyObject_Call at /usr/src/debug/Python-3.5.1/Objects/abstract.c:2165

Comment 1 Mike FABIAN 2016-08-21 09:09:23 UTC
Created attachment 1192545 [details]
File: backtrace

Comment 2 Mike FABIAN 2016-08-21 09:09:25 UTC
Created attachment 1192546 [details]
File: cgroup

Comment 3 Mike FABIAN 2016-08-21 09:09:27 UTC
Created attachment 1192547 [details]
File: core_backtrace

Comment 4 Mike FABIAN 2016-08-21 09:09:28 UTC
Created attachment 1192548 [details]
File: dso_list

Comment 5 Mike FABIAN 2016-08-21 09:09:30 UTC
Created attachment 1192549 [details]
File: environ

Comment 6 Mike FABIAN 2016-08-21 09:09:31 UTC
Created attachment 1192550 [details]
File: exploitable

Comment 7 Mike FABIAN 2016-08-21 09:09:33 UTC
Created attachment 1192551 [details]
File: limits

Comment 8 Mike FABIAN 2016-08-21 09:09:35 UTC
Created attachment 1192552 [details]
File: maps

Comment 9 Mike FABIAN 2016-08-21 09:09:36 UTC
Created attachment 1192553 [details]
File: mountinfo

Comment 10 Mike FABIAN 2016-08-21 09:09:38 UTC
Created attachment 1192554 [details]
File: namespaces

Comment 11 Mike FABIAN 2016-08-21 09:09:40 UTC
Created attachment 1192555 [details]
File: open_fds

Comment 12 Mike FABIAN 2016-08-21 09:09:41 UTC
Created attachment 1192556 [details]
File: proc_pid_status

Comment 13 Mike FABIAN 2016-08-21 09:09:43 UTC
Created attachment 1192557 [details]
File: var_log_messages

Comment 14 Mike FABIAN 2016-08-21 09:56:04 UTC
Created attachment 1192563 [details]
python-enchant-crash.py

It crashes, because python3-enchant crashes:

$ python3 python-enchant-crash.py 
['Budapest', 'Budapesti', 'Budapesté', 'Budapestű']
[b'Budapest', b'Budapesti', b'Budapest\xc3\xa9', b'Budapest\xc5\xb1']
This UTF-8 encoding can't convert to UTF-16:
𐲂𐳪𐳇𐳀𐳠𐳉𐳤𐳦
This UTF-8 encoding can't convert to UTF-16:
𐲂𐳪𐳇𐳀𐳠𐳉𐳤𐳦
This UTF-8 encoding can't convert to UTF-16:
𐲂𐳪𐳇𐳀𐳠𐳉𐳤𐳦
Segmentation fault (コアダンプ)
mfabian@ari:~
$

Comment 15 Mike FABIAN 2016-08-21 10:00:52 UTC
Created attachment 1192564 [details]
hunspell-conversion-problem.txt

python3-enchant probably crashes because of this problem in hunspell:

$ hunspell -d hu_HU -i utf-8 -l hunspell-conversion-problem.txt 
hBudapxst
This UTF-8 encoding can't convert to UTF-16:
😇
This UTF-8 encoding can't convert to UTF-16:
😇
This UTF-8 encoding can't convert to UTF-16:
𐳠
This UTF-8 encoding can't convert to UTF-16:
𐳠
mfabian@ari:~
$ 

Of course the file converts to UTF-16 just fine:

$ iconv -f utf-8 -t utf-16 < hunspell-conversion-problem.txt | iconv -f utf-16 -t utf-8 
Budapxst
😇
𐳠
mfabian@ari:~
$ 

It looks like hunspell has problems with characters above the BMP (Basic Multilingual Plane).

Comment 16 Mike FABIAN 2016-08-21 11:46:31 UTC
It works on current rawhide with hunspell-1.4.1:

[mfabian@Fedora-Workstation-netinst-x86_6 ~]$ cat /etc/fedora-release 
Fedora release 26 (Rawhide)
[mfabian@Fedora-Workstation-netinst-x86_6 ~]$ python3 python-enchant-crash.py 
['Budapest', 'Budapesti', 'Budapesté', 'Budapestű']
[b'Budapest', b'Budapesti', b'Budapest\xc3\xa9', b'Budapest\xc5\xb1']
[]
[]
[mfabian@Fedora-Workstation-netinst-x86_6 ~]$ rpm -q hunspell
hunspell-1.4.1-1.fc25.x86_64
[mfabian@Fedora-Workstation-netinst-x86_6 ~]$ hunspell -d hu_HU -i utf-8 -l hunspell-conversion-problem.txt 
Budapxst
[mfabian@Fedora-Workstation-netinst-x86_6 ~]$

Comment 17 Mike FABIAN 2016-08-21 11:47:01 UTC
Can the fix be backported to f24?

Comment 18 Caolan McNamara 2016-08-22 16:37:15 UTC
Good to know the big rework of stuff in hunspell had a practical worthwhile effect. I'll have to bisect to find when it started working to see what exactly was the cause to see if its backportable in isolation.

Comment 19 Fedora Update System 2016-08-29 11:47:24 UTC
hunspell-1.3.3-10.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-1a8b18ee44

Comment 20 Fedora Update System 2016-08-29 22:52:32 UTC
hunspell-1.3.3-10.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-1a8b18ee44

Comment 21 Fedora Update System 2016-09-01 16:53:06 UTC
hunspell-1.3.3-10.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.