Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 74701
Summary: | Locale rules are inconsistent between X and non-X environments | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Ed Halley <ed> | ||||
Component: | bash | Assignee: | Tim Waugh <twaugh> | ||||
Status: | CLOSED ERRATA | QA Contact: | David Lawrence <dkl> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 8.0 | CC: | ali, chris.ricker, drepper, jakub, menthos, mitr, twaugh | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | athlon | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2002-10-17 07:45:02 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Ed Halley
2002-10-01 00:25:15 UTC
I am able to replicate this problem. Comparing systems with the LOCALE set to POSIX/C in /etc/sysconfig/i18n and systems that have the default the only significant difference is that the trace shows hits to /usr/X11R6/lib/X11/locale/en_US.UTF-8/ .. The bug report, again, isn't a LC_COLLATE complaint. It's just that en_US.UTF-8 seems to behave differently under X vs. console. What drives you insane is that once the gnome-terminal or what-not was launched, "export LANG="POSIX"" or C did ~nothing~ to the above. Completely unexpected. So the only way to get it to work properly is to get the env set before X starts. From memory I'm thinking that X is behaving properly for the given locale except I don't understand why the 'export' does nothing to change it. So this bug might be shared between X and GLIBC. Cheers, -Ali any differences in ouput of 'locale' between X and text console? Nope, no differences at all. And setting them once gnome-terminal was already launched did nothing to help this problem. The guess is that the X locale info is ~different~ than the GLIBC stuff that would be used otherwise. It was really bizarre and Halley and I might have to demo it for Mharris at some point. It took a while to characterize it ~consistently~. -Ali Note that I've never had any luck at all getting 'locale' to produce anything useful. I'd advise ignoring it. I suspect something is setting LC_COLLATE or LC_CTYPE different in the two circumstances. That is where I think the X locale directory specified above is futzing with things. A difference between the same locale between the way X and GLIBC handle it.. Cheers, -Ali Ah, I know what happens, nothing to do with X, but genuine bash bug. [ Ed, would you mind reassigning? ] The problem is that bash uses the LANG/LC_* variable values from the time it starts. When it is started from gnome-terminal, it gets LANG=en_US from its environment as everything else, and works as expected. When started on login in console, it *doesn't*, LANG is set *inside* the shell and bash doesn't reflect it. To verify it, run a subshell in console (i.e. type 'bash') and in this subshell (started with LANG=en_US), things behave as in X. This is bash bug, POSIX 1003.1-2002 (Shell and Utilites, Issue 6) requires in section 2.5.3 Shell Variables, that "The following [shell, not environment] variables shall affect the execution of the shell" ... LANG, LC_*, ... - i.e. changing the value *inside the shell* is required to affect it. I see what you're saying but as I noted above, even once in the shell (under gnome-terminal) doing an 'export LANG=<foo>' wouldn't change the behavior either.. that isn't right. And that only seemed to be broken under gnome-terminal and I was ~guessing~ based on traces where it inherited that ( as opposed to traces under konsole). More noted above. Hrmm. I think I'll wait for Mharris to comment. He mentioned some things to me earlier today that he thought might be related. Then again, perhaps I don't quite grasp the vastness of your comment. I'm Forrest-Gumpish at times. ;-) Cheers, -Ali Ali, the whole point is that bash ignores changes of LC_* and LANG which are done inside it: in konsole: touch a A b B c C LANG=POSIX bash ls [a-c] .... a b c export LANG=cs_CZ ls [a-c] ... a b c exit LANG=cs_CZ bash ls [a-c] ... a B b C c export LANG=POSIX ls [a-c] ... a B b C c exit see? LANG=cs_CZ bash shows one behavior, LANG=POSIX bash shows the other, even if *in* both of them, LANG is then set to the same value. (after the shel is exec()ed, changing LANG has no effect) And gnome-terminal runs it with LANG=en_US, but getty/login with LANG=C (or unset) Ah. I see what you meant now, didn't understand it before. So Ed, remember when I was 'LANG=<foo> gnome-terminal' and that all worked? Ok. Hrmm. Something still bothers me here. Switching from UTF to Latin fixes the display problems in programs like 'man' and 'screen' (UNICODE growing pains). And those programs certainly noticed the changed. So ~BASH~ is ignoring it. SOB, you know why I didn't get your behavior before? Your POSIX comment should've jogged my memory, I thought I saw some anomoly while testing Ed but YOU certainly grasped it, I didn't. Watch: [ali@damascus mitr]$ LANG="en_US.UTF-8" bash --posix bash-2.05b$ ls a A b B c C d D bash-2.05b$ ls [a-z]* a A b B c C d D bash-2.05b$ export LANG="POSIX" bash-2.05b$ ls [a-z]* a b c d Note that launching bash as 'sh' defaults to this behavior I understand. Hrmm. Now I don't believe there is a difference between POSIX 1003.2 (from man page vs. classic (mine above)) but I haven't found a firm answer to that yet. Mitr or RH might have that stuff/difference handy. Talk about bloody confusing. I'm never submitting a Bugzilla report again. :-/ Cheers, -Ali This seems to be a valid bug IMHO. After tracing things a bit, I believe it might be a bash bug. I'm CC'ing the bash maintainer, and Jakub and Uli for concensus/advice/info. All these discussions do not provide all the needed information. In each and every situation mentioned also run 'locale' and post the output. This will show what programs should see. In each usage case above, #4 and #7 as en_US.UTF-8, and #4 and #7 as POSIX (or C), the output from locale was unanimously as expected. All locale variables report the en_US.UTF-8 or POSIX or C setting. First run of #4 and #7: $ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= Second run of #4 and #7: $ locale LANG=POSIX LC_CTYPE="POSIX" LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= mitr seems to have hit the nail on the head here. The quote is quite clear that LANG et al should behave in a similar to HOME. c.f.: HOME=foo echo ~ I know that there is already special code in bash which traces certain envvars and directly uses their values. HOME is one example, but there is also LANG. At least some of the LC_* variables are missing although I'm sure that some where handled in the patch. In any case, quite easy to locate and probably fix. OK, here is a patch. It fixes the problem for me, but it changes most of the locale variable handling (which was originally really strange IMHO, LANG would override LC_*, assuming that libc sees bash variables), and we already know these issues are quite brittle. Something for Red Hat QA team to spend a day or two on :-( All I can say is I'm currently running the patched bash and the systems seems to shut down, boot and run cleanly. The original code ignores (sort of) most L* variable changes, but LC_ALL is honored. So a workaround for those who don't need to set individual LC_* values seems to be to use LC_ALL instead of LANG in /etc/sysconfig/i18n. As a second note, either bash needs a BuildPrereq on texinfo, or the implicit build requirements should be documented somewhere ;-) Created attachment 79724 [details]
Patch hopefully fixing the bash L* variable weirdness
mitr: You beat me to reporting it upstream.. ;-) Two more comments: a) From discussion with bash maintainer: The original intent was that setlocale (LC_*, "") works, because bash "overrides" getenv (). This was AFAIK never guarranteed to work, and breaks with glibc 2.3, which always calls internal getenv (this means PLT reduction and smaller run-time-linking overhead). b) In case the QA team does not notice: this means that rc.sysinit has LANG=cs_CZ.UTF-8 (or whatever) set from the start, and all messages translated into Czech are printed translated, but console is set to UTF-8 mode rather late, which means that most of the messages have "random" pairs characters instead of non-ASCII characters. This is quite ugly. I would like to humbly propose it's time to fix bug #30469 ;-) This should be fixed in bash-2.05b-6. *** Bug 77115 has been marked as a duplicate of this bug. *** An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2003-140.html |