2003-07-22 12:23:24

by Jan Kasprzak

[permalink] [raw]
Subject: [Patch] Non-ASCII chars in visor.c messages

Hello,

what is the general opinion on printing non-ASCII characters in kernel
messages? I think kernel should print either pure ASCII messages, or
at least UTF-8-encoded ones.

The visor.c module contains three messages
with non-ASCII character ("e" with acute above, encoded in
ISO 8859-1, in the name of "Sony Clie'" handheld). I propose the attached
patch, which works in all environments (altough UTF-8 variant would be
IMHO fine as well).

What do you think about it?

-Yenya

--- linux-2.6.0-test1/drivers/usb/serial/visor.c.orig 2003-07-22 14:30:18.835081416 +0200
+++ linux-2.6.0-test1/drivers/usb/serial/visor.c 2003-07-22 14:30:41.597620984 +0200
@@ -169,7 +169,7 @@
*/
#define DRIVER_VERSION "v2.1"
#define DRIVER_AUTHOR "Greg Kroah-Hartman <[email protected]>"
-#define DRIVER_DESC "USB HandSpring Visor, Palm m50x, Sony Cli? driver"
+#define DRIVER_DESC "USB HandSpring Visor, Palm m50x, Sony Clie driver"

/* function prototypes for a handspring visor */
static int visor_open (struct usb_serial_port *port, struct file *filp);
@@ -275,7 +275,7 @@
/* All of the device info needed for the Handspring Visor, and Palm 4.0 devices */
static struct usb_serial_device_type handspring_device = {
.owner = THIS_MODULE,
- .name = "Handspring Visor / Treo / Palm 4.0 / Cli? 4.x",
+ .name = "Handspring Visor / Treo / Palm 4.0 / Clie 4.x",
.short_name = "visor",
.id_table = id_table,
.num_interrupt_in = NUM_DONT_CARE,
@@ -303,7 +303,7 @@
/* device info for the Sony Clie OS version 3.5 */
static struct usb_serial_device_type clie_3_5_device = {
.owner = THIS_MODULE,
- .name = "Sony Cli? 3.5",
+ .name = "Sony Clie 3.5",
.short_name = "clie_3.5",
.id_table = clie_id_3_5_table,
.num_interrupt_in = 0,

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
|__ If you want "aesthetics", go play with microkernels. -Linus Torvalds __|


2003-07-22 12:35:53

by Greg KH

[permalink] [raw]
Subject: Re: [Patch] Non-ASCII chars in visor.c messages

On Tue, Jul 22, 2003 at 02:38:21PM +0200, Jan Kasprzak wrote:
> Hello,
>
> what is the general opinion on printing non-ASCII characters in kernel
> messages? I think kernel should print either pure ASCII messages, or
> at least UTF-8-encoded ones.

"pure ASCII"? Heh, that's the first time I've heard that.

> The visor.c module contains three messages
> with non-ASCII character ("e" with acute above, encoded in
> ISO 8859-1, in the name of "Sony Clie'" handheld). I propose the attached
> patch, which works in all environments (altough UTF-8 variant would be
> IMHO fine as well).
>
> What do you think about it?

I don't think it's really needed. Why change this, syslog can't handle
this? It works for me...

thanks,

greg k-h

2003-07-22 12:54:43

by Jan Kasprzak

[permalink] [raw]
Subject: Re: [Patch] Non-ASCII chars in visor.c messages

Greg KH wrote:
: >
: > What do you think about it?
:
: I don't think it's really needed. Why change this, syslog can't handle
: this? It works for me...
:
Yes, syslog can handle this, but in order to parse syslog files
you should have your LC_CTYPE set to something Latin-1 compatible
(which UTF-8 is not, and it is the default on many distros).

Why Latin-1 and not UTF-8? I think UTF-8 is more "correct", while
ASCII is "works for all". Latin-1 is neither "correct" nor "works for all".

Thanks,

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
|__ If you want "aesthetics", go play with microkernels. -Linus Torvalds __|

2003-07-22 13:03:55

by Greg KH

[permalink] [raw]
Subject: Re: [Patch] Non-ASCII chars in visor.c messages

On Tue, Jul 22, 2003 at 03:09:42PM +0200, Jan Kasprzak wrote:
> Greg KH wrote:
> : >
> : > What do you think about it?
> :
> : I don't think it's really needed. Why change this, syslog can't handle
> : this? It works for me...
> :
> Yes, syslog can handle this, but in order to parse syslog files
> you should have your LC_CTYPE set to something Latin-1 compatible
> (which UTF-8 is not, and it is the default on many distros).
>
> Why Latin-1 and not UTF-8? I think UTF-8 is more "correct", while
> ASCII is "works for all". Latin-1 is neither "correct" nor "works for all".

So how do you encode that character in UTF-8?

If we are going to print device names, I want to be correct in their
usage...

thanks,

greg k-h

2003-07-22 13:50:02

by Jan Kasprzak

[permalink] [raw]
Subject: Re: [Patch] Non-ASCII chars in visor.c messages

Greg KH wrote:
: > Why Latin-1 and not UTF-8? I think UTF-8 is more "correct", while
: > ASCII is "works for all". Latin-1 is neither "correct" nor "works for all".
:
: So how do you encode that character in UTF-8?
:
: If we are going to print device names, I want to be correct in their
: usage...

It is \303\251 in octal (0xc3 0xa9 in hex).

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
|__ If you want "aesthetics", go play with microkernels. -Linus Torvalds __|

2003-07-22 14:07:21

by Alan

[permalink] [raw]
Subject: Re: [Patch] Non-ASCII chars in visor.c messages

On Maw, 2003-07-22 at 13:50, Greg KH wrote:
> > The visor.c module contains three messages
> > with non-ASCII character ("e" with acute above, encoded in
> > ISO 8859-1, in the name of "Sony Clie'" handheld). I propose the attached
> > patch, which works in all environments (altough UTF-8 variant would be
> > IMHO fine as well).
> >
> > What do you think about it?
>
> I don't think it's really needed. Why change this, syslog can't handle
> this? It works for me...

Current syslog has problems handling it. These problems are a lot worse
than they appear too. Since the file system encoding is UTF-8 for file
naming the syslog daemon is sometimes logging kernel file path objects
which are unicode utf-8 format.

The highbit corrupted characters in the C files (as well as being iffy
C) causes problems we just don't need.

It doens't really matter if we pick UTF-8 (which does mean things like
names can be handled ok) or plain 7bit ascii C locale but we need to
pick something.

2003-07-23 11:04:48

by Pavel Machek

[permalink] [raw]
Subject: Re: [Patch] Non-ASCII chars in visor.c messages

Hi!

> > What do you think about it?
>
> I don't think it's really needed. Why change this, syslog can't handle
> this? It works for me...
>

It would not work here. Make it us-ascii.

--
Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...