2003-05-31 13:57:27

by

[permalink] [raw]
Subject: [PATCH][2.5] UTF-8 support in console

Although the keyboard (keyboard.c) and terminal (vt.c) both have "UTF-8
modes", there are still a couple of short-comings at the kernel level.

* compose tables use 8-bit characters,
* selection doesn't copy/paste UTF-8.

(OK, so there are many more, but these two are the ones that make
it inconvenient to even use extended Latin characters.)

Here is a patch against 2.5.70 that fixes selection. It uses the
existing Unicode font information (ushorts) to create an inverse mapping.

Chris


Attachments:
selection.patch (6.33 kB)

2003-05-31 14:08:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH][2.5] UTF-8 support in console


+static void set_inverse_trans_unicode(struct vc_data *conp, struct uni_pagedir *p)

Please linewrap after 80 chars.

+{
+ int i, j, k, glyph;
+ u16 **p1, *p2;
+ u16 *q;
+
+ if (!p) return;

Please split this into two lines. Can p ever be null_

+ q = p->inverse_trans_unicode;
+
+ if (!q) {

Kill the blank line above.

+ q = p->inverse_trans_unicode = (u16 *)
+ kmalloc(MAX_GLYPH * sizeof(u16), GFP_KERNEL);

The cast is not needed. And btw, where is q freed?

+ if (!q) return;

Two lines again.

2003-05-31 14:30:10

by Larry McVoy

[permalink] [raw]
Subject: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

> Please linewrap after 80 chars.

Amen to that.

> + if (!q) {
>
> Kill the blank line above.
>
> + if (!q) return;
>
> Two lines again.

A couple of comments: in the BK source tree, we've diverged from the Linux
coding style a bit (maybe a lot, Linus has read the source, ask him).

One thing is

unless (p) {
....
}
instead of
if (!p) {
....
}

It's just a
#define unless(x) if (!(x))
but it makes some code read quite a bit easier. I'm a stickler for not using
2 lines where one will do, i.e.,

FILE *f;

...
unless (f = fopen(file, "r")) {
error handling;
return (-1);
}

You hiccup the first time you see it, then you can read it, then you
start using it. Yeah, I know, I'm using the value of an assignment in
a conditional, trust me, it works fine.

One other one is the

if (!q) return;

Chris said two lines, we don't do it that way. The coding style we use is
a) one line is fine for a single statement.
b) in all other cases there are curly braces

unless (q) return; /* OK */
unless (q) { /* also OK */
return;
}
unless (q)
return; /* not OK, no "}" */


The point of this style is twofold: save a line when the thing you are
doing is a singe statement, and make it easier for your eyes (or my
tired old eyes) to run over the code. If you see indentation you know
it is a block and there will be a closing } without exception.

It keeps the line counts about 10% smaller or so in our source base.
If you are looking for bragging rights about how big your stuff is that
might be bad but I like it because I can read more code in a window.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-05-31 14:45:43

by Dave Jones

[permalink] [raw]
Subject: Re: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

On Sat, May 31, 2003 at 07:43:23AM -0700, Larry McVoy wrote:

> One other one is the
>
> if (!q) return;
>
> Chris said two lines, we don't do it that way. The coding style we use is
> a) one line is fine for a single statement.
> b) in all other cases there are curly braces

Saving a line over readability is utterly bogus.
Just look at some of the crap we have in devfs..

if (fs_info->devfsd_task == NULL) return (TRUE);
if (devfsd_queue_empty (fs_info) && fs_info->devfsd_sleeping) return TRUE;
if ( is_devfsd_or_child (fs_info) ) return (FALSE);
set_current_state (TASK_UNINTERRUPTIBLE);
add_wait_queue (&fs_info->revalidate_wait_queue, &wait);
if (!devfsd_queue_empty (fs_info) || !fs_info->devfsd_sleeping)
if (fs_info->devfsd_task) schedule ();
remove_wait_queue (&fs_info->revalidate_wait_queue, &wait);
__set_current_state (TASK_RUNNING);
return (TRUE);

*horror* to my eyes at least.

Parts of the DRI code use similar uglies. Whitespace is a *good* thing.
If you want more lines of code per screen, get a larger xterm, change a
font, whatever, but don't decrease code readability for something so bogus.

Dave

2003-05-31 15:26:27

by Larry McVoy

[permalink] [raw]
Subject: Re: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

On Sat, May 31, 2003 at 04:01:50PM +0100, Dave Jones wrote:
> Saving a line over readability is utterly bogus.

I agree 100%. If you have anything more complex than

if (error) return (error);

I want it to look like

if ((expr) || (expr2) || (expr3)) {
return (error);
}

> Just look at some of the crap we have in devfs..

No kidding, look at the nested if, that's insane.

> if (fs_info->devfsd_task == NULL) return (TRUE);
> if (devfsd_queue_empty (fs_info) && fs_info->devfsd_sleeping) return TRUE;
> if ( is_devfsd_or_child (fs_info) ) return (FALSE);
> set_current_state (TASK_UNINTERRUPTIBLE);
> add_wait_queue (&fs_info->revalidate_wait_queue, &wait);
> if (!devfsd_queue_empty (fs_info) || !fs_info->devfsd_sleeping)
> if (fs_info->devfsd_task) schedule ();
> remove_wait_queue (&fs_info->revalidate_wait_queue, &wait);
> __set_current_state (TASK_RUNNING);
> return (TRUE);

I took a pass at this, I think this is better (note the use of 1/2 tabs
as "continuation" lines, that's a Sun thing and it works pretty well:

if ((fs_info->devfsd_task == NULL) ||
(devfsd_queue_empty(fs_info) && fs_info->devfsd_sleeping)) {
return (TRUE);
}
if (is_devfsd_or_child(fs_info)) return (FALSE);
set_current_state (TASK_UNINTERRUPTIBLE);
add_wait_queue (&fs_info->revalidate_wait_queue, &wait);
if ((!devfsd_queue_empty (fs_info) || !fs_info->devfsd_sleeping) &&
fs_info->devfsd_task) {
schedule();
}
remove_wait_queue(&fs_info->revalidate_wait_queue, &wait);
__set_current_state(TASK_RUNNING);
return (TRUE);

--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

2003-05-31 15:47:11

by John Bradford

[permalink] [raw]
Subject: Re: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

> > Saving a line over readability is utterly bogus.

> I agree 100%. If you have anything more complex than
>
> if (error) return (error);
>
> I want it to look like
>
> if ((expr) || (expr2) || (expr3)) {
> return (error);
> }

Ergh, personally I hate reading code like that.

Having said that, I hate code that is indented. Am I the only person who mentally
counts, "in, in, in, out", etc, when reading code? Artificial indenting really
throws off my concentration, because the 'prompting' it provides does nothing to
help me, but the ragged left margin is irritating, and makes moving code around
awkward, because you have to correct the indenting to match the current, (actual),
nesting level of the code, (because indenting is a completely artificial concept).

I'm not trying to say my coding style is right, and yours is wrong, I'm just curious
:-).

John.

2003-05-31 17:00:55

by Steven Cole

[permalink] [raw]
Subject: Re: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

On Sat, 2003-05-31 at 09:39, Larry McVoy wrote:
> On Sat, May 31, 2003 at 04:01:50PM +0100, Dave Jones wrote:
> > Saving a line over readability is utterly bogus.
>
> I agree 100%. If you have anything more complex than
>
> if (error) return (error);
>
> I want it to look like
>
> if ((expr) || (expr2) || (expr3)) {
> return (error);
> }
>
This may just be pedantic minutiae, but aren't those parenthesis around
"error" unnecessary?

Here is a proposal for coding style: Only use parenthesis in the return
statement when needed.

return -ETOSENDERADDRESSUNKNOWN; /* this is OK */
return (value & ZORRO_MASK); /* so is this */
return (-ENOTENOUGHCOFFEE); /* bogus parenthesis */

Steven

2003-05-31 17:43:13

by Al Viro

[permalink] [raw]
Subject: Re: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

On Sat, May 31, 2003 at 11:14:08AM -0600, Steven Cole wrote:

> statement when needed.
>
> return -ETOSENDERADDRESSUNKNOWN; /* this is OK */
> return (value & ZORRO_MASK); /* so is this */

Like hell it is. Parenthesis are _not_ needed here - production is
<statement> -> return <expression> ;

The only messy '('-related case in C grammar is sizeof as unary operation
vs. sizeof ( <type> ) (lovely way to torture parsers and students on exam:
sizeof (int)*p). Everything else is pretty straightforward...

2003-05-31 19:24:21

by

[permalink] [raw]
Subject: Re: [PATCH][2.5] UTF-8 support in console

Here's the patch again, with suggested changes.


> + if (!p) return;
>
> Please split this into two lines. Can p ever be null_

p could never be null, as it turns out. So I removed this line. :-)


> + q = p->inverse_trans_unicode = (u16 *)
> + kmalloc(MAX_GLYPH * sizeof(u16), GFP_KERNEL);
>
> The cast is not needed. And btw, where is q freed?

It is freed. Grep for kfree.*inverse_trans_unicode.

Chris


Attachments:
selection.patch (6.33 kB)

2003-06-01 04:28:39

by Matt Mackall

[permalink] [raw]
Subject: Re: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

On Sat, May 31, 2003 at 05:06:21PM +0100, [email protected] wrote:

> Having said that, I hate code that is indented. Am I the only person
> who mentally counts, "in, in, in, out", etc, when reading code?

Yes.
--
Matt Mackall : http://www.selenic.com : of or relating to the moon

2003-06-01 05:05:02

by Randy.Dunlap

[permalink] [raw]
Subject: Re: coding style (was Re: [PATCH][2.5] UTF-8 support in console)

> On Sat, May 31, 2003 at 05:06:21PM +0100, [email protected] wrote:
>
>> Having said that, I hate code that is indented. Am I the only person who
>> mentally counts, "in, in, in, out", etc, when reading code?
>
> Yes.

I agree. That's a "last resort" type of thing.

~Randy