2005-02-09 15:28:04

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 02:26:54PM +0100, Jirka Bohac wrote:
> Hi folks,
>
> find attached a patch that improves the keycode to keysym mapping in the
> kernel. The current system has its limits, not allowing to implement keyboard
> maps that people in different countries are used to. This patch tries to solve
> this. Please, tell me what you think, and merge if possible.
>
> Current state:
> --------------
>
> The keycodes are mapped into keysyms using so-called keymaps. A keymap is
> an array (of 255 elements per default) of keysyms, and there is one such
> keymap for each modifier combination. There are 9 modifiers (such as Alt,
> Ctrl, ....), so one would need to allocate 2^9 = 512 such keymaps to make
> use of all modifier combinations. However, there is a limit of 256 keymaps
> to prevent them eating too much memory. In short, you need a whole keymap
> to add a new modifier combination to a single key -- bad.
>
> The problem is, that not all keyboard modifiers can actually be assigned a
> keyboard map - CapsLock and NumLock simply aren't on the list.

The current keyboard code is far more powerful than you seem to think.

Keymaps are allocated dynamically, and only few people use more than 16.
You can have 256 keymaps, but they are not necessarily the 2^8 maps
belonging to all 2^8 combinations of simultaneously pressed modifier keys.

You can assign the "modifier" property to any key you like.
You can assign the effect of each modifier key as you like.
There are modifier keys with action while pressed, and modifier keys
that act on the next non-modifier keystroke (say, for handicapped),
and modifier keys that lock a state (say, to switch between Latin
and Cyrillic keyboards).

It seems very unlikely that you cannot handle Czech with all
combinations of 8 keys pressed, and need 9.
Please document carefully what you want to do and why you want
to do it. I think most reasonable things are possible.

(The weakest part is the support for Unicode / UTF8 - don't know
whether improvement would be good - it is clear that one doesnt
want to have full Unicode support in the kernel, but there is
continued pressure to add some support for diacriticals. We might.)

Andries


2005-02-09 16:02:57

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 04:27:40PM +0100, Andries Brouwer wrote:

> > The keycodes are mapped into keysyms using so-called keymaps. A keymap is
> > an array (of 255 elements per default) of keysyms, and there is one such
> > keymap for each modifier combination. There are 9 modifiers (such as Alt,
> > Ctrl, ....), so one would need to allocate 2^9 = 512 such keymaps to make
> > use of all modifier combinations. However, there is a limit of 256 keymaps
> > to prevent them eating too much memory. In short, you need a whole keymap
> > to add a new modifier combination to a single key -- bad.
> >
> > The problem is, that not all keyboard modifiers can actually be assigned a
> > keyboard map - CapsLock and NumLock simply aren't on the list.
>
> The current keyboard code is far more powerful than you seem to think.
>
> Keymaps are allocated dynamically, and only few people use more than 16.
> You can have 256 keymaps, but they are not necessarily the 2^8 maps
> belonging to all 2^8 combinations of simultaneously pressed modifier keys.
>
> You can assign the "modifier" property to any key you like.
> You can assign the effect of each modifier key as you like.
> There are modifier keys with action while pressed, and modifier keys
> that act on the next non-modifier keystroke (say, for handicapped),
> and modifier keys that lock a state (say, to switch between Latin
> and Cyrillic keyboards).
>
> It seems very unlikely that you cannot handle Czech with all
> combinations of 8 keys pressed, and need 9.

A czech keyboard has the letters 'escrzyaie' with accents on the number
row of keys. With a Shift, they are supposed to produce the original
numbers, but with a CapsLock, they're supposed to produce the uppercase.
With a right alt or one of three czech dead keys they should produce
the !@#$%^&*() symbols.

It's kind of logical, kind of stupid, but anyway it's the national standard.

You can't do that currently. The main problem is that CapsLock is
hardcoded to work as a Shift on keys and you can't make it work
differently for normal letter keys and for the upper row of keys.

> Please document carefully what you want to do and why you want
> to do it. I think most reasonable things are possible.
>
> (The weakest part is the support for Unicode / UTF8 - don't know
> whether improvement would be good - it is clear that one doesnt
> want to have full Unicode support in the kernel, but there is
> continued pressure to add some support for diacriticals. We might.)

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2005-02-09 16:39:07

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 05:03:45PM +0100, Vojtech Pavlik wrote:

> > It seems very unlikely that you cannot handle Czech with all
> > combinations of 8 keys pressed, and need 9.
>
> A czech keyboard has the letters 'escrzyaie' with accents on the number
> row of keys. With a Shift, they are supposed to produce the original
> numbers, but with a CapsLock, they're supposed to produce the uppercase.
> With a right alt or one of three czech dead keys they should produce
> the !@#$%^&*() symbols.
>
> It's kind of logical, kind of stupid, but anyway it's the national standard.
>
> You can't do that currently. The main problem is that CapsLock is
> hardcoded to work as a Shift on keys and you can't make it work
> differently for normal letter keys and for the upper row of keys.

I think the fallacy in that reasoning is the idea that the key
labeled CapsLock has to be bound to the kernel function named capslock.

Andries

2005-02-09 16:53:58

by Vojtech Pavlik

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 05:38:56PM +0100, Andries Brouwer wrote:
> On Wed, Feb 09, 2005 at 05:03:45PM +0100, Vojtech Pavlik wrote:
>
> > > It seems very unlikely that you cannot handle Czech with all
> > > combinations of 8 keys pressed, and need 9.
> >
> > A czech keyboard has the letters 'escrzyaie' with accents on the number
> > row of keys. With a Shift, they are supposed to produce the original
> > numbers, but with a CapsLock, they're supposed to produce the uppercase.
> > With a right alt or one of three czech dead keys they should produce
> > the !@#$%^&*() symbols.
> >
> > It's kind of logical, kind of stupid, but anyway it's the national standard.
> >
> > You can't do that currently. The main problem is that CapsLock is
> > hardcoded to work as a Shift on keys and you can't make it work
> > differently for normal letter keys and for the upper row of keys.
>
> I think the fallacy in that reasoning is the idea that the key
> labeled CapsLock has to be bound to the kernel function named capslock.

How do you make it control the CapsLock LED then?

--
Vojtech Pavlik
SuSE Labs, SuSE CR

2005-02-09 17:21:42

by Jiri Bohac

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 04:27:40PM +0100, Andries Brouwer wrote:
> On Wed, Feb 09, 2005 at 02:26:54PM +0100, Jirka Bohac wrote:
> > Hi folks,
> >
> > find attached a patch that improves the keycode to keysym mapping in the
> > kernel. The current system has its limits, not allowing to implement keyboard
> > maps that people in different countries are used to. This patch tries to solve
> > this. Please, tell me what you think, and merge if possible.
> >
> > Current state:
> > --------------
> >
> > The keycodes are mapped into keysyms using so-called keymaps. A keymap is
> > an array (of 255 elements per default) of keysyms, and there is one such
> > keymap for each modifier combination. There are 9 modifiers (such as Alt,
> > Ctrl, ....), so one would need to allocate 2^9 = 512 such keymaps to make
> > use of all modifier combinations. However, there is a limit of 256 keymaps
> > to prevent them eating too much memory. In short, you need a whole keymap
> > to add a new modifier combination to a single key -- bad.
> >
> > The problem is, that not all keyboard modifiers can actually be assigned a
> > keyboard map - CapsLock and NumLock simply aren't on the list.
>
> The current keyboard code is far more powerful than you seem to think.
>
> Keymaps are allocated dynamically, and only few people use more than 16.
> You can have 256 keymaps, but they are not necessarily the 2^8 maps
> belonging to all 2^8 combinations of simultaneously pressed modifier keys.
>
> You can assign the "modifier" property to any key you like.
> You can assign the effect of each modifier key as you like.
> There are modifier keys with action while pressed, and modifier keys
> that act on the next non-modifier keystroke (say, for handicapped),
> and modifier keys that lock a state (say, to switch between Latin
> and Cyrillic keyboards).

I know that. But still, there is a problem with K_CAPS and K_NUM. They are
hard wired into the code on several places. They toggle the state of the keyboard LED,
and later the state of the LED directly influences the keycode->keysym mapping; e.g.:

if (type == KT_LETTER) {
type = KT_LATIN;
if (vc_kbd_led(kbd, VC_CAPSLOCK)) {
key_map = key_maps[shift_final ^ (1 << KG_SHIFT)];
if (key_map)
keysym = key_map[keycode];
}
}


>
> It seems very unlikely that you cannot handle Czech with all
> combinations of 8 keys pressed, and need 9.
> Please document carefully what you want to do and why you want
> to do it. I think most reasonable things are possible.

In the standard Czech keyboard, there are letters with diacritics on the
1234567890 keys, this is what you should get wgen pressing those keys:

1) with CapsLock off:
1a) no shift pressed: plus, ecaron, scaron, ccaron, rcaron, zcaron, yacute, aacute, iacute ,eacute
1b) SHIFT pressed: 1, 2, 3, 4, 5, 6, 7, 8, 9, 0
2) with CapsLock on:
1a) no shift pressed: Plus, Ecaron, Scaron, Ccaron, Rcaron, Zcaron, Yacute, Aacute, Iacute ,Eacute
1b) SHIFT pressed: 1, 2, 3, 4, 5, 6, 7, 8, 9, 0

This is not possible to achieve with the current code, because the K_CAPS
behaviour is hard wired in the code and not controlled by an extra set of
keymaps.

There are presently two ways around this, neither of them good enough
1) assigning one of the other modifier keysyms to the CapsLock key
-- the LED will not work
2) assigning a nonstandard keysym to the Shift key -- will breeak
programs like mcedit

The NumLock is hardwired in the code in a similar way. I think this is a
design misconcept. These keys should have been treated as other modifiers.

But by adding two modifiers to almost every keyboard map, you would
increase the space occupied by the keymaps four times. ... erm ... eight
times, because there is also this "applkey" (application keypad) flag,
that will be needed as another modifier.

This, of course, is undesirable. The new implementation solves this
problem:

- you only define the meaning of additional modifiers for those keys for
which they make any difference - not wasting memory by huge keymaps that
are mostly filled by K_HOLEs

- all the clever things you pointed out are still there

- the resulting memory size occupied by the needed structures will
generally be smaller or equal to the current state. Of course there are
insane worst-case examples that end up bloating much more memory.

- the implementation is fully compatible with the old IOCTL interface --
the only drawback is, that the resulting keytables created by the old
IOCTLs are not optimal and actually take up more memory than the
original implementation. But this is a temporary state, which can be
fixed by creating a keyboard map in the new format

- the proposed keyboard map file format is IMHO much much nicer than the
old one, although this is dependent on personal tastes. Maybe by looking
at an example, people will better understand how it works:

keytable Esc { #defines the escape key
alt = Meta_Escape #if alt is pressed, produce Meta_Escape
= Escape #if not, produce an escape
}

# This effectively defined all the 4096 combinations on two lines.
# The first line says: Only have look at the curent state of ALT,
# and if it is on, produce a Meta_Escape.
# The second line says: Don't look on the state of anything and
# produce Escape


keytable Enter {
!shift !altgr !control alt !shiftl !shiftr !ctrll !ctrlr !capsshift = Meta_Control_m
= Return
}

# In this example, the first line says: Have look at the state of
# the shift, altgr, control, alt, shiftl, shiftr, ctrll,
# ctrlr and capshift modifiers. If all of them are off,
# except alt which is on, produce a Meta_Control_m
# The second line fallbacks to Return.

As you can see, this is quite effective. And this is exactly the way it
is represented in the memory - each of the lines is represented by a
6-byte entry in the keytable.

Have a look at the default keymap, which is included in the patch, to
see just how effective this can get. It saves both typing and memory.
Once you get the simple idea, it is much clearer to understand than the
old format, imho.

> (The weakest part is the support for Unicode / UTF8 - don't know
> whether improvement would be good - it is clear that one doesnt
> want to have full Unicode support in the kernel, but there is
> continued pressure to add some support for diacriticals. We might.)

There is unicode support for everything BUT the dead keys, WRT to keyboard
mappings. It seems that dead keys were simply forgotten.

regards,

--
Jirka Bohac <[email protected]>
SUSE Labs, SUSE CR

2005-02-09 19:05:42

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 05:55:00PM +0100, Vojtech Pavlik wrote:
> On Wed, Feb 09, 2005 at 05:38:56PM +0100, Andries Brouwer wrote:
>> On Wed, Feb 09, 2005 at 05:03:45PM +0100, Vojtech Pavlik wrote:
>>
>>>> It seems very unlikely that you cannot handle Czech with all
>>>> combinations of 8 keys pressed, and need 9.
>>>
>>> A czech keyboard has the letters 'escrzyaie' with accents on the number
>>> row of keys. With a Shift, they are supposed to produce the original
>>> numbers, but with a CapsLock, they're supposed to produce the uppercase.
>>> With a right alt or one of three czech dead keys they should produce
>>> the !@#$%^&*() symbols.
>>>
>>> It's kind of logical, kind of stupid, but anyway it's the national standard.
>>>
>>> You can't do that currently. The main problem is that CapsLock is
>>> hardcoded to work as a Shift on keys and you can't make it work
>>> differently for normal letter keys and for the upper row of keys.
>>
>> I think the fallacy in that reasoning is the idea that the key
>> labeled CapsLock has to be bound to the kernel function named capslock.
>
> How do you make it control the CapsLock LED then?

OK - I agree. The keyboard can do what you want,
but there is no independent CapsLock LED control.

Andries


[not that I think the proposed change is a good idea,
but now I understand why one would want to extend functionality]

2005-02-09 20:04:12

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 06:19:21PM +0100, Jirka Bohac wrote:

> There are presently two ways around this, neither of them good enough
> 1) assigning one of the other modifier keysyms to the CapsLock key
> -- the LED will not work

True.

> But by adding two modifiers to almost every keyboard map, you would
> increase the space occupied by the keymaps four times. ... erm ... eight
> times, because there is also this "applkey" (application keypad) flag,
> that will be needed as another modifier.

But keymaps are allocated dynamically.
Any number of new modifiers does not cost anything until
one actually uses some particular modifier combination.

New modifiers are not expensive at all.

> - the proposed keyboard map file format is IMHO much much nicer

Keymap files live in user space. The layout of a keymap file
has no bearing on the kernel implementation of keymaps.

We want a map (keystroke,current_modifiers) -> keycode.
The present kernel code organizes things in maps, one for
each modifier combination that people want to use.
You want to organize things per keystroke.

I see no great advantages. Many small arrays allocated
by kmalloc() leads to more overhead - probably your version
would lead to larger memory usage, but I have not checked.
It looks like your code is larger.
It also looks like your code is slower, with a loop instead of
a table lookup.

(Not that those things are very important, but I do not see
significant advantages for your setup. Maybe you have numbers?)

Of more interest are the added features.

You come with a single big patch, but some parts are independent.

For example, I see

+struct kbdiacruc {
+ unsigned char diacr, base;
+ unsigned int result; /* UCS */
+};

Ten years ago we made the mistake of being too careful with memory.
Today it is a very bad idea to introduce new ioctls that act on
8-bit quantities. These must all be int's.

An ioctl somewhat in this style has been proposed several times,
and I have no serious objections. If you want it, separate it out
and make it a patch on its own.

Andries

2005-02-10 12:56:08

by Jiri Bohac

[permalink] [raw]
Subject: Re: [rfc] keytables - the new keycode->keysym mapping

On Wed, Feb 09, 2005 at 09:03:30PM +0100, Andries Brouwer wrote:
> On Wed, Feb 09, 2005 at 06:19:21PM +0100, Jirka Bohac wrote:
>
> > There are presently two ways around this, neither of them good enough
> > 1) assigning one of the other modifier keysyms to the CapsLock key
> > -- the LED will not work
>
> True.

Well, we need to solve this. It seems that CapsLock and NumLock should really
be modifiers -- the assumption that CapsLock simply inverts the state of Shift is
wrong. My patch solves this.

> > But by adding two modifiers to almost every keyboard map, you would
> > increase the space occupied by the keymaps four times. ... erm ... eight
> > times, because there is also this "applkey" (application keypad) flag,
> > that will be needed as another modifier.
>
> But keymaps are allocated dynamically.
> Any number of new modifiers does not cost anything until
> one actually uses some particular modifier combination.
>
> New modifiers are not expensive at all.

Addidng two modifiers is not expensive. But adding CapsLock, NumLock ( ->
and applkey) would actually require them to be used in most keyboard maps,
probably in combination with all the other modifiers they're currently
using. Thus, increasing their size 4 or 8 times. (Maybe the applkey is not
strictly needed as a modifier, but it makes things much nicer and cleaner
... with the new approach, making applkey a modifier does not hurt at all)

This was the reason I decided to go the keytable way.
The current default map has 7 keymaps -> 3.5kB ... * 4 = 14kB
The current "us" map has 9 keymaps -> 4.5kB ... *4 = 18kB
The current "cz" map has 42 keymaps -> 27kB
Now, don't want to demagogic here ... maybe not all combinations
with CapsLock and NumLock are really needed, but most of them probably
are ... 27kB * almost 4 = almost 108kB
!! Anyway ... the 27kB is bad enough on its own !!

Also, it seems that in the future it will be necessary to increase NR_KEYS
beyond 256 (probably 512 ?). So, better multiply the above numbers by two
;-)

Now ... with the keytables patch, there is a fixed amount of memory eaten
by the key_tables array ... NR_KEYS * 8 bytes in most cases = 2k (4k in
the future).

By adding many modifier-dependent meanings to a couple of keys, you
increase only the table associated with that couple of keys. The default
map I supplied uses 155 32B blocks = 4960B in 114 tables and 507 entries.
So the total is 7kB for the default keymap. Ok, this is two times worse
than the current 3.5kB but also two times better than the 14kB needed to
implement the current map with CapsLock and NumLock.

But the way is much better suited for future extensions. More keycodes
won't hurt. More modifiers won't hurt, even in combinations with the
current ones.

I haven't written the "cz" map in the new format yet, but it is obvious
that it would be just slightly larger than the default map (I'd bet it
would fit in 10k, not 27k, not 104k). Just have a look at these maps in
the old format, and try to count the number of VoidSymbols in there.

> > - the proposed keyboard map file format is IMHO much much nicer
>
> Keymap files live in user space. The layout of a keymap file
> has no bearing on the kernel implementation of keymaps.

Well, not quite true in this case, because the new format is not a
traditional "map". It is a lookup table. Anyway, this is not important...

> The present kernel code organizes things in maps, one for
> each modifier combination that people want to use.
> You want to organize things per keystroke.

Seems logical to me. Defines what individual modifiers do to keys, instead
of having huge maps for every modifier combination you want to use
(possibly for a single key).


> I see no great advantages. Many small arrays allocated
> by kmalloc() leads to more overhead - probably your version
> would lead to larger memory usage, but I have not checked.

For very basic maps yes ... slightly larger. For usable maps smaller.

> It looks like your code is larger.

Well, the big and ugly part is the backwards compatibility code. This
would go away after some time. The rest looks like being cleaner and
better structured (?)

> It also looks like your code is slower, with a loop instead of a table
> lookup. (Not that those things are very important

True ... both parts ;-) ... no, really, the tables typically have up to
ten elements, this shouldn't hurt

> Of more interest are the added features.
> You come with a single big patch, but some parts are independent.

Sorry, I really could have splitted this. Will do.

> For example, I see
>
> +struct kbdiacruc {
> + unsigned char diacr, base;
> + unsigned int result; /* UCS */
> +};
>
> Ten years ago we made the mistake of being too careful with memory.
> Today it is a very bad idea to introduce new ioctls that act on
> 8-bit quantities. These must all be int's.

Looks like a good idea. I will probably make the KDGKBTBL and KDSKBTBL
ioctls also use int for the keysym, because I just copied the idea from
the current system, where the unicode does not have full 16 bits (the ^
0xf000 hack). You're right, it would be very unfortunate to have to extend
the ioctls once again because of saving on bits here.

regards,

--
Jirka Bohac <[email protected]>
SUSE Labs, SUSE CR