2017-03-27 10:35:54

by Adam Borowski

[permalink] [raw]
Subject: [PATCH 0/2] vt: mouse selection word boundaries

Hi!
Here's a couple of really low priority fixes to how "word chars" for mouse
selection are determined.

Patch 1 (adds "-./") is an epitome of "apply if bored": for two decades,
only gpm used this, and it always ignored the defaults. Bill Allombert made
a second implementation, "consolation", and we wasted a bit of time figuring
out why mouse selection is so limited. As consolation now overrides the
defaults too, this patch will be helpful only for whoever makes a third user
of this code...

Patch 2, besides making a handful of non-ASCII symbols work the same as
everything else, also stops people from mysteriously losing that 'ü' after
naive gpm -l "-A-Za-z0-9_./".

--
⢀⣴⠾⠻⢶⣦⠀ Meow!
⣾⠁⢠⠒⠀⣿⡁
⢿⡄⠘⠷⠚⠋⠀ Collisions shmolisions, let's see them find a collision or second
⠈⠳⣄⠀⠀⠀⠀ preimage for double rot13!


2017-03-27 10:37:37

by Adam Borowski

[permalink] [raw]
Subject: [PATCH 1/2] vt: set mouse selection word-chars to gpm's default

Since forever, gpm was this code's only user, and it overrides the table on
start so the default was never seen -- until Bill Allombert's "consolation"
came in. The in-kernel set is "A-Za-z0-9_" which fails to catch typical
file names, etc. Let's change this to gpm's conservative default, ie
"-A-Za-z0-9_./"; most terminals include more, for example xfce4-terminal has
"-A-Za-z0-9,./?%&#:_=+@~".

There's some discussion at https://bugs.debian.org/846587

Signed-off-by: Adam Borowski <[email protected]>
---
drivers/tty/vt/selection.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
index 36e1b8c7680f..2252e11d8347 100644
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -84,7 +84,7 @@ void clear_selection(void)
*/
static u32 inwordLut[8]={
0x00000000, /* control chars */
- 0x03FF0000, /* digits */
+ 0x03FFE000, /* digits and "-./" */
0x87FFFFFE, /* uppercase and '_' */
0x07FFFFFE, /* lowercase */
0x00000000,
--
2.11.0

2017-03-27 10:37:50

by Adam Borowski

[permalink] [raw]
Subject: [PATCH 2/2] vt: make mouse selection of non-ASCII consistent

For some reason a handful of ISO-8859-1 symbols are excluded from "word
chars" while the vast majority of Unicode is hard-coded as included, even
when inappropriate (we really would want to _not_ select line-drawing/etc).
Those symbols are: ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿×÷

Thus, let's not special-case any non-ASCII anymore. Attempts to set these
via ioctl will be silently ignored.

As an extra bonus, we debloat the kernel by 128 bytes.

Signed-off-by: Adam Borowski <[email protected]>
---
drivers/tty/vt/selection.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
index 2252e11d8347..c81c99165ea6 100644
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -80,21 +80,17 @@ void clear_selection(void)

/*
* User settable table: what characters are to be considered alphabetic?
- * 256 bits. Locked by the console lock.
+ * 128 bits. Locked by the console lock.
*/
-static u32 inwordLut[8]={
+static u32 inwordLut[4]={
0x00000000, /* control chars */
0x03FFE000, /* digits and "-./" */
0x87FFFFFE, /* uppercase and '_' */
0x07FFFFFE, /* lowercase */
- 0x00000000,
- 0x00000000,
- 0xFF7FFFFF, /* latin-1 accented letters, not multiplication sign */
- 0xFF7FFFFF /* latin-1 accented letters, not division sign */
};

static inline int inword(const u16 c) {
- return c > 0xff || (( inwordLut[c>>5] >> (c & 0x1F) ) & 1);
+ return c > 0x7f || (( inwordLut[c>>5] >> (c & 0x1F) ) & 1);
}

/**
@@ -106,10 +102,10 @@ static inline int inword(const u16 c) {
*/
int sel_loadlut(char __user *p)
{
- u32 tmplut[8];
- if (copy_from_user(tmplut, (u32 __user *)(p+4), 32))
+ u32 tmplut[4];
+ if (copy_from_user(tmplut, (u32 __user *)(p+4), 16))
return -EFAULT;
- memcpy(inwordLut, tmplut, 32);
+ memcpy(inwordLut, tmplut, 16);
return 0;
}

--
2.11.0

2017-03-27 11:50:39

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 2/2] vt: make mouse selection of non-ASCII consistent

On 03/27/2017, 12:37 PM, Adam Borowski wrote:
> For some reason a handful of ISO-8859-1 symbols are excluded from "word
> chars" while the vast majority of Unicode is hard-coded as included, even
> when inappropriate (we really would want to _not_ select line-drawing/etc).
> Those symbols are: ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿×÷
>
> Thus, let's not special-case any non-ASCII anymore. Attempts to set these
> via ioctl will be silently ignored.
>
> As an extra bonus, we debloat the kernel by 128 bytes.
>
> Signed-off-by: Adam Borowski <[email protected]>
> ---
> drivers/tty/vt/selection.c | 16 ++++++----------
> 1 file changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
> index 2252e11d8347..c81c99165ea6 100644
> --- a/drivers/tty/vt/selection.c
> +++ b/drivers/tty/vt/selection.c
> @@ -80,21 +80,17 @@ void clear_selection(void)
>
> /*
> * User settable table: what characters are to be considered alphabetic?
> - * 256 bits. Locked by the console lock.
> + * 128 bits. Locked by the console lock.
> */
> -static u32 inwordLut[8]={
> +static u32 inwordLut[4]={

No need for the constant here. Let it autosize.

> 0x00000000, /* control chars */
> 0x03FFE000, /* digits and "-./" */
> 0x87FFFFFE, /* uppercase and '_' */
> 0x07FFFFFE, /* lowercase */
> - 0x00000000,
> - 0x00000000,
> - 0xFF7FFFFF, /* latin-1 accented letters, not multiplication sign */
> - 0xFF7FFFFF /* latin-1 accented letters, not division sign */
> };
>
> static inline int inword(const u16 c) {
> - return c > 0xff || (( inwordLut[c>>5] >> (c & 0x1F) ) & 1);
> + return c > 0x7f || (( inwordLut[c>>5] >> (c & 0x1F) ) & 1);
> }
>
> /**
> @@ -106,10 +102,10 @@ static inline int inword(const u16 c) {
> */
> int sel_loadlut(char __user *p)
> {
> - u32 tmplut[8];
> - if (copy_from_user(tmplut, (u32 __user *)(p+4), 32))
> + u32 tmplut[4];

ARRAY_SIZE(inwordLut) here.

> + if (copy_from_user(tmplut, (u32 __user *)(p+4), 16))
> return -EFAULT;
> - memcpy(inwordLut, tmplut, 32);
> + memcpy(inwordLut, tmplut, 16);

sizeof(inwordLut) here and for copy_from_user too.

> return 0;
> }

thanks,
--
js
suse labs

2017-03-27 12:22:12

by Adam Borowski

[permalink] [raw]
Subject: [PATCH v2 2/2] vt: make mouse selection of non-ASCII consistent

For some reason a handful of ISO-8859-1 symbols are excluded from "word
chars" while the vast majority of Unicode is hard-coded as included, even
when inappropriate (we really would want to _not_ select line-drawing/etc).
Those symbols are: ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿×÷

Thus, let's not special-case any non-ASCII anymore. Attempts to set these
via ioctl will be silently ignored.

As an extra bonus, we debloat the kernel by 128 bytes.

Signed-off-by: Adam Borowski <[email protected]>
---
v2: Got rid of hard-coded array sizes.

drivers/tty/vt/selection.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
index 2252e11d8347..accbd1257bc4 100644
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -80,21 +80,17 @@ void clear_selection(void)

/*
* User settable table: what characters are to be considered alphabetic?
- * 256 bits. Locked by the console lock.
+ * 128 bits. Locked by the console lock.
*/
-static u32 inwordLut[8]={
+static u32 inwordLut[]={
0x00000000, /* control chars */
0x03FFE000, /* digits and "-./" */
0x87FFFFFE, /* uppercase and '_' */
0x07FFFFFE, /* lowercase */
- 0x00000000,
- 0x00000000,
- 0xFF7FFFFF, /* latin-1 accented letters, not multiplication sign */
- 0xFF7FFFFF /* latin-1 accented letters, not division sign */
};

static inline int inword(const u16 c) {
- return c > 0xff || (( inwordLut[c>>5] >> (c & 0x1F) ) & 1);
+ return c > 0x7f || (( inwordLut[c>>5] >> (c & 0x1F) ) & 1);
}

/**
@@ -106,10 +102,10 @@ static inline int inword(const u16 c) {
*/
int sel_loadlut(char __user *p)
{
- u32 tmplut[8];
- if (copy_from_user(tmplut, (u32 __user *)(p+4), 32))
+ u32 tmplut[ARRAY_SIZE(inwordLut)];
+ if (copy_from_user(tmplut, (u32 __user *)(p+4), sizeof(inwordLut)))
return -EFAULT;
- memcpy(inwordLut, tmplut, 32);
+ memcpy(inwordLut, tmplut, sizeof(inwordLut));
return 0;
}

--
2.11.0

2017-03-27 12:22:20

by Adam Borowski

[permalink] [raw]
Subject: [PATCH v2 1/2] vt: set mouse selection word-chars to gpm's default

Since forever, gpm was this code's only user, and it overrides the table on
start so the default was never seen -- until Bill Allombert's "consolation"
came in. The in-kernel set is "A-Za-z0-9_" which fails to catch typical
file names, etc. Let's change this to gpm's conservative default, ie
"-A-Za-z0-9_./"; most terminals include more, for example xfce4-terminal has
"-A-Za-z0-9,./?%&#:_=+@~".

There's some discussion at https://bugs.debian.org/846587

Signed-off-by: Adam Borowski <[email protected]>
---
v2: no changes in this patch, resent as part of the set.

drivers/tty/vt/selection.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c
index 36e1b8c7680f..2252e11d8347 100644
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -84,7 +84,7 @@ void clear_selection(void)
*/
static u32 inwordLut[8]={
0x00000000, /* control chars */
- 0x03FF0000, /* digits */
+ 0x03FFE000, /* digits and "-./" */
0x87FFFFFE, /* uppercase and '_' */
0x07FFFFFE, /* lowercase */
0x00000000,
--
2.11.0