2018-12-15 14:44:15

by Martin Hostettler

[permalink] [raw]
Subject: vt: Improve CSI parsing

This patch series improves parsing of csi sequences to be more compliant
with current practice.

ECMA-64 defines the format of CSI sequences which allow more characters
than what the vt parser currently accepts. More importantly many of
these characters are used in sequences that more capable terminal
terminal implementations use.

Adjust the parsing of CSI sequences to match xterm* by ignoring all
unknown sequences of the form
(ESC [)|CSI [\x20-\x3f]*[\x40-\x7e]

This avoids printing unwanted characters when application send valid
sequences not supported by linux either while querying the terminal for
it's identity or when applications print sequences without knowing what
terminal implementation they are connected to (e.g. when connected over
serial lines, android's adb, simple tcp connects, etc)

* and other common terminals


2018-12-15 14:41:57

by Martin Hostettler

[permalink] [raw]
Subject: [PATCH 4/4] vt: ignore sequences that contain ':' in parameters.

csi sequences can contain subparameters delimited by ':' characters. For
now just ignore the whole sequence in this case. Such sequences are used by
more capable terminal implementations with T.416 high color modes or
extended underline rendition attributes.

Also ignore sequences with private use characters '?', '>', '='
and '>' that are not at the initial position.

Signed-off-by: Martin Hostettler <[email protected]>
---
drivers/tty/vt/vt.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 24cd0e9c037b..0aaa15c723fa 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1629,9 +1629,9 @@ static void rgb_background(struct vc_data *vc, const struct rgb *c)

/*
* ITU T.416 Higher colour modes. They break the usual properties of SGR codes
- * and thus need to be detected and ignored by hand. Strictly speaking, that
- * standard also wants : rather than ; as separators, contrary to ECMA-48, but
- * no one produces such codes and almost no one accepts them.
+ * and thus need to be detected and ignored by hand. That standard also
+ * wants : rather than ; as separators but sequences containing : are currently
+ * completely ignored by the parser.
*
* Subcommands 3 (CMY) and 4 (CMYK) are so insane there's no point in
* supporting them.
@@ -2259,7 +2259,7 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
vc->vc_par[vc->vc_npar] += c - '0';
return;
}
- if (c >= 0x20 && c <= 0x2f) {
+ if (c >= 0x20 && c <= 0x3f) { /* 0x2x, 0x3a and 0x3c - 0x3f */
vc->vc_state = EScsiignore;
return;
}
--
2.11.0


2018-12-15 14:41:59

by Martin Hostettler

[permalink] [raw]
Subject: [PATCH 2/4] vt: Implement parsing for >, =, < private sequences.

Private sequences can start with '>', '=' and (in theory) '<'.
Implement correct parsing for these. The newly parsable sequences are
cleanly ignored as it is customary with terminal emulators.

This allows the vt to ignore various sequences used by more capable
terminal implementations such as "Secondary Device Attributes",
"Tertiary Device Attributes" and various advanced configuration commands
that don't have dedicated terminfo entries.

Signed-off-by: Martin Hostettler <[email protected]>
---
drivers/tty/vt/vt.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 75826a97b0c3..448b4f6be7d1 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2235,9 +2235,21 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
vc->vc_state=ESfunckey;
return;
}
- vc->vc_priv = (c == '?') ? EPdec : EPecma;
- if (vc->vc_priv != EPecma)
+ switch (c) {
+ case '?':
+ vc->vc_priv = EPdec;
+ return;
+ case '>':
+ vc->vc_priv = EPgt;
+ return;
+ case '=':
+ vc->vc_priv = EPeq;
return;
+ case '<':
+ vc->vc_priv = EPlt;
+ return;
+ }
+ vc->vc_priv = EPecma;
case ESgetpars:
if (c == ';' && vc->vc_npar < NPAR - 1) {
vc->vc_npar++;
@@ -2250,10 +2262,12 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
vc->vc_state = ESnormal;
switch(c) {
case 'h':
- set_mode(vc, 1);
+ if (vc->vc_priv <= EPdec)
+ set_mode(vc, 1);
return;
case 'l':
- set_mode(vc, 0);
+ if (vc->vc_priv <= EPdec)
+ set_mode(vc, 0);
return;
case 'c':
if (vc->vc_priv == EPdec) {
--
2.11.0


2018-12-15 14:42:09

by Martin Hostettler

[permalink] [raw]
Subject: [PATCH 3/4] vt: ignore csi sequences with intermediate characters.

Various csi sequences contain intermediate characters between the
parameters and the final character. Introduce a additional state that
cleanly ignores these sequences.

This allows the vt to ignore these sequences used by more capable
terminal implementations such as "request mode", etc.

Signed-off-by: Martin Hostettler <[email protected]>
---
drivers/tty/vt/vt.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 448b4f6be7d1..24cd0e9c037b 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2023,7 +2023,7 @@ static void restore_cur(struct vc_data *vc)
}

enum { ESnormal, ESesc, ESsquare, ESgetpars, ESfunckey,
- EShash, ESsetG0, ESsetG1, ESpercent, ESignore, ESnonstd,
+ EShash, ESsetG0, ESsetG1, ESpercent, EScsiignore, ESnonstd,
ESpalette, ESosc };

/* console_lock is held (except via vc_init()) */
@@ -2259,6 +2259,10 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
vc->vc_par[vc->vc_npar] += c - '0';
return;
}
+ if (c >= 0x20 && c <= 0x2f) {
+ vc->vc_state = EScsiignore;
+ return;
+ }
vc->vc_state = ESnormal;
switch(c) {
case 'h':
@@ -2421,6 +2425,11 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
return;
}
return;
+ case EScsiignore:
+ if (c >= 20 && c <= 0x3f)
+ return;
+ vc->vc_state = ESnormal;
+ return;
case ESpercent:
vc->vc_state = ESnormal;
switch (c) {
--
2.11.0


2018-12-15 14:42:50

by Martin Hostettler

[permalink] [raw]
Subject: [PATCH 1/4] vt: refactor vc_ques to allow of other private sequences.

The vc_ques keeps track if a csi sequence is a private DEC control
function beginning with '?'. Nowadays some private control functions
begin with '>' and '='. Switch the code to instead use a new 3-bit
vc_priv that allows for all private use parameter prefixes.

Signed-off-by: Martin Hostettler <[email protected]>
---
drivers/tty/vt/vt.c | 20 +++++++++++---------
include/linux/console_struct.h | 2 +-
2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 41ec8e5010f3..75826a97b0c3 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1341,6 +1341,8 @@ struct vc_data *vc_deallocate(unsigned int currcons)
* VT102 emulator
*/

+enum { EPecma = 0, EPdec, EPeq, EPgt, EPlt};
+
#define set_kbd(vc, x) vt_set_kbd_mode_bit((vc)->vc_num, (x))
#define clr_kbd(vc, x) vt_clr_kbd_mode_bit((vc)->vc_num, (x))
#define is_kbd(vc, x) vt_get_kbd_mode_bit((vc)->vc_num, (x))
@@ -1814,7 +1816,7 @@ static void set_mode(struct vc_data *vc, int on_off)
int i;

for (i = 0; i <= vc->vc_npar; i++)
- if (vc->vc_ques) {
+ if (vc->vc_priv == EPdec) {
switch(vc->vc_par[i]) { /* DEC private modes set/reset */
case 1: /* Cursor keys send ^[Ox/^[[x */
if (on_off)
@@ -2030,7 +2032,7 @@ static void reset_terminal(struct vc_data *vc, int do_clear)
vc->vc_top = 0;
vc->vc_bottom = vc->vc_rows;
vc->vc_state = ESnormal;
- vc->vc_ques = 0;
+ vc->vc_priv = EPecma;
vc->vc_translate = set_translate(LAT1_MAP, vc);
vc->vc_G0_charset = LAT1_MAP;
vc->vc_G1_charset = GRAF_MAP;
@@ -2233,8 +2235,8 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
vc->vc_state=ESfunckey;
return;
}
- vc->vc_ques = (c == '?');
- if (vc->vc_ques)
+ vc->vc_priv = (c == '?') ? EPdec : EPecma;
+ if (vc->vc_priv != EPecma)
return;
case ESgetpars:
if (c == ';' && vc->vc_npar < NPAR - 1) {
@@ -2254,7 +2256,7 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
set_mode(vc, 0);
return;
case 'c':
- if (vc->vc_ques) {
+ if (vc->vc_priv == EPdec) {
if (vc->vc_par[0])
vc->vc_cursor_type = vc->vc_par[0] | (vc->vc_par[1] << 8) | (vc->vc_par[2] << 16);
else
@@ -2263,7 +2265,7 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
}
break;
case 'm':
- if (vc->vc_ques) {
+ if (vc->vc_priv == EPdec) {
clear_selection();
if (vc->vc_par[0])
vc->vc_complement_mask = vc->vc_par[0] << 8 | vc->vc_par[1];
@@ -2273,7 +2275,7 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
}
break;
case 'n':
- if (!vc->vc_ques) {
+ if (vc->vc_priv == EPecma) {
if (vc->vc_par[0] == 5)
status_report(tty);
else if (vc->vc_par[0] == 6)
@@ -2281,8 +2283,8 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
}
return;
}
- if (vc->vc_ques) {
- vc->vc_ques = 0;
+ if (vc->vc_priv != EPecma) {
+ vc->vc_priv = EPecma;
return;
}
switch(c) {
diff --git a/include/linux/console_struct.h b/include/linux/console_struct.h
index ab137f97ecbd..ed798e114663 100644
--- a/include/linux/console_struct.h
+++ b/include/linux/console_struct.h
@@ -119,7 +119,7 @@ struct vc_data {
unsigned int vc_s_blink : 1;
unsigned int vc_s_reverse : 1;
/* misc */
- unsigned int vc_ques : 1;
+ unsigned int vc_priv : 3;
unsigned int vc_need_wrap : 1;
unsigned int vc_can_do_color : 1;
unsigned int vc_report_mouse : 2;
--
2.11.0


2019-01-18 13:01:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: vt: Improve CSI parsing

On Sat, Dec 15, 2018 at 03:34:19PM +0100, Martin Hostettler wrote:
> This patch series improves parsing of csi sequences to be more compliant
> with current practice.
>
> ECMA-64 defines the format of CSI sequences which allow more characters
> than what the vt parser currently accepts. More importantly many of
> these characters are used in sequences that more capable terminal
> terminal implementations use.
>
> Adjust the parsing of CSI sequences to match xterm* by ignoring all
> unknown sequences of the form
> (ESC [)|CSI [\x20-\x3f]*[\x40-\x7e]
>
> This avoids printing unwanted characters when application send valid
> sequences not supported by linux either while querying the terminal for
> it's identity or when applications print sequences without knowing what
> terminal implementation they are connected to (e.g. when connected over
> serial lines, android's adb, simple tcp connects, etc)
>
> * and other common terminals

Thanks for these, now queued up.

greg k-h

2023-12-14 12:11:26

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 3/4] vt: ignore csi sequences with intermediate characters.

On 15. 12. 18, 15:34, Martin Hostettler wrote:
> Various csi sequences contain intermediate characters between the
> parameters and the final character. Introduce a additional state that
> cleanly ignores these sequences.
>
> This allows the vt to ignore these sequences used by more capable
> terminal implementations such as "request mode", etc.
>
> Signed-off-by: Martin Hostettler <[email protected]>
> ---
> drivers/tty/vt/vt.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index 448b4f6be7d1..24cd0e9c037b 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -2023,7 +2023,7 @@ static void restore_cur(struct vc_data *vc)
> }
>
> enum { ESnormal, ESesc, ESsquare, ESgetpars, ESfunckey,
> - EShash, ESsetG0, ESsetG1, ESpercent, ESignore, ESnonstd,
> + EShash, ESsetG0, ESsetG1, ESpercent, EScsiignore, ESnonstd,
> ESpalette, ESosc };
>
> /* console_lock is held (except via vc_init()) */
> @@ -2259,6 +2259,10 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
> vc->vc_par[vc->vc_npar] += c - '0';
> return;
> }
> + if (c >= 0x20 && c <= 0x2f) {
> + vc->vc_state = EScsiignore;
> + return;
> + }
> vc->vc_state = ESnormal;
> switch(c) {
> case 'h':
> @@ -2421,6 +2425,11 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
> return;
> }
> return;
> + case EScsiignore:
> + if (c >= 20 && c <= 0x3f)

Staring at the current code, I am confused as I cannot find out why
"20". Was this supposed to be 0x20 (the same as above -- 0x20 is SPACE
and that _is_ sensible)? Or why was this arbitrary 20 chosen?

thanks,
--
js
suse labs

2024-01-14 15:16:18

by Martin Hostettler

[permalink] [raw]
Subject: Re: [PATCH 3/4] vt: ignore csi sequences with intermediate characters.

On Thu, Dec 14, 2023 at 01:10:07PM +0100, Jiri Slaby wrote:
> On 15. 12. 18, 15:34, Martin Hostettler wrote:
> > Various csi sequences contain intermediate characters between the
> > parameters and the final character. Introduce a additional state that
> > cleanly ignores these sequences.
> >
> > This allows the vt to ignore these sequences used by more capable
> > terminal implementations such as "request mode", etc.
> >
> > Signed-off-by: Martin Hostettler <[email protected]>
> > ---
> > drivers/tty/vt/vt.c | 11 ++++++++++-
> > 1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> > index 448b4f6be7d1..24cd0e9c037b 100644
> > --- a/drivers/tty/vt/vt.c
> > +++ b/drivers/tty/vt/vt.c
> > @@ -2023,7 +2023,7 @@ static void restore_cur(struct vc_data *vc)
> > }
> > enum { ESnormal, ESesc, ESsquare, ESgetpars, ESfunckey,
> > - EShash, ESsetG0, ESsetG1, ESpercent, ESignore, ESnonstd,
> > + EShash, ESsetG0, ESsetG1, ESpercent, EScsiignore, ESnonstd,
> > ESpalette, ESosc };
> > /* console_lock is held (except via vc_init()) */
> > @@ -2259,6 +2259,10 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
> > vc->vc_par[vc->vc_npar] += c - '0';
> > return;
> > }
> > + if (c >= 0x20 && c <= 0x2f) {
> > + vc->vc_state = EScsiignore;
> > + return;
> > + }
> > vc->vc_state = ESnormal;
> > switch(c) {
> > case 'h':
> > @@ -2421,6 +2425,11 @@ static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
> > return;
> > }
> > return;
> > + case EScsiignore:
> > + if (c >= 20 && c <= 0x3f)
>
> Staring at the current code, I am confused as I cannot find out why "20".
> Was this supposed to be 0x20 (the same as above -- 0x20 is SPACE and that
> _is_ sensible)? Or why was this arbitrary 20 chosen?

I'm not sure what happend here. But i agree this should be 0x20 or it
should be removed (see below) but not decimal 20.

This is supposed to match what ECMA-48, xterm and the usual terminal
parsers do.

The most important behavior here is that common usages work, which
fortunatly it seems this did not break.

CAN, SUB and ESC are in the range 20 to 0x20, but they are already handled
before the switch. And 0x20 to 0x3f are properly ignored.

I think that is all that really matters for terminal interoperablity.

When comparing with how xterm handles this state [1], xterm ignores more
characters. If we want to match that, i think we would want to remove the
whole `c >= 20 &&` part.

xterm ignores 0-4, 6, 16-23, 25, 28-0x3f and 127.
Or in other words, it does not ignore
5 (ENQ), 7 (BEL), 8 (BS), 9 (TAB), 10-13, 14 (Shift out), 15 (shift in),
24 (CAN), 26 (SUB), 27 (ESC) and 0x40-126

This code already handles 7, 8-15, 24, 26, 27, 127 before the switch.
The code also does not implement ENQ so effectivly ignoring it is the same
as handling it.

But if we match xterm here, we should also match it in other cases like
ESsquare and ESgetpars.

So in summary i think we should fix this, but the fix does not need any
backporting to stable kernels.

Do you want to send a patch to fix this, or should i send a fix.

Do you have thoughts in which of the two plausible directions we should
change the code?

[1] https://github.com/ThomasDickey/xterm-snapshots/blob/3a151f2f31358d135204c8f90759f3cfd0697b9e/VTPrsTbl.c#L5290

>
> thanks,
> --
> js
> suse labs
>