2020-11-24 10:25:35

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

From: Borislav Petkov <[email protected]>

Hi,

here's what I had in mind, finally split into proper patches. The final
goal is for all users of the decoder to simply call insn_decode() and
look at its retval. Simple.

As to amluto's question about detecting partial insns, see the diff
below.

Running that gives:

insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0

and the return value is always success.

Which means that that buf_len that gets supplied to the decoder
functions doesn't really work and I need to look into it.

That is, provided this is how we want to control what the instruction
decoder decodes - by supplying the length of the buffer it should look
at.

We could also say that probably there should be a way to say "decode
only the first insn in the buffer and ignore the rest". That is all up
to the use cases so I'm looking for suggestions here.

In any case, at least the case where I give it

0x48 0xcf 0x48 0x83

and say that buf size is 4, should return an error because the second
insn is incomplete. So I need to go look at that now.

Thx.

---

diff --git a/arch/x86/tools/insn_sanity.c b/arch/x86/tools/insn_sanity.c
index 51309df285b4..41e1ae0cd833 100644
--- a/arch/x86/tools/insn_sanity.c
+++ b/arch/x86/tools/insn_sanity.c
@@ -220,6 +220,45 @@ static void parse_args(int argc, char **argv)
}
}

+static void run_insn_limits_test(void)
+{
+ unsigned char b[MAX_INSN_SIZE];
+ struct insn insn;
+ int ret, i, size;
+
+ memset(b, INSN_NOP, MAX_INSN_SIZE);
+
+ /* IRETQ */
+ b[0] = 0x48;
+ b[1] = 0xcf;
+
+ /* partial add $0x8, %rsp */
+ b[2] = 0x48;
+ b[3] = 0x83;
+
+ printf("insn buffer:\n");
+
+ for (i = 0; i < MAX_INSN_SIZE; i++)
+ printf("0x%hhx ", b[i]);
+ printf("\n");
+
+ size = MAX_INSN_SIZE;
+ ret = insn_decode(&insn, b, size, INSN_MODE_64);
+ printf("supplied buf size: %d, ret %d\n", size, ret);
+
+ size = 2;
+ ret = insn_decode(&insn, b, size, INSN_MODE_64);
+ printf("supplied buf size: %d, ret %d\n", size, ret);
+
+ size = 3;
+ ret = insn_decode(&insn, b, size, INSN_MODE_64);
+ printf("supplied buf size: %d, ret %d\n", size, ret);
+
+ size = 4;
+ ret = insn_decode(&insn, b, size, INSN_MODE_64);
+ printf("supplied buf size: %d, ret %d\n", size, ret);
+}
+
int main(int argc, char **argv)
{
int insns = 0, ret;
@@ -265,5 +304,7 @@ int main(int argc, char **argv)
errors,
seed);

+ run_insn_limits_test();
+
return errors ? 1 : 0;
}

Borislav Petkov (19):
x86/insn: Rename insn_decode() to insn_decode_regs()
x86/insn: Add @buf_len param to insn_init() kernel-doc comment
x86/insn: Add an insn_decode() API
x86/insn-eval: Handle return values from the decoder
x86/boot/compressed/sev-es: Convert to insn_decode()
perf/x86/intel/ds: Check insn_get_length() retval
perf/x86/intel/ds: Check return values of insn decoder functions
x86/alternative: Use insn_decode()
x86/mce: Convert to insn_decode()
x86/kprobes: Convert to insn_decode()
x86/sev-es: Convert to insn_decode()
x86/traps: Convert to insn_decode()
x86/uprobes: Convert to insn_decode()
x86/tools/insn_decoder_test: Convert to insn_decode()
tools/objtool: Convert to insn_decode()
x86/tools/insn_sanity: Convert to insn_decode()
tools/perf: Convert to insn_decode()
x86/insn: Remove kernel_insn_init()
x86/insn: Make insn_complete() static

arch/x86/boot/compressed/sev-es.c | 11 +-
arch/x86/events/intel/ds.c | 4 +-
arch/x86/events/intel/lbr.c | 10 +-
arch/x86/include/asm/insn-eval.h | 4 +-
arch/x86/include/asm/insn.h | 42 ++--
arch/x86/kernel/alternative.c | 6 +-
arch/x86/kernel/cpu/mce/severity.c | 12 +-
arch/x86/kernel/kprobes/core.c | 17 +-
arch/x86/kernel/kprobes/opt.c | 9 +-
arch/x86/kernel/sev-es.c | 15 +-
arch/x86/kernel/traps.c | 7 +-
arch/x86/kernel/umip.c | 2 +-
arch/x86/kernel/uprobes.c | 8 +-
arch/x86/lib/insn-eval.c | 20 +-
arch/x86/lib/insn.c | 190 ++++++++++++++----
arch/x86/tools/insn_decoder_test.c | 10 +-
arch/x86/tools/insn_sanity.c | 8 +-
tools/arch/x86/include/asm/insn.h | 42 ++--
tools/arch/x86/lib/insn.c | 190 ++++++++++++++----
tools/include/linux/kconfig.h | 73 +++++++
tools/objtool/arch/x86/decode.c | 9 +-
tools/perf/arch/x86/tests/insn-x86.c | 9 +-
tools/perf/arch/x86/util/archinsn.c | 9 +-
.../intel-pt-decoder/intel-pt-insn-decoder.c | 17 +-
24 files changed, 507 insertions(+), 217 deletions(-)
create mode 100644 tools/include/linux/kconfig.h

--
2.21.0


2020-11-24 10:26:52

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH v0 11/19] x86/sev-es: Convert to insn_decode()

From: Borislav Petkov <[email protected]>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/kernel/sev-es.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/sev-es.c b/arch/x86/kernel/sev-es.c
index 37736486603e..564cc9fc693d 100644
--- a/arch/x86/kernel/sev-es.c
+++ b/arch/x86/kernel/sev-es.c
@@ -244,8 +244,7 @@ static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
{
char buffer[MAX_INSN_SIZE];
- enum es_result ret;
- int res;
+ int res, ret;

if (user_mode(ctxt->regs)) {
res = insn_fetch_from_user(ctxt->regs, buffer);
@@ -267,13 +266,13 @@ static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
return ES_EXCEPTION;
}

- insn_init(&ctxt->insn, buffer, MAX_INSN_SIZE - res, 1);
- insn_get_length(&ctxt->insn);
+ ret = insn_decode(&ctxt->insn, buffer, MAX_INSN_SIZE - res, INSN_MODE_64);
}

- ret = ctxt->insn.immediate.got ? ES_OK : ES_DECODE_FAILED;
-
- return ret;
+ if (ret < 0)
+ return ES_DECODE_FAILED;
+ else
+ return ES_OK;
}

static enum es_result vc_write_mem(struct es_em_ctxt *ctxt,
--
2.21.0

2020-11-24 10:26:52

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH v0 06/19] perf/x86/intel/ds: Check insn_get_length() retval

From: Borislav Petkov <[email protected]>

intel_pmu_pebs_fixup_ip() needs only the insn length so use the
appropriate helper instead of a full decode. A full decode differs only
in running insn_complete() on the decoded insn but that is not needed
here.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/events/intel/ds.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index fb327d11a04d..56cbcfee0ab1 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1263,14 +1263,14 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
is_64bit = kernel_ip(to) || any_64bit_mode(regs);
#endif
insn_init(&insn, kaddr, size, is_64bit);
- insn_get_length(&insn);
+
/*
* Make sure there was not a problem decoding the
* instruction and getting the length. This is
* doubly important because we have an infinite
* loop if insn.length=0.
*/
- if (!insn.length)
+ if (insn_get_length(&insn) || !insn.length)
break;

to += insn.length;
--
2.21.0

2020-11-24 10:27:03

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH v0 02/19] x86/insn: Add @buf_len param to insn_init() kernel-doc comment

From: Borislav Petkov <[email protected]>

It wasn't documented so add it. No functional changes.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/lib/insn.c | 1 +
tools/arch/x86/lib/insn.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/x86/lib/insn.c b/arch/x86/lib/insn.c
index 404279563891..1ba994862b56 100644
--- a/arch/x86/lib/insn.c
+++ b/arch/x86/lib/insn.c
@@ -37,6 +37,7 @@
* insn_init() - initialize struct insn
* @insn: &struct insn to be initialized
* @kaddr: address (in kernel memory) of instruction (or copy thereof)
+ * @buf_len: length of the insn buffer at @kaddr
* @x86_64: !0 for 64-bit kernel or 64-bit app
*/
void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64)
diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
index 0151dfc6da61..f3277d6e4ef2 100644
--- a/tools/arch/x86/lib/insn.c
+++ b/tools/arch/x86/lib/insn.c
@@ -37,6 +37,7 @@
* insn_init() - initialize struct insn
* @insn: &struct insn to be initialized
* @kaddr: address (in kernel memory) of instruction (or copy thereof)
+ * @buf_len: length of the insn buffer at @kaddr
* @x86_64: !0 for 64-bit kernel or 64-bit app
*/
void insn_init(struct insn *insn, const void *kaddr, int buf_len, int x86_64)
--
2.21.0

2020-11-24 17:50:50

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> In any case, at least the case where I give it
>
> 0x48 0xcf 0x48 0x83
>
> and say that buf size is 4, should return an error because the second
> insn is incomplete. So I need to go look at that now.

Ok, got it:

./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0
supplied buf size: 1, ret -22

the current decoder simply decodes the *first* insn in the buffer it
encounters and that's it.

When you give it a buffer of size smaller than the first instruction:

supplied buf size: 1, ret -22

while the first insn is 2 bytes long:

0x48 0xcf (IRETQ)

then it signals an error.

Andy, does that work for your use cases?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-11-25 02:01:00

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH v0 14/19] x86/tools/insn_decoder_test: Convert to insn_decode()

From: Borislav Petkov <[email protected]>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/tools/insn_decoder_test.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/tools/insn_decoder_test.c b/arch/x86/tools/insn_decoder_test.c
index 34eda63c124b..472540aeabc2 100644
--- a/arch/x86/tools/insn_decoder_test.c
+++ b/arch/x86/tools/insn_decoder_test.c
@@ -120,7 +120,7 @@ int main(int argc, char **argv)

while (fgets(line, BUFSIZE, stdin)) {
char copy[BUFSIZE], *s, *tab1, *tab2;
- int nb = 0;
+ int nb = 0, ret;
unsigned int b;

if (line[0] == '<') {
@@ -148,10 +148,12 @@ int main(int argc, char **argv)
} else
break;
}
+
/* Decode an instruction */
- insn_init(&insn, insn_buff, sizeof(insn_buff), x86_64);
- insn_get_length(&insn);
- if (insn.length != nb) {
+ ret = insn_decode(&insn, insn_buff, sizeof(insn_buff),
+ x86_64 ? INSN_MODE_64 : INSN_MODE_32);
+
+ if (ret < 0 || insn.length != nb) {
warnings++;
pr_warn("Found an x86 instruction decoder bug, "
"please report this.\n", sym);
--
2.21.0

2020-11-25 02:02:11

by Borislav Petkov

[permalink] [raw]
Subject: [RFC PATCH v0 15/19] tools/objtool: Convert to insn_decode()

From: Borislav Petkov <[email protected]>

Simplify code, no functional changes.

Signed-off-by: Borislav Petkov <[email protected]>
---
tools/objtool/arch/x86/decode.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index cde9c36e40ae..67ee8d2a9e5c 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -90,7 +90,7 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec,
struct list_head *ops_list)
{
struct insn insn;
- int x86_64, sign;
+ int x86_64, sign, ret;
unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0,
modrm_reg = 0, sib = 0;
@@ -101,10 +101,9 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec,
if (x86_64 == -1)
return -1;

- insn_init(&insn, sec->data->d_buf + offset, maxlen, x86_64);
- insn_get_length(&insn);
-
- if (!insn_complete(&insn)) {
+ ret = insn_decode(&insn, sec->data->d_buf + offset, maxlen,
+ x86_64 ? INSN_MODE_64 : INSN_MODE_32);
+ if (ret < 0) {
WARN("can't decode instruction at %s:0x%lx", sec->name, offset);
return -1;
}
--
2.21.0

2020-11-25 08:05:24

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Tue, 24 Nov 2020 18:46:47 +0100
Borislav Petkov <[email protected]> wrote:

> On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> > In any case, at least the case where I give it
> >
> > 0x48 0xcf 0x48 0x83
> >
> > and say that buf size is 4, should return an error because the second
> > insn is incomplete. So I need to go look at that now.
>
> Ok, got it:
>
> ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
> insn buffer:
> 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
> supplied buf size: 15, ret 0
> supplied buf size: 2, ret 0
> supplied buf size: 3, ret 0
> supplied buf size: 4, ret 0
> supplied buf size: 1, ret -22
>
> the current decoder simply decodes the *first* insn in the buffer it
> encounters and that's it.

Yes, currently the buf_size is only for checking the maximum length of
the buffer, because we expect the user doesn't know the actual length of
the instruction before calling insn_get_length().
But yes, for the insn_sanity.c, the return length should be compared.

Thank you,

>
> When you give it a buffer of size smaller than the first instruction:
>
> supplied buf size: 1, ret -22
>
> while the first insn is 2 bytes long:
>
> 0x48 0xcf (IRETQ)
>
> then it signals an error.
>
> Andy, does that work for your use cases?
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette


--
Masami Hiramatsu <[email protected]>

2020-11-27 22:42:56

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Fri, Nov 27, 2020 at 09:45:39AM -0800, Andy Lutomirski wrote:
> Is -22 (-EINVAL) the same error it returns if you pass in garbage?

Define garbage.

Yes, if you have a sequence of bytes which you can unambiguously
determine to be

- an invalid instruction in some of the tables

- REX prefix with the wrong bits set

- a byte says that some insn part like ModRM or SIB is following but
the buffer falls short

- ... other error condition

then maybe you can say, yes, I'm looking at garbage and can error out
right then and there. But you need to have enough bytes of information
to determine that.

For example (those are random bytes):

00000000000011ff <.asm_start>:
11ff: 95 xchg %eax,%ebp
1200: 14 60 adc $0x60,%al
1202: 77 74 ja 1278 <__libc_csu_init+0x28>
1204: 82 (bad)
1205: 67 dc 55 35 fcoml 0x35(%ebp)

The 0x82 is usually in opcode group 1 but that opcode is invalid in
64-bit mode. So if this is a 64-bit executable, you know that that is an
invalid insn.

Another example:

18: a0 .byte 0xa0
19: 17 (bad)
1a: 27 (bad)
1b: ea (bad)
1c: 90 nop
1d: 90 nop
1e: 90 nop
1f: 90 nop

0xa0 is the opcode for

MOV AL, moffset8

where moffset8 is an address-sized memory offset, which in 64-bit mode
is 64-bit. But we have only 7 bytes after the 0xa0 thus we know that the
buffer is truncated.

If it had one byte more, it would be a valid insn:

18: a0 17 27 ea 90 90 90 movabs 0x9090909090ea2717,%al
20: 90 90


I'm sure you get the idea: if you have enough unambiguous bits which
tell you that this cannot be a valid insn, then you can return early
from the decoder and signal that fact.

I'm not sure that you can do that for all possible byte combinations
and also I'm not sure that it won't ever happen that it per chance
misinterprets garbage data as valid instructions.

> How hard would it be to teach it to return a different error code when
> the buffer is too small?

Yap, see above. Unambiguous cases are clear but I don't know it would
work in all cases. For example, let's say you give it a zeroed out
buffer of 8 bytes which doesn't contain anything - you've just zeroed it
out and haven't put any insns in there yet.

But those are perfectly valid insns:

0: 00 00 add %al,(%rax)
2: 00 00 add %al,(%rax)
4: 00 00 add %al,(%rax)
6: 00 00 add %al,(%rax)

So now you go about your merry way working with those although they're
not real instructions which some tool generated.

See what I mean?

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-11-28 00:28:09

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Tue, Nov 24, 2020 at 9:46 AM Borislav Petkov <[email protected]> wrote:
>
> On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> > In any case, at least the case where I give it
> >
> > 0x48 0xcf 0x48 0x83
> >
> > and say that buf size is 4, should return an error because the second
> > insn is incomplete. So I need to go look at that now.
>
> Ok, got it:
>
> ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
> insn buffer:
> 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
> supplied buf size: 15, ret 0
> supplied buf size: 2, ret 0
> supplied buf size: 3, ret 0
> supplied buf size: 4, ret 0
> supplied buf size: 1, ret -22
>
> the current decoder simply decodes the *first* insn in the buffer it
> encounters and that's it.
>
> When you give it a buffer of size smaller than the first instruction:
>
> supplied buf size: 1, ret -22
>
> while the first insn is 2 bytes long:
>
> 0x48 0xcf (IRETQ)
>
> then it signals an error.
>
> Andy, does that work for your use cases?

Is -22 (-EINVAL) the same error it returns if you pass in garbage?
How hard would it be to teach it to return a different error code when
the buffer is too small?

>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette



--
Andy Lutomirski
AMA Capital Management, LLC

2020-11-29 08:55:18

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Fri, 27 Nov 2020 09:45:39 -0800
Andy Lutomirski <[email protected]> wrote:

> On Tue, Nov 24, 2020 at 9:46 AM Borislav Petkov <[email protected]> wrote:
> >
> > On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote:
> > > In any case, at least the case where I give it
> > >
> > > 0x48 0xcf 0x48 0x83
> > >
> > > and say that buf size is 4, should return an error because the second
> > > insn is incomplete. So I need to go look at that now.
> >
> > Ok, got it:
> >
> > ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x826fdf9c)
> > insn buffer:
> > 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
> > supplied buf size: 15, ret 0
> > supplied buf size: 2, ret 0
> > supplied buf size: 3, ret 0
> > supplied buf size: 4, ret 0
> > supplied buf size: 1, ret -22
> >
> > the current decoder simply decodes the *first* insn in the buffer it
> > encounters and that's it.
> >
> > When you give it a buffer of size smaller than the first instruction:
> >
> > supplied buf size: 1, ret -22
> >
> > while the first insn is 2 bytes long:
> >
> > 0x48 0xcf (IRETQ)
> >
> > then it signals an error.
> >
> > Andy, does that work for your use cases?
>
> Is -22 (-EINVAL) the same error it returns if you pass in garbage?
> How hard would it be to teach it to return a different error code when
> the buffer is too small?
>

Good point. I think we can return, e.g. -EFAULT if we failed in get_next(). Then, we can read out next page, for example.

Thank you,

--
Masami Hiramatsu <[email protected]>

2020-11-30 13:50:35

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Sun, Nov 29, 2020 at 05:50:05PM +0900, Masami Hiramatsu wrote:
> Good point. I think we can return, e.g. -EFAULT if we failed in
> get_next(). Then, we can read out next page, for example.

Why -EFAULT?

Running this

size = 1;
ret = insn_decode(&insn, b, size, INSN_MODE_64)

i.e., buffer size is 1:

./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x9994a137)
insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0
insn_decode: entry
insn_decode: will get_length
insn_get_immediate: getting immediate
insn_get_displacement: getting displacement
insn_get_sib: getting sib
insn_get_modrm: entry
insn_get_opcode: entry
insn_get_prefixes: entry, prefixes->got: 0
insn_get_prefixes: 1
insn_get_prefixes: REX
insn_get_prefixes: VEX
insn_get_prefixes: validate_next: 0
insn_get_prefixes: insn->next_byte: 0x7ffec297c3e1, insn->end_kaddr: 0x7ffec297c3e1
insn_get_prefixes: errored out
supplied buf size: 1, ret -22

is caught in validate_next() where ->next_byte == ->end_kaddr.

I'm thinking we should return EOF here, to denote that we're reached the
end and then propagate that error up the callchain.

We don't have "define EOF" in the kernel but we can designate one for
the insn decoder, perhaps

#define EOF -1

as stdio.h does:

/* The value returned by fgetc and similar functions to indicate the
end of the file. */
#define EOF (-1)

Hmm, but then the callers would need to know EOF too so maybe EIO or
something.

In any case, it should be a value which callers should be able to use to
get told that input buffer is truncated...

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-11-30 17:26:17

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Mon, 30 Nov 2020 14:44:42 +0100
Borislav Petkov <[email protected]> wrote:

> On Sun, Nov 29, 2020 at 05:50:05PM +0900, Masami Hiramatsu wrote:
> > Good point. I think we can return, e.g. -EFAULT if we failed in
> > get_next(). Then, we can read out next page, for example.
>
> Why -EFAULT?

Because it overruns the buffer. Maybe -E2BIG/ENODATA or any other
error (except for -EINVAL) is OK :)

> Running this
>
> size = 1;
> ret = insn_decode(&insn, b, size, INSN_MODE_64)
>
> i.e., buffer size is 1:
>
> ./arch/x86/tools/insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x9994a137)
> insn buffer:
> 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
> supplied buf size: 15, ret 0
> supplied buf size: 2, ret 0
> supplied buf size: 3, ret 0
> supplied buf size: 4, ret 0
> insn_decode: entry
> insn_decode: will get_length
> insn_get_immediate: getting immediate
> insn_get_displacement: getting displacement
> insn_get_sib: getting sib
> insn_get_modrm: entry
> insn_get_opcode: entry
> insn_get_prefixes: entry, prefixes->got: 0
> insn_get_prefixes: 1
> insn_get_prefixes: REX
> insn_get_prefixes: VEX
> insn_get_prefixes: validate_next: 0
> insn_get_prefixes: insn->next_byte: 0x7ffec297c3e1, insn->end_kaddr: 0x7ffec297c3e1
> insn_get_prefixes: errored out
> supplied buf size: 1, ret -22
>
> is caught in validate_next() where ->next_byte == ->end_kaddr.
>
> I'm thinking we should return EOF here, to denote that we're reached the
> end and then propagate that error up the callchain.

EOF means the end of file and it is not an error. Here, since the
decoder fails to decode the data, so it should return an error code.

>
> We don't have "define EOF" in the kernel but we can designate one for
> the insn decoder, perhaps
>
> #define EOF -1
>
> as stdio.h does:
>
> /* The value returned by fgetc and similar functions to indicate the
> end of the file. */
> #define EOF (-1)

It is because libc fgetc() returns -1 in any error case...

>
> Hmm, but then the callers would need to know EOF too so maybe EIO or
> something.
>
> In any case, it should be a value which callers should be able to use to
> get told that input buffer is truncated...

Yes!

Thank you,

>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette


--
Masami Hiramatsu <[email protected]>

2020-12-02 18:10:02

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API

On Tue, Dec 01, 2020 at 02:21:45AM +0900, Masami Hiramatsu wrote:
> Because it overruns the buffer. Maybe -E2BIG/ENODATA or any other
> error (except for -EINVAL) is OK :)

ENODATA it is. :)

And propagating that error value is easy because the err_out: labels are
all coming from the validate_next() error path so we basically know that
the buffer has ended.

./insn_sanity: Success: decoded and checked 10000 random instructions with 0 errors (seed:0x7bdfa56e)
insn buffer:
0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90
supplied buf size: 15, ret 0
supplied buf size: 2, ret 0
supplied buf size: 3, ret 0
supplied buf size: 4, ret 0
insn_decode: entry
insn_decode: will get_length
insn_get_immediate: getting immediate
insn_get_displacement: getting displacement
insn_get_sib: getting sib
insn_get_modrm: entry
insn_get_opcode: entry
insn_get_prefixes: entry, prefixes->got: 0
insn_get_prefixes: 1
insn_get_prefixes: REX
insn_get_prefixes: VEX
insn_get_prefixes: validate_next: 0
insn_get_prefixes: insn->next_byte: 0x7ffc211eb661, insn->end_kaddr: 0x7ffc211eb661
insn_get_prefixes: errored out
supplied buf size: 1, ret -61

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette