2012-08-04 05:38:37

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

On Tue, 31 Jul 2012, Cyril Chemparathy wrote:

> The original phys_to_virt/virt_to_phys patching implementation relied on early
> patching prior to MMU initialization. On PAE systems running out of >4G
> address space, this would have entailed an additional round of patching after
> switching over to the high address space.
>
> The approach implemented here conceptually extends the original PHYS_OFFSET
> patching implementation with the introduction of "early" patch stubs. Early
> patch code is required to be functional out of the box, even before the patch
> is applied. This is implemented by inserting functional (but inefficient)
> load code into the .patch.code init section. Having functional code out of
> the box then allows us to defer the init time patch application until later
> in the init sequence.
>
> In addition to fitting better with our need for physical address-space
> switch-over, this implementation should be somewhat more extensible by virtue
> of its more readable (and hackable) C implementation. This should prove
> useful for other similar init time specialization needs, especially in light
> of our multi-platform kernel initiative.
>
> This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
> (Cortex-A8) device.
>
> Note: the obtuse use of stringified symbols in patch_stub() and
> early_patch_stub() is intentional. Theoretically this should have been
> accomplished with formal operands passed into the asm block, but this requires
> the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
> However, the 'c' modifier has been found to ICE certain versions of GCC, and
> therefore we resort to stringified symbols here.
>
> Signed-off-by: Cyril Chemparathy <[email protected]>

This looks very nice. Comments below.

> ---
> arch/arm/include/asm/patch.h | 123 +++++++++++++++++++++++++++++

Please find a better name for this file. "patch" is way too generic and
commonly referring to something different. "runtime-patching" or similar
would be more descriptive.

> arch/arm/kernel/module.c | 4 +
> arch/arm/kernel/setup.c | 175 +++++++++++++++++++++++++++++++++++++++++

This is complex enough to waarrant aa separate source file. Please move
those additions out from setup.c. Given a good name for the header file
above, the c file could share the same name.

> new file mode 100644
> index 0000000..a89749f
> --- /dev/null
> +++ b/arch/arm/include/asm/patch.h
> @@ -0,0 +1,123 @@
> +/*
> + * arch/arm/include/asm/patch.h
> + *
> + * Copyright (C) 2012, Texas Instruments
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * Note: this file should not be included by non-asm/.h files
> + */
> +#ifndef __ASM_ARM_PATCH_H
> +#define __ASM_ARM_PATCH_H
> +
> +#include <linux/stringify.h>
> +
> +#ifndef __ASSEMBLY__
> +
> extern unsigned __patch_table_begin, __patch_table_end;

You could use "exttern void __patch_table_begin" so those symbols don't
get any type that could be misused by mistake, while you still can take
their addresses.

> +
> +struct patch_info {
> + u32 type;
> + u32 size;

Given the possibly large number of table entries, some effort at making
those entries as compact as possible should be considered. For instance,
the type and size fields could be u8's and insn_end pointer replaced
with another size expressed as an u8. By placing all the u8's together
they would occupy a single word by themselves. The assembly stub would
only need a .align statement to reflect the c structure's padding.

[...]

Did you verify with some test program that your patching routines do
produce the same opcodes as the assembled equivalent for all possible
shift values? Especially for Thumb2 code which isn't as trivial to get
right as the ARM one.


Nicolas


2012-08-05 13:56:40

by Cyril Chemparathy

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

Hi Nicolas,

On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
> On Tue, 31 Jul 2012, Cyril Chemparathy wrote:
>
>> The original phys_to_virt/virt_to_phys patching implementation relied on early
>> patching prior to MMU initialization. On PAE systems running out of >4G
>> address space, this would have entailed an additional round of patching after
>> switching over to the high address space.
>>
>> The approach implemented here conceptually extends the original PHYS_OFFSET
>> patching implementation with the introduction of "early" patch stubs. Early
>> patch code is required to be functional out of the box, even before the patch
>> is applied. This is implemented by inserting functional (but inefficient)
>> load code into the .patch.code init section. Having functional code out of
>> the box then allows us to defer the init time patch application until later
>> in the init sequence.
>>
>> In addition to fitting better with our need for physical address-space
>> switch-over, this implementation should be somewhat more extensible by virtue
>> of its more readable (and hackable) C implementation. This should prove
>> useful for other similar init time specialization needs, especially in light
>> of our multi-platform kernel initiative.
>>
>> This code has been boot tested in both ARM and Thumb-2 modes on an ARMv7
>> (Cortex-A8) device.
>>
>> Note: the obtuse use of stringified symbols in patch_stub() and
>> early_patch_stub() is intentional. Theoretically this should have been
>> accomplished with formal operands passed into the asm block, but this requires
>> the use of the 'c' modifier for instantiating the long (e.g. .long %c0).
>> However, the 'c' modifier has been found to ICE certain versions of GCC, and
>> therefore we resort to stringified symbols here.
>>
>> Signed-off-by: Cyril Chemparathy <[email protected]>
>
> This looks very nice. Comments below.
>
>> ---
>> arch/arm/include/asm/patch.h | 123 +++++++++++++++++++++++++++++
>
> Please find a better name for this file. "patch" is way too generic and
> commonly referring to something different. "runtime-patching" or similar
> would be more descriptive.
>

Sure. Does init-patch sound about right? We need to reflect the fact
that this is intended for init-time patching only.

>> arch/arm/kernel/module.c | 4 +
>> arch/arm/kernel/setup.c | 175 +++++++++++++++++++++++++++++++++++++++++
>
> This is complex enough to waarrant aa separate source file. Please move
> those additions out from setup.c. Given a good name for the header file
> above, the c file could share the same name.
>

Sure.

>> new file mode 100644
>> index 0000000..a89749f
>> --- /dev/null
>> +++ b/arch/arm/include/asm/patch.h
>> @@ -0,0 +1,123 @@
>> +/*
>> + * arch/arm/include/asm/patch.h
>> + *
>> + * Copyright (C) 2012, Texas Instruments
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * Note: this file should not be included by non-asm/.h files
>> + */
>> +#ifndef __ASM_ARM_PATCH_H
>> +#define __ASM_ARM_PATCH_H
>> +
>> +#include <linux/stringify.h>
>> +
>> +#ifndef __ASSEMBLY__
>> +
>> extern unsigned __patch_table_begin, __patch_table_end;
>
> You could use "exttern void __patch_table_begin" so those symbols don't
> get any type that could be misused by mistake, while you still can take
> their addresses.
>

Sure.

>> +
>> +struct patch_info {
>> + u32 type;
>> + u32 size;
>
> Given the possibly large number of table entries, some effort at making
> those entries as compact as possible should be considered. For instance,
> the type and size fields could be u8's and insn_end pointer replaced
> with another size expressed as an u8. By placing all the u8's together
> they would occupy a single word by themselves. The assembly stub would
> only need a .align statement to reflect the c structure's padding.
>

Thanks, will try and pack this struct up.

> [...]
>
> Did you verify with some test program that your patching routines do
> produce the same opcodes as the assembled equivalent for all possible
> shift values? Especially for Thumb2 code which isn't as trivial to get
> right as the ARM one.
>

Not quite all, but I'm sure I can conjure up an off-line test harness to
do so.


Much appreciated feedback. Thanks for taking a look.

--
Thanks
- Cyril

2012-08-07 22:53:19

by Cyril Chemparathy

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

Hi Nicolas,

On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
[...]
>> extern unsigned __patch_table_begin, __patch_table_end;
>
> You could use "exttern void __patch_table_begin" so those symbols don't
> get any type that could be misused by mistake, while you still can take
> their addresses.
>

Looks like we'll have to stick with a non-void type here. The compiler
throws a warning when we try to take the address of a void.

[...]
> Did you verify with some test program that your patching routines do
> produce the same opcodes as the assembled equivalent for all possible
> shift values? Especially for Thumb2 code which isn't as trivial to get
> right as the ARM one.
>

We've refactored the patching code into separate functions as:

static int do_patch_imm8_arm(u32 insn, u32 imm, u32 *ninsn);
static int do_patch_imm8_thumb(u32 insn, u32 imm, u32 *ninsn);


With this, the following test code has been used to verify the generated
instruction encoding:

u32 arm_check[] = {
0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
};

u32 thumb_check[] = {
0xf1010081, 0xf5017081, 0xf5017001, 0xf5016081, 0xf5016001,
0xf5015081, 0xf5015001, 0xf5014081, 0xf5014001, 0xf5013081,
0xf5013001, 0xf5012081, 0xf5012001, 0xf5011081, 0xf5011001,
0xf5010081, 0xf5010001, 0xf1017081, 0xf1017001, 0xf1016081,
0xf1016001, 0xf1015081, 0xf1015001, 0xf1014081, 0xf1014001,
};

int do_test(void)
{
int i, ret;
u32 ninsn, insn;

insn = arm_check[0];
for (i = 0; i < ARRAY_SIZE(arm_check); i++) {
ret = do_patch_imm8_arm(insn, 0x41 << i, &ninsn);
if (ret < 0)
pr_err("patch failed at shift %d\n", i);
if (ninsn != arm_check[i])
pr_err("mismatch at %d, expect %x, got %x\n",
i, arm_check[i], ninsn);
}

insn = thumb_check[0];
for (i = 0; i < ARRAY_SIZE(thumb_check); i++) {
ret = do_patch_imm8_thumb(insn, 0x81 << i, &ninsn);
if (ret < 0)
pr_err("patch failed at shift %d\n", i);
if (ninsn != thumb_check[i])
pr_err("mismatch at %d, expect %x, got %x\n",
i, thumb_check[i], ninsn);
}
}

Any ideas on improving these tests?

--
Thanks
- Cyril

2012-08-08 05:56:46

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

On Tue, 7 Aug 2012, Cyril Chemparathy wrote:

> Hi Nicolas,
>
> On 8/4/2012 1:38 AM, Nicolas Pitre wrote:
> [...]
> > > extern unsigned __patch_table_begin, __patch_table_end;
> >
> > You could use "exttern void __patch_table_begin" so those symbols don't
> > get any type that could be misused by mistake, while you still can take
> > their addresses.
> >
>
> Looks like we'll have to stick with a non-void type here. The compiler throws
> a warning when we try to take the address of a void.

Ah, I see. Bummer. This used not to be the case with older gcc
versions.

> [...]
> > Did you verify with some test program that your patching routines do
> > produce the same opcodes as the assembled equivalent for all possible
> > shift values? Especially for Thumb2 code which isn't as trivial to get
> > right as the ARM one.
> >
>
> We've refactored the patching code into separate functions as:
>
> static int do_patch_imm8_arm(u32 insn, u32 imm, u32 *ninsn);
> static int do_patch_imm8_thumb(u32 insn, u32 imm, u32 *ninsn);
>
>
> With this, the following test code has been used to verify the generated
> instruction encoding:
>
> u32 arm_check[] = {
> 0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
> 0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
> 0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
> 0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
> 0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
> };

Instead of using this array you could let the assembler do it for you
like this:

asm (" \n\
.arm \n\
arm_check: \n\
.set shft, 0 \n\
.rep 12 \n\
add r1, r2, #0x81 << \shft \n\
.set shft, \shft + 2 \n\
.endr \n\
");

> u32 thumb_check[] = {
> 0xf1010081, 0xf5017081, 0xf5017001, 0xf5016081, 0xf5016001,
> 0xf5015081, 0xf5015001, 0xf5014081, 0xf5014001, 0xf5013081,
> 0xf5013001, 0xf5012081, 0xf5012001, 0xf5011081, 0xf5011001,
> 0xf5010081, 0xf5010001, 0xf1017081, 0xf1017001, 0xf1016081,
> 0xf1016001, 0xf1015081, 0xf1015001, 0xf1014081, 0xf1014001,

Same idea here.


Nicolas

2012-08-08 13:19:06

by Cyril Chemparathy

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

On 08/08/12 01:56, Nicolas Pitre wrote:
> On Tue, 7 Aug 2012, Cyril Chemparathy wrote:
[...]
>> u32 arm_check[] = {
>> 0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
>> 0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
>> 0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
>> 0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
>> 0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
>> };
>
> Instead of using this array you could let the assembler do it for you
> like this:
>
> asm (" \n\
> .arm \n\
> arm_check: \n\
> .set shft, 0 \n\
> .rep 12 \n\
> add r1, r2, #0x81 << \shft \n\
> .set shft, \shft + 2 \n\
> .endr \n\
> ");
>

Neat macro magic. Are you thinking that we build this in as a self test
in the code?

Thanks
-- Cyril.

2012-08-08 13:55:17

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

On Wed, 8 Aug 2012, Cyril Chemparathy wrote:

> On 08/08/12 01:56, Nicolas Pitre wrote:
> > On Tue, 7 Aug 2012, Cyril Chemparathy wrote:
> [...]
> > > u32 arm_check[] = {
> > > 0xe2810041, 0xe2810082, 0xe2810f41, 0xe2810f82, 0xe2810e41,
> > > 0xe2810e82, 0xe2810d41, 0xe2810d82, 0xe2810c41, 0xe2810c82,
> > > 0xe2810b41, 0xe2810b82, 0xe2810a41, 0xe2810a82, 0xe2810941,
> > > 0xe2810982, 0xe2810841, 0xe2810882, 0xe2810741, 0xe2810782,
> > > 0xe2810641, 0xe2810682, 0xe2810541, 0xe2810582, 0xe2810441,
> > > };
> >
> > Instead of using this array you could let the assembler do it for you
> > like this:
> >
> > asm (" \n\
> > .arm \n\
> > arm_check: \n\
> > .set shft, 0 \n\
> > .rep 12 \n\
> > add r1, r2, #0x81 << \shft \n\
> > .set shft, \shft + 2 \n\
> > .endr \n\
> > ");
> >
>
> Neat macro magic. Are you thinking that we build this in as a self test in
> the code?

For such things, this is never a bad idea to have some test alongside
with the main code, especially if this is extended to more cases in the
future. It is too easy to break it in subtle ways.

See arch/arm/kernel/kprobes-test*.c for a precedent.


Nicolas

2012-08-08 16:06:37

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

On Wed, Aug 08, 2012 at 09:55:12AM -0400, Nicolas Pitre wrote:
> On Wed, 8 Aug 2012, Cyril Chemparathy wrote:
> > Neat macro magic. Are you thinking that we build this in as a self test in
> > the code?
>
> For such things, this is never a bad idea to have some test alongside
> with the main code, especially if this is extended to more cases in the
> future. It is too easy to break it in subtle ways.
>
> See arch/arm/kernel/kprobes-test*.c for a precedent.

Done correctly, it shouldn't be a problem, but I wouldn't say that
arch/arm/kernel/kprobes-test*.c is done correctly. It's seen quite
a number of patching attempts since it was introduced for various
problems, and I've seen quite a number of builds fail for various
reasons in this file (none which I could be bothered to investigate.)

When the test code ends up causing more problems than the code it's
testing, something is definitely wrong.

2012-08-08 16:56:59

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

On Wed, 8 Aug 2012, Russell King - ARM Linux wrote:

> On Wed, Aug 08, 2012 at 09:55:12AM -0400, Nicolas Pitre wrote:
> > On Wed, 8 Aug 2012, Cyril Chemparathy wrote:
> > > Neat macro magic. Are you thinking that we build this in as a self test in
> > > the code?
> >
> > For such things, this is never a bad idea to have some test alongside
> > with the main code, especially if this is extended to more cases in the
> > future. It is too easy to break it in subtle ways.
> >
> > See arch/arm/kernel/kprobes-test*.c for a precedent.
>
> Done correctly, it shouldn't be a problem, but I wouldn't say that
> arch/arm/kernel/kprobes-test*.c is done correctly. It's seen quite
> a number of patching attempts since it was introduced for various
> problems, and I've seen quite a number of builds fail for various
> reasons in this file (none which I could be bothered to investigate.)
>
> When the test code ends up causing more problems than the code it's
> testing, something is definitely wrong.

I think we shouldn't compare the complexity of test code for kprobes and
test code for runtime patching code. The former, while more difficult
to keep compiling, has found loads of issues in the former kprobes code.
So it certainly paid back many times its cost in maintenance.

My mention of it wasn't about the actual test code implementation, but
rather about the fact that we do have test code in the tree which can be
enabled with a config option.

As for build failures with that test code, I'd suggest you simply drop a
note to Tixy who is normally very responsive. I randomly enable it
myself and didn't run into any issues yet.


Nicolas

2012-08-09 07:30:26

by Tixy

[permalink] [raw]
Subject: Re: [PATCH 01/22] ARM: add mechanism for late code patching

On Wed, 2012-08-08 at 12:56 -0400, Nicolas Pitre wrote:
> On Wed, 8 Aug 2012, Russell King - ARM Linux wrote:
> > Done correctly, it shouldn't be a problem, but I wouldn't say that
> > arch/arm/kernel/kprobes-test*.c is done correctly. It's seen quite
> > a number of patching attempts since it was introduced for various
> > problems, and I've seen quite a number of builds fail for various
> > reasons in this file (none which I could be bothered to investigate.)
<snip>
> >
> As for build failures with that test code, I'd suggest you simply drop a
> note to Tixy who is normally very responsive.

Indeed. If there are build failures, I'm happy to investigate and fix.

--
Tixy