2014-11-01 13:11:29

by Chen Gang

[permalink] [raw]
Subject: [PATCH] arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one

Not only memcpy() is faster than pointer copy, but also let code more
clearer and simple, which can avoid compiling warning (the original
implementation copy registers by exceeding member array border).

The related warning (with allmodconfig under tile):

CC arch/tile/kernel/kgdb.o
arch/tile/kernel/kgdb.c: In function 'sleeping_thread_to_gdb_regs':
arch/tile/kernel/kgdb.c:140:31: warning: iteration 53u invokes undefined behavior [-Waggressive-loop-optimizations]
*(ptr++) = thread_regs->regs[reg];
^
arch/tile/kernel/kgdb.c:139:2: note: containing loop
for (reg = 0; reg <= TREG_LAST_GPR; reg++)
^

Signed-off-by: Chen Gang <[email protected]>
---
arch/tile/kernel/kgdb.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/tile/kernel/kgdb.c b/arch/tile/kernel/kgdb.c
index 4cd8838..ff5335a 100644
--- a/arch/tile/kernel/kgdb.c
+++ b/arch/tile/kernel/kgdb.c
@@ -125,9 +125,7 @@ int dbg_set_reg(int regno, void *mem, struct pt_regs *regs)
void
sleeping_thread_to_gdb_regs(unsigned long *gdb_regs, struct task_struct *task)
{
- int reg;
struct pt_regs *thread_regs;
- unsigned long *ptr = gdb_regs;

if (task == NULL)
return;
@@ -136,9 +134,7 @@ sleeping_thread_to_gdb_regs(unsigned long *gdb_regs, struct task_struct *task)
memset(gdb_regs, 0, NUMREGBYTES);

thread_regs = task_pt_regs(task);
- for (reg = 0; reg <= TREG_LAST_GPR; reg++)
- *(ptr++) = thread_regs->regs[reg];
-
+ memcpy(gdb_regs, thread_regs, TREG_LAST_GPR * sizeof(unsigned long));
gdb_regs[TILEGX_PC_REGNUM] = thread_regs->pc;
gdb_regs[TILEGX_FAULTNUM_REGNUM] = thread_regs->faultnum;
}
--
1.9.3


2014-11-12 02:05:23

by Chen Gang

[permalink] [raw]
Subject: Re: [PATCH] arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one

Hello maintainer:

Please help check this patch, when you have time (sorry for my original
typo for your email address, so maybe you did not notice about the
original mail).

Thanks.

On 11/1/14 21:17, Chen Gang wrote:
> Not only memcpy() is faster than pointer copy, but also let code more
> clearer and simple, which can avoid compiling warning (the original
> implementation copy registers by exceeding member array border).
>
> The related warning (with allmodconfig under tile):
>
> CC arch/tile/kernel/kgdb.o
> arch/tile/kernel/kgdb.c: In function 'sleeping_thread_to_gdb_regs':
> arch/tile/kernel/kgdb.c:140:31: warning: iteration 53u invokes undefined behavior [-Waggressive-loop-optimizations]
> *(ptr++) = thread_regs->regs[reg];
> ^
> arch/tile/kernel/kgdb.c:139:2: note: containing loop
> for (reg = 0; reg <= TREG_LAST_GPR; reg++)
> ^
>
> Signed-off-by: Chen Gang <[email protected]>
> ---
> arch/tile/kernel/kgdb.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/arch/tile/kernel/kgdb.c b/arch/tile/kernel/kgdb.c
> index 4cd8838..ff5335a 100644
> --- a/arch/tile/kernel/kgdb.c
> +++ b/arch/tile/kernel/kgdb.c
> @@ -125,9 +125,7 @@ int dbg_set_reg(int regno, void *mem, struct pt_regs *regs)
> void
> sleeping_thread_to_gdb_regs(unsigned long *gdb_regs, struct task_struct *task)
> {
> - int reg;
> struct pt_regs *thread_regs;
> - unsigned long *ptr = gdb_regs;
>
> if (task == NULL)
> return;
> @@ -136,9 +134,7 @@ sleeping_thread_to_gdb_regs(unsigned long *gdb_regs, struct task_struct *task)
> memset(gdb_regs, 0, NUMREGBYTES);
>
> thread_regs = task_pt_regs(task);
> - for (reg = 0; reg <= TREG_LAST_GPR; reg++)
> - *(ptr++) = thread_regs->regs[reg];
> -
> + memcpy(gdb_regs, thread_regs, TREG_LAST_GPR * sizeof(unsigned long));
> gdb_regs[TILEGX_PC_REGNUM] = thread_regs->pc;
> gdb_regs[TILEGX_FAULTNUM_REGNUM] = thread_regs->faultnum;
> }
>

--
Chen Gang

Open, share, and attitude like air, water, and life which God blessed

2014-11-12 13:27:48

by Jeff Epler

[permalink] [raw]
Subject: Re: [PATCH] arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one

It seems there's additional background required to understand the
diagnostic:

asm/ptrace.h has
struct pt_regs {
/* tp, sp, and lr must immediately follow regs[] for aliasing. */
pt_reg_t regs[53];
pt_reg_t tp; /* aliases regs[TREG_TP] */
pt_reg_t sp; /* aliases regs[TREG_SP] */
pt_reg_t lr; /* aliases regs[TREG_LR] */
and the intended copy overwites all of regs[], plus tp, sp, and lr.

It's intended for thread_regs.regs[TREG_TP] to alias to thread_regs.tp,
though in C this is undefined behavior (it dereferences a pointer past
the end of the structure).
> > arch/tile/kernel/kgdb.c:140:31: warning: iteration 53u invokes undefined behavior [-Waggressive-loop-optimizations]
> > *(ptr++) = thread_regs->regs[reg];

If compilers are beginning to exploit the rule that indexing past the
end of an array is UB, then the way that these register aliases are
created may need to be revisited with careful attention to what the C
standard actually says; I'm just going by memory. (I assume the
compiler could do things like replace an intended load from memory with
a constant load or even no load at all)

Jeff

2014-11-12 15:43:20

by Chen Gang

[permalink] [raw]
Subject: Re: [PATCH] arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one

On 11/12/2014 09:27 PM, Jeff Epler wrote:
> It seems there's additional background required to understand the
> diagnostic:
>
> asm/ptrace.h has
> struct pt_regs {
> /* tp, sp, and lr must immediately follow regs[] for aliasing. */
> pt_reg_t regs[53];
> pt_reg_t tp; /* aliases regs[TREG_TP] */
> pt_reg_t sp; /* aliases regs[TREG_SP] */
> pt_reg_t lr; /* aliases regs[TREG_LR] */
> and the intended copy overwites all of regs[], plus tp, sp, and lr.
>
> It's intended for thread_regs.regs[TREG_TP] to alias to thread_regs.tp,
> though in C this is undefined behavior (it dereferences a pointer past
> the end of the structure).

Yeah.

>>> arch/tile/kernel/kgdb.c:140:31: warning: iteration 53u invokes undefined behavior [-Waggressive-loop-optimizations]
>>> *(ptr++) = thread_regs->regs[reg];
>
> If compilers are beginning to exploit the rule that indexing past the
> end of an array is UB, then the way that these register aliases are
> created may need to be revisited with careful attention to what the C
> standard actually says; I'm just going by memory.

"thread_regs.tp" and "TREG_SP" (not "TREG_TP") are all used, at present.
For me, both of them can let code readable and clearer, and they are not
conflict (variable let address clearer, macro let index clearer).

So for me, only "memcpy() instead of 'for' looping" is engough. Not
consider about compiler's warning, memcpy() can get a little higher
performance and simple code.

> (I assume the
> compiler could do things like replace an intended load from memory with
> a constant load or even no load at all)
>

Excuse me, my English is not quite well, I can not understand what you
said above. (If necessary, please help provide more details for it).

Thanks.
--
Chen Gang

Open share and attitude like air water and life which God blessed

2014-11-12 19:29:36

by Jeff Epler

[permalink] [raw]
Subject: Re: [PATCH] arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one

On Wed, Nov 12, 2014 at 11:43:08PM +0800, Chen Gang wrote:
> > (I assume the
> > compiler could do things like replace an intended load from memory with
> > a constant load or even no load at all)
> >
>
> Excuse me, my English is not quite well, I can not understand what you
> said above. (If necessary, please help provide more details for it).

I am concerned that writing regs[TREG_TP] is "undefined behavior"
according to the C standard.

This expression is equivalent to *(regs + TREG_TP). The expression
(regs + TREG_TP) does not result in a pointer to any element of regs[],
so dereferencing it is undefined behavior. (Source: C99 draft standard
WG14/N1256, annex J.2, "[The behavior is undefined if t]he operand of
the unary * operator has an invalid value")

That is why the compiler showed the original diagnostic, but the same
logic that made the loop's behavior undefined also makes the expression
regs[TREG_TP] undefined whereever it appears.

None of this is a specific problem with your proposed patch. Rather, it
is a suggestion that the whole structure's design needs to be revisited
in light of compilers beginning to notice that regs[TREG_TP] is
undefined behavior and change their generated code as a result.

Unfortunately it looks like this header is also a part of the userspace
API, so it can't simply be changed just in case all in-kernel uses are
changed.

Jeff

2014-11-12 20:15:45

by Chris Metcalf

[permalink] [raw]
Subject: Re: [PATCH] arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one

On 11/1/14 21:17, Chen Gang wrote:
> >Not only memcpy() is faster than pointer copy, but also let code more
> >clearer and simple, which can avoid compiling warning (the original
> >implementation copy registers by exceeding member array border).
> >
> >The related warning (with allmodconfig under tile):
> >
> > CC arch/tile/kernel/kgdb.o
> > arch/tile/kernel/kgdb.c: In function 'sleeping_thread_to_gdb_regs':
> > arch/tile/kernel/kgdb.c:140:31: warning: iteration 53u invokes undefined behavior [-Waggressive-loop-optimizations]
> > *(ptr++) = thread_regs->regs[reg];
> > ^
> > arch/tile/kernel/kgdb.c:139:2: note: containing loop
> > for (reg = 0; reg <= TREG_LAST_GPR; reg++)
> > ^
> >
> >Signed-off-by: Chen Gang<[email protected]>
> >---
> > arch/tile/kernel/kgdb.c | 6 +-----
> > 1 file changed, 1 insertion(+), 5 deletions(-)
> >

Taken into the tile tree - thanks!

See my recent email about changing the shape of the pt_regs and sigcontext
structures to make the regs[] usage explicit. But even with that, I
think your change makes sense to simplify the code in this location.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

2014-11-13 00:08:53

by Chen Gang

[permalink] [raw]
Subject: Re: [PATCH] arch: tile: kernel: kgdb.c: Use memcpy() instead of pointer copy one by one

Thank you for your detail information, and what you said sounds
reasonable to me.

Send from Lenovo A788t.

Jeff Epler <[email protected]> wrote:

>On Wed, Nov 12, 2014 at 11:43:08PM +0800, Chen Gang wrote:
>> > (I assume the
>> > compiler could do things like replace an intended load from memory with
>> > a constant load or even no load at all)
>> >
>>
>> Excuse me, my English is not quite well, I can not understand what you
>> said above. (If necessary, please help provide more details for it).
>
>I am concerned that writing regs[TREG_TP] is "undefined behavior"
>according to the C standard.
>
>This expression is equivalent to *(regs + TREG_TP). The expression
>(regs + TREG_TP) does not result in a pointer to any element of regs[],
>so dereferencing it is undefined behavior. (Source: C99 draft standard
>WG14/N1256, annex J.2, "[The behavior is undefined if t]he operand of
>the unary * operator has an invalid value")
>
>That is why the compiler showed the original diagnostic, but the same
>logic that made the loop's behavior undefined also makes the expression
>regs[TREG_TP] undefined whereever it appears.
>
>None of this is a specific problem with your proposed patch. Rather, it
>is a suggestion that the whole structure's design needs to be revisited
>in light of compilers beginning to notice that regs[TREG_TP] is
>undefined behavior and change their generated code as a result.
>
>Unfortunately it looks like this header is also a part of the userspace
>API, so it can't simply be changed just in case all in-kernel uses are
>changed.
>
>Jeff
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?