2011-01-06 13:15:39

by Mario 'BitKoenig' Holbe

[permalink] [raw]
Subject: Re: 2.6.37-rc7: Regression: b43: crashes in hwrng_register()

Hello Herbert,

On Thu, Jan 06, 2011 at 05:12:41PM +1100, Herbert Xu wrote:
> On Wed, Jan 05, 2011 at 02:16:22PM +0100, Mario 'BitKoenig' Holbe wrote:
> > attached. I hope I got the gcc call right. However, I prefer the objdump
> > output anyways, so this one is attached as well.
> I see what you mean now.
> Please let me know if this patch (still against vanilla) helps.

The patch helps. No crashes, meaningful random data - perfect.
I still have 2 small annotations...

1. Having ECX on the clobber-list is not really necessary.
XSTORE doesn't touch ECX at all.
REP XSTORE would touch it, but for this ECX would be an input anyways.

2. Would you mind doing the same for EDX as you did for EDI?
This doesn't really change anything currently, but will probably help
avoiding a debug-session like ours somewhere in the future :)

A patch that does both is attached - to apply on top of your patch, if
you like. I tested this patch - it passed all my tests: no crashes,
meaningful random data.


Thanks for your help & regards
Mario
--
It is a capital mistake to theorize before one has data.
Insensibly one begins to twist facts to suit theories instead of theories
to suit facts. -- Sherlock Holmes by Arthur Conan Doyle


Attachments:
(No filename) (0.00 B)
signature.asc (482.00 B)
Digital signature
Download all attachments

2011-01-06 13:35:51

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.37-rc7: Regression: b43: crashes in hwrng_register()

On Thu, Jan 06, 2011 at 02:15:16PM +0100, Mario 'BitKoenig' Holbe wrote:
>
> The patch helps. No crashes, meaningful random data - perfect.
> I still have 2 small annotations...

Thanks for testing!

> 1. Having ECX on the clobber-list is not really necessary.
> XSTORE doesn't touch ECX at all.
> REP XSTORE would touch it, but for this ECX would be an input anyways.

The documentation wasn't clear whether ECX would be updated without
the REP prefix so I included it to be on the safe side. Unfortunately
my only VIA machine is on another continent at the moment so I can't
test it myself. Can you verify that ECX isn't changed without the
REP prefix?

> 2. Would you mind doing the same for EDX as you did for EDI?
> This doesn't really change anything currently, but will probably help
> avoiding a debug-session like ours somewhere in the future :)

According to my documentation EDX isn't be modified (nor would it
make sense as it would break REP XSTORE). Are you seeing anything
different?

Thanks,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2011-01-06 13:55:57

by Larry Finger

[permalink] [raw]
Subject: Re: 2.6.37-rc7: Regression: b43: crashes in hwrng_register()

On 01/06/2011 07:35 AM, Herbert Xu wrote:
> On Thu, Jan 06, 2011 at 02:15:16PM +0100, Mario 'BitKoenig' Holbe wrote:
>>
>> The patch helps. No crashes, meaningful random data - perfect.
>> I still have 2 small annotations...
>
> Thanks for testing!
>
>> 1. Having ECX on the clobber-list is not really necessary.
>> XSTORE doesn't touch ECX at all.
>> REP XSTORE would touch it, but for this ECX would be an input anyways.
>
> The documentation wasn't clear whether ECX would be updated without
> the REP prefix so I included it to be on the safe side. Unfortunately
> my only VIA machine is on another continent at the moment so I can't
> test it myself. Can you verify that ECX isn't changed without the
> REP prefix?
>
>> 2. Would you mind doing the same for EDX as you did for EDI?
>> This doesn't really change anything currently, but will probably help
>> avoiding a debug-session like ours somewhere in the future :)
>
> According to my documentation EDX isn't be modified (nor would it
> make sense as it would break REP XSTORE). Are you seeing anything
> different?

As an interested observer, I think this routine needs some really detailed comments.

Larry

2011-01-06 14:43:12

by Mario 'BitKoenig' Holbe

[permalink] [raw]
Subject: Re: 2.6.37-rc7: Regression: b43: crashes in hwrng_register()

On Fri, Jan 07, 2011 at 12:35:41AM +1100, Herbert Xu wrote:
> On Thu, Jan 06, 2011 at 02:15:16PM +0100, Mario 'BitKoenig' Holbe wrote:
> > 1. Having ECX on the clobber-list is not really necessary.
> > XSTORE doesn't touch ECX at all.
> > REP XSTORE would touch it, but for this ECX would be an input anyways.
>
> The documentation wasn't clear whether ECX would be updated without
> the REP prefix so I included it to be on the safe side. Unfortunately
> my only VIA machine is on another continent at the moment so I can't
> test it myself. Can you verify that ECX isn't changed without the
> REP prefix?

session-log (including small test case) attached: ECX is not changed.

> > 2. Would you mind doing the same for EDX as you did for EDI?
> According to my documentation EDX isn't be modified (nor would it
> make sense as it would break REP XSTORE). Are you seeing anything
> different?

http://linux.via.com.tw/support/beginDownload.action?eleid=181&fid=261
VIA PadLock Programming Guide, v1.66, 4th August 2005
2.1 XSTORE Instructions (page 9)
RNG Quality Factor: EDX
[...] Only the lower two bits of EDX are meaningful; the upper 30 bits
are ignored by the instruction and may be set to zero.
^^^^^^^^^^^^^^^^^^

http://hackipedia.org/Hardware/CPU/x86/chip,%20VIA/nano/Padlock,%20quick%20reference%20v0.95%20(July%2025th,%202008).pdf
PadLock Quick Reference, v0.95, 25th July 2008
RANDOM NUMBER GENERATION (page 3)
Register Usage: Output
EDX Bits 0:1 are unchanged, all higher order bits are zero.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

See the attached session-log as well: those EDX bits are indeed zeroed.


Btw: I believe both documents are quite clear regarding ECX for XSTORE.


Mario
--
User sind wie ideale Gase - sie verteilen sich gleichmaessig ueber alle Platten


Attachments:
(No filename) (0.00 B)
signature.asc (482.00 B)
Digital signature
Download all attachments

2011-01-07 03:50:01

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.37-rc7: Regression: b43: crashes in hwrng_register()

On Thu, Jan 06, 2011 at 03:42:55PM +0100, Mario 'BitKoenig' Holbe wrote:
>
> holbe@ideapad ~ % gcc -o via-rng-test via-rng-test.c
> holbe@ideapad ~ % ./via-rng-test
> 0xbff09d40: 00000000
> ecx: aa55aa55 edx: ffffff03 edi: 0xbff09d40
> ecx: aa55aa55 edx: 00000003 edi: 0xbff09d48
> 0xbff09d40: 339d4525
> holbe@ideapad ~ %

Thanks, I've applied this patch:

commit 0735ac1f2551d9f9d356126aaf3b1110150918e6
Author: Herbert Xu <[email protected]>
Date: Fri Jan 7 14:48:57 2011 +1100

hwrng: via_rng - Fix asm constraints

The inline asm to invoke xstore did not specify the constraints
correctly. In particular, dx/di should have been marked as output
registers as well as input as they're modified by xstore.

Thanks to Mario Holbe for creating this patch and testing it.

Tested-by: Mario 'BitKoenig' Holbe <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

diff --git a/drivers/char/hw_random/via-rng.c b/drivers/char/hw_random/via-rng.c
index 794aacb..7f86666 100644
--- a/drivers/char/hw_random/via-rng.c
+++ b/drivers/char/hw_random/via-rng.c
@@ -81,8 +81,7 @@ static inline u32 xstore(u32 *addr, u32 edx_in)
ts_state = irq_ts_save();

asm(".byte 0x0F,0xA7,0xC0 /* xstore %%edi (addr=%0) */"
- :"=m"(*addr), "=a"(eax_out)
- :"D"(addr), "d"(edx_in));
+ : "=m" (*addr), "=a" (eax_out), "+d" (edx_in), "+D" (addr));

irq_ts_restore(ts_state);
return eax_out;

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2011-01-07 03:55:01

by Herbert Xu

[permalink] [raw]
Subject: crypto: padlock - Move padlock.h into include/crypto

On Fri, Jan 07, 2011 at 02:49:48PM +1100, Herbert Xu wrote:
>
> commit 0735ac1f2551d9f9d356126aaf3b1110150918e6
> Author: Herbert Xu <[email protected]>
> Date: Fri Jan 7 14:48:57 2011 +1100

Here are the other patches to fix this problem:

commit 21493088733e6e09dac6f54595a1b6b8ab1e68fd
Author: Herbert Xu <[email protected]>
Date: Fri Jan 7 14:52:00 2011 +1100

crypto: padlock - Move padlock.h into include/crypto

This patch moves padlock.h from drivers/crypto into include/crypto
so that it may be used by the via-rng driver.

Signed-off-by: Herbert Xu <[email protected]>

diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index 2e992bc..2e56508 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -9,6 +9,7 @@

#include <crypto/algapi.h>
#include <crypto/aes.h>
+#include <crypto/padlock.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/types.h>
@@ -21,7 +22,6 @@
#include <asm/byteorder.h>
#include <asm/processor.h>
#include <asm/i387.h>
-#include "padlock.h"

/*
* Number of data blocks actually fetched for each xcrypt insn.
diff --git a/drivers/crypto/padlock-sha.c b/drivers/crypto/padlock-sha.c
index d3a27e0..adf075b 100644
--- a/drivers/crypto/padlock-sha.c
+++ b/drivers/crypto/padlock-sha.c
@@ -13,6 +13,7 @@
*/

#include <crypto/internal/hash.h>
+#include <crypto/padlock.h>
#include <crypto/sha.h>
#include <linux/err.h>
#include <linux/module.h>
@@ -22,13 +23,6 @@
#include <linux/kernel.h>
#include <linux/scatterlist.h>
#include <asm/i387.h>
-#include "padlock.h"
-
-#ifdef CONFIG_64BIT
-#define STACK_ALIGN 16
-#else
-#define STACK_ALIGN 4
-#endif

struct padlock_sha_desc {
struct shash_desc fallback;
diff --git a/drivers/crypto/padlock.h b/drivers/crypto/padlock.h
deleted file mode 100644
index b728e45..0000000
--- a/drivers/crypto/padlock.h
+++ /dev/null
@@ -1,23 +0,0 @@
-/*
- * Driver for VIA PadLock
- *
- * Copyright (c) 2004 Michal Ludvig <[email protected]>
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License as published by the Free
- * Software Foundation; either version 2 of the License, or (at your option)
- * any later version.
- *
- */
-
-#ifndef _CRYPTO_PADLOCK_H
-#define _CRYPTO_PADLOCK_H
-
-#define PADLOCK_ALIGNMENT 16
-
-#define PFX "padlock: "
-
-#define PADLOCK_CRA_PRIORITY 300
-#define PADLOCK_COMPOSITE_PRIORITY 400
-
-#endif /* _CRYPTO_PADLOCK_H */
diff --git a/include/crypto/padlock.h b/include/crypto/padlock.h
new file mode 100644
index 0000000..d2cfa2e
--- /dev/null
+++ b/include/crypto/padlock.h
@@ -0,0 +1,29 @@
+/*
+ * Driver for VIA PadLock
+ *
+ * Copyright (c) 2004 Michal Ludvig <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ */
+
+#ifndef _CRYPTO_PADLOCK_H
+#define _CRYPTO_PADLOCK_H
+
+#define PADLOCK_ALIGNMENT 16
+
+#define PFX KBUILD_MODNAME ": "
+
+#define PADLOCK_CRA_PRIORITY 300
+#define PADLOCK_COMPOSITE_PRIORITY 400
+
+#ifdef CONFIG_64BIT
+#define STACK_ALIGN 16
+#else
+#define STACK_ALIGN 4
+#endif
+
+#endif /* _CRYPTO_PADLOCK_H */

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2011-01-07 03:55:45

by Herbert Xu

[permalink] [raw]
Subject: hwrng: via_rng - Fix memory scribbling on some CPUs

commit 55db8387a5e8d07407f0b7c6b2526417a2bc6243
Author: Herbert Xu <[email protected]>
Date: Fri Jan 7 14:55:06 2011 +1100

hwrng: via_rng - Fix memory scribbling on some CPUs

It has been reported that on at least one Nano CPU the xstore
instruction will write as many as 16 bytes of data to the output
buffer.

This causes memory corruption as we use rng->priv which is only
4-8 bytes long.

This patch fixes this by using an intermediate buffer on the stack
with at least 16 bytes and aligned to a 16-byte boundary.

The problem was observed on the following processor:

processor : 0
vendor_id : CentaurHauls
cpu family : 6
model : 15
model name : VIA Nano processor U2250 (1.6GHz Capable)
stepping : 3
cpu MHz : 1600.000
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng rng_en ace ace_en ace2 phe phe_en lahf_lm
bogomips : 3192.08
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:

Tested-by: Mario 'BitKoenig' Holbe <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>

diff --git a/drivers/char/hw_random/via-rng.c b/drivers/char/hw_random/via-rng.c
index 7f86666..d0387a8 100644
--- a/drivers/char/hw_random/via-rng.c
+++ b/drivers/char/hw_random/via-rng.c
@@ -24,6 +24,7 @@
* warranty of any kind, whether express or implied.
*/

+#include <crypto/padlock.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/hw_random.h>
@@ -34,7 +35,6 @@
#include <asm/i387.h>


-#define PFX KBUILD_MODNAME ": "


enum {
@@ -89,8 +89,10 @@ static inline u32 xstore(u32 *addr, u32 edx_in)

static int via_rng_data_present(struct hwrng *rng, int wait)
{
+ char buf[16 + PADLOCK_ALIGNMENT - STACK_ALIGN] __attribute__
+ ((aligned(STACK_ALIGN)));
+ u32 *via_rng_datum = (u32 *)PTR_ALIGN(&buf[0], PADLOCK_ALIGNMENT);
u32 bytes_out;
- u32 *via_rng_datum = (u32 *)(&rng->priv);
int i;

/* We choose the recommended 1-byte-per-instruction RNG rate,
@@ -114,6 +116,7 @@ static int via_rng_data_present(struct hwrng *rng, int wait)
break;
udelay(10);
}
+ rng->priv = *via_rng_datum;
return bytes_out ? 1 : 0;
}

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt