2014-07-12 18:16:13

by Oleg Nesterov

[permalink] [raw]
Subject: bit fields && data tearing

Hello,

I am not sure I should ask here, but since Documentation/memory-barriers.txt
mentions load/store tearing perhaps my question is not completely off-topic...

I am fighting with mysterious RHEL bug, it can be reproduced on ppc and s390
but not on x86. Finally I seem to understand the problem, and I even wrote the
stupid kernel module to ensure, see it below at the end.

It triggers the problem immediately, kt_2() sees the wrong value in freeze_stop.
(If I turn ->freeze_stop int "long", the problem goes away).

So the question is: is this gcc bug or the code below is buggy?

If it is buggy, then probably memory-barriers.txt could mention that you should
be carefull with bit fields, even ACCESS_ONCE() obviously can't help.

Or this just discloses my ignorance and you need at least aligned(long) after a
bit field to be thread-safe ? I thought that compiler should take care and add
the necessary alignment if (say) CPU can't update a single byte/uint.

gcc version 4.4.7 20120313 (Red Hat 4.4.7-9) (GCC). Asm:

0000000000000000 <.kt_2>:
0: 7c 08 02 a6 mflr r0
4: fb 81 ff e0 std r28,-32(r1)
8: fb a1 ff e8 std r29,-24(r1)
c: fb c1 ff f0 std r30,-16(r1)
10: fb e1 ff f8 std r31,-8(r1)
14: eb c2 00 00 ld r30,0(r2)
18: f8 01 00 10 std r0,16(r1)
1c: f8 21 ff 71 stdu r1,-144(r1)
20: 7c 7d 1b 78 mr r29,r3
24: 3b e0 00 00 li r31,0
28: 78 3c 04 64 rldicr r28,r1,0,49
2c: 3b 9c 00 80 addi r28,r28,128
30: 48 00 00 2c b 5c <.kt_2+0x5c>
34: 60 00 00 00 nop
38: 60 00 00 00 nop
3c: 60 00 00 00 nop
40: 93 fd 00 04 stw r31,4(r29)
44: e8 9d 00 06 lwa r4,4(r29)
48: 7f 84 f8 00 cmpw cr7,r4,r31
4c: 40 de 00 4c bne- cr7,98 <.kt_2+0x98>
50: e8 1c 00 00 ld r0,0(r28)
54: 78 09 f7 e3 rldicl. r9,r0,62,63
58: 40 c2 00 54 bne- ac <.kt_2+0xac>
5c: 48 00 00 01 bl 5c <.kt_2+0x5c>
60: 60 00 00 00 nop
64: 3b ff 00 01 addi r31,r31,1
68: 2f a3 00 00 cmpdi cr7,r3,0
6c: 7f ff 07 b4 extsw r31,r31
70: 41 9e ff d0 beq+ cr7,40 <.kt_2+0x40>
74: 38 21 00 90 addi r1,r1,144
78: 38 60 00 00 li r3,0
7c: e8 01 00 10 ld r0,16(r1)
80: eb 81 ff e0 ld r28,-32(r1)
84: eb a1 ff e8 ld r29,-24(r1)
88: eb c1 ff f0 ld r30,-16(r1)
8c: eb e1 ff f8 ld r31,-8(r1)
90: 7c 08 03 a6 mtlr r0
94: 4e 80 00 20 blr
98: e8 7e 80 28 ld r3,-32728(r30)
9c: 7f e5 fb 78 mr r5,r31
a0: 48 00 00 01 bl a0 <.kt_2+0xa0>
a4: 60 00 00 00 nop
a8: 4b ff ff a8 b 50 <.kt_2+0x50>
ac: 48 00 00 01 bl ac <.kt_2+0xac>
b0: 60 00 00 00 nop
b4: 4b ff ff a8 b 5c <.kt_2+0x5c>
b8: 60 00 00 00 nop
bc: 60 00 00 00 nop

00000000000000c0 <.kt_1>:
c0: 7c 08 02 a6 mflr r0
c4: fb 81 ff e0 std r28,-32(r1)
c8: fb a1 ff e8 std r29,-24(r1)
cc: fb c1 ff f0 std r30,-16(r1)
d0: fb e1 ff f8 std r31,-8(r1)
d4: eb c2 00 00 ld r30,0(r2)
d8: f8 01 00 10 std r0,16(r1)
dc: f8 21 ff 71 stdu r1,-144(r1)
e0: 7c 7d 1b 78 mr r29,r3
e4: 3b e0 00 00 li r31,0
e8: 78 3c 04 64 rldicr r28,r1,0,49
ec: 3b 9c 00 80 addi r28,r28,128
f0: 48 00 00 38 b 128 <.kt_1+0x68>
f4: 60 00 00 00 nop
f8: 60 00 00 00 nop
fc: 60 00 00 00 nop
100: e8 1d 00 00 ld r0,0(r29)
104: 79 20 e8 0e rldimi r0,r9,61,0
108: f8 1d 00 00 std r0,0(r29)
10c: 80 1d 00 00 lwz r0,0(r29)
110: 54 00 1f 7e rlwinm r0,r0,3,29,31
114: 7f 80 f8 00 cmpw cr7,r0,r31
118: 40 de 00 6c bne- cr7,184 <.kt_1+0xc4>
11c: e8 1c 00 00 ld r0,0(r28)
120: 78 09 f7 e3 rldicl. r9,r0,62,63
124: 40 c2 00 70 bne- 194 <.kt_1+0xd4>
128: 48 00 00 01 bl 128 <.kt_1+0x68>
12c: 60 00 00 00 nop
130: 3b ff 00 01 addi r31,r31,1
134: 2f a3 00 00 cmpdi cr7,r3,0
138: 7f ff 07 b4 extsw r31,r31
13c: 2f 1f 00 07 cmpwi cr6,r31,7
140: 7b e9 07 60 clrldi r9,r31,61
144: 40 9e 00 1c bne- cr7,160 <.kt_1+0xa0>
148: 40 9a ff b8 bne+ cr6,100 <.kt_1+0x40>
14c: 39 20 00 00 li r9,0
150: 3b e0 00 00 li r31,0
154: 4b ff ff ac b 100 <.kt_1+0x40>
158: 60 00 00 00 nop
15c: 60 00 00 00 nop
160: 38 21 00 90 addi r1,r1,144
164: 38 60 00 00 li r3,0
168: e8 01 00 10 ld r0,16(r1)
16c: eb 81 ff e0 ld r28,-32(r1)
170: eb a1 ff e8 ld r29,-24(r1)
174: eb c1 ff f0 ld r30,-16(r1)
178: eb e1 ff f8 ld r31,-8(r1)
17c: 7c 08 03 a6 mtlr r0
180: 4e 80 00 20 blr
184: e8 7e 80 30 ld r3,-32720(r30)
188: 48 00 00 01 bl 188 <.kt_1+0xc8>
18c: 60 00 00 00 nop
190: 4b ff ff 8c b 11c <.kt_1+0x5c>
194: 48 00 00 01 bl 194 <.kt_1+0xd4>
198: 60 00 00 00 nop
19c: 4b ff ff 8c b 128 <.kt_1+0x68>

Unfortunately it tells me nothing, I do not know ppc.

Oleg.

--------------------------------------------------------------------------------
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/kthread.h>

struct utrace {
unsigned int resume:3;
int freeze_stop;
};

static int kt_1(void *arg)
{
struct utrace *u = arg;
int r = 0;

while (!kthread_should_stop()) {
if (++r == 7)
r = 0;

u->resume = r;
barrier();
if (u->resume != r)
printk(KERN_CRIT "BUG! bitfield\n");

if (need_resched())
schedule();
}

return 0;
}

static int kt_2(void *arg)
{
struct utrace *u = arg;
int f = 0;

while (!kthread_should_stop()) {
u->freeze_stop = ++f;
barrier();
if (u->freeze_stop != f)
printk(KERN_CRIT "BUG! freeze_stop %d != %d\n", u->freeze_stop, f);

if (need_resched())
schedule();
}

return 0;
}

static struct task_struct *t_1, *t_2;

static struct utrace utrace;

static int __init mod_init(void)
{
WARN_ON(IS_ERR(t_1 = kthread_run(kt_1, &utrace, "kt_1")));
WARN_ON(IS_ERR(t_2 = kthread_run(kt_2, &utrace, "kt_2")));

return 0;
}

static void __exit mod_exit(void)
{
kthread_stop(t_1);
kthread_stop(t_2);
}

MODULE_LICENSE("GPL");
module_init(mod_init);
module_exit(mod_exit);


2014-07-12 20:53:12

by Oleg Nesterov

[permalink] [raw]
Subject: Re: bit fields && data tearing

OK, looks like this is compiler bug,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080

Thanks to Dan who informed me privately.

On 07/12, Oleg Nesterov wrote:
>
> Hello,
>
> I am not sure I should ask here, but since Documentation/memory-barriers.txt
> mentions load/store tearing perhaps my question is not completely off-topic...
>
> I am fighting with mysterious RHEL bug, it can be reproduced on ppc and s390
> but not on x86. Finally I seem to understand the problem, and I even wrote the
> stupid kernel module to ensure, see it below at the end.
>
> It triggers the problem immediately, kt_2() sees the wrong value in freeze_stop.
> (If I turn ->freeze_stop int "long", the problem goes away).
>
> So the question is: is this gcc bug or the code below is buggy?
>
> If it is buggy, then probably memory-barriers.txt could mention that you should
> be carefull with bit fields, even ACCESS_ONCE() obviously can't help.
>
> Or this just discloses my ignorance and you need at least aligned(long) after a
> bit field to be thread-safe ? I thought that compiler should take care and add
> the necessary alignment if (say) CPU can't update a single byte/uint.
>
> gcc version 4.4.7 20120313 (Red Hat 4.4.7-9) (GCC). Asm:
>
> 0000000000000000 <.kt_2>:
> 0: 7c 08 02 a6 mflr r0
> 4: fb 81 ff e0 std r28,-32(r1)
> 8: fb a1 ff e8 std r29,-24(r1)
> c: fb c1 ff f0 std r30,-16(r1)
> 10: fb e1 ff f8 std r31,-8(r1)
> 14: eb c2 00 00 ld r30,0(r2)
> 18: f8 01 00 10 std r0,16(r1)
> 1c: f8 21 ff 71 stdu r1,-144(r1)
> 20: 7c 7d 1b 78 mr r29,r3
> 24: 3b e0 00 00 li r31,0
> 28: 78 3c 04 64 rldicr r28,r1,0,49
> 2c: 3b 9c 00 80 addi r28,r28,128
> 30: 48 00 00 2c b 5c <.kt_2+0x5c>
> 34: 60 00 00 00 nop
> 38: 60 00 00 00 nop
> 3c: 60 00 00 00 nop
> 40: 93 fd 00 04 stw r31,4(r29)
> 44: e8 9d 00 06 lwa r4,4(r29)
> 48: 7f 84 f8 00 cmpw cr7,r4,r31
> 4c: 40 de 00 4c bne- cr7,98 <.kt_2+0x98>
> 50: e8 1c 00 00 ld r0,0(r28)
> 54: 78 09 f7 e3 rldicl. r9,r0,62,63
> 58: 40 c2 00 54 bne- ac <.kt_2+0xac>
> 5c: 48 00 00 01 bl 5c <.kt_2+0x5c>
> 60: 60 00 00 00 nop
> 64: 3b ff 00 01 addi r31,r31,1
> 68: 2f a3 00 00 cmpdi cr7,r3,0
> 6c: 7f ff 07 b4 extsw r31,r31
> 70: 41 9e ff d0 beq+ cr7,40 <.kt_2+0x40>
> 74: 38 21 00 90 addi r1,r1,144
> 78: 38 60 00 00 li r3,0
> 7c: e8 01 00 10 ld r0,16(r1)
> 80: eb 81 ff e0 ld r28,-32(r1)
> 84: eb a1 ff e8 ld r29,-24(r1)
> 88: eb c1 ff f0 ld r30,-16(r1)
> 8c: eb e1 ff f8 ld r31,-8(r1)
> 90: 7c 08 03 a6 mtlr r0
> 94: 4e 80 00 20 blr
> 98: e8 7e 80 28 ld r3,-32728(r30)
> 9c: 7f e5 fb 78 mr r5,r31
> a0: 48 00 00 01 bl a0 <.kt_2+0xa0>
> a4: 60 00 00 00 nop
> a8: 4b ff ff a8 b 50 <.kt_2+0x50>
> ac: 48 00 00 01 bl ac <.kt_2+0xac>
> b0: 60 00 00 00 nop
> b4: 4b ff ff a8 b 5c <.kt_2+0x5c>
> b8: 60 00 00 00 nop
> bc: 60 00 00 00 nop
>
> 00000000000000c0 <.kt_1>:
> c0: 7c 08 02 a6 mflr r0
> c4: fb 81 ff e0 std r28,-32(r1)
> c8: fb a1 ff e8 std r29,-24(r1)
> cc: fb c1 ff f0 std r30,-16(r1)
> d0: fb e1 ff f8 std r31,-8(r1)
> d4: eb c2 00 00 ld r30,0(r2)
> d8: f8 01 00 10 std r0,16(r1)
> dc: f8 21 ff 71 stdu r1,-144(r1)
> e0: 7c 7d 1b 78 mr r29,r3
> e4: 3b e0 00 00 li r31,0
> e8: 78 3c 04 64 rldicr r28,r1,0,49
> ec: 3b 9c 00 80 addi r28,r28,128
> f0: 48 00 00 38 b 128 <.kt_1+0x68>
> f4: 60 00 00 00 nop
> f8: 60 00 00 00 nop
> fc: 60 00 00 00 nop
> 100: e8 1d 00 00 ld r0,0(r29)
> 104: 79 20 e8 0e rldimi r0,r9,61,0
> 108: f8 1d 00 00 std r0,0(r29)
> 10c: 80 1d 00 00 lwz r0,0(r29)
> 110: 54 00 1f 7e rlwinm r0,r0,3,29,31
> 114: 7f 80 f8 00 cmpw cr7,r0,r31
> 118: 40 de 00 6c bne- cr7,184 <.kt_1+0xc4>
> 11c: e8 1c 00 00 ld r0,0(r28)
> 120: 78 09 f7 e3 rldicl. r9,r0,62,63
> 124: 40 c2 00 70 bne- 194 <.kt_1+0xd4>
> 128: 48 00 00 01 bl 128 <.kt_1+0x68>
> 12c: 60 00 00 00 nop
> 130: 3b ff 00 01 addi r31,r31,1
> 134: 2f a3 00 00 cmpdi cr7,r3,0
> 138: 7f ff 07 b4 extsw r31,r31
> 13c: 2f 1f 00 07 cmpwi cr6,r31,7
> 140: 7b e9 07 60 clrldi r9,r31,61
> 144: 40 9e 00 1c bne- cr7,160 <.kt_1+0xa0>
> 148: 40 9a ff b8 bne+ cr6,100 <.kt_1+0x40>
> 14c: 39 20 00 00 li r9,0
> 150: 3b e0 00 00 li r31,0
> 154: 4b ff ff ac b 100 <.kt_1+0x40>
> 158: 60 00 00 00 nop
> 15c: 60 00 00 00 nop
> 160: 38 21 00 90 addi r1,r1,144
> 164: 38 60 00 00 li r3,0
> 168: e8 01 00 10 ld r0,16(r1)
> 16c: eb 81 ff e0 ld r28,-32(r1)
> 170: eb a1 ff e8 ld r29,-24(r1)
> 174: eb c1 ff f0 ld r30,-16(r1)
> 178: eb e1 ff f8 ld r31,-8(r1)
> 17c: 7c 08 03 a6 mtlr r0
> 180: 4e 80 00 20 blr
> 184: e8 7e 80 30 ld r3,-32720(r30)
> 188: 48 00 00 01 bl 188 <.kt_1+0xc8>
> 18c: 60 00 00 00 nop
> 190: 4b ff ff 8c b 11c <.kt_1+0x5c>
> 194: 48 00 00 01 bl 194 <.kt_1+0xd4>
> 198: 60 00 00 00 nop
> 19c: 4b ff ff 8c b 128 <.kt_1+0x68>
>
> Unfortunately it tells me nothing, I do not know ppc.
>
> Oleg.
>
> --------------------------------------------------------------------------------
> #include <linux/module.h>
> #include <linux/kernel.h>
> #include <linux/kthread.h>
>
> struct utrace {
> unsigned int resume:3;
> int freeze_stop;
> };
>
> static int kt_1(void *arg)
> {
> struct utrace *u = arg;
> int r = 0;
>
> while (!kthread_should_stop()) {
> if (++r == 7)
> r = 0;
>
> u->resume = r;
> barrier();
> if (u->resume != r)
> printk(KERN_CRIT "BUG! bitfield\n");
>
> if (need_resched())
> schedule();
> }
>
> return 0;
> }
>
> static int kt_2(void *arg)
> {
> struct utrace *u = arg;
> int f = 0;
>
> while (!kthread_should_stop()) {
> u->freeze_stop = ++f;
> barrier();
> if (u->freeze_stop != f)
> printk(KERN_CRIT "BUG! freeze_stop %d != %d\n", u->freeze_stop, f);
>
> if (need_resched())
> schedule();
> }
>
> return 0;
> }
>
> static struct task_struct *t_1, *t_2;
>
> static struct utrace utrace;
>
> static int __init mod_init(void)
> {
> WARN_ON(IS_ERR(t_1 = kthread_run(kt_1, &utrace, "kt_1")));
> WARN_ON(IS_ERR(t_2 = kthread_run(kt_2, &utrace, "kt_2")));
>
> return 0;
> }
>
> static void __exit mod_exit(void)
> {
> kthread_stop(t_1);
> kthread_stop(t_2);
> }
>
> MODULE_LICENSE("GPL");
> module_init(mod_init);
> module_exit(mod_exit);

2014-07-12 23:35:20

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: bit fields && data tearing

On Sat, 2014-07-12 at 22:51 +0200, Oleg Nesterov wrote:
> OK, looks like this is compiler bug,
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080
>
> Thanks to Dan who informed me privately.

So yes, there's is this compiler bug which means a bitfield
access can cause a r-m-w access to a neighbouring field but
in general, I would be weary of bitfields anyway since accessing
them isn't going to be atomic anyway... it's too easy to get things
wrong and in most cases the benefit is yet to be demonstrated.

In your example, I don't see the point of the bitfield.

Cheers,
Ben.

> On 07/12, Oleg Nesterov wrote:
> >
> > Hello,
> >
> > I am not sure I should ask here, but since Documentation/memory-barriers.txt
> > mentions load/store tearing perhaps my question is not completely off-topic...
> >
> > I am fighting with mysterious RHEL bug, it can be reproduced on ppc and s390
> > but not on x86. Finally I seem to understand the problem, and I even wrote the
> > stupid kernel module to ensure, see it below at the end.
> >
> > It triggers the problem immediately, kt_2() sees the wrong value in freeze_stop.
> > (If I turn ->freeze_stop int "long", the problem goes away).
> >
> > So the question is: is this gcc bug or the code below is buggy?
> >
> > If it is buggy, then probably memory-barriers.txt could mention that you should
> > be carefull with bit fields, even ACCESS_ONCE() obviously can't help.
> >
> > Or this just discloses my ignorance and you need at least aligned(long) after a
> > bit field to be thread-safe ? I thought that compiler should take care and add
> > the necessary alignment if (say) CPU can't update a single byte/uint.
> >
> > gcc version 4.4.7 20120313 (Red Hat 4.4.7-9) (GCC). Asm:
> >
> > 0000000000000000 <.kt_2>:
> > 0: 7c 08 02 a6 mflr r0
> > 4: fb 81 ff e0 std r28,-32(r1)
> > 8: fb a1 ff e8 std r29,-24(r1)
> > c: fb c1 ff f0 std r30,-16(r1)
> > 10: fb e1 ff f8 std r31,-8(r1)
> > 14: eb c2 00 00 ld r30,0(r2)
> > 18: f8 01 00 10 std r0,16(r1)
> > 1c: f8 21 ff 71 stdu r1,-144(r1)
> > 20: 7c 7d 1b 78 mr r29,r3
> > 24: 3b e0 00 00 li r31,0
> > 28: 78 3c 04 64 rldicr r28,r1,0,49
> > 2c: 3b 9c 00 80 addi r28,r28,128
> > 30: 48 00 00 2c b 5c <.kt_2+0x5c>
> > 34: 60 00 00 00 nop
> > 38: 60 00 00 00 nop
> > 3c: 60 00 00 00 nop
> > 40: 93 fd 00 04 stw r31,4(r29)
> > 44: e8 9d 00 06 lwa r4,4(r29)
> > 48: 7f 84 f8 00 cmpw cr7,r4,r31
> > 4c: 40 de 00 4c bne- cr7,98 <.kt_2+0x98>
> > 50: e8 1c 00 00 ld r0,0(r28)
> > 54: 78 09 f7 e3 rldicl. r9,r0,62,63
> > 58: 40 c2 00 54 bne- ac <.kt_2+0xac>
> > 5c: 48 00 00 01 bl 5c <.kt_2+0x5c>
> > 60: 60 00 00 00 nop
> > 64: 3b ff 00 01 addi r31,r31,1
> > 68: 2f a3 00 00 cmpdi cr7,r3,0
> > 6c: 7f ff 07 b4 extsw r31,r31
> > 70: 41 9e ff d0 beq+ cr7,40 <.kt_2+0x40>
> > 74: 38 21 00 90 addi r1,r1,144
> > 78: 38 60 00 00 li r3,0
> > 7c: e8 01 00 10 ld r0,16(r1)
> > 80: eb 81 ff e0 ld r28,-32(r1)
> > 84: eb a1 ff e8 ld r29,-24(r1)
> > 88: eb c1 ff f0 ld r30,-16(r1)
> > 8c: eb e1 ff f8 ld r31,-8(r1)
> > 90: 7c 08 03 a6 mtlr r0
> > 94: 4e 80 00 20 blr
> > 98: e8 7e 80 28 ld r3,-32728(r30)
> > 9c: 7f e5 fb 78 mr r5,r31
> > a0: 48 00 00 01 bl a0 <.kt_2+0xa0>
> > a4: 60 00 00 00 nop
> > a8: 4b ff ff a8 b 50 <.kt_2+0x50>
> > ac: 48 00 00 01 bl ac <.kt_2+0xac>
> > b0: 60 00 00 00 nop
> > b4: 4b ff ff a8 b 5c <.kt_2+0x5c>
> > b8: 60 00 00 00 nop
> > bc: 60 00 00 00 nop
> >
> > 00000000000000c0 <.kt_1>:
> > c0: 7c 08 02 a6 mflr r0
> > c4: fb 81 ff e0 std r28,-32(r1)
> > c8: fb a1 ff e8 std r29,-24(r1)
> > cc: fb c1 ff f0 std r30,-16(r1)
> > d0: fb e1 ff f8 std r31,-8(r1)
> > d4: eb c2 00 00 ld r30,0(r2)
> > d8: f8 01 00 10 std r0,16(r1)
> > dc: f8 21 ff 71 stdu r1,-144(r1)
> > e0: 7c 7d 1b 78 mr r29,r3
> > e4: 3b e0 00 00 li r31,0
> > e8: 78 3c 04 64 rldicr r28,r1,0,49
> > ec: 3b 9c 00 80 addi r28,r28,128
> > f0: 48 00 00 38 b 128 <.kt_1+0x68>
> > f4: 60 00 00 00 nop
> > f8: 60 00 00 00 nop
> > fc: 60 00 00 00 nop
> > 100: e8 1d 00 00 ld r0,0(r29)
> > 104: 79 20 e8 0e rldimi r0,r9,61,0
> > 108: f8 1d 00 00 std r0,0(r29)
> > 10c: 80 1d 00 00 lwz r0,0(r29)
> > 110: 54 00 1f 7e rlwinm r0,r0,3,29,31
> > 114: 7f 80 f8 00 cmpw cr7,r0,r31
> > 118: 40 de 00 6c bne- cr7,184 <.kt_1+0xc4>
> > 11c: e8 1c 00 00 ld r0,0(r28)
> > 120: 78 09 f7 e3 rldicl. r9,r0,62,63
> > 124: 40 c2 00 70 bne- 194 <.kt_1+0xd4>
> > 128: 48 00 00 01 bl 128 <.kt_1+0x68>
> > 12c: 60 00 00 00 nop
> > 130: 3b ff 00 01 addi r31,r31,1
> > 134: 2f a3 00 00 cmpdi cr7,r3,0
> > 138: 7f ff 07 b4 extsw r31,r31
> > 13c: 2f 1f 00 07 cmpwi cr6,r31,7
> > 140: 7b e9 07 60 clrldi r9,r31,61
> > 144: 40 9e 00 1c bne- cr7,160 <.kt_1+0xa0>
> > 148: 40 9a ff b8 bne+ cr6,100 <.kt_1+0x40>
> > 14c: 39 20 00 00 li r9,0
> > 150: 3b e0 00 00 li r31,0
> > 154: 4b ff ff ac b 100 <.kt_1+0x40>
> > 158: 60 00 00 00 nop
> > 15c: 60 00 00 00 nop
> > 160: 38 21 00 90 addi r1,r1,144
> > 164: 38 60 00 00 li r3,0
> > 168: e8 01 00 10 ld r0,16(r1)
> > 16c: eb 81 ff e0 ld r28,-32(r1)
> > 170: eb a1 ff e8 ld r29,-24(r1)
> > 174: eb c1 ff f0 ld r30,-16(r1)
> > 178: eb e1 ff f8 ld r31,-8(r1)
> > 17c: 7c 08 03 a6 mtlr r0
> > 180: 4e 80 00 20 blr
> > 184: e8 7e 80 30 ld r3,-32720(r30)
> > 188: 48 00 00 01 bl 188 <.kt_1+0xc8>
> > 18c: 60 00 00 00 nop
> > 190: 4b ff ff 8c b 11c <.kt_1+0x5c>
> > 194: 48 00 00 01 bl 194 <.kt_1+0xd4>
> > 198: 60 00 00 00 nop
> > 19c: 4b ff ff 8c b 128 <.kt_1+0x68>
> >
> > Unfortunately it tells me nothing, I do not know ppc.
> >
> > Oleg.
> >
> > --------------------------------------------------------------------------------
> > #include <linux/module.h>
> > #include <linux/kernel.h>
> > #include <linux/kthread.h>
> >
> > struct utrace {
> > unsigned int resume:3;
> > int freeze_stop;
> > };
> >
> > static int kt_1(void *arg)
> > {
> > struct utrace *u = arg;
> > int r = 0;
> >
> > while (!kthread_should_stop()) {
> > if (++r == 7)
> > r = 0;
> >
> > u->resume = r;
> > barrier();
> > if (u->resume != r)
> > printk(KERN_CRIT "BUG! bitfield\n");
> >
> > if (need_resched())
> > schedule();
> > }
> >
> > return 0;
> > }
> >
> > static int kt_2(void *arg)
> > {
> > struct utrace *u = arg;
> > int f = 0;
> >
> > while (!kthread_should_stop()) {
> > u->freeze_stop = ++f;
> > barrier();
> > if (u->freeze_stop != f)
> > printk(KERN_CRIT "BUG! freeze_stop %d != %d\n", u->freeze_stop, f);
> >
> > if (need_resched())
> > schedule();
> > }
> >
> > return 0;
> > }
> >
> > static struct task_struct *t_1, *t_2;
> >
> > static struct utrace utrace;
> >
> > static int __init mod_init(void)
> > {
> > WARN_ON(IS_ERR(t_1 = kthread_run(kt_1, &utrace, "kt_1")));
> > WARN_ON(IS_ERR(t_2 = kthread_run(kt_2, &utrace, "kt_2")));
> >
> > return 0;
> > }
> >
> > static void __exit mod_exit(void)
> > {
> > kthread_stop(t_1);
> > kthread_stop(t_2);
> > }
> >
> > MODULE_LICENSE("GPL");
> > module_init(mod_init);
> > module_exit(mod_exit);

2014-07-13 12:31:45

by Oleg Nesterov

[permalink] [raw]
Subject: Re: bit fields && data tearing

On 07/13, Benjamin Herrenschmidt wrote:
>
> On Sat, 2014-07-12 at 22:51 +0200, Oleg Nesterov wrote:
> > OK, looks like this is compiler bug,
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080
> >
> > Thanks to Dan who informed me privately.
>
> So yes, there's is this compiler bug which means a bitfield
> access can cause a r-m-w access to a neighbouring field

Thanks. So I can forward this all back to bugzilla.

> but
> in general, I would be weary of bitfields anyway since accessing
> them isn't going to be atomic anyway... it's too easy to get things
> wrong and in most cases the benefit is yet to be demonstrated.

Sure, bit fields should be used with care. But the same arguments apply
to bitmasks, they are often used without "atomic" set/clear_bit.

> In your example, I don't see the point of the bitfield.

This is just test-case. The real code has more adjacent bit fields, only
the tracee can modify them, and only debugger can change ->freeze_stop.

Thanks,

Oleg.

2014-07-13 13:15:26

by Peter Hurley

[permalink] [raw]
Subject: Re: bit fields && data tearing

On 07/12/2014 07:34 PM, Benjamin Herrenschmidt wrote:
> On Sat, 2014-07-12 at 22:51 +0200, Oleg Nesterov wrote:
>> OK, looks like this is compiler bug,
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080
>>
>> Thanks to Dan who informed me privately.
>
> So yes, there's is this compiler bug which means a bitfield
> access can cause a r-m-w access to a neighbouring field but
> in general, I would be weary of bitfields anyway since accessing
> them isn't going to be atomic anyway... it's too easy to get things
> wrong and in most cases the benefit is yet to be demonstrated.

I'm not sure I understand your point here, Ben.

Suppose that two different spinlocks are used independently to
protect r-m-w access to adjacent data. In Oleg's example,
suppose spinlock 1 is used for access to the bitfield and
spinlock 2 is used for access to freeze_stop.

What would prevent an accidental write to freeze_stop from the
kt_1 thread?

Regards,
Peter Hurley

> In your example, I don't see the point of the bitfield.
>
> Cheers,
> Ben.
>
>> On 07/12, Oleg Nesterov wrote:
>>>
>>> Hello,
>>>
>>> I am not sure I should ask here, but since Documentation/memory-barriers.txt
>>> mentions load/store tearing perhaps my question is not completely off-topic...
>>>
>>> I am fighting with mysterious RHEL bug, it can be reproduced on ppc and s390
>>> but not on x86. Finally I seem to understand the problem, and I even wrote the
>>> stupid kernel module to ensure, see it below at the end.
>>>
>>> It triggers the problem immediately, kt_2() sees the wrong value in freeze_stop.
>>> (If I turn ->freeze_stop int "long", the problem goes away).
>>>
>>> So the question is: is this gcc bug or the code below is buggy?
>>>
>>> If it is buggy, then probably memory-barriers.txt could mention that you should
>>> be carefull with bit fields, even ACCESS_ONCE() obviously can't help.
>>>
>>> Or this just discloses my ignorance and you need at least aligned(long) after a
>>> bit field to be thread-safe ? I thought that compiler should take care and add
>>> the necessary alignment if (say) CPU can't update a single byte/uint.
>>>
>>> gcc version 4.4.7 20120313 (Red Hat 4.4.7-9) (GCC). Asm:
>>>
>>> 0000000000000000 <.kt_2>:
>>> 0: 7c 08 02 a6 mflr r0
>>> 4: fb 81 ff e0 std r28,-32(r1)
>>> 8: fb a1 ff e8 std r29,-24(r1)
>>> c: fb c1 ff f0 std r30,-16(r1)
>>> 10: fb e1 ff f8 std r31,-8(r1)
>>> 14: eb c2 00 00 ld r30,0(r2)
>>> 18: f8 01 00 10 std r0,16(r1)
>>> 1c: f8 21 ff 71 stdu r1,-144(r1)
>>> 20: 7c 7d 1b 78 mr r29,r3
>>> 24: 3b e0 00 00 li r31,0
>>> 28: 78 3c 04 64 rldicr r28,r1,0,49
>>> 2c: 3b 9c 00 80 addi r28,r28,128
>>> 30: 48 00 00 2c b 5c <.kt_2+0x5c>
>>> 34: 60 00 00 00 nop
>>> 38: 60 00 00 00 nop
>>> 3c: 60 00 00 00 nop
>>> 40: 93 fd 00 04 stw r31,4(r29)
>>> 44: e8 9d 00 06 lwa r4,4(r29)
>>> 48: 7f 84 f8 00 cmpw cr7,r4,r31
>>> 4c: 40 de 00 4c bne- cr7,98 <.kt_2+0x98>
>>> 50: e8 1c 00 00 ld r0,0(r28)
>>> 54: 78 09 f7 e3 rldicl. r9,r0,62,63
>>> 58: 40 c2 00 54 bne- ac <.kt_2+0xac>
>>> 5c: 48 00 00 01 bl 5c <.kt_2+0x5c>
>>> 60: 60 00 00 00 nop
>>> 64: 3b ff 00 01 addi r31,r31,1
>>> 68: 2f a3 00 00 cmpdi cr7,r3,0
>>> 6c: 7f ff 07 b4 extsw r31,r31
>>> 70: 41 9e ff d0 beq+ cr7,40 <.kt_2+0x40>
>>> 74: 38 21 00 90 addi r1,r1,144
>>> 78: 38 60 00 00 li r3,0
>>> 7c: e8 01 00 10 ld r0,16(r1)
>>> 80: eb 81 ff e0 ld r28,-32(r1)
>>> 84: eb a1 ff e8 ld r29,-24(r1)
>>> 88: eb c1 ff f0 ld r30,-16(r1)
>>> 8c: eb e1 ff f8 ld r31,-8(r1)
>>> 90: 7c 08 03 a6 mtlr r0
>>> 94: 4e 80 00 20 blr
>>> 98: e8 7e 80 28 ld r3,-32728(r30)
>>> 9c: 7f e5 fb 78 mr r5,r31
>>> a0: 48 00 00 01 bl a0 <.kt_2+0xa0>
>>> a4: 60 00 00 00 nop
>>> a8: 4b ff ff a8 b 50 <.kt_2+0x50>
>>> ac: 48 00 00 01 bl ac <.kt_2+0xac>
>>> b0: 60 00 00 00 nop
>>> b4: 4b ff ff a8 b 5c <.kt_2+0x5c>
>>> b8: 60 00 00 00 nop
>>> bc: 60 00 00 00 nop
>>>
>>> 00000000000000c0 <.kt_1>:
>>> c0: 7c 08 02 a6 mflr r0
>>> c4: fb 81 ff e0 std r28,-32(r1)
>>> c8: fb a1 ff e8 std r29,-24(r1)
>>> cc: fb c1 ff f0 std r30,-16(r1)
>>> d0: fb e1 ff f8 std r31,-8(r1)
>>> d4: eb c2 00 00 ld r30,0(r2)
>>> d8: f8 01 00 10 std r0,16(r1)
>>> dc: f8 21 ff 71 stdu r1,-144(r1)
>>> e0: 7c 7d 1b 78 mr r29,r3
>>> e4: 3b e0 00 00 li r31,0
>>> e8: 78 3c 04 64 rldicr r28,r1,0,49
>>> ec: 3b 9c 00 80 addi r28,r28,128
>>> f0: 48 00 00 38 b 128 <.kt_1+0x68>
>>> f4: 60 00 00 00 nop
>>> f8: 60 00 00 00 nop
>>> fc: 60 00 00 00 nop
>>> 100: e8 1d 00 00 ld r0,0(r29)
>>> 104: 79 20 e8 0e rldimi r0,r9,61,0
>>> 108: f8 1d 00 00 std r0,0(r29)
>>> 10c: 80 1d 00 00 lwz r0,0(r29)
>>> 110: 54 00 1f 7e rlwinm r0,r0,3,29,31
>>> 114: 7f 80 f8 00 cmpw cr7,r0,r31
>>> 118: 40 de 00 6c bne- cr7,184 <.kt_1+0xc4>
>>> 11c: e8 1c 00 00 ld r0,0(r28)
>>> 120: 78 09 f7 e3 rldicl. r9,r0,62,63
>>> 124: 40 c2 00 70 bne- 194 <.kt_1+0xd4>
>>> 128: 48 00 00 01 bl 128 <.kt_1+0x68>
>>> 12c: 60 00 00 00 nop
>>> 130: 3b ff 00 01 addi r31,r31,1
>>> 134: 2f a3 00 00 cmpdi cr7,r3,0
>>> 138: 7f ff 07 b4 extsw r31,r31
>>> 13c: 2f 1f 00 07 cmpwi cr6,r31,7
>>> 140: 7b e9 07 60 clrldi r9,r31,61
>>> 144: 40 9e 00 1c bne- cr7,160 <.kt_1+0xa0>
>>> 148: 40 9a ff b8 bne+ cr6,100 <.kt_1+0x40>
>>> 14c: 39 20 00 00 li r9,0
>>> 150: 3b e0 00 00 li r31,0
>>> 154: 4b ff ff ac b 100 <.kt_1+0x40>
>>> 158: 60 00 00 00 nop
>>> 15c: 60 00 00 00 nop
>>> 160: 38 21 00 90 addi r1,r1,144
>>> 164: 38 60 00 00 li r3,0
>>> 168: e8 01 00 10 ld r0,16(r1)
>>> 16c: eb 81 ff e0 ld r28,-32(r1)
>>> 170: eb a1 ff e8 ld r29,-24(r1)
>>> 174: eb c1 ff f0 ld r30,-16(r1)
>>> 178: eb e1 ff f8 ld r31,-8(r1)
>>> 17c: 7c 08 03 a6 mtlr r0
>>> 180: 4e 80 00 20 blr
>>> 184: e8 7e 80 30 ld r3,-32720(r30)
>>> 188: 48 00 00 01 bl 188 <.kt_1+0xc8>
>>> 18c: 60 00 00 00 nop
>>> 190: 4b ff ff 8c b 11c <.kt_1+0x5c>
>>> 194: 48 00 00 01 bl 194 <.kt_1+0xd4>
>>> 198: 60 00 00 00 nop
>>> 19c: 4b ff ff 8c b 128 <.kt_1+0x68>
>>>
>>> Unfortunately it tells me nothing, I do not know ppc.
>>>
>>> Oleg.
>>>
>>> --------------------------------------------------------------------------------
>>> #include <linux/module.h>
>>> #include <linux/kernel.h>
>>> #include <linux/kthread.h>
>>>
>>> struct utrace {
>>> unsigned int resume:3;
>>> int freeze_stop;
>>> };
>>>
>>> static int kt_1(void *arg)
>>> {
>>> struct utrace *u = arg;
>>> int r = 0;
>>>
>>> while (!kthread_should_stop()) {
>>> if (++r == 7)
>>> r = 0;
>>>
>>> u->resume = r;
>>> barrier();
>>> if (u->resume != r)
>>> printk(KERN_CRIT "BUG! bitfield\n");
>>>
>>> if (need_resched())
>>> schedule();
>>> }
>>>
>>> return 0;
>>> }
>>>
>>> static int kt_2(void *arg)
>>> {
>>> struct utrace *u = arg;
>>> int f = 0;
>>>
>>> while (!kthread_should_stop()) {
>>> u->freeze_stop = ++f;
>>> barrier();
>>> if (u->freeze_stop != f)
>>> printk(KERN_CRIT "BUG! freeze_stop %d != %d\n", u->freeze_stop, f);
>>>
>>> if (need_resched())
>>> schedule();
>>> }
>>>
>>> return 0;
>>> }
>>>
>>> static struct task_struct *t_1, *t_2;
>>>
>>> static struct utrace utrace;
>>>
>>> static int __init mod_init(void)
>>> {
>>> WARN_ON(IS_ERR(t_1 = kthread_run(kt_1, &utrace, "kt_1")));
>>> WARN_ON(IS_ERR(t_2 = kthread_run(kt_2, &utrace, "kt_2")));
>>>
>>> return 0;
>>> }
>>>
>>> static void __exit mod_exit(void)
>>> {
>>> kthread_stop(t_1);
>>> kthread_stop(t_2);
>>> }
>>>
>>> MODULE_LICENSE("GPL");
>>> module_init(mod_init);
>>> module_exit(mod_exit);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-07-13 22:26:07

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: bit fields && data tearing

On Sun, 2014-07-13 at 09:15 -0400, Peter Hurley wrote:
>
> I'm not sure I understand your point here, Ben.
>
> Suppose that two different spinlocks are used independently to
> protect r-m-w access to adjacent data. In Oleg's example,
> suppose spinlock 1 is used for access to the bitfield and
> spinlock 2 is used for access to freeze_stop.
>
> What would prevent an accidental write to freeze_stop from the
> kt_1 thread?

My point was to be weary of bitfields in general because access
to them is always R-M-W, never atomic and that seem to escape
people regularily :-) (Among other problems such as endian etc...)

As for Oleg's example, it *should* have worked because the bitfield and
the adjacent freeze_stop should have been accessed using load/stores
that don't actually overlap, but the compiler bug causes the bitfield
access to not properly use the basic type of the bitfield, but escalate
to a full 64-bit R-M-W instead, thus incorrectly R-M-W'ing the field
next door.

Cheers,
Ben.



2014-07-15 13:54:26

by Peter Hurley

[permalink] [raw]
Subject: Re: bit fields && data tearing

On 07/13/2014 06:25 PM, Benjamin Herrenschmidt wrote:
> On Sun, 2014-07-13 at 09:15 -0400, Peter Hurley wrote:
>>
>> I'm not sure I understand your point here, Ben.
>>
>> Suppose that two different spinlocks are used independently to
>> protect r-m-w access to adjacent data. In Oleg's example,
>> suppose spinlock 1 is used for access to the bitfield and
>> spinlock 2 is used for access to freeze_stop.
>>
>> What would prevent an accidental write to freeze_stop from the
>> kt_1 thread?
>
> My point was to be weary of bitfields in general because access
> to them is always R-M-W, never atomic and that seem to escape
> people regularily :-) (Among other problems such as endian etc...)
>
> As for Oleg's example, it *should* have worked because the bitfield and
> the adjacent freeze_stop should have been accessed using load/stores
> that don't actually overlap, but the compiler bug causes the bitfield
> access to not properly use the basic type of the bitfield, but escalate
> to a full 64-bit R-M-W instead, thus incorrectly R-M-W'ing the field
> next door.

Yeah, ok, so just a generic heads-up about non-atomicity of bitfields,
and not something specific to Oleg's example. Thanks.

Jonathan Corbet wrote a LWN article about this back in 2012:
http://lwn.net/Articles/478657/

I guess it's fixed in gcc 4.8, but too bad there's not a workaround for
earlier compilers (akin to -fstrict_volatile_bitfields without requiring
the volatile keyword).

Regards,
Peter Hurley


2014-07-15 15:02:44

by Richard Henderson

[permalink] [raw]
Subject: Re: bit fields && data tearing

On 07/15/2014 06:54 AM, Peter Hurley wrote:
>
> Jonathan Corbet wrote a LWN article about this back in 2012:
> http://lwn.net/Articles/478657/
>
> I guess it's fixed in gcc 4.8, but too bad there's not a workaround for
> earlier compilers (akin to -fstrict_volatile_bitfields without requiring
> the volatile keyword)

>From the gcc pr, it looks like the patch was backported to 4.7.
But we didn't fix it in versions earlier than that.


r~