2014-11-06 11:59:58

by Christian Riesch

[permalink] [raw]
Subject: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

The current implementation of put_tty_queue() causes a race condition
when re-arranged by the compiler.

On my build with gcc 4.8.3, cross-compiling for ARM, the line

*read_buf_addr(ldata, ldata->read_head++) = c;

was re-arranged by the compiler to something like

x = ldata->read_head
ldata->read_head++
*read_buf_addr(ldata, x) = c;

which causes a race condition. Invalid data is read if data is read
before it is actually written to the read buffer.

This race condition was introduced with commit
6d76bd2618535c581f1673047b8341fd291abc67 ("n_tty: Make N_TTY ldisc receive
path lockless").

This patch adds memory barriers to resolve this race condition.

Signed-off-by: Christian Riesch <[email protected]>
Cc: Peter Hurley <[email protected]>
Cc: <[email protected]>
---

Hi all,

I noticed that since an upgrade to kernel 3.12 my ARM device communicating
via serial port with a GPS receiver module reports frequent communication
errors.

After several attempts to resolve these problems I created a small
test setup. This setup utilizes a small microcontroller that just sends
data via serial port to the ARM processor using 9600o1.

The code on the microcontroller looks like this:

char c = 48;
while (1) {
if c > 126 then c = 48
send character c
c++
}

On the ARM/Linux side I ran the serial port in non-canonical mode,
received the data and checked if the data is what we expect it to be:

struct termios tio;
memset(&tio, 0, sizeof(tio));
tio.c_cflag = CREAD | CLOCAL | B9600 | CS8 | PARENB | PARODD;
tio.c_iflag = INPCK;
tio.c_cc[VTIME] = 0;
tio.c_cc[VMIN] = 0;
tcsetattr(fd, TCSANOW, &tio);

...
c = 48;
while (1) {
...
poll(pfds, 1, 1000);
if (pfds[0].revents & POLLIN) {
ret = read(fd, buf, 200);
for (i = 0; i < ret; i++) {
c++;
if (c > 126)
c = 48;
if (c != buf[i]) {
printf("expected %d - received %d, ret = %d, i = %d\n",
c, buf[i], ret, i);
c = buf[i];
}
}
}
}

I ran this test for about 5 days, the result was:

expected 51 - received 63, ret = 11, i = 10
expected 64 - received 52, ret = 13, i = 0
expected 64 - received 76, ret = 11, i = 10
expected 77 - received 65, ret = 5, i = 0
expected 105 - received 117, ret = 18, i = 17
expected 118 - received 106, ret = 6, i = 0
expected 120 - received 53, ret = 16, i = 15
expected 54 - received 121, ret = 8, i = 0
expected 105 - received 117, ret = 3, i = 2
expected 118 - received 106, ret = 5, i = 0
expected 79 - received 91, ret = 20, i = 19
expected 92 - received 80, ret = 4, i = 0
expected 72 - received 84, ret = 15, i = 14
expected 85 - received 73, ret = 9, i = 0
expected 54 - received 66, ret = 13, i = 12
expected 67 - received 55, ret = 3, i = 0
expected 86 - received 98, ret = 25, i = 24
expected 99 - received 87, ret = 15, i = 0
expected 86 - received 98, ret = 14, i = 13
expected 99 - received 87, ret = 42, i = 0
expected 93 - received 105, ret = 34, i = 33
expected 106 - received 94, ret = 6, i = 0
expected 92 - received 104, ret = 16, i = 15
expected 105 - received 93, ret = 8, i = 0
expected 53 - received 65, ret = 8, i = 7
expected 66 - received 54, ret = 8, i = 0

The first line shows that we expected buf[i] to be 51, actually we
received 63. We therefore set c = 63, consequently we expect 64 as the
next character. But we receive 52, so everything is back to normal. So
no bytes are missing, no additional bytes are received, there is just
a single byte with a wrong content.

We see that always the last byte in buf is affected, i.e. buf[ret - 1].

Furthermore we see that the wrong byte is always off by 12, i.e. instead
of 51 we received 63 (63 - 51 = 12), instead of 64 we received 76
(76 - 64 = 12) etc.

The race that I described above in the commit message exactly results in
such a behavior.

In the example below read_head was already incremented but the new content
has not yet been written to ldata->read_buf.

48 49 50 51 64 65 66 67
^^
read_head

The receive buffer is 4096 bytes, and we are sending a character
sequence that repeats every 126 - 47 = 79 bytes. Therefore the offset between
the old data and the new data is 4096 mod 79 = 67.

Instead of the new value 52, we still read the old value 52 - 67 (with
wrapping around from 48 to 127) = 64 = 52 + 12.

I have now applied my proposed fix below, I will run a test for the new
few days and report if this finally solved my problem.

However, since I am not familiar with memory barriers, I would like to
ask you for comments if this is the correct way to solve this problem.

Thank you!

Regards, Christian


drivers/tty/n_tty.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index 89c4cee..831137e 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -321,7 +321,13 @@ static void n_tty_check_unthrottle(struct tty_struct *tty)

static inline void put_tty_queue(unsigned char c, struct n_tty_data *ldata)
{
- *read_buf_addr(ldata, ldata->read_head++) = c;
+ *read_buf_addr(ldata, ldata->read_head) = c;
+ /*
+ * make sure read_head is incremented _after_ data is written
+ * from the buffer.
+ */
+ smp_wmb();
+ ldata->read_head++;
}

/**
@@ -1539,6 +1545,11 @@ n_tty_receive_buf_real_raw(struct tty_struct *tty, const unsigned char *cp,
n = N_TTY_BUF_SIZE - max(read_cnt(ldata), head);
n = min_t(size_t, count, n);
memcpy(read_buf_addr(ldata, head), cp, n);
+ /*
+ * make sure read_head is incremented _after_ data is written
+ * from the buffer, here ...
+ */
+ smp_wmb();
ldata->read_head += n;
cp += n;
count -= n;
@@ -1547,6 +1558,8 @@ n_tty_receive_buf_real_raw(struct tty_struct *tty, const unsigned char *cp,
n = N_TTY_BUF_SIZE - max(read_cnt(ldata), head);
n = min_t(size_t, count, n);
memcpy(read_buf_addr(ldata, head), cp, n);
+ /* ... and here again. */
+ smp_wmb();
ldata->read_head += n;
}

@@ -1947,6 +1960,11 @@ static int copy_from_read_buf(struct tty_struct *tty,
is_eof = n == 1 && read_buf(ldata, tail) == EOF_CHAR(tty);
tty_audit_add_data(tty, read_buf_addr(ldata, tail), n,
ldata->icanon);
+ /*
+ * make sure read_tail is incremented _after_ data is read
+ * from the buffer.
+ */
+ smp_mb();
ldata->read_tail += n;
/* Turn single EOF into zero-length read */
if (L_EXTPROC(tty) && ldata->icanon && is_eof && !read_cnt(ldata))
--
1.7.9.5


2014-11-06 20:38:37

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
> The current implementation of put_tty_queue() causes a race condition
> when re-arranged by the compiler.
>
> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>
> *read_buf_addr(ldata, ldata->read_head++) = c;
>
> was re-arranged by the compiler to something like
>
> x = ldata->read_head
> ldata->read_head++
> *read_buf_addr(ldata, x) = c;
>
> which causes a race condition. Invalid data is read if data is read
> before it is actually written to the read buffer.

Really? A compiler can rearange things like that and expect things to
actually work? How is that valid?

Is this the "broken gcc" version that ARM developers keep running into
all the time with odd crashes and problems? Can you upgrade to 4.9 and
see if that solves the issue for you?

thanks,

greg k-h

2014-11-06 20:49:08

by Måns Rullgård

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Greg Kroah-Hartman <[email protected]> writes:

> On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>> The current implementation of put_tty_queue() causes a race condition
>> when re-arranged by the compiler.
>>
>> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>>
>> *read_buf_addr(ldata, ldata->read_head++) = c;
>>
>> was re-arranged by the compiler to something like
>>
>> x = ldata->read_head
>> ldata->read_head++
>> *read_buf_addr(ldata, x) = c;
>>
>> which causes a race condition. Invalid data is read if data is read
>> before it is actually written to the read buffer.
>
> Really? A compiler can rearange things like that and expect things to
> actually work? How is that valid?

This is actually required by the C spec. There is a sequence point
before a function call, after the arguments have been evaluated. Thus
all side-effects, such as the post-increment, must be complete before
the function is called, just like in the example.

There is no "re-arranging" here. The code is simply wrong.

--
M?ns Rullg?rd
[email protected]

2014-11-06 20:56:48

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
> >> The current implementation of put_tty_queue() causes a race condition
> >> when re-arranged by the compiler.
> >>
> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
> >>
> >> *read_buf_addr(ldata, ldata->read_head++) = c;
> >>
> >> was re-arranged by the compiler to something like
> >>
> >> x = ldata->read_head
> >> ldata->read_head++
> >> *read_buf_addr(ldata, x) = c;
> >>
> >> which causes a race condition. Invalid data is read if data is read
> >> before it is actually written to the read buffer.
> >
> > Really? A compiler can rearange things like that and expect things to
> > actually work? How is that valid?
>
> This is actually required by the C spec. There is a sequence point
> before a function call, after the arguments have been evaluated. Thus
> all side-effects, such as the post-increment, must be complete before
> the function is called, just like in the example.
>
> There is no "re-arranging" here. The code is simply wrong.

Ah, ok, time to dig out the C spec...

Anyway, because of this, no need for the wmb() calls, just rearrange the
logic and all should be good, right? Christian, can you test that
instead?

thanks,

greg k-h

2014-11-06 21:01:41

by Måns Rullgård

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Greg Kroah-Hartman <[email protected]> writes:

> On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>> >> The current implementation of put_tty_queue() causes a race condition
>> >> when re-arranged by the compiler.
>> >>
>> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>> >>
>> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>> >>
>> >> was re-arranged by the compiler to something like
>> >>
>> >> x = ldata->read_head
>> >> ldata->read_head++
>> >> *read_buf_addr(ldata, x) = c;
>> >>
>> >> which causes a race condition. Invalid data is read if data is read
>> >> before it is actually written to the read buffer.
>> >
>> > Really? A compiler can rearange things like that and expect things to
>> > actually work? How is that valid?
>>
>> This is actually required by the C spec. There is a sequence point
>> before a function call, after the arguments have been evaluated. Thus
>> all side-effects, such as the post-increment, must be complete before
>> the function is called, just like in the example.
>>
>> There is no "re-arranging" here. The code is simply wrong.
>
> Ah, ok, time to dig out the C spec...
>
> Anyway, because of this, no need for the wmb() calls, just rearrange the
> logic and all should be good, right? Christian, can you test that
> instead?

Weakly ordered SMP systems probably need some kind of barrier. I didn't
look at it carefully.

--
M?ns Rullg?rd
[email protected]

2014-11-06 21:17:57

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On Thu, Nov 06, 2014 at 09:01:36PM +0000, M?ns Rullg?rd wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
> >> Greg Kroah-Hartman <[email protected]> writes:
> >>
> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
> >> >> The current implementation of put_tty_queue() causes a race condition
> >> >> when re-arranged by the compiler.
> >> >>
> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
> >> >>
> >> >> *read_buf_addr(ldata, ldata->read_head++) = c;
> >> >>
> >> >> was re-arranged by the compiler to something like
> >> >>
> >> >> x = ldata->read_head
> >> >> ldata->read_head++
> >> >> *read_buf_addr(ldata, x) = c;
> >> >>
> >> >> which causes a race condition. Invalid data is read if data is read
> >> >> before it is actually written to the read buffer.
> >> >
> >> > Really? A compiler can rearange things like that and expect things to
> >> > actually work? How is that valid?
> >>
> >> This is actually required by the C spec. There is a sequence point
> >> before a function call, after the arguments have been evaluated. Thus
> >> all side-effects, such as the post-increment, must be complete before
> >> the function is called, just like in the example.
> >>
> >> There is no "re-arranging" here. The code is simply wrong.
> >
> > Ah, ok, time to dig out the C spec...
> >
> > Anyway, because of this, no need for the wmb() calls, just rearrange the
> > logic and all should be good, right? Christian, can you test that
> > instead?
>
> Weakly ordered SMP systems probably need some kind of barrier. I didn't
> look at it carefully.

It shouldn't need a barier, as it is a sequence point with the function
call. Well, it's an inline function, but that "shouldn't" matter here,
right?

thanks,

greg k-h

2014-11-06 21:39:05

by Måns Rullgård

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Greg Kroah-Hartman <[email protected]> writes:

> On Thu, Nov 06, 2014 at 09:01:36PM +0000, M?ns Rullg?rd wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
>> >> Greg Kroah-Hartman <[email protected]> writes:
>> >>
>> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>> >> >> The current implementation of put_tty_queue() causes a race condition
>> >> >> when re-arranged by the compiler.
>> >> >>
>> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>> >> >>
>> >> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>> >> >>
>> >> >> was re-arranged by the compiler to something like
>> >> >>
>> >> >> x = ldata->read_head
>> >> >> ldata->read_head++
>> >> >> *read_buf_addr(ldata, x) = c;
>> >> >>
>> >> >> which causes a race condition. Invalid data is read if data is read
>> >> >> before it is actually written to the read buffer.
>> >> >
>> >> > Really? A compiler can rearange things like that and expect things to
>> >> > actually work? How is that valid?
>> >>
>> >> This is actually required by the C spec. There is a sequence point
>> >> before a function call, after the arguments have been evaluated. Thus
>> >> all side-effects, such as the post-increment, must be complete before
>> >> the function is called, just like in the example.
>> >>
>> >> There is no "re-arranging" here. The code is simply wrong.
>> >
>> > Ah, ok, time to dig out the C spec...
>> >
>> > Anyway, because of this, no need for the wmb() calls, just rearrange the
>> > logic and all should be good, right? Christian, can you test that
>> > instead?
>>
>> Weakly ordered SMP systems probably need some kind of barrier. I didn't
>> look at it carefully.
>
> It shouldn't need a barier, as it is a sequence point with the function
> call. Well, it's an inline function, but that "shouldn't" matter here,
> right?

Sequence points say nothing about the order in which stores become
visible to other CPUs. That's why there are barrier instructions.

--
M?ns Rullg?rd
[email protected]

2014-11-06 21:40:41

by Christian Riesch

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On Thu, Nov 6, 2014 at 9:56 PM, Greg Kroah-Hartman
<[email protected]> wrote:
> On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>> >> The current implementation of put_tty_queue() causes a race condition
>> >> when re-arranged by the compiler.
>> >>
>> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>> >>
>> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>> >>
>> >> was re-arranged by the compiler to something like
>> >>
>> >> x = ldata->read_head
>> >> ldata->read_head++
>> >> *read_buf_addr(ldata, x) = c;
>> >>
>> >> which causes a race condition. Invalid data is read if data is read
>> >> before it is actually written to the read buffer.
>> >
>> > Really? A compiler can rearange things like that and expect things to
>> > actually work? How is that valid?
>>
>> This is actually required by the C spec. There is a sequence point
>> before a function call, after the arguments have been evaluated. Thus
>> all side-effects, such as the post-increment, must be complete before
>> the function is called, just like in the example.
>>
>> There is no "re-arranging" here. The code is simply wrong.

Oh, I didn't know that, thanks a lot for this!

> Anyway, because of this, no need for the wmb() calls, just rearrange the
> logic and all should be good, right?

I came up with the wmb() stuff after getting scared from reading
Documentation/memory-barriers.txt (which I didn't understand) and
Documentation/circular-buffers.txt (which I understood partly). But it
is actually a circular buffer and circular-buffers.txt says I need
memory barriers for circular buffers. Though I do not know how
function calls fit into this picture.

> Christian, can you test that
> instead?

Sure, but it will probably not happen before Monday. Thanks a lot for your help!

Christian

2014-11-06 22:02:55

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On Thu, Nov 06, 2014 at 09:38:59PM +0000, M?ns Rullg?rd wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
> > On Thu, Nov 06, 2014 at 09:01:36PM +0000, M?ns Rullg?rd wrote:
> >> Greg Kroah-Hartman <[email protected]> writes:
> >>
> >> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
> >> >> Greg Kroah-Hartman <[email protected]> writes:
> >> >>
> >> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
> >> >> >> The current implementation of put_tty_queue() causes a race condition
> >> >> >> when re-arranged by the compiler.
> >> >> >>
> >> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
> >> >> >>
> >> >> >> *read_buf_addr(ldata, ldata->read_head++) = c;
> >> >> >>
> >> >> >> was re-arranged by the compiler to something like
> >> >> >>
> >> >> >> x = ldata->read_head
> >> >> >> ldata->read_head++
> >> >> >> *read_buf_addr(ldata, x) = c;
> >> >> >>
> >> >> >> which causes a race condition. Invalid data is read if data is read
> >> >> >> before it is actually written to the read buffer.
> >> >> >
> >> >> > Really? A compiler can rearange things like that and expect things to
> >> >> > actually work? How is that valid?
> >> >>
> >> >> This is actually required by the C spec. There is a sequence point
> >> >> before a function call, after the arguments have been evaluated. Thus
> >> >> all side-effects, such as the post-increment, must be complete before
> >> >> the function is called, just like in the example.
> >> >>
> >> >> There is no "re-arranging" here. The code is simply wrong.
> >> >
> >> > Ah, ok, time to dig out the C spec...
> >> >
> >> > Anyway, because of this, no need for the wmb() calls, just rearrange the
> >> > logic and all should be good, right? Christian, can you test that
> >> > instead?
> >>
> >> Weakly ordered SMP systems probably need some kind of barrier. I didn't
> >> look at it carefully.
> >
> > It shouldn't need a barier, as it is a sequence point with the function
> > call. Well, it's an inline function, but that "shouldn't" matter here,
> > right?
>
> Sequence points say nothing about the order in which stores become
> visible to other CPUs. That's why there are barrier instructions.

Yes, but "order" matters.

If I write code that does:

100 x = ldata->read_head;
101 &ldata->read_head[x & SOME_VALUE] = y;
102 ldata->read_head++;

the compiler can not reorder lines 102 and 101 just because it feels
like it, right? Or is it time to go spend some reading of the C spec
again...

thanks,

greg k-h

2014-11-06 22:13:06

by Måns Rullgård

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Greg Kroah-Hartman <[email protected]> writes:

> On Thu, Nov 06, 2014 at 09:38:59PM +0000, M?ns Rullg?rd wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>> > On Thu, Nov 06, 2014 at 09:01:36PM +0000, M?ns Rullg?rd wrote:
>> >> Greg Kroah-Hartman <[email protected]> writes:
>> >>
>> >> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
>> >> >> Greg Kroah-Hartman <[email protected]> writes:
>> >> >>
>> >> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>> >> >> >> The current implementation of put_tty_queue() causes a race condition
>> >> >> >> when re-arranged by the compiler.
>> >> >> >>
>> >> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>> >> >> >>
>> >> >> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>> >> >> >>
>> >> >> >> was re-arranged by the compiler to something like
>> >> >> >>
>> >> >> >> x = ldata->read_head
>> >> >> >> ldata->read_head++
>> >> >> >> *read_buf_addr(ldata, x) = c;
>> >> >> >>
>> >> >> >> which causes a race condition. Invalid data is read if data is read
>> >> >> >> before it is actually written to the read buffer.
>> >> >> >
>> >> >> > Really? A compiler can rearange things like that and expect things to
>> >> >> > actually work? How is that valid?
>> >> >>
>> >> >> This is actually required by the C spec. There is a sequence point
>> >> >> before a function call, after the arguments have been evaluated. Thus
>> >> >> all side-effects, such as the post-increment, must be complete before
>> >> >> the function is called, just like in the example.
>> >> >>
>> >> >> There is no "re-arranging" here. The code is simply wrong.
>> >> >
>> >> > Ah, ok, time to dig out the C spec...
>> >> >
>> >> > Anyway, because of this, no need for the wmb() calls, just rearrange the
>> >> > logic and all should be good, right? Christian, can you test that
>> >> > instead?
>> >>
>> >> Weakly ordered SMP systems probably need some kind of barrier. I didn't
>> >> look at it carefully.
>> >
>> > It shouldn't need a barier, as it is a sequence point with the function
>> > call. Well, it's an inline function, but that "shouldn't" matter here,
>> > right?
>>
>> Sequence points say nothing about the order in which stores become
>> visible to other CPUs. That's why there are barrier instructions.
>
> Yes, but "order" matters.
>
> If I write code that does:
>
> 100 x = ldata->read_head;
> 101 &ldata->read_head[x & SOME_VALUE] = y;
> 102 ldata->read_head++;
>
> the compiler can not reorder lines 102 and 101 just because it feels
> like it, right? Or is it time to go spend some reading of the C spec
> again...

The compiler can't. The hardware can. All the hardware promises is
that at some unspecified time in the future, both memory locations will
have the correct values. Another CPU might see 'read_head' updated
before it sees the corresponding data value. A wmb() between the writes
forces the CPU to complete preceding stores before it begins subsequent
ones.

Documentation/memory-barriers.txt explains it all in detail.

--
M?ns Rullg?rd
[email protected]

2014-11-06 22:31:17

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On Thu, Nov 06, 2014 at 10:12:54PM +0000, M?ns Rullg?rd wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
> > On Thu, Nov 06, 2014 at 09:38:59PM +0000, M?ns Rullg?rd wrote:
> >> Greg Kroah-Hartman <[email protected]> writes:
> >>
> >> > On Thu, Nov 06, 2014 at 09:01:36PM +0000, M?ns Rullg?rd wrote:
> >> >> Greg Kroah-Hartman <[email protected]> writes:
> >> >>
> >> >> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
> >> >> >> Greg Kroah-Hartman <[email protected]> writes:
> >> >> >>
> >> >> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
> >> >> >> >> The current implementation of put_tty_queue() causes a race condition
> >> >> >> >> when re-arranged by the compiler.
> >> >> >> >>
> >> >> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
> >> >> >> >>
> >> >> >> >> *read_buf_addr(ldata, ldata->read_head++) = c;
> >> >> >> >>
> >> >> >> >> was re-arranged by the compiler to something like
> >> >> >> >>
> >> >> >> >> x = ldata->read_head
> >> >> >> >> ldata->read_head++
> >> >> >> >> *read_buf_addr(ldata, x) = c;
> >> >> >> >>
> >> >> >> >> which causes a race condition. Invalid data is read if data is read
> >> >> >> >> before it is actually written to the read buffer.
> >> >> >> >
> >> >> >> > Really? A compiler can rearange things like that and expect things to
> >> >> >> > actually work? How is that valid?
> >> >> >>
> >> >> >> This is actually required by the C spec. There is a sequence point
> >> >> >> before a function call, after the arguments have been evaluated. Thus
> >> >> >> all side-effects, such as the post-increment, must be complete before
> >> >> >> the function is called, just like in the example.
> >> >> >>
> >> >> >> There is no "re-arranging" here. The code is simply wrong.
> >> >> >
> >> >> > Ah, ok, time to dig out the C spec...
> >> >> >
> >> >> > Anyway, because of this, no need for the wmb() calls, just rearrange the
> >> >> > logic and all should be good, right? Christian, can you test that
> >> >> > instead?
> >> >>
> >> >> Weakly ordered SMP systems probably need some kind of barrier. I didn't
> >> >> look at it carefully.
> >> >
> >> > It shouldn't need a barier, as it is a sequence point with the function
> >> > call. Well, it's an inline function, but that "shouldn't" matter here,
> >> > right?
> >>
> >> Sequence points say nothing about the order in which stores become
> >> visible to other CPUs. That's why there are barrier instructions.
> >
> > Yes, but "order" matters.
> >
> > If I write code that does:
> >
> > 100 x = ldata->read_head;
> > 101 &ldata->read_head[x & SOME_VALUE] = y;
> > 102 ldata->read_head++;
> >
> > the compiler can not reorder lines 102 and 101 just because it feels
> > like it, right? Or is it time to go spend some reading of the C spec
> > again...
>
> The compiler can't. The hardware can. All the hardware promises is
> that at some unspecified time in the future, both memory locations will
> have the correct values. Another CPU might see 'read_head' updated
> before it sees the corresponding data value. A wmb() between the writes
> forces the CPU to complete preceding stores before it begins subsequent
> ones.

Yes, sorry, I'm not talking about other CPUs and what they see, I'm
talking about the local one. I'm not assuming that this is SMP "safe"
at all. If it is supposed to be, then yes, we do have problems, but
there should be a lock _somewhere_ protecting this.

Peter's emails seem to be bouncing horridly right now, otherwise he
would chime in and set me straight as to how this all should be
working...

> Documentation/memory-barriers.txt explains it all in detail.

In _great_ detail :)

thanks,

greg k-h

2014-11-06 22:54:22

by Måns Rullgård

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Greg Kroah-Hartman <[email protected]> writes:

> On Thu, Nov 06, 2014 at 10:12:54PM +0000, M?ns Rullg?rd wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>> > On Thu, Nov 06, 2014 at 09:38:59PM +0000, M?ns Rullg?rd wrote:
>> >> Greg Kroah-Hartman <[email protected]> writes:
>> >>
>> >> > On Thu, Nov 06, 2014 at 09:01:36PM +0000, M?ns Rullg?rd wrote:
>> >> >> Greg Kroah-Hartman <[email protected]> writes:
>> >> >>
>> >> >> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
>> >> >> >> Greg Kroah-Hartman <[email protected]> writes:
>> >> >> >>
>> >> >> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>> >> >> >> >> The current implementation of put_tty_queue() causes a race condition
>> >> >> >> >> when re-arranged by the compiler.
>> >> >> >> >>
>> >> >> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>> >> >> >> >>
>> >> >> >> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>> >> >> >> >>
>> >> >> >> >> was re-arranged by the compiler to something like
>> >> >> >> >>
>> >> >> >> >> x = ldata->read_head
>> >> >> >> >> ldata->read_head++
>> >> >> >> >> *read_buf_addr(ldata, x) = c;
>> >> >> >> >>
>> >> >> >> >> which causes a race condition. Invalid data is read if data is read
>> >> >> >> >> before it is actually written to the read buffer.
>> >> >> >> >
>> >> >> >> > Really? A compiler can rearange things like that and expect things to
>> >> >> >> > actually work? How is that valid?
>> >> >> >>
>> >> >> >> This is actually required by the C spec. There is a sequence point
>> >> >> >> before a function call, after the arguments have been evaluated. Thus
>> >> >> >> all side-effects, such as the post-increment, must be complete before
>> >> >> >> the function is called, just like in the example.
>> >> >> >>
>> >> >> >> There is no "re-arranging" here. The code is simply wrong.
>> >> >> >
>> >> >> > Ah, ok, time to dig out the C spec...
>> >> >> >
>> >> >> > Anyway, because of this, no need for the wmb() calls, just rearrange the
>> >> >> > logic and all should be good, right? Christian, can you test that
>> >> >> > instead?
>> >> >>
>> >> >> Weakly ordered SMP systems probably need some kind of barrier. I didn't
>> >> >> look at it carefully.
>> >> >
>> >> > It shouldn't need a barier, as it is a sequence point with the function
>> >> > call. Well, it's an inline function, but that "shouldn't" matter here,
>> >> > right?
>> >>
>> >> Sequence points say nothing about the order in which stores become
>> >> visible to other CPUs. That's why there are barrier instructions.
>> >
>> > Yes, but "order" matters.
>> >
>> > If I write code that does:
>> >
>> > 100 x = ldata->read_head;
>> > 101 &ldata->read_head[x & SOME_VALUE] = y;
>> > 102 ldata->read_head++;
>> >
>> > the compiler can not reorder lines 102 and 101 just because it feels
>> > like it, right? Or is it time to go spend some reading of the C spec
>> > again...
>>
>> The compiler can't. The hardware can. All the hardware promises is
>> that at some unspecified time in the future, both memory locations will
>> have the correct values. Another CPU might see 'read_head' updated
>> before it sees the corresponding data value. A wmb() between the writes
>> forces the CPU to complete preceding stores before it begins subsequent
>> ones.
>
> Yes, sorry, I'm not talking about other CPUs and what they see, I'm
> talking about the local one. I'm not assuming that this is SMP "safe"
> at all. If it is supposed to be, then yes, we do have problems, but
> there should be a lock _somewhere_ protecting this.

Within the confines of a single CPU + memory, barriers are never needed.
The moment another CPU or master-capable peripheral enters the mix,
proper ordering must be enforced somehow.

If the buffer is already protected by a lock of some kind, this will
provide the necessary barriers, so nothing further is necessary. If
it's a lock-less design, there will need to be barriers somewhere.

>From the patch context, I can't tell which category this case falls
into, and I'm far too lazy to read the entire file.

--
M?ns Rullg?rd
[email protected]

2014-11-07 06:50:51

by Christian Riesch

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

[sent again due to stupid HTML mail problems, sorry]

On Thu, Nov 6, 2014 at 11:54 PM, Måns Rullgård <[email protected]> wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
>> On Thu, Nov 06, 2014 at 10:12:54PM +0000, Måns Rullgård wrote:
>>> Greg Kroah-Hartman <[email protected]> writes:
>>>
>>> > On Thu, Nov 06, 2014 at 09:38:59PM +0000, Måns Rullgård wrote:
>>> >> Greg Kroah-Hartman <[email protected]> writes:
>>> >>
>>> >> > On Thu, Nov 06, 2014 at 09:01:36PM +0000, Måns Rullgård wrote:
>>> >> >> Greg Kroah-Hartman <[email protected]> writes:
>>> >> >>
>>> >> >> > On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote:
>>> >> >> >> Greg Kroah-Hartman <[email protected]> writes:
>>> >> >> >>
>>> >> >> >> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>>> >> >> >> >> The current implementation of put_tty_queue() causes a race condition
>>> >> >> >> >> when re-arranged by the compiler.
>>> >> >> >> >>
>>> >> >> >> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>>> >> >> >> >>
>>> >> >> >> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>>> >> >> >> >>
>>> >> >> >> >> was re-arranged by the compiler to something like
>>> >> >> >> >>
>>> >> >> >> >> x = ldata->read_head
>>> >> >> >> >> ldata->read_head++
>>> >> >> >> >> *read_buf_addr(ldata, x) = c;
>>> >> >> >> >>
>>> >> >> >> >> which causes a race condition. Invalid data is read if data is read
>>> >> >> >> >> before it is actually written to the read buffer.
>>> >> >> >> >
>>> >> >> >> > Really? A compiler can rearange things like that and expect things to
>>> >> >> >> > actually work? How is that valid?
>>> >> >> >>
>>> >> >> >> This is actually required by the C spec. There is a sequence point
>>> >> >> >> before a function call, after the arguments have been evaluated. Thus
>>> >> >> >> all side-effects, such as the post-increment, must be complete before
>>> >> >> >> the function is called, just like in the example.
>>> >> >> >>
>>> >> >> >> There is no "re-arranging" here. The code is simply wrong.
>>> >> >> >
>>> >> >> > Ah, ok, time to dig out the C spec...
>>> >> >> >
>>> >> >> > Anyway, because of this, no need for the wmb() calls, just rearrange the
>>> >> >> > logic and all should be good, right? Christian, can you test that
>>> >> >> > instead?
>>> >> >>
>>> >> >> Weakly ordered SMP systems probably need some kind of barrier. I didn't
>>> >> >> look at it carefully.
>>> >> >
>>> >> > It shouldn't need a barier, as it is a sequence point with the function
>>> >> > call. Well, it's an inline function, but that "shouldn't" matter here,
>>> >> > right?
>>> >>
>>> >> Sequence points say nothing about the order in which stores become
>>> >> visible to other CPUs. That's why there are barrier instructions.
>>> >
>>> > Yes, but "order" matters.
>>> >
>>> > If I write code that does:
>>> >
>>> > 100 x = ldata->read_head;
>>> > 101 &ldata->read_head[x & SOME_VALUE] = y;
>>> > 102 ldata->read_head++;
>>> >
>>> > the compiler can not reorder lines 102 and 101 just because it feels
>>> > like it, right? Or is it time to go spend some reading of the C spec
>>> > again...
>>>
>>> The compiler can't. The hardware can. All the hardware promises is
>>> that at some unspecified time in the future, both memory locations will
>>> have the correct values. Another CPU might see 'read_head' updated
>>> before it sees the corresponding data value. A wmb() between the writes
>>> forces the CPU to complete preceding stores before it begins subsequent
>>> ones.
>>
>> Yes, sorry, I'm not talking about other CPUs and what they see, I'm
>> talking about the local one. I'm not assuming that this is SMP "safe"
>> at all. If it is supposed to be, then yes, we do have problems, but
>> there should be a lock _somewhere_ protecting this.
>
> Within the confines of a single CPU + memory, barriers are never needed.
> The moment another CPU or master-capable peripheral enters the mix,
> proper ordering must be enforced somehow.
>
> If the buffer is already protected by a lock of some kind, this will
> provide the necessary barriers, so nothing further is necessary. If
> it's a lock-less design, there will need to be barriers somewhere.

It was changed to lock-less with 3.12 in commit
6d76bd2618535c581f1673047b8341fd291abc67 ("n_tty: Make N_TTY ldisc
receive
path lockless"). So I will try to read the memory barrier docs again.

Of course my little ARM system is no SMP system, but I guess this
should also be fixed for the SMP case, right?

Thanks,
Christian

2014-11-07 13:45:46

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On 11/06/2014 05:31 PM, Greg Kroah-Hartman wrote:
> On Thu, Nov 06, 2014 at 10:12:54PM +0000, Måns Rullgård wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>>> On Thu, Nov 06, 2014 at 09:38:59PM +0000, Måns Rullgård wrote:
>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>
>>>>> On Thu, Nov 06, 2014 at 09:01:36PM +0000, Måns Rullgård wrote:
>>>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>>>
>>>>>>> On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote:
>>>>>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>>>>>
>>>>>>>>> On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>>>>>>>>>> The current implementation of put_tty_queue() causes a race condition
>>>>>>>>>> when re-arranged by the compiler.
>>>>>>>>>>
>>>>>>>>>> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>>>>>>>>>>
>>>>>>>>>> *read_buf_addr(ldata, ldata->read_head++) = c;
>>>>>>>>>>
>>>>>>>>>> was re-arranged by the compiler to something like
>>>>>>>>>>
>>>>>>>>>> x = ldata->read_head
>>>>>>>>>> ldata->read_head++
>>>>>>>>>> *read_buf_addr(ldata, x) = c;
>>>>>>>>>>
>>>>>>>>>> which causes a race condition. Invalid data is read if data is read
>>>>>>>>>> before it is actually written to the read buffer.
>>>>>>>>>
>>>>>>>>> Really? A compiler can rearange things like that and expect things to
>>>>>>>>> actually work? How is that valid?
>>>>>>>>
>>>>>>>> This is actually required by the C spec. There is a sequence point
>>>>>>>> before a function call, after the arguments have been evaluated. Thus
>>>>>>>> all side-effects, such as the post-increment, must be complete before
>>>>>>>> the function is called, just like in the example.
>>>>>>>>
>>>>>>>> There is no "re-arranging" here. The code is simply wrong.
>>>>>>>
>>>>>>> Ah, ok, time to dig out the C spec...
>>>>>>>
>>>>>>> Anyway, because of this, no need for the wmb() calls, just rearrange the
>>>>>>> logic and all should be good, right? Christian, can you test that
>>>>>>> instead?
>>>>>>
>>>>>> Weakly ordered SMP systems probably need some kind of barrier. I didn't
>>>>>> look at it carefully.
>>>>>
>>>>> It shouldn't need a barier, as it is a sequence point with the function
>>>>> call. Well, it's an inline function, but that "shouldn't" matter here,
>>>>> right?
>>>>
>>>> Sequence points say nothing about the order in which stores become
>>>> visible to other CPUs. That's why there are barrier instructions.
>>>
>>> Yes, but "order" matters.
>>>
>>> If I write code that does:
>>>
>>> 100 x = ldata->read_head;
>>> 101 &ldata->read_head[x & SOME_VALUE] = y;
>>> 102 ldata->read_head++;
>>>
>>> the compiler can not reorder lines 102 and 101 just because it feels
>>> like it, right? Or is it time to go spend some reading of the C spec
>>> again...
>>
>> The compiler can't. The hardware can. All the hardware promises is
>> that at some unspecified time in the future, both memory locations will
>> have the correct values. Another CPU might see 'read_head' updated
>> before it sees the corresponding data value. A wmb() between the writes
>> forces the CPU to complete preceding stores before it begins subsequent
>> ones.
>
> Yes, sorry, I'm not talking about other CPUs and what they see, I'm
> talking about the local one. I'm not assuming that this is SMP "safe"
> at all. If it is supposed to be, then yes, we do have problems, but
> there should be a lock _somewhere_ protecting this.
>
> Peter's emails seem to be bouncing horridly right now, otherwise he
> would chime in and set me straight as to how this all should be
> working...

Sorry for the bouncing emails; something is wrong with my hosting
because I'm just now seeing these emails but not my inbox mails :/

I need to spend some time looking at this.

Regards,
Peter Hurley

2014-11-10 07:51:57

by Christian Riesch

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On Thu, Nov 6, 2014 at 9:56 PM, Greg Kroah-Hartman
<[email protected]> wrote:
> On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>> >> The current implementation of put_tty_queue() causes a race condition
>> >> when re-arranged by the compiler.
>> >>
>> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>> >>
>> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>> >>
>> >> was re-arranged by the compiler to something like
>> >>
>> >> x = ldata->read_head
>> >> ldata->read_head++
>> >> *read_buf_addr(ldata, x) = c;
>> >>
>> >> which causes a race condition. Invalid data is read if data is read
>> >> before it is actually written to the read buffer.
>> >
>> > Really? A compiler can rearange things like that and expect things to
>> > actually work? How is that valid?
>>
>> This is actually required by the C spec. There is a sequence point
>> before a function call, after the arguments have been evaluated. Thus
>> all side-effects, such as the post-increment, must be complete before
>> the function is called, just like in the example.
>>
>> There is no "re-arranging" here. The code is simply wrong.
>
> Ah, ok, time to dig out the C spec...
>
> Anyway, because of this, no need for the wmb() calls, just rearrange the
> logic and all should be good, right? Christian, can you test that
> instead?

I ran a test with the patch that I posted in my first email for the
last 4 days. No communication errors occurred so the patch actually
fixes my problem. I will run another test as suggested by Greg, just
with rearranging the logic.
Best regards, Christian

2014-11-10 09:26:04

by Måns Rullgård

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Christian Riesch <[email protected]> writes:

> On Thu, Nov 6, 2014 at 9:56 PM, Greg Kroah-Hartman
> <[email protected]> wrote:
>> On Thu, Nov 06, 2014 at 08:49:01PM +0000, M?ns Rullg?rd wrote:
>>> Greg Kroah-Hartman <[email protected]> writes:
>>>
>>> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>>> >> The current implementation of put_tty_queue() causes a race condition
>>> >> when re-arranged by the compiler.
>>> >>
>>> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>>> >>
>>> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>>> >>
>>> >> was re-arranged by the compiler to something like
>>> >>
>>> >> x = ldata->read_head
>>> >> ldata->read_head++
>>> >> *read_buf_addr(ldata, x) = c;
>>> >>
>>> >> which causes a race condition. Invalid data is read if data is read
>>> >> before it is actually written to the read buffer.
>>> >
>>> > Really? A compiler can rearange things like that and expect things to
>>> > actually work? How is that valid?
>>>
>>> This is actually required by the C spec. There is a sequence point
>>> before a function call, after the arguments have been evaluated. Thus
>>> all side-effects, such as the post-increment, must be complete before
>>> the function is called, just like in the example.
>>>
>>> There is no "re-arranging" here. The code is simply wrong.
>>
>> Ah, ok, time to dig out the C spec...
>>
>> Anyway, because of this, no need for the wmb() calls, just rearrange the
>> logic and all should be good, right? Christian, can you test that
>> instead?
>
> I ran a test with the patch that I posted in my first email for the
> last 4 days. No communication errors occurred so the patch actually
> fixes my problem. I will run another test as suggested by Greg, just
> with rearranging the logic.

What hardware are you running on? If it's a single-processor system,
it won't break without barriers even if they are required for SMP.

--
M?ns Rullg?rd
[email protected]

2014-11-10 09:38:40

by Christian Riesch

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Hi Måns,

On Mon, Nov 10, 2014 at 10:25 AM, Måns Rullgård <[email protected]> wrote:
> Christian Riesch <[email protected]> writes:
>
>> On Thu, Nov 6, 2014 at 9:56 PM, Greg Kroah-Hartman
>> <[email protected]> wrote:
>>> On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote:
>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>
>>>> > On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>>>> >> The current implementation of put_tty_queue() causes a race condition
>>>> >> when re-arranged by the compiler.
>>>> >>
>>>> >> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>>>> >>
>>>> >> *read_buf_addr(ldata, ldata->read_head++) = c;
>>>> >>
>>>> >> was re-arranged by the compiler to something like
>>>> >>
>>>> >> x = ldata->read_head
>>>> >> ldata->read_head++
>>>> >> *read_buf_addr(ldata, x) = c;
>>>> >>
>>>> >> which causes a race condition. Invalid data is read if data is read
>>>> >> before it is actually written to the read buffer.
>>>> >
>>>> > Really? A compiler can rearange things like that and expect things to
>>>> > actually work? How is that valid?
>>>>
>>>> This is actually required by the C spec. There is a sequence point
>>>> before a function call, after the arguments have been evaluated. Thus
>>>> all side-effects, such as the post-increment, must be complete before
>>>> the function is called, just like in the example.
>>>>
>>>> There is no "re-arranging" here. The code is simply wrong.
>>>
>>> Ah, ok, time to dig out the C spec...
>>>
>>> Anyway, because of this, no need for the wmb() calls, just rearrange the
>>> logic and all should be good, right? Christian, can you test that
>>> instead?
>>
>> I ran a test with the patch that I posted in my first email for the
>> last 4 days. No communication errors occurred so the patch actually
>> fixes my problem. I will run another test as suggested by Greg, just
>> with rearranging the logic.
>
> What hardware are you running on? If it's a single-processor system,
> it won't break without barriers even if they are required for SMP.

Yes, single processor. Texas Instruments AM1808 SoC.

Thanks,
Christian

2014-12-30 19:02:43

by Denis Du

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

Hi, guys:

I confirmed the Patch worked great on non-SMP 3.12 kernel. But on SMP it will still have race condition happened.

Does anyone have another patch for the SMP as mentioned in commit
19e2ad6a09f0c06dbca19c98e5f4584269d913dd




Denis Du


----- Original Message -----
From: Peter Hurley <[email protected]>
To: Greg Kroah-Hartman <[email protected]>; Måns Rullgård <[email protected]>
Cc: Christian Riesch <[email protected]>; Jiri Slaby <[email protected]>; [email protected]; [email protected]
Sent: Friday, November 7, 2014 8:45 AM
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On 11/06/2014 05:31 PM, Greg Kroah-Hartman wrote:
> On Thu, Nov 06, 2014 at 10:12:54PM +0000, Måns Rullgård wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>>> On Thu, Nov 06, 2014 at 09:38:59PM +0000, Måns Rullgård wrote:
>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>
>>>>> On Thu, Nov 06, 2014 at 09:01:36PM +0000, Måns Rullgård wrote:
>>>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>>>
>>>>>>> On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote:
>>>>>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>>>>>
>>>>>>>>> On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>>>>>>>>>> The current implementation of put_tty_queue() causes a race condition
>>>>>>>>>> when re-arranged by the compiler.
>>>>>>>>>>
>>>>>>>>>> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>>>>>>>>>>
>>>>>>>>>> *read_buf_addr(ldata, ldata->read_head++) = c;
>>>>>>>>>>
>>>>>>>>>> was re-arranged by the compiler to something like
>>>>>>>>>>
>>>>>>>>>> x = ldata->read_head
>>>>>>>>>> ldata->read_head++
>>>>>>>>>> *read_buf_addr(ldata, x) = c;
>>>>>>>>>>
>>>>>>>>>> which causes a race condition. Invalid data is read if data is read
>>>>>>>>>> before it is actually written to the read buffer.
>>>>>>>>>
>>>>>>>>> Really? A compiler can rearange things like that and expect things to
>>>>>>>>> actually work? How is that valid?
>>>>>>>>
>>>>>>>> This is actually required by the C spec. There is a sequence point
>>>>>>>> before a function call, after the arguments have been evaluated. Thus
>>>>>>>> all side-effects, such as the post-increment, must be complete before
>>>>>>>> the function is called, just like in the example.
>>>>>>>>
>>>>>>>> There is no "re-arranging" here. The code is simply wrong.
>>>>>>>
>>>>>>> Ah, ok, time to dig out the C spec...
>>>>>>>
>>>>>>> Anyway, because of this, no need for the wmb() calls, just rearrange the
>>>>>>> logic and all should be good, right? Christian, can you test that
>>>>>>> instead?
>>>>>>
>>>>>> Weakly ordered SMP systems probably need some kind of barrier. I didn't
>>>>>> look at it carefully.
>>>>>
>>>>> It shouldn't need a barier, as it is a sequence point with the function
>>>>> call. Well, it's an inline function, but that "shouldn't" matter here,
>>>>> right?
>>>>
>>>> Sequence points say nothing about the order in which stores become
>>>> visible to other CPUs. That's why there are barrier instructions.
>>>
>>> Yes, but "order" matters.
>>>
>>> If I write code that does:
>>>
>>> 100 x = ldata->read_head;
>>> 101 &ldata->read_head[x & SOME_VALUE] = y;
>>> 102 ldata->read_head++;
>>>
>>> the compiler can not reorder lines 102 and 101 just because it feels
>>> like it, right? Or is it time to go spend some reading of the C spec
>>> again...
>>
>> The compiler can't. The hardware can. All the hardware promises is
>> that at some unspecified time in the future, both memory locations will
>> have the correct values. Another CPU might see 'read_head' updated
>> before it sees the corresponding data value. A wmb() between the writes
>> forces the CPU to complete preceding stores before it begins subsequent
>> ones.
>
> Yes, sorry, I'm not talking about other CPUs and what they see, I'm
> talking about the local one. I'm not assuming that this is SMP "safe"
> at all. If it is supposed to be, then yes, we do have problems, but
> there should be a lock _somewhere_ protecting this.
>
> Peter's emails seem to be bouncing horridly right now, otherwise he
> would chime in and set me straight as to how this all should be
> working...

Sorry for the bouncing emails; something is wrong with my hosting
because I'm just now seeing these emails but not my inbox mails :/

I need to spend some time looking at this.

Regards,
Peter Hurley

2014-12-30 19:18:57

by Peter Hurley

[permalink] [raw]
Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path

On 12/30/2014 02:02 PM, Denis Du wrote:
> Hi, guys:
>
> I confirmed the Patch worked great on non-SMP 3.12 kernel. But on SMP it will still have race condition happened.
>
> Does anyone have another patch for the SMP as mentioned in commit
> 19e2ad6a09f0c06dbca19c98e5f4584269d913dd

My apologies for not cc'ing you on that fix.

https://lkml.org/lkml/2014/12/30/66

However, it requires 3.14+. I still need to backport it to 3.12.

Regards,
Peter Hurley


>
>
>
> Denis Du
>
>
> ----- Original Message -----
> From: Peter Hurley <[email protected]>
> To: Greg Kroah-Hartman <[email protected]>; Måns Rullgård <[email protected]>
> Cc: Christian Riesch <[email protected]>; Jiri Slaby <[email protected]>; [email protected]; [email protected]
> Sent: Friday, November 7, 2014 8:45 AM
> Subject: Re: [PATCH] n_tty: Add memory barrier to fix race condition in receive path
>
> On 11/06/2014 05:31 PM, Greg Kroah-Hartman wrote:
>> On Thu, Nov 06, 2014 at 10:12:54PM +0000, Måns Rullgård wrote:
>>> Greg Kroah-Hartman <[email protected]> writes:
>>>
>>>> On Thu, Nov 06, 2014 at 09:38:59PM +0000, Måns Rullgård wrote:
>>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>>
>>>>>> On Thu, Nov 06, 2014 at 09:01:36PM +0000, Måns Rullgård wrote:
>>>>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>>>>
>>>>>>>> On Thu, Nov 06, 2014 at 08:49:01PM +0000, Måns Rullgård wrote:
>>>>>>>>> Greg Kroah-Hartman <[email protected]> writes:
>>>>>>>>>
>>>>>>>>>> On Thu, Nov 06, 2014 at 12:39:59PM +0100, Christian Riesch wrote:
>>>>>>>>>>> The current implementation of put_tty_queue() causes a race condition
>>>>>>>>>>> when re-arranged by the compiler.
>>>>>>>>>>>
>>>>>>>>>>> On my build with gcc 4.8.3, cross-compiling for ARM, the line
>>>>>>>>>>>
>>>>>>>>>>> *read_buf_addr(ldata, ldata->read_head++) = c;
>>>>>>>>>>>
>>>>>>>>>>> was re-arranged by the compiler to something like
>>>>>>>>>>>
>>>>>>>>>>> x = ldata->read_head
>>>>>>>>>>> ldata->read_head++
>>>>>>>>>>> *read_buf_addr(ldata, x) = c;
>>>>>>>>>>>
>>>>>>>>>>> which causes a race condition. Invalid data is read if data is read
>>>>>>>>>>> before it is actually written to the read buffer.
>>>>>>>>>>
>>>>>>>>>> Really? A compiler can rearange things like that and expect things to
>>>>>>>>>> actually work? How is that valid?
>>>>>>>>>
>>>>>>>>> This is actually required by the C spec. There is a sequence point
>>>>>>>>> before a function call, after the arguments have been evaluated. Thus
>>>>>>>>> all side-effects, such as the post-increment, must be complete before
>>>>>>>>> the function is called, just like in the example.
>>>>>>>>>
>>>>>>>>> There is no "re-arranging" here. The code is simply wrong.
>>>>>>>>
>>>>>>>> Ah, ok, time to dig out the C spec...
>>>>>>>>
>>>>>>>> Anyway, because of this, no need for the wmb() calls, just rearrange the
>>>>>>>> logic and all should be good, right? Christian, can you test that
>>>>>>>> instead?
>>>>>>>
>>>>>>> Weakly ordered SMP systems probably need some kind of barrier. I didn't
>>>>>>> look at it carefully.
>>>>>>
>>>>>> It shouldn't need a barier, as it is a sequence point with the function
>>>>>> call. Well, it's an inline function, but that "shouldn't" matter here,
>>>>>> right?
>>>>>
>>>>> Sequence points say nothing about the order in which stores become
>>>>> visible to other CPUs. That's why there are barrier instructions.
>>>>
>>>> Yes, but "order" matters.
>>>>
>>>> If I write code that does:
>>>>
>>>> 100 x = ldata->read_head;
>>>> 101 &ldata->read_head[x & SOME_VALUE] = y;
>>>> 102 ldata->read_head++;
>>>>
>>>> the compiler can not reorder lines 102 and 101 just because it feels
>>>> like it, right? Or is it time to go spend some reading of the C spec
>>>> again...
>>>
>>> The compiler can't. The hardware can. All the hardware promises is
>>> that at some unspecified time in the future, both memory locations will
>>> have the correct values. Another CPU might see 'read_head' updated
>>> before it sees the corresponding data value. A wmb() between the writes
>>> forces the CPU to complete preceding stores before it begins subsequent
>>> ones.
>>
>> Yes, sorry, I'm not talking about other CPUs and what they see, I'm
>> talking about the local one. I'm not assuming that this is SMP "safe"
>> at all. If it is supposed to be, then yes, we do have problems, but
>> there should be a lock _somewhere_ protecting this.
>>
>> Peter's emails seem to be bouncing horridly right now, otherwise he
>> would chime in and set me straight as to how this all should be
>> working...
>
> Sorry for the bouncing emails; something is wrong with my hosting
> because I'm just now seeing these emails but not my inbox mails :/
>
> I need to spend some time looking at this.
>
> Regards,
> Peter Hurley
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>