2023-07-20 20:28:48

by Alan Huang

[permalink] [raw]
Subject: Question about the barrier() in hlist_nulls_for_each_entry_rcu()

Hi,

I noticed a commit c87a124a5d5e(“net: force a reload of first item in hlist_nulls_for_each_entry_rcu”)
and a related discussion [1].

After reading the whole discussion, it seems like that ptr->field was cached by gcc even with the deprecated
ACCESS_ONCE(), so my question is:

Is that a compiler bug? If so, has this bug been fixed today, ten years later?

What about READ_ONCE(ptr->field)?


[1] https://lore.kernel.org/all/1369699930.3301.494.camel@edumazet-glaptop/

Thanks,
Alan


2023-07-21 12:03:52

by David Laight

[permalink] [raw]
Subject: RE: Question about the barrier() in hlist_nulls_for_each_entry_rcu()

From: Alan Huang
> Sent: 20 July 2023 19:54
>
> I noticed a commit c87a124a5d5e(“net: force a reload of first item in hlist_nulls_for_each_entry_rcu”)
> and a related discussion [1].

Hmmm... that was all about the retry loop in ipv4/udp.c

AFAICT that retry got deleted by ca065d0c.

That also changes the list from hlist_nulls_xxx to hlist_xxx.
(I'm not sure of the difference)

This might be why we're seeing unexpected 'port unreachable' messages?

Quite why that has just started happening is another issue.
Most of the UDP sockets we create aren't 'connected' so I don't
believe they get moved between hash chains - just deleted.
The deletion should leave the hash chain intact.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-07-21 15:47:12

by Alan Huang

[permalink] [raw]
Subject: Re: Question about the barrier() in hlist_nulls_for_each_entry_rcu()


> 2023年7月21日 22:47,Eric Dumazet <[email protected]> 写道:
>
> On Fri, Jul 21, 2023 at 4:31 PM Alan Huang <[email protected]> wrote:
>>
>>
>>> 2023年7月21日 05:11,Eric Dumazet <[email protected]> 写道:
>>>
>>> On Thu, Jul 20, 2023 at 10:00 PM Alan Huang <[email protected]> wrote:
>>>>
>>>>
>>>>> 2023年7月21日 03:22,Eric Dumazet <[email protected]> 写道:
>>>>>
>>>>> On Thu, Jul 20, 2023 at 8:54 PM Alan Huang <[email protected]> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I noticed a commit c87a124a5d5e(“net: force a reload of first item in hlist_nulls_for_each_entry_rcu”)
>>>>>> and a related discussion [1].
>>>>>>
>>>>>> After reading the whole discussion, it seems like that ptr->field was cached by gcc even with the deprecated
>>>>>> ACCESS_ONCE(), so my question is:
>>>>>>
>>>>>> Is that a compiler bug? If so, has this bug been fixed today, ten years later?
>>>>>>
>>>>>> What about READ_ONCE(ptr->field)?
>>>>>
>>>>> Make sure sparse is happy.
>>>>
>>>> It caused a problem without barrier(), and the deprecated ACCESS_ONCE() didn’t help:
>>>>
>>>> https://lore.kernel.org/all/[email protected]/
>>>>
>>>> So, my real question is: With READ_ONCE(ptr->field), are there still some unusual cases where gcc
>>>> decides not to reload ptr->field?
>>>
>>> I can not really answer without seeing an actual patch...
>>
>> The content of the potential patch:
>>
>> diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h
>> index 89186c499dd4..bcd39670f359 100644
>> --- a/include/linux/rculist_nulls.h
>> +++ b/include/linux/rculist_nulls.h
>> @@ -158,15 +158,9 @@ static inline void hlist_nulls_add_fake(struct hlist_nulls_node *n)
>> * @pos: the &struct hlist_nulls_node to use as a loop cursor.
>> * @head: the head of the list.
>> * @member: the name of the hlist_nulls_node within the struct.
>> - *
>> - * The barrier() is needed to make sure compiler doesn't cache first element [1],
>> - * as this loop can be restarted [2]
>> - * [1] Documentation/memory-barriers.txt around line 1533
>> - * [2] Documentation/RCU/rculist_nulls.rst around line 146
>> */
>> #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \
>> - for (({barrier();}), \
>> - pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \
>> + for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \
>> (!is_a_nulls(pos)) && \
>> ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \
>> pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos)))
>> @@ -180,8 +174,7 @@ static inline void hlist_nulls_add_fake(struct hlist_nulls_node *n)
>> * @member: the name of the hlist_nulls_node within the struct.
>> */
>> #define hlist_nulls_for_each_entry_safe(tpos, pos, head, member) \
>> - for (({barrier();}), \
>> - pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \
>> + for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \
>> (!is_a_nulls(pos)) && \
>> ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); \
>> pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos)); 1; });)
>>
>>
>>>
>>> Why are you asking ? Are you tracking compiler bug fixes ?
>>
>> The barrier() here makes me confused.
>>
>> If we really need that, do we need:
>>
>> READ_ONCE(head->first);
>> barrier();
>> READ_ONCE(head->first);
>>
>
> Nope, the patch you want to revert (while it did fix (by pure luck
> ???) a real bug back in the days) was replacing
>
> ACCESS_ONCE()
>
> by
>
> barrier();
> ACCESS_ONCE();

Yeah.

The commit message and related discussions all indicate
that the compiler cached a value accessed through volatile.

That’s why I asked here.

>
> (There is one ACCESS_ONCE(), not two of them)
>
> BTW,
> barrier();
> followed by an arbitrary number of barrier(); back to back,
> translates to one barrier()
>
> Frankly, I would not change the code, unless someone can explain what
> was the issue.
> (Perhaps there was a missing barrier elsewhere)

Fair enough, although I feel like this is masking the real issue.

(I feel like no one will know the real issue ten years later, when no one knew it ten years ago.)

Anyway, thanks for your time.





2023-07-21 16:05:25

by Alan Huang

[permalink] [raw]
Subject: Re: Question about the barrier() in hlist_nulls_for_each_entry_rcu()


> 2023年7月21日 23:21,Joel Fernandes <[email protected]> 写道:
>
> On 7/21/23 10:27, Alan Huang wrote:
>>> 2023年7月21日 20:54,Joel Fernandes <[email protected]> 写道:
>>>
>>>
>>>
>>>> On Jul 20, 2023, at 4:00 PM, Alan Huang <[email protected]> wrote:
>>>>
>>>> 
>>>>> 2023年7月21日 03:22,Eric Dumazet <[email protected]> 写道:
>>>>>
>>>>>> On Thu, Jul 20, 2023 at 8:54 PM Alan Huang <[email protected]> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I noticed a commit c87a124a5d5e(“net: force a reload of first item in hlist_nulls_for_each_entry_rcu”)
>>>>>> and a related discussion [1].
>>>>>>
>>>>>> After reading the whole discussion, it seems like that ptr->field was cached by gcc even with the deprecated
>>>>>> ACCESS_ONCE(), so my question is:
>>>>>>
>>>>>> Is that a compiler bug? If so, has this bug been fixed today, ten years later?
>>>>>>
>>>>>> What about READ_ONCE(ptr->field)?
>>>>>
>>>>> Make sure sparse is happy.
>>>>
>>>> It caused a problem without barrier(), and the deprecated ACCESS_ONCE() didn’t help:
>>>>
>>>> https://lore.kernel.org/all/[email protected]/
>>>>
>>>> So, my real question is: With READ_ONCE(ptr->field), are there still some unusual cases where gcc
>>>> decides not to reload ptr->field?
>>>
>>> I am a bit doubtful there will be strong (any?) interest in replacing the barrier() with READ_ONCE() without any tangible reason, regardless of whether a gcc issue was fixed.
>>>
>>> But hey, if you want to float the idea…
>> We already had the READ_ONCE() in rcu_deference_raw().
>> The barrier() here makes me think we need write code like below:
>>
>> READ_ONCE(head->first);
>> barrier();
>> READ_ONCE(head->first);
>> With READ_ONCE (or the deprecated ACCESS_ONCE),
>> I don’t think a compiler should cache the value of head->first.
>
>
> Right, it shouldn't need to cache. To Eric's point it might be risky to remove the barrier() and someone needs to explain that issue first (or IMO there needs to be another tangible reason like performance etc). Anyway, FWIW I wrote a simple program and I am not seeing the head->first cached with the pattern you shared above:
>
> #include <stdlib.h>
>
> #define READ_ONCE(x) (*(volatile typeof(x) *)&(x))
> #define barrier() __asm__ __volatile__("": : :"memory")
>
> typedef struct list_head {
> int first;
> struct list_head *next;
> } list_head;
>
> int main() {
> list_head *head = (list_head *)malloc(sizeof(list_head));
> head->first = 1;
> head->next = 0;
>
> READ_ONCE(head->first);
> barrier();

Thanks for your time!

However, what I'm trying to say here is that without this barrier(), GCC wouldn't cache this value either.

So, I removed the barrier() and tested, GCC didn’t cache the value of head->first.
(Only tested on x86-64 (all the possible versions of gcc that Compiler Explorer has) where the original issue occurred [1].)

Therefore, the commit message and the related discussion ten years ago is misleading.

Thanks again!

[1] https://lkml.org/lkml/2013/4/16/371


> READ_ONCE(head->first);
>
> free(head);
> return 0;
> }
>
> On ARM 32-bit, 64-bit and x86_64, with -Os and then another experiment with -O2 on new gcc versions.



2023-07-21 16:52:47

by David Laight

[permalink] [raw]
Subject: RE: Question about the barrier() in hlist_nulls_for_each_entry_rcu()

....
> Right, it shouldn't need to cache. To Eric's point it might be risky to remove
> the barrier() and someone needs to explain that issue first (or IMO there needs
> to be another tangible reason like performance etc). Anyway, FWIW I wrote a
> simple program and I am not seeing the head->first cached with the pattern you
> shared above:
>
> #include <stdlib.h>
>
> #define READ_ONCE(x) (*(volatile typeof(x) *)&(x))
> #define barrier() __asm__ __volatile__("": : :"memory")
>
> typedef struct list_head {
> int first;
> struct list_head *next;
> } list_head;
>
> int main() {
> list_head *head = (list_head *)malloc(sizeof(list_head));
> head->first = 1;
> head->next = 0;
>
> READ_ONCE(head->first);
> barrier();
> READ_ONCE(head->first);
>
> free(head);
> return 0;
> }

You probably need to try harder to generate the error.
It probably has something to do code surrounding the
sk_nulls_for_each_rcu() in the ca065d0c^ version of udp.c.

That patch removes the retry loop - and probably breaks udp receive.
The issue is that sockets can be moved between the 'hash2' chains
(eg by connect()) without being freed.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2023-07-21 16:55:43

by Alan Huang

[permalink] [raw]
Subject: Re: Question about the barrier() in hlist_nulls_for_each_entry_rcu()


> 2023年7月21日 19:51,David Laight <[email protected]> 写道:
>
> From: Alan Huang
>> Sent: 20 July 2023 19:54
>>
>> I noticed a commit c87a124a5d5e(“net: force a reload of first item in hlist_nulls_for_each_entry_rcu”)
>> and a related discussion [1].
>
> Hmmm... that was all about the retry loop in ipv4/udp.c
>
> AFAICT that retry got deleted by ca065d0c.
>
> That also changes the list from hlist_nulls_xxx to hlist_xxx.
> (I'm not sure of the difference)
>
> This might be why we're seeing unexpected 'port unreachable' messages?
>
> Quite why that has just started happening is another issue.
> Most of the UDP sockets we create aren't 'connected' so I don't
> believe they get moved between hash chains - just deleted.
> The deletion should leave the hash chain intact.

Thanks for the information! :)

>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)


2023-07-21 17:02:32

by Joel Fernandes

[permalink] [raw]
Subject: Re: Question about the barrier() in hlist_nulls_for_each_entry_rcu()



> On Jul 21, 2023, at 11:55 AM, Alan Huang <[email protected]> wrote:
>
> 
>> 2023年7月21日 23:21,Joel Fernandes <[email protected]> 写道:
>>
>> On 7/21/23 10:27, Alan Huang wrote:
>>>> 2023年7月21日 20:54,Joel Fernandes <[email protected]> 写道:
>>>>
>>>>
>>>>
>>>>> On Jul 20, 2023, at 4:00 PM, Alan Huang <[email protected]> wrote:
>>>>>
>>>>> 
>>>>>> 2023年7月21日 03:22,Eric Dumazet <[email protected]> 写道:
>>>>>>
>>>>>>>> On Thu, Jul 20, 2023 at 8:54 PM Alan Huang <[email protected]> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I noticed a commit c87a124a5d5e(“net: force a reload of first item in hlist_nulls_for_each_entry_rcu”)
>>>>>>>> and a related discussion [1].
>>>>>>>>
>>>>>>>> After reading the whole discussion, it seems like that ptr->field was cached by gcc even with the deprecated
>>>>>>>> ACCESS_ONCE(), so my question is:
>>>>>>>>
>>>>>>>> Is that a compiler bug? If so, has this bug been fixed today, ten years later?
>>>>>>>>
>>>>>>>> What about READ_ONCE(ptr->field)?
>>>>>>>
>>>>>>> Make sure sparse is happy.
>>>>>>
>>>>>> It caused a problem without barrier(), and the deprecated ACCESS_ONCE() didn’t help:
>>>>>>
>>>>>> https://lore.kernel.org/all/[email protected]/
>>>>>>
>>>>>> So, my real question is: With READ_ONCE(ptr->field), are there still some unusual cases where gcc
>>>>>> decides not to reload ptr->field?
>>>>>
>>>>> I am a bit doubtful there will be strong (any?) interest in replacing the barrier() with READ_ONCE() without any tangible reason, regardless of whether a gcc issue was fixed.
>>>>>
>>>>> But hey, if you want to float the idea…
>>> We already had the READ_ONCE() in rcu_deference_raw().
>>> The barrier() here makes me think we need write code like below:
>>>
>>> READ_ONCE(head->first);
>>> barrier();
>>> READ_ONCE(head->first);
>>> With READ_ONCE (or the deprecated ACCESS_ONCE),
>>> I don’t think a compiler should cache the value of head->first.
>>
>>
>> Right, it shouldn't need to cache. To Eric's point it might be risky to remove the barrier() and someone needs to explain that issue first (or IMO there needs to be another tangible reason like performance etc). Anyway, FWIW I wrote a simple program and I am not seeing the head->first cached with the pattern you shared above:
>>
>> #include <stdlib.h>
>>
>> #define READ_ONCE(x) (*(volatile typeof(x) *)&(x))
>> #define barrier() __asm__ __volatile__("": : :"memory")
>>
>> typedef struct list_head {
>> int first;
>> struct list_head *next;
>> } list_head;
>>
>> int main() {
>> list_head *head = (list_head *)malloc(sizeof(list_head));
>> head->first = 1;
>> head->next = 0;
>>
>> READ_ONCE(head->first);
>> barrier();
>
> Thanks for your time!
>
> However, what I'm trying to say here is that without this barrier(), GCC wouldn't cache this value either.
>

Yes that is what I tried too, removing the barrier.

Apologies for confusing, the code I shared did have it but I had also tried removing it.

Thanks,

- Joel


> So, I removed the barrier() and tested, GCC didn’t cache the value of head->first.
> (Only tested on x86-64 (all the possible versions of gcc that Compiler Explorer has) where the original issue occurred [1].)
>
> Therefore, the commit message and the related discussion ten years ago is misleading.
>
> Thanks again!
>
> [1] https://lkml.org/lkml/2013/4/16/371
>
>
>> READ_ONCE(head->first);
>>
>> free(head);
>> return 0;
>> }
>>
>> On ARM 32-bit, 64-bit and x86_64, with -Os and then another experiment with -O2 on new gcc versions.
>
>

2023-07-21 17:27:43

by Joel Fernandes

[permalink] [raw]
Subject: Re: Question about the barrier() in hlist_nulls_for_each_entry_rcu()

On Fri, Jul 21, 2023 at 11:59 AM David Laight <[email protected]> wrote:
>
> ....
> > Right, it shouldn't need to cache. To Eric's point it might be risky to remove
> > the barrier() and someone needs to explain that issue first (or IMO there needs
> > to be another tangible reason like performance etc). Anyway, FWIW I wrote a
> > simple program and I am not seeing the head->first cached with the pattern you
> > shared above:
> >
> > #include <stdlib.h>
> >
> > #define READ_ONCE(x) (*(volatile typeof(x) *)&(x))
> > #define barrier() __asm__ __volatile__("": : :"memory")
> >
> > typedef struct list_head {
> > int first;
> > struct list_head *next;
> > } list_head;
> >
> > int main() {
> > list_head *head = (list_head *)malloc(sizeof(list_head));
> > head->first = 1;
> > head->next = 0;
> >
> > READ_ONCE(head->first);
> > barrier();
> > READ_ONCE(head->first);
> >
> > free(head);
> > return 0;
> > }
>
> You probably need to try harder to generate the error.
> It probably has something to do code surrounding the
> sk_nulls_for_each_rcu() in the ca065d0c^ version of udp.c.
>
> That patch removes the retry loop - and probably breaks udp receive.
> The issue is that sockets can be moved between the 'hash2' chains
> (eg by connect()) without being freed.

I was just replying to Alan's question on the behavior of READ_ONCE()
since I myself recently got surprised by compiler optimizations
related to it. I haven't looked into the actual UDP code.

- Joel

2023-08-03 14:35:28

by Paul E. McKenney

[permalink] [raw]
Subject: Re: Question about the barrier() in hlist_nulls_for_each_entry_rcu()

On Thu, Aug 03, 2023 at 09:40:11PM +0800, Alan Huang wrote:
>
> > 2023年8月1日 上午4:09,Paul E. McKenney <[email protected]> 写道:
> >
> > On Fri, Jul 21, 2023 at 10:27:04PM +0800, Alan Huang wrote:
> >>
> >>> 2023年7月21日 20:54,Joel Fernandes <[email protected]> 写道:
> >>>
> >>>
> >>>
> >>>> On Jul 20, 2023, at 4:00 PM, Alan Huang <[email protected]> wrote:
> >>>>
> >>>> 
> >>>>> 2023年7月21日 03:22,Eric Dumazet <[email protected]> 写道:
> >>>>>
> >>>>>> On Thu, Jul 20, 2023 at 8:54 PM Alan Huang <[email protected]> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> I noticed a commit c87a124a5d5e(“net: force a reload of first item in hlist_nulls_for_each_entry_rcu”)
> >>>>>> and a related discussion [1].
> >>>>>>
> >>>>>> After reading the whole discussion, it seems like that ptr->field was cached by gcc even with the deprecated
> >>>>>> ACCESS_ONCE(), so my question is:
> >>>>>>
> >>>>>> Is that a compiler bug? If so, has this bug been fixed today, ten years later?
> >>>>>>
> >>>>>> What about READ_ONCE(ptr->field)?
> >>>>>
> >>>>> Make sure sparse is happy.
> >>>>
> >>>> It caused a problem without barrier(), and the deprecated ACCESS_ONCE() didn’t help:
> >>>>
> >>>> https://lore.kernel.org/all/[email protected]/
> >>>>
> >>>> So, my real question is: With READ_ONCE(ptr->field), are there still some unusual cases where gcc
> >>>> decides not to reload ptr->field?
> >>>
> >>> I am a bit doubtful there will be strong (any?) interest in replacing the barrier() with READ_ONCE() without any tangible reason, regardless of whether a gcc issue was fixed.
> >>>
> >>> But hey, if you want to float the idea…
> >>
> >> We already had the READ_ONCE() in rcu_deference_raw().
> >>
> >> The barrier() here makes me think we need write code like below:
> >>
> >> READ_ONCE(head->first);
> >> barrier();
> >> READ_ONCE(head->first);
> >>
> >> With READ_ONCE (or the deprecated ACCESS_ONCE),
> >> I don’t think a compiler should cache the value of head->first.
> >
> > Apologies for the late reply!
> >
> > If both are READ_ONCE(), you should not need the barrier(). Unless there
> > is some other code not shown in your example that requires it, that is.
>
> And unless the compiler has a bug. :)
>
> So, the barrier() in hlist_nulls_for_each_entry_rcu() is a workaround for a compiler bug.

Fair enough!!! ;-)


Thanx, Paul

> >>> Thanks,
> >>>
> >>> - Joel
> >>>
> >>>>
> >>>>>
> >>>>> Do you have a patch for review ?
> >>>>
> >>>> Possibly next month. :)
> >>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> [1] https://lore.kernel.org/all/1369699930.3301.494.camel@edumazet-glaptop/
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Alan
>

2023-08-03 15:11:21

by David Laight

[permalink] [raw]
Subject: RE: Question about the barrier() in hlist_nulls_for_each_entry_rcu()

From: Paul E. McKenney
> Sent: 03 August 2023 14:54
....
> > > If both are READ_ONCE(), you should not need the barrier(). Unless there
> > > is some other code not shown in your example that requires it, that is.
> >
> > And unless the compiler has a bug. :)
> >
> > So, the barrier() in hlist_nulls_for_each_entry_rcu() is a workaround for a compiler bug.
>
> Fair enough!!! ;-)

Except that it is likely that the compiler bug is avoided by the
implementation of READ_ONCE() rather than ACCESS_ONCE().

Also the code that looped forever (UDP receive socket lookup)
no longer has the retry - which is a different bug.
If a socket rehash hits the lookup then an erroneous ICMP
'port unreachable' is sent rather than doing a rescan.

David


-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)