2022-03-03 05:22:18

by David Laight

[permalink] [raw]
Subject: RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

From: Xiaomeng Tong
> Sent: 03 March 2022 02:27
>
> On Wed, 2 Mar 2022 14:04:06 +0000, David Laight
> <[email protected]> wrote:
> > I think that it would be better to make any alternate loop macro
> > just set the variable to NULL on the loop exit.
> > That is easier to code for and the compiler might be persuaded to
> > not redo the test.
>
> No, that would lead to a NULL dereference.

Why, it would make it b ethe same as the 'easy to use':
for (item = head; item; item = item->next) {
...
if (...)
break;
...
}
if (!item)
return;

> The problem is the mis-use of iterator outside the loop on exit, and
> the iterator will be the HEAD's container_of pointer which pointers
> to a type-confused struct. Sidenote: The *mis-use* here refers to
> mistakely access to other members of the struct, instead of the
> list_head member which acutally is the valid HEAD.

The problem is that the HEAD's container_of pointer should never
be calculated at all.
This is what is fundamentally broken about the current definition.

> IOW, you would dereference a (NULL + offset_of_member) address here.

Where?

> Please remind me if i missed something, thanks.
>
> Can you share your "alternative definitions" details? thanks!

The loop should probably use as extra variable that points
to the 'list node' in the next structure.
Something like:
for (xxx *iter = head->next;
iter == &head ? ((item = NULL),0) : ((item = list_item(iter),1));
iter = item->member->next) {
...
With a bit of casting you can use 'item' to hold 'iter'.

>
> > OTOH there may be alternative definitions that can be used to get
> > the compiler (or other compiler-like tools) to detect broken code.
> > Even if the definition can't possibly generate a working kerrnel.
>
> The "list_for_each_entry_inside(pos, type, head, member)" way makes
> the iterator invisiable outside the loop, and would be catched by
> compiler if use-after-loop things happened.

It is also a compete PITA for anything doing a search.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2022-03-03 07:49:20

by Xiaomeng Tong

[permalink] [raw]
Subject: RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

On Thu, 3 Mar 2022 04:58:23 +0000, David Laight wrote:
> on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote:
> > The problem is the mis-use of iterator outside the loop on exit, and
> > the iterator will be the HEAD's container_of pointer which pointers
> > to a type-confused struct. Sidenote: The *mis-use* here refers to
> > mistakely access to other members of the struct, instead of the
> > list_head member which acutally is the valid HEAD.
>
> The problem is that the HEAD's container_of pointer should never
> be calculated at all.
> This is what is fundamentally broken about the current definition.

Yes, the rule is "the HEAD's container_of pointer should never be
calculated at all outside the loop", but how do you make sure everyone
follows this rule?
Everyone makes mistakes, but we can eliminate them all from the beginning
with the help of compiler which can catch such use-after-loop things.

> > IOW, you would dereference a (NULL + offset_of_member) address here.
>
>Where?

In the case where a developer do not follows the above rule, and mistakely
access a non-list-head member of the HEAD's container_of pointer outside
the loop. For example:
struct req{
int a;
struct list_head h;
}
struct req *r;
list_for_each_entry(r, HEAD, h) {
if (r->a == 0x10)
break;
}
// the developer made a mistake: he didn't take this situation into
// account where all entries in the list are *r->a != 0x10*, and now
// the r is the HEAD's container_of pointer.
r->a = 0x20;
Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member)
address here.

> > Please remind me if i missed something, thanks.
> >
> > Can you share your "alternative definitions" details? thanks!
>
> The loop should probably use as extra variable that points
> to the 'list node' in the next structure.
> Something like:
> for (xxx *iter = head->next;
> iter == &head ? ((item = NULL),0) : ((item = list_item(iter),1));
> iter = item->member->next) {
> ...
> With a bit of casting you can use 'item' to hold 'iter'.

you still can not make sure everyone follows this rule:
"do not use iterator outside the loop" without the help of compiler,
because item is declared outside the loop.

BTW, to avoid ambiguity,the "alternative definitions" here i asked is
something from you in this context:
"OTOH there may be alternative definitions that can be used to get
the compiler (or other compiler-like tools) to detect broken code.
Even if the definition can't possibly generate a working kerrnel."

> >
> > > OTOH there may be alternative definitions that can be used to get
> > > the compiler (or other compiler-like tools) to detect broken code.
> > > Even if the definition can't possibly generate a working kerrnel.
> >
> > The "list_for_each_entry_inside(pos, type, head, member)" way makes
> > the iterator invisiable outside the loop, and would be catched by
> > compiler if use-after-loop things happened.

> It is also a compete PITA for anything doing a search.

You mean it would be a burden on search? can you show me some examples?

Or you mean it is too long the list_for_each_entry_inside* string to live
in one single line, and should spilt into two line? If it is the case, there
are 2 way to mitigate it.
1. pass a shorter t as struct type to the macro
2. after all list_for_each_entry macros be replaced with
list_for_each_entry_inside, remove the list_for_each_entry implementations
and rename all list_for_each_entry_inside use back to the origin
list_for_each_entry in a single patch.

For me, it is acceptable and not a such big problem.

--
Xiaomeng Tong

2022-03-03 09:44:09

by David Laight

[permalink] [raw]
Subject: RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

From: Xiaomeng Tong
> Sent: 03 March 2022 07:27
>
> On Thu, 3 Mar 2022 04:58:23 +0000, David Laight wrote:
> > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote:
> > > The problem is the mis-use of iterator outside the loop on exit, and
> > > the iterator will be the HEAD's container_of pointer which pointers
> > > to a type-confused struct. Sidenote: The *mis-use* here refers to
> > > mistakely access to other members of the struct, instead of the
> > > list_head member which acutally is the valid HEAD.
> >
> > The problem is that the HEAD's container_of pointer should never
> > be calculated at all.
> > This is what is fundamentally broken about the current definition.
>
> Yes, the rule is "the HEAD's container_of pointer should never be
> calculated at all outside the loop", but how do you make sure everyone
> follows this rule?
> Everyone makes mistakes, but we can eliminate them all from the beginning
> with the help of compiler which can catch such use-after-loop things.
>
> > > IOW, you would dereference a (NULL + offset_of_member) address here.
> >
> >Where?
>
> In the case where a developer do not follows the above rule, and mistakely
> access a non-list-head member of the HEAD's container_of pointer outside
> the loop. For example:
> struct req{
> int a;
> struct list_head h;
> }
> struct req *r;
> list_for_each_entry(r, HEAD, h) {
> if (r->a == 0x10)
> break;
> }
> // the developer made a mistake: he didn't take this situation into
> // account where all entries in the list are *r->a != 0x10*, and now
> // the r is the HEAD's container_of pointer.
> r->a = 0x20;
> Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member)
> address here.

That is just a bug.
No different to failing to check anything else might 'return'
a NULL pointer.
Because it is a NULL dereference you find out pretty quickly.
The existing loop leaves you with a valid pointer to something
that isn't a list item.

> > > Please remind me if i missed something, thanks.
> > >
> > > Can you share your "alternative definitions" details? thanks!
> >
> > The loop should probably use as extra variable that points
> > to the 'list node' in the next structure.
> > Something like:
> > for (xxx *iter = head->next;
> > iter == &head ? ((item = NULL),0) : ((item = list_item(iter),1));
> > iter = item->member->next) {
> > ...
> > With a bit of casting you can use 'item' to hold 'iter'.
>
> you still can not make sure everyone follows this rule:
> "do not use iterator outside the loop" without the help of compiler,
> because item is declared outside the loop.

That one has 'iter' defined in the loop.

> BTW, to avoid ambiguity,the "alternative definitions" here i asked is
> something from you in this context:
> "OTOH there may be alternative definitions that can be used to get
> the compiler (or other compiler-like tools) to detect broken code.
> Even if the definition can't possibly generate a working kerrnel."

I was thinking of something like:
if ((pos = list_first)), 1) pos = NULL else
so that unchecked dereferences after the loop will be detectable
as NULL pointer offsets - but that in itself isn't enough to avoid
other warnings.

> > > The "list_for_each_entry_inside(pos, type, head, member)" way makes
> > > the iterator invisiable outside the loop, and would be catched by
> > > compiler if use-after-loop things happened.
>
> > It is also a compete PITA for anything doing a search.
>
> You mean it would be a burden on search? can you show me some examples?

The whole business of having to save the pointer to the located item
before breaking the loop, remembering to have set it to NULL earlier etc.

It is so much better if you can just do:
if (found)
break;

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-03-03 13:04:54

by Xiaomeng Tong

[permalink] [raw]
Subject: RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

> From: Xiaomeng Tong
> > Sent: 03 March 2022 07:27
> >
> > On Thu, 3 Mar 2022 04:58:23 +0000, David Laight wrote:
> > > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote:
> > > > The problem is the mis-use of iterator outside the loop on exit, and
> > > > the iterator will be the HEAD's container_of pointer which pointers
> > > > to a type-confused struct. Sidenote: The *mis-use* here refers to
> > > > mistakely access to other members of the struct, instead of the
> > > > list_head member which acutally is the valid HEAD.
> > >
> > > The problem is that the HEAD's container_of pointer should never
> > > be calculated at all.
> > > This is what is fundamentally broken about the current definition.
> >
> > Yes, the rule is "the HEAD's container_of pointer should never be
> > calculated at all outside the loop", but how do you make sure everyone
> > follows this rule?
> > Everyone makes mistakes, but we can eliminate them all from the beginning
> > with the help of compiler which can catch such use-after-loop things.
> >
> > > > IOW, you would dereference a (NULL + offset_of_member) address here.
> > >
> > >Where?
> >
> > In the case where a developer do not follows the above rule, and mistakely
> > access a non-list-head member of the HEAD's container_of pointer outside
> > the loop. For example:
> > struct req{
> > int a;
> > struct list_head h;
> > }
> > struct req *r;
> > list_for_each_entry(r, HEAD, h) {
> > if (r->a == 0x10)
> > break;
> > }
> > // the developer made a mistake: he didn't take this situation into
> > // account where all entries in the list are *r->a != 0x10*, and now
> > // the r is the HEAD's container_of pointer.
> > r->a = 0x20;
> > Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member)
> > address here.
>
> That is just a bug.
> No different to failing to check anything else might 'return'
> a NULL pointer.

Yes, but it‘s a mistake everyone has made and will make, we should avoid
this at the beginning with the help of compiler.

> Because it is a NULL dereference you find out pretty quickly.

AFAIK,NULL dereference is a undefined bahavior, can compiler quickly
catch it? Or it can only be applied to some simple/restricted cases.

> The existing loop leaves you with a valid pointer to something
> that isn't a list item.
>
> > > > Please remind me if i missed something, thanks.
> > > >
> > > > Can you share your "alternative definitions" details? thanks!
> > >
> > > The loop should probably use as extra variable that points
> > > to the 'list node' in the next structure.
> > > Something like:
> > > for (xxx *iter = head->next;
> > > iter == &head ? ((item = NULL),0) : ((item = list_item(iter),1));
> > > iter = item->member->next) {
> > > ...
> > > With a bit of casting you can use 'item' to hold 'iter'.
> >
> > you still can not make sure everyone follows this rule:
> > "do not use iterator outside the loop" without the help of compiler,
> > because item is declared outside the loop.
>
> That one has 'iter' defined in the loop.

Oh, sorry, I misunderstood. Then this is the same way with my way of
list_for_each_entry_inside(pos, type, head, member), which declare
the iterator inside the loop.
You go further and make things like "&pos->member == (head)" goes away
to avoid calculate the HEAD's container_of pointer, although the
calculation is harmless.

>
> > BTW, to avoid ambiguity,the "alternative definitions" here i asked is
> > something from you in this context:
> > "OTOH there may be alternative definitions that can be used to get
> > the compiler (or other compiler-like tools) to detect broken code.
> > Even if the definition can't possibly generate a working kerrnel."
>
> I was thinking of something like:
> if ((pos = list_first)), 1) pos = NULL else
> so that unchecked dereferences after the loop will be detectable
> as NULL pointer offsets - but that in itself isn't enough to avoid
> other warnings.
>

Do you mean put this right after the loop (I changed somthing that i
do not understand, please correct me if i am worng, thanks):
if (pos == list_first) pos = NULL; else
and compiler can detect the following NULL derefernce?
But if the NULL derefernce is just right after the loop originally,
it will be masked by the *else* keyword。

> > > > The "list_for_each_entry_inside(pos, type, head, member)" way makes
> > > > the iterator invisiable outside the loop, and would be catched by
> > > > compiler if use-after-loop things happened.
> >
> > > It is also a compete PITA for anything doing a search.
> >
> > You mean it would be a burden on search? can you show me some examples?
>
> The whole business of having to save the pointer to the located item
> before breaking the loop, remembering to have set it to NULL earlier etc.

Ok, i see. And then you need pass a "item" to the list_for_each_entry macro
as a new argument.

>
> It is so much better if you can just do:
> if (found)
> break;
>
> David

this confused me. this way is better or the "save the pointer to the located item
before breaking the loop" one?

--
Xiaomeng Tong