2021-09-22 00:59:25

by Kees Cook

[permalink] [raw]
Subject: Re: [patch 9/9] mm/vmalloc: add __alloc_size attributes for better bounds checking

On Fri, Sep 10, 2021 at 10:23:48AM -0700, Linus Torvalds wrote:
> On Thu, Sep 9, 2021 at 8:10 PM Andrew Morton <[email protected]> wrote:
> >
> > +__alloc_size(1)
> > extern void *vmalloc(unsigned long size);
> [...]
>
> All of these are added in the wrong place - inconsistent with the very
> compiler documentation the patches add.
>
> The function attributes are generally added _after_ the function,
> although admittedly we've been quite confused here before.
>
> But the very compiler documentation you point to in the patch that
> adds these macros gives that as the examples both for gcc and clang:
>
> + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
> + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
>
> and honestly I think that is the preferred format because this is
> about the *function*, not about the return type.
>
> Do both placements work? Yes.

I'm cleaning this up now, and have discovered that the reason for the
before-function placement is consistency with static inlines. If I do this:

static __always_inline void * kmalloc(size_t size, gfp_t flags) __alloc_size(1)
{
...
}

GCC is very angry:

./include/linux/slab.h:519:1: error: attributes should be specified before the declarator in a function definition
519 | static __always_inline void *kmalloc_large(size_t size, gfp_t flags) __alloc_size(1)
| ^~~~~~

It's happy if I treat it as a "return type attribute" in the ordering,
though:

static __always_inline void * __alloc_size(1) kmalloc(size_t size, gfp_t flags)

I'll do that unless you have a preference for somewhere else...

-Kees

--
Kees Cook


2021-09-22 02:04:53

by Joe Perches

[permalink] [raw]
Subject: Re: [patch 9/9] mm/vmalloc: add __alloc_size attributes for better bounds checking

On Tue, 2021-09-21 at 16:37 -0700, Kees Cook wrote:
> On Fri, Sep 10, 2021 at 10:23:48AM -0700, Linus Torvalds wrote:
> > On Thu, Sep 9, 2021 at 8:10 PM Andrew Morton <[email protected]> wrote:
> > >
> > > +__alloc_size(1)
> > > ?extern void *vmalloc(unsigned long size);
> > [...]
> >
> > All of these are added in the wrong place - inconsistent with the very
> > compiler documentation the patches add.
> >
> > The function attributes are generally added _after_ the function,
> > although admittedly we've been quite confused here before.
> >
> > But the very compiler documentation you point to in the patch that
> > adds these macros gives that as the examples both for gcc and clang:
> >
> > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
> > + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
> >
> > and honestly I think that is the preferred format because this is
> > about the *function*, not about the return type.
> >
> > Do both placements work? Yes.
>
> I'm cleaning this up now, and have discovered that the reason for the
> before-function placement is consistency with static inlines. If I do this:
>
> static __always_inline void * kmalloc(size_t size, gfp_t flags) __alloc_size(1)
> {
> ...
> }
>
> GCC is very angry:
>
> ./include/linux/slab.h:519:1: error: attributes should be specified before the declarator in a function definition
> ??519 | static __always_inline void *kmalloc_large(size_t size, gfp_t flags) __alloc_size(1)
> ??????| ^~~~~~
>
> It's happy if I treat it as a "return type attribute" in the ordering,
> though:
>
> static __always_inline void * __alloc_size(1) kmalloc(size_t size, gfp_t flags)
>
> I'll do that unless you have a preference for somewhere else...

_please_ put it before the return type on a separate line.

[__attributes]
[static inline const] <return type> function(<args...>)




2021-09-22 02:27:25

by Kees Cook

[permalink] [raw]
Subject: function prototype element ordering

On Tue, Sep 21, 2021 at 04:45:44PM -0700, Joe Perches wrote:
> On Tue, 2021-09-21 at 16:37 -0700, Kees Cook wrote:
> > On Fri, Sep 10, 2021 at 10:23:48AM -0700, Linus Torvalds wrote:
> > > On Thu, Sep 9, 2021 at 8:10 PM Andrew Morton <[email protected]> wrote:
> > > >
> > > > +__alloc_size(1)
> > > > ?extern void *vmalloc(unsigned long size);
> > > [...]
> > >
> > > All of these are added in the wrong place - inconsistent with the very
> > > compiler documentation the patches add.
> > >
> > > The function attributes are generally added _after_ the function,
> > > although admittedly we've been quite confused here before.
> > >
> > > But the very compiler documentation you point to in the patch that
> > > adds these macros gives that as the examples both for gcc and clang:
> > >
> > > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
> > > + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
> > >
> > > and honestly I think that is the preferred format because this is
> > > about the *function*, not about the return type.
> > >
> > > Do both placements work? Yes.
> >
> > I'm cleaning this up now, and have discovered that the reason for the
> > before-function placement is consistency with static inlines. If I do this:
> >
> > static __always_inline void * kmalloc(size_t size, gfp_t flags) __alloc_size(1)
> > {
> > ...
> > }
> >
> > GCC is very angry:
> >
> > ./include/linux/slab.h:519:1: error: attributes should be specified before the declarator in a function definition
> > ??519 | static __always_inline void *kmalloc_large(size_t size, gfp_t flags) __alloc_size(1)
> > ??????| ^~~~~~
> >
> > It's happy if I treat it as a "return type attribute" in the ordering,
> > though:
> >
> > static __always_inline void * __alloc_size(1) kmalloc(size_t size, gfp_t flags)
> >
> > I'll do that unless you have a preference for somewhere else...
>
> _please_ put it before the return type on a separate line.
>
> [__attributes]
> [static inline const] <return type> function(<args...>)

Somehow Linus wasn't in CC. :P

Linus, what do you want here? I keep getting conflicting (or
uncompilable) advice. I'm also trying to prepare a patch for
Documentation/process/coding-style.rst ...

Looking through what was written before[1] and through examples in the
source tree, I find the following categories:

1- storage class: static extern inline __always_inline
2- storage class attributes/hints/???: __init __cold
3- return type: void *
4- return type attributes: __must_check __noreturn __assume_aligned(n)
5- function attributes: __attribute_const__ __malloc
6- function argument attributes: __printf(n, m) __alloc_size(n)

Everyone seems to basically agree on:

[storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)

There is a lot of disagreement over where 5 and 6 should fit in above. And
there is a lot of confusion over 4 (mixed between before and after the
function name) and 2 (see below).

What's currently blocking me is that 6 cannot go after the function
(for definitions) because it angers GCC (see quoted bit above), but 5
can (e.g. __attribute_const__).

Another inconsistency seems to be 2 (mainly section markings like
__init). Sometimes it's after the storage class and sometimes after the
return type, but it certainly feels more like a storage class than a
return type attribute:

$ git grep 'static __init int' | wc -l
349
$ git grep 'static int __init' | wc -l
8402

But it's clearly positioned like a return type attribute in most of the
tree. What's correct?

Regardless, given the constraints above, it seems like what Linus may
want is (on "one line", though it will get wrapped in pathological cases
like kmem_cache_alloc_node_trace):

[storage class] [storage class attributes] [return type] [return type attributes] [function argument attributes] [name]([arg1type] [arg1name], ...) [function attributes]

Joe appears to want (on two lines):

[storage class attributes] [function attributes] [function argument attributes]
[storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)

I would just like to have an arrangement that won't get NAKed by
someone. ;) And I'm willing to document it. :)

-Kees

[1] https://lore.kernel.org/mm-commits/CAHk-=wiOCLRny5aifWNhr621kYrJwhfURsa0vFPeUEm8mF0ufg@mail.gmail.com/

--
Kees Cook

2021-09-22 04:26:15

by Joe Perches

[permalink] [raw]
Subject: Re: function prototype element ordering

On Tue, 2021-09-21 at 19:25 -0700, Kees Cook wrote:
> On Tue, Sep 21, 2021 at 04:45:44PM -0700, Joe Perches wrote:
> > On Tue, 2021-09-21 at 16:37 -0700, Kees Cook wrote:
> > > On Fri, Sep 10, 2021 at 10:23:48AM -0700, Linus Torvalds wrote:
> > > > On Thu, Sep 9, 2021 at 8:10 PM Andrew Morton <[email protected]> wrote:
> > > > >
> > > > > +__alloc_size(1)
> > > > > ?extern void *vmalloc(unsigned long size);
> > > > [...]
> > > >
> > > > All of these are added in the wrong place - inconsistent with the very
> > > > compiler documentation the patches add.
> > > >
> > > > The function attributes are generally added _after_ the function,
> > > > although admittedly we've been quite confused here before.
> > > >
> > > > But the very compiler documentation you point to in the patch that
> > > > adds these macros gives that as the examples both for gcc and clang:
> > > >
> > > > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
> > > > + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
> > > >
> > > > and honestly I think that is the preferred format because this is
> > > > about the *function*, not about the return type.
> > > >
> > > > Do both placements work? Yes.
> > >
> > > I'm cleaning this up now, and have discovered that the reason for the
> > > before-function placement is consistency with static inlines. If I do this:
> > >
> > > static __always_inline void * kmalloc(size_t size, gfp_t flags) __alloc_size(1)
> > > {
> > > ...
> > > }
> > >
> > > GCC is very angry:
> > >
> > > ./include/linux/slab.h:519:1: error: attributes should be specified before the declarator in a function definition
> > > ??519 | static __always_inline void *kmalloc_large(size_t size, gfp_t flags) __alloc_size(1)
> > > ??????| ^~~~~~
> > >
> > > It's happy if I treat it as a "return type attribute" in the ordering,
> > > though:
> > >
> > > static __always_inline void * __alloc_size(1) kmalloc(size_t size, gfp_t flags)
> > >
> > > I'll do that unless you have a preference for somewhere else...
> >
> > _please_ put it before the return type on a separate line.
> >
> > [__attributes]
> > [static inline const] <return type> function(<args...>)
>
> Somehow Linus wasn't in CC. :P
>
> Linus, what do you want here? I keep getting conflicting (or
> uncompilable) advice. I'm also trying to prepare a patch for
> Documentation/process/coding-style.rst ...
>
> Looking through what was written before[1] and through examples in the
> source tree, I find the following categories:
>
> 1- storage class: static extern inline __always_inline
> 2- storage class attributes/hints/???: __init __cold
> 3- return type: void *
> 4- return type attributes: __must_check __noreturn __assume_aligned(n)
> 5- function attributes: __attribute_const__ __malloc
> 6- function argument attributes: __printf(n, m) __alloc_size(n)
>
> Everyone seems to basically agree on:
>
> [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)
>
> There is a lot of disagreement over where 5 and 6 should fit in above. And
> there is a lot of confusion over 4 (mixed between before and after the
> function name) and 2 (see below).
>
> What's currently blocking me is that 6 cannot go after the function
> (for definitions) because it angers GCC (see quoted bit above), but 5
> can (e.g. __attribute_const__).
>
> Another inconsistency seems to be 2 (mainly section markings like
> __init). Sometimes it's after the storage class and sometimes after the
> return type, but it certainly feels more like a storage class than a
> return type attribute:
>
> $ git grep 'static __init int' | wc -l
> 349
> $ git grep 'static int __init' | wc -l
> 8402
>
> But it's clearly positioned like a return type attribute in most of the
> tree. What's correct?

Neither really. 'Correct' is such a difficult concept.
'Preferred' might be better.

btw: there are about another 100 other uses with '__init' as the
initial attribute, mostly in trace.

And I still think that return type attributes like __init, which is
just a __section define, should go before the function storage class
and ideally on a separate line to simplify the parsing of the actual
function declaration. Attributes like __section, __aligned, __cold,
etc... don't have much value when looking up a function definition.

> Regardless, given the constraints above, it seems like what Linus may
> want is (on "one line", though it will get wrapped in pathological cases
> like kmem_cache_alloc_node_trace):

Pathological is pretty common these days as the function name length
is rather longer now than earlier times.

> [storage class] [storage class attributes] [return type] [return type attributes] [function argument attributes] [name]([arg1type] [arg1name], ...) [function attributes]
>
> Joe appears to want (on two lines):
>
> [storage class attributes] [function attributes] [function argument attributes]
> [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)

I would put [return type attributes] on the initial separate line
even though that's not the most common use today.

> I would just like to have an arrangement that won't get NAKed by
> someone. ;)

Bikeshed building dreamer...

btw:

Scouting through kernel code for frequency of use examples really
should have some age of code checking associated to the use.

Older code was far more freeform than more recently written code.

But IMO the desire here is to ask for a bit more uniformity, not
require it.


2021-09-22 07:28:30

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: function prototype element ordering

On Tue, Sep 21, 2021 at 07:25:53PM -0700, Kees Cook wrote:
> On Tue, Sep 21, 2021 at 04:45:44PM -0700, Joe Perches wrote:
> > On Tue, 2021-09-21 at 16:37 -0700, Kees Cook wrote:
> > > On Fri, Sep 10, 2021 at 10:23:48AM -0700, Linus Torvalds wrote:
> > > > On Thu, Sep 9, 2021 at 8:10 PM Andrew Morton <[email protected]> wrote:
> > > > >
> > > > > +__alloc_size(1)
> > > > >  extern void *vmalloc(unsigned long size);
> > > > [...]
> > > >
> > > > All of these are added in the wrong place - inconsistent with the very
> > > > compiler documentation the patches add.
> > > >
> > > > The function attributes are generally added _after_ the function,
> > > > although admittedly we've been quite confused here before.
> > > >
> > > > But the very compiler documentation you point to in the patch that
> > > > adds these macros gives that as the examples both for gcc and clang:
> > > >
> > > > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
> > > > + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
> > > >
> > > > and honestly I think that is the preferred format because this is
> > > > about the *function*, not about the return type.
> > > >
> > > > Do both placements work? Yes.
> > >
> > > I'm cleaning this up now, and have discovered that the reason for the
> > > before-function placement is consistency with static inlines. If I do this:
> > >
> > > static __always_inline void * kmalloc(size_t size, gfp_t flags) __alloc_size(1)
> > > {
> > > ...
> > > }
> > >
> > > GCC is very angry:
> > >
> > > ./include/linux/slab.h:519:1: error: attributes should be specified before the declarator in a function definition
> > >   519 | static __always_inline void *kmalloc_large(size_t size, gfp_t flags) __alloc_size(1)
> > >       | ^~~~~~
> > >
> > > It's happy if I treat it as a "return type attribute" in the ordering,
> > > though:
> > >
> > > static __always_inline void * __alloc_size(1) kmalloc(size_t size, gfp_t flags)
> > >
> > > I'll do that unless you have a preference for somewhere else...
> >
> > _please_ put it before the return type on a separate line.
> >
> > [__attributes]
> > [static inline const] <return type> function(<args...>)
>
> Somehow Linus wasn't in CC. :P
>
> Linus, what do you want here? I keep getting conflicting (or
> uncompilable) advice. I'm also trying to prepare a patch for
> Documentation/process/coding-style.rst ...
>
> Looking through what was written before[1] and through examples in the
> source tree, I find the following categories:
>
> 1- storage class: static extern inline __always_inline
> 2- storage class attributes/hints/???: __init __cold
> 3- return type: void *
> 4- return type attributes: __must_check __noreturn __assume_aligned(n)
> 5- function attributes: __attribute_const__ __malloc
> 6- function argument attributes: __printf(n, m) __alloc_size(n)
>
> Everyone seems to basically agree on:
>
> [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)
>
> There is a lot of disagreement over where 5 and 6 should fit in above. And
> there is a lot of confusion over 4 (mixed between before and after the
> function name) and 2 (see below).
>
> What's currently blocking me is that 6 cannot go after the function
> (for definitions) because it angers GCC (see quoted bit above), but 5
> can (e.g. __attribute_const__).
>
> Another inconsistency seems to be 2 (mainly section markings like
> __init). Sometimes it's after the storage class and sometimes after the
> return type, but it certainly feels more like a storage class than a
> return type attribute:
>
> $ git grep 'static __init int' | wc -l
> 349
> $ git grep 'static int __init' | wc -l
> 8402
>
> But it's clearly positioned like a return type attribute in most of the
> tree. What's correct?
>
> Regardless, given the constraints above, it seems like what Linus may
> want is (on "one line", though it will get wrapped in pathological cases
> like kmem_cache_alloc_node_trace):
>
> [storage class] [storage class attributes] [return type] [return type attributes] [function argument attributes] [name]([arg1type] [arg1name], ...) [function attributes]
>
> Joe appears to want (on two lines):
>
> [storage class attributes] [function attributes] [function argument attributes]
> [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)
>
> I would just like to have an arrangement that won't get NAKed by
> someone. ;) And I'm willing to document it. :)

Attributes should be on their own line, they can be quite lengthy.

__attribute__((...))
[static] [inline] T f(A1 arg1, ...)
{
...
}

There will be even more attributes in the future, both added by
compilers and developers (const, pure, WUR), so let's make "prototype lane"
for them.

Same for structures:

__attribute__((packed))
struct S {
};

Kernel practice of hiding attributes under defines (__ro_after_init)
breaks ctags which parses the last identifier before semicolon as object
name. Naturally, it is ctags bug, but placing attributes before
declaration will autmatically unbreak such cases.

2021-09-22 08:55:51

by Joe Perches

[permalink] [raw]
Subject: Re: function prototype element ordering

On Wed, 2021-09-22 at 10:24 +0300, Alexey Dobriyan wrote:

> Attributes should be on their own line, they can be quite lengthy.
>
> __attribute__((...))
> [static] [inline] T f(A1 arg1, ...)
> {
> ...
> }
>
> There will be even more attributes in the future, both added by
> compilers and developers (const, pure, WUR), so let's make "prototype lane"
> for them.
>
> Same for structures:
>
> __attribute__((packed))
> struct S {
> };

Do you know if placing attributes like __packed/__aligned() before
definitions would work for all cases for structs/substructs/unions?


2021-09-22 10:46:20

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: function prototype element ordering

On Wed, Sep 22, 2021 at 01:51:34AM -0700, Joe Perches wrote:
> On Wed, 2021-09-22 at 10:24 +0300, Alexey Dobriyan wrote:
>
> > Attributes should be on their own line, they can be quite lengthy.
> >
> > __attribute__((...))
> > [static] [inline] T f(A1 arg1, ...)
> > {
> > ...
> > }
> >
> > There will be even more attributes in the future, both added by
> > compilers and developers (const, pure, WUR), so let's make "prototype lane"
> > for them.
> >
> > Same for structures:
> >
> > __attribute__((packed))
> > struct S {
> > };
>
> Do you know if placing attributes like __packed/__aligned() before
> definitions would work for all cases for structs/substructs/unions?

Somehow, it doesn't.

But it works for members:

struct S {
__attribute__((aligned(16)))
int a;
};

2021-09-22 11:20:51

by Jani Nikula

[permalink] [raw]
Subject: Re: function prototype element ordering

On Wed, 22 Sep 2021, Alexey Dobriyan <[email protected]> wrote:
> On Tue, Sep 21, 2021 at 07:25:53PM -0700, Kees Cook wrote:
>> On Tue, Sep 21, 2021 at 04:45:44PM -0700, Joe Perches wrote:
>> > On Tue, 2021-09-21 at 16:37 -0700, Kees Cook wrote:
>> > > On Fri, Sep 10, 2021 at 10:23:48AM -0700, Linus Torvalds wrote:
>> > > > On Thu, Sep 9, 2021 at 8:10 PM Andrew Morton <[email protected]> wrote:
>> > > > >
>> > > > > +__alloc_size(1)
>> > > > >  extern void *vmalloc(unsigned long size);
>> > > > [...]
>> > > >
>> > > > All of these are added in the wrong place - inconsistent with the very
>> > > > compiler documentation the patches add.
>> > > >
>> > > > The function attributes are generally added _after_ the function,
>> > > > although admittedly we've been quite confused here before.
>> > > >
>> > > > But the very compiler documentation you point to in the patch that
>> > > > adds these macros gives that as the examples both for gcc and clang:
>> > > >
>> > > > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
>> > > > + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
>> > > >
>> > > > and honestly I think that is the preferred format because this is
>> > > > about the *function*, not about the return type.
>> > > >
>> > > > Do both placements work? Yes.
>> > >
>> > > I'm cleaning this up now, and have discovered that the reason for the
>> > > before-function placement is consistency with static inlines. If I do this:
>> > >
>> > > static __always_inline void * kmalloc(size_t size, gfp_t flags) __alloc_size(1)
>> > > {
>> > > ...
>> > > }
>> > >
>> > > GCC is very angry:
>> > >
>> > > ./include/linux/slab.h:519:1: error: attributes should be specified before the declarator in a function definition
>> > >   519 | static __always_inline void *kmalloc_large(size_t size, gfp_t flags) __alloc_size(1)
>> > >       | ^~~~~~
>> > >
>> > > It's happy if I treat it as a "return type attribute" in the ordering,
>> > > though:
>> > >
>> > > static __always_inline void * __alloc_size(1) kmalloc(size_t size, gfp_t flags)
>> > >
>> > > I'll do that unless you have a preference for somewhere else...
>> >
>> > _please_ put it before the return type on a separate line.
>> >
>> > [__attributes]
>> > [static inline const] <return type> function(<args...>)
>>
>> Somehow Linus wasn't in CC. :P
>>
>> Linus, what do you want here? I keep getting conflicting (or
>> uncompilable) advice. I'm also trying to prepare a patch for
>> Documentation/process/coding-style.rst ...
>>
>> Looking through what was written before[1] and through examples in the
>> source tree, I find the following categories:
>>
>> 1- storage class: static extern inline __always_inline
>> 2- storage class attributes/hints/???: __init __cold
>> 3- return type: void *
>> 4- return type attributes: __must_check __noreturn __assume_aligned(n)
>> 5- function attributes: __attribute_const__ __malloc
>> 6- function argument attributes: __printf(n, m) __alloc_size(n)
>>
>> Everyone seems to basically agree on:
>>
>> [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)
>>
>> There is a lot of disagreement over where 5 and 6 should fit in above. And
>> there is a lot of confusion over 4 (mixed between before and after the
>> function name) and 2 (see below).
>>
>> What's currently blocking me is that 6 cannot go after the function
>> (for definitions) because it angers GCC (see quoted bit above), but 5
>> can (e.g. __attribute_const__).
>>
>> Another inconsistency seems to be 2 (mainly section markings like
>> __init). Sometimes it's after the storage class and sometimes after the
>> return type, but it certainly feels more like a storage class than a
>> return type attribute:
>>
>> $ git grep 'static __init int' | wc -l
>> 349
>> $ git grep 'static int __init' | wc -l
>> 8402
>>
>> But it's clearly positioned like a return type attribute in most of the
>> tree. What's correct?
>>
>> Regardless, given the constraints above, it seems like what Linus may
>> want is (on "one line", though it will get wrapped in pathological cases
>> like kmem_cache_alloc_node_trace):
>>
>> [storage class] [storage class attributes] [return type] [return type attributes] [function argument attributes] [name]([arg1type] [arg1name], ...) [function attributes]
>>
>> Joe appears to want (on two lines):
>>
>> [storage class attributes] [function attributes] [function argument attributes]
>> [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)
>>
>> I would just like to have an arrangement that won't get NAKed by
>> someone. ;) And I'm willing to document it. :)
>
> Attributes should be on their own line, they can be quite lengthy.
>
> __attribute__((...))
> [static] [inline] T f(A1 arg1, ...)
> {
> ...
> }
>
> There will be even more attributes in the future, both added by
> compilers and developers (const, pure, WUR), so let's make "prototype lane"
> for them.
>
> Same for structures:
>
> __attribute__((packed))
> struct S {
> };
>
> Kernel practice of hiding attributes under defines (__ro_after_init)
> breaks ctags which parses the last identifier before semicolon as object
> name. Naturally, it is ctags bug, but placing attributes before
> declaration will autmatically unbreak such cases.

git grep seems to suggest __packed is preferred over
__attribute__((packed)), and at the end of the struct declaration
instead of at front:

struct S {
/* ... */
} __packed;

And GNU Global handles this just fine. ;)


BR,
Jani.

--
Jani Nikula, Intel Open Source Graphics Center

2021-09-22 21:17:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: function prototype element ordering

On Wed, Sep 22, 2021 at 12:24 AM Alexey Dobriyan <[email protected]> wrote:
>
> Attributes should be on their own line, they can be quite lengthy.

No, no no. They really shouldn't.

First off, no normal code should use that "__attribute__(())" syntax
anyway. It's ugly and big, and many of the attributes are
compiler-specific anyway.

So the "quite lengthy" argument is bogus, because the actual names you
should use are things like "__packed" or "__pure" or "__user" etc.

But the "on their own line" is complete garbage to begin with. That
will NEVER be a kernel rule. We should never have a rule that assumes
things are so long that they need to be on multiple lines.

We don't put function return types on their own lines either, even if
some other projects have that rule (just to get function names at the
beginning of lines or some other odd reason).

So no, attributes do not go on their own lines, and they also
generally don't go before the thing they describe. Your examples are
wrong, and explicitly against kernel rules.

Linus

2021-09-23 05:18:01

by Joe Perches

[permalink] [raw]
Subject: Re: function prototype element ordering

On Wed, 2021-09-22 at 14:15 -0700, Linus Torvalds wrote:
> On Wed, Sep 22, 2021 at 12:24 AM Alexey Dobriyan <[email protected]> wrote:
> >
> > Attributes should be on their own line, they can be quite lengthy.
>
> No, no no. They really shouldn't.
>
> First off, no normal code should use that "__attribute__(())" syntax
> anyway. It's ugly and big, and many of the attributes are
> compiler-specific anyway.
>
> So the "quite lengthy" argument is bogus, because the actual names you
> should use are things like "__packed" or "__pure" or "__user" etc.
>
> But the "on their own line" is complete garbage to begin with. That
> will NEVER be a kernel rule. We should never have a rule that assumes
> things are so long that they need to be on multiple lines.

I think it's not so much that lines are long, it's more that the
information provided by these markings aren't particularly useful to
a caller/user of a function.

Under what circumstance is a marking like __pure/__cold or __section
useful to someone that just wants to call a particular function?

A secondary reason why these should be separate or at least put
at the begining of a function declaration is compatibility with
existing tools like ctags.


2021-09-25 06:21:53

by Kees Cook

[permalink] [raw]
Subject: Re: function prototype element ordering

On Tue, Sep 21, 2021 at 09:24:04PM -0700, Joe Perches wrote:
> On Tue, 2021-09-21 at 19:25 -0700, Kees Cook wrote:
> > [...]
> > Looking through what was written before[1] and through examples in the
> > source tree, I find the following categories:
> >
> > 1- storage class: static extern inline __always_inline
> > 2- storage class attributes/hints/???: __init __cold
> > 3- return type: void *
> > 4- return type attributes: __must_check __noreturn __assume_aligned(n)
> > 5- function attributes: __attribute_const__ __malloc
> > 6- function argument attributes: __printf(n, m) __alloc_size(n)
> >
> > Everyone seems to basically agree on:
> >
> > [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)
> >
> > There is a lot of disagreement over where 5 and 6 should fit in above. And
> > there is a lot of confusion over 4 (mixed between before and after the
> > function name) and 2 (see below).
> >
> > What's currently blocking me is that 6 cannot go after the function
> > (for definitions) because it angers GCC (see quoted bit above), but 5
> > can (e.g. __attribute_const__).
> >
> > Another inconsistency seems to be 2 (mainly section markings like
> > __init). Sometimes it's after the storage class and sometimes after the
> > return type, but it certainly feels more like a storage class than a
> > return type attribute:
> >
> > $ git grep 'static __init int' | wc -l
> > 349
> > $ git grep 'static int __init' | wc -l
> > 8402
> >
> > But it's clearly positioned like a return type attribute in most of the
> > tree. What's correct?
>
> Neither really. 'Correct' is such a difficult concept.
> 'Preferred' might be better.

Right -- I expect it to be a guideline.

> btw: there are about another 100 other uses with '__init' as the
> initial attribute, mostly in trace.

Hah, yeah.

> And I still think that return type attributes like __init, which is
> just a __section define, should go before the function storage class
> and ideally on a separate line to simplify the parsing of the actual
> function declaration. Attributes like __section, __aligned, __cold,
> etc... don't have much value when looking up a function definition.
>
> > Regardless, given the constraints above, it seems like what Linus may
> > want is (on "one line", though it will get wrapped in pathological cases
> > like kmem_cache_alloc_node_trace):
>
> Pathological is pretty common these days as the function name length
> is rather longer now than earlier times.

Agreed!

> > [storage class] [storage class attributes] [return type] [return type attributes] [function argument attributes] [name]([arg1type] [arg1name], ...) [function attributes]
> >
> > Joe appears to want (on two lines):
> >
> > [storage class attributes] [function attributes] [function argument attributes]
> > [storage class] [return type] [return type attributes] [name]([arg1type] [arg1name], ...)
>
> I would put [return type attributes] on the initial separate line
> even though that's not the most common use today.

I found a few other people wanting separate lines too, so at the risk of
annoying Linus, I guess I'll attempt this (again).

> > I would just like to have an arrangement that won't get NAKed by
> > someone. ;)
>
> Bikeshed building dreamer...

I just want to know the right place to put stuff. :P

> But IMO the desire here is to ask for a bit more uniformity, not
> require it.

Yeah.

--
Kees Cook

2021-09-25 19:43:24

by David Laight

[permalink] [raw]
Subject: RE: function prototype element ordering

From: Linus Torvalds
> Sent: 22 September 2021 22:16
...
> We don't put function return types on their own lines either, even if
> some other projects have that rule (just to get function names at the
> beginning of lines or some other odd reason).

If the function name starts at the beginning of a line it is
much easier to grep for the definition.
Trying to find function definitions in the Linux kernel tree
is a PITA - unless they are exported when 'EXPORT.*(function_name)'
will tend to work.

Trying to compile:
static int x(int y) __attribute__((section("x"))) { return y;}
with gcc generates "error: attributes are not allowed on a function-definition".

Putting the attribute anywhere before the function name works fine.
gcc probably accepts:
__inline static __inline int __inline x(void) {return 0;}

So any of those locations is plausible.
But after the arguments isn't allowed.
So an (extern) function declaration probably should not put them
there - if only for consistency.

I think I'd go for 'first' - optionally on their own line.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2021-09-26 21:13:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: function prototype element ordering

On Sat, Sep 25, 2021 at 12:40 PM David Laight <[email protected]> wrote:
>
> If the function name starts at the beginning of a line it is
> much easier to grep for the definition.

That has always been a completely bogus argument. I grep to look up
the type as often as I grep for the function definition, plus it's not
at all unlikely that the "function" is actually a macro wrapper, so
grepping for the beginning of line is just completely wrong.

It's completely wrong for another reason too: it assumes a style of
programming that has never actually been all that common. It's a very
specific pattern to very specific projects, and anybody who learnt
that pattern for their project is going to be completely lost anywhere
else. So don't do it. It's just a bad idea.

So a broken "easier to grep for" is not an excuse for "make the code
harder to read" particularly when it just makes another type of
grepping harder, and it's not actually nearly universal enough to
actually be a useful pattern in the first place.

It's not only never been the pattern in the kernel, but it's generally
not been the pattern anywhere else either. It's literally one of the
broken GNU coding standards - and the fact that almost every other
part of the GNU coding standards were wrong (indentation, placement of
braces, you name it) should give you a hint about how good _that_ one
was.

Here's an exercise for you: go search for C coding examples on the
web, and see how many of them do

int main(int argc, char **argv)

vs how many of them do

int
main(int argc, char **argv)

and then realize that in order for the "grep for ^main" pattern to be
useful, the second version has to not just be more common, it has to
be practically *universal*.

Hint: it isn't even remotely more common, much less universal. In
Debian code search, I had to go to the third page to find any example
at all of people putting the "int" and the "main" on different lines,
and even that one didn't place the "main()" at the beginning of the
line - they had been separated because of other reasons and looked
like this:

int
#ifdef _WIN32
__cdecl
#endif // _WIN32
main(int argc, char** argv)

instead.

Maybe Dbian code search isn't the place to go, but I think it proves
my case: the "function name at beginning of line" story is pure
make-believe, and has absolutely no relevance in the real world.

It's a bad straightjacket. Just get over it, and stop perpetuating the
idiotic myth.

If you care so much about grepping for function declarations, and you
use that old-fashioned GNU coding standard policy as an argument, just
be *properly* old-fashioned instead, and use etags or something.

Don't make the rest of us suffer.

Because I grep for functions all the time, and I'd rather have useful
output - which very much includes the type of the function. That's
often one reason _why_ I grep for things in the first place.

Other grep tricks for when the function really is used everywhere, and
you are having trouble finding the definition vs the use:

- grep in the headers for the type, and actually use the type (either
of the function, or the first argument) as part of the pattern.

If you really have no idea where it might be, you'll want to start
off with the header grep anyway, to find the macro case (or the inline
case)

Yeah, splitting the declaration will screw the type information up.
So don't do that, then.

- if it's so widely used that you find it all over, it's probably
exported. grep for 'EXPORT.*fnname' to see where it is defined.

We used to (brokenly) export things separately from the definition.
If you find cases of that, let's fix them.

Of course, usually I know roughly where it is defined, so I just limit
the pathnames for 'git grep'.

But the 'add the type of the return value or first argument to the
pattern' is often my second go-to (particularly for the cases where
you might be looking for _multiple_ definitions because it's some
architecture-specific thing, or you have some partial pattern because
every filesystem does their own thing).

Other 'git grep' patterns that often work for kernel sources:

- looking for a structure declaration? Use

git grep 'struct name {'

which mostly works, but obviously depends on coding style so it's
not guaranteed. Good first thing to try, though.

- use

git grep '\t\.name\>.*='

to find things like particular inode operations.

That second case is because we have almost universally converted our
filesystem operation initializers to use that named format (and really
strive to have a policy of constant structures of function pointers
only), and it's really convenient if you are doing VFS changes and
need to find all the places that use a particular VFS interface (eg
".get_acl" or similar).

It used to be a nightmare to find those things back when most of our
initializers were using the traditional unnamed ordered structure
initializers, so this is one area where we've introduced coding style
policies to make it really easy to grep for things (but also much
easier to add new fields and not have to add pointless NULL
initializer elements, of course).

Linus

2021-09-27 08:25:47

by David Laight

[permalink] [raw]
Subject: RE: function prototype element ordering

From: Linus Torvalds
> Sent: 26 September 2021 22:04
>
> On Sat, Sep 25, 2021 at 12:40 PM David Laight <[email protected]> wrote:
> >
> > If the function name starts at the beginning of a line it is
> > much easier to grep for the definition.
>
> That has always been a completely bogus argument. I grep to look up
> the type as often as I grep for the function definition, plus it's not
> at all unlikely that the "function" is actually a macro wrapper, so
> grepping for the beginning of line is just completely wrong.
>
> It's completely wrong for another reason too: it assumes a style of
> programming that has never actually been all that common. It's a very
> specific pattern to very specific projects, and anybody who learnt
> that pattern for their project is going to be completely lost anywhere
> else. So don't do it. It's just a bad idea.
>
> So a broken "easier to grep for" is not an excuse for "make the code
> harder to read" particularly when it just makes another type of
> grepping harder, and it's not actually nearly universal enough to
> actually be a useful pattern in the first place.
>
> It's not only never been the pattern in the kernel, but it's generally
> not been the pattern anywhere else either. It's literally one of the
> broken GNU coding standards - and the fact that almost every other
> part of the GNU coding standards were wrong (indentation, placement of
> braces, you name it) should give you a hint about how good _that_ one
> was.
>
> Here's an exercise for you: go search for C coding examples on the
> web, and see how many of them do
>
> int main(int argc, char **argv)
>
> vs how many of them do
>
> int
> main(int argc, char **argv)

It makes a bigger difference with:

struct frobulate *find_frobulate(args)
which is going to need a line break somewhere.
Especially with the (strange) rule about aligning the continued
arguments with the (.

But I didn't expect such a long response :-)

I'm sure the netBSD tree (mostly) puts the function name in column 1.
But after that uses the K&R location for {} (as does Linux).

It true that a lot of 'coding standards' are horrid.
Putting '} else {' on one line is important when reading code.
Especially if the '}' would be at the bottom of the screen,
or worse still turning the page on a fan-fold paper listing to find
a floating 'else' = with no idea which 'if' it goes with.

The modern example of why { and } shouldn't be on their own lines is:
...
}
while (...........................
{
...
Is that a loop bottom followed by a code block or
a conditional followed by a loop?

But none of this is related to the location of attributes unless
you need to split long lines and put the attribute before the
function name where you may need.

static struct frobulate *
__inline ....
find_frobulate(....)

Especially if you need #if around the attributes.

David


David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2021-09-27 09:25:09

by Willy Tarreau

[permalink] [raw]
Subject: Re: function prototype element ordering

On Mon, Sep 27, 2021 at 08:21:24AM +0000, David Laight wrote:
> Putting '} else {' on one line is important when reading code.

I used not to like that due to "else if ()" being less readable and less
easy to spot, but the arguments you gave regarding the end of screen are
valid and are similar to my hate of GNU's broken "while ()" on its own
line especially after a "do { }" block where it immediately looks like
an accidental infinite loop.

However:

> But none of this is related to the location of attributes unless
> you need to split long lines and put the attribute before the
> function name where you may need.
>
> static struct frobulate *
> __inline ....
> find_frobulate(....)

This is exactly the case where I hate to dig into code looking like
that: you build, it fails to find symbol "find_frobulate()", you run
"git grep -w find_frobulate" to figure what file provides it, or even
"grep ^find_frobulate" if you want. And you find it in frobulate.c. You
double-check, you find that frobulate.o was built and linked into your
executable. Despite this it fails to find the symbol. Finally you open
the file to discover this painful "static" two lines above, which made
you waste 3 minutes of your time digging at the wrong place.

*Just* for this reason I'm much more careful to always put the type and
name on the same line nowadays.

> Especially if you need #if around the attributes.

This is the only exception I still have to the rule above. But #if by
definition require multi-line processing anyway and they're not welcome
in the middle of control flows.

Willy