The select() implementation is carefully tuned to put a sensible amount
of data on the stack for holding a copy of the user space fd_set,
but not too large to risk overflowing the kernel stack.
When building a 32-bit kernel with clang, we need a little more space
than with gcc, which often triggers a warning:
fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
I experimentally found that for 32-bit ARM, reducing the maximum
stack usage by 64 bytes keeps us reliably under the warning limit
again.
Signed-off-by: Arnd Bergmann <[email protected]>
---
include/linux/poll.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/linux/poll.h b/include/linux/poll.h
index 7e0fdcf905d2..1cdc32b1f1b0 100644
--- a/include/linux/poll.h
+++ b/include/linux/poll.h
@@ -16,7 +16,11 @@
extern struct ctl_table epoll_table[]; /* for sysctl */
/* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
additional memory. */
+#ifdef __clang__
+#define MAX_STACK_ALLOC 768
+#else
#define MAX_STACK_ALLOC 832
+#endif
#define FRONTEND_STACK_ALLOC 256
#define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
#define POLL_STACK_ALLOC FRONTEND_STACK_ALLOC
--
2.20.0
On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
> The select() implementation is carefully tuned to put a sensible amount
> of data on the stack for holding a copy of the user space fd_set,
> but not too large to risk overflowing the kernel stack.
>
> When building a 32-bit kernel with clang, we need a little more space
> than with gcc, which often triggers a warning:
>
> fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>
> I experimentally found that for 32-bit ARM, reducing the maximum
> stack usage by 64 bytes keeps us reliably under the warning limit
> again.
Could just use 768 bytes unconditionally. I doubt a few bytes more or less
will make too much difference.
Other than that
Reviewed-by: Andi Kleen <[email protected]>
-Andi
On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
> The select() implementation is carefully tuned to put a sensible amount
> of data on the stack for holding a copy of the user space fd_set,
> but not too large to risk overflowing the kernel stack.
>
> When building a 32-bit kernel with clang, we need a little more space
> than with gcc, which often triggers a warning:
>
> fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>
> I experimentally found that for 32-bit ARM, reducing the maximum
> stack usage by 64 bytes keeps us reliably under the warning limit
> again.
>
> Signed-off-by: Arnd Bergmann <[email protected]>
> ---
> include/linux/poll.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/include/linux/poll.h b/include/linux/poll.h
> index 7e0fdcf905d2..1cdc32b1f1b0 100644
> --- a/include/linux/poll.h
> +++ b/include/linux/poll.h
> @@ -16,7 +16,11 @@
> extern struct ctl_table epoll_table[]; /* for sysctl */
> /* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
> additional memory. */
> +#ifdef __clang__
> +#define MAX_STACK_ALLOC 768
Hi Arnd,
Upon a toolchain upgrade for Android, our 32b x86 image used for
first-party developer VMs started tripping -Wframe-larger-than= again
(thanks -Werror) which is blocking our ability to upgrade our toolchain.
I've attached the zstd compressed .config file that reproduces with ToT
LLVM:
$ cd linux
$ zstd -d path/to/config.zst -o .config
$ make ARCH=i386 LLVM=1 -j128 fs/select.o
fs/select.c:625:5: error: stack frame size (1028) exceeds limit (1024)
in 'core_sys_select' [-Werror,-Wframe-larger-than]
int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
^
As you can see, we're just barely tipping over the limit. Should I send
a patch to reduce this again? If so, any thoughts by how much?
Decrementing the current value by 4 builds the config in question, but
seems brittle.
Do we need to only do this if !CONFIG_64BIT?
commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
seems to allude to this being more problematic on 32b targets?
> +#else
> #define MAX_STACK_ALLOC 832
> +#endif
> #define FRONTEND_STACK_ALLOC 256
> #define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
> #define POLL_STACK_ALLOC FRONTEND_STACK_ALLOC
> --
> 2.20.0
>
>
On Fri, Oct 7, 2022, at 12:21 AM, Nick Desaulniers wrote:
> On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
>> The select() implementation is carefully tuned to put a sensible amount
>> of data on the stack for holding a copy of the user space fd_set,
>> but not too large to risk overflowing the kernel stack.
>>
>> When building a 32-bit kernel with clang, we need a little more space
>> than with gcc, which often triggers a warning:
>>
>> fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
>> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>>
>> I experimentally found that for 32-bit ARM, reducing the maximum
>> stack usage by 64 bytes keeps us reliably under the warning limit
>> again.
>>
>> Signed-off-by: Arnd Bergmann <[email protected]>
>> ---
>> include/linux/poll.h | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/include/linux/poll.h b/include/linux/poll.h
>> index 7e0fdcf905d2..1cdc32b1f1b0 100644
>> --- a/include/linux/poll.h
>> +++ b/include/linux/poll.h
>> @@ -16,7 +16,11 @@
>> extern struct ctl_table epoll_table[]; /* for sysctl */
>> /* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
>> additional memory. */
>> +#ifdef __clang__
>> +#define MAX_STACK_ALLOC 768
>
> Hi Arnd,
> Upon a toolchain upgrade for Android, our 32b x86 image used for
> first-party developer VMs started tripping -Wframe-larger-than= again
> (thanks -Werror) which is blocking our ability to upgrade our toolchain.
>
> I've attached the zstd compressed .config file that reproduces with ToT
> LLVM:
>
> $ cd linux
> $ zstd -d path/to/config.zst -o .config
> $ make ARCH=i386 LLVM=1 -j128 fs/select.o
> fs/select.c:625:5: error: stack frame size (1028) exceeds limit (1024)
> in 'core_sys_select' [-Werror,-Wframe-larger-than]
> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
> ^
>
> As you can see, we're just barely tipping over the limit. Should I send
> a patch to reduce this again? If so, any thoughts by how much?
> Decrementing the current value by 4 builds the config in question, but
> seems brittle.
>
> Do we need to only do this if !CONFIG_64BIT?
> commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
> seems to allude to this being more problematic on 32b targets?
I think we should keep the limit consistent between 32 bit and 64 bit
kernels. Lowering the allocation a bit more would of course have a
performance impact for users that are just below the current limit,
so I think it would be best to first look at what might be going
wrong in the compiler.
I managed to reproduce the issue and had a look at what happens
here. A few random observations:
- the kernel is built with -fsanitize=local-bounds, dropping this
option reduces the stack allocation for this function by around
100 bytes, which would be the easiest change for you to build
those kernels again without any source changes, but it may also
be possible to change the core_sys_select function in a way that
avoids the insertion of runtime bounds checks.
- If I mark 'do_select' as noinline_for_stack, the reported frame
size is decreased a lot and is suddenly independent of
-fsanitize=local-bounds:
fs/select.c:625:5: error: stack frame size (336) exceeds limit (100) in 'core_sys_select' [-Werror,-Wframe-larger-than]
int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
fs/select.c:479:21: error: stack frame size (684) exceeds limit (100) in 'do_select' [-Werror,-Wframe-larger-than]
static noinline int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
However, I don't even see how this makes sense at all, given that
the actual frame size should be at least SELECT_STACK_ALLOC!
- The behavior of -ftrivial-auto-var-init= is a bit odd here: with =zero or
=pattern, the stack usage is just below the limit (1020), without the
option it is up to 1044. It looks like your .config picks =zero, which
was dropped in the latest clang version, so it falls back to not
initializing. Setting it to =pattern should give you the old
behavior, but I don't understand why clang uses more stack without
the initialization, rather than using less, as it would likely cause
fewer spills
Arnd
+ Kees, Paul
On Fri, Oct 7, 2022 at 1:28 AM Arnd Bergmann <[email protected]> wrote:
>
> On Fri, Oct 7, 2022, at 12:21 AM, Nick Desaulniers wrote:
> > On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
> >> The select() implementation is carefully tuned to put a sensible amount
> >> of data on the stack for holding a copy of the user space fd_set,
> >> but not too large to risk overflowing the kernel stack.
> >>
> >> When building a 32-bit kernel with clang, we need a little more space
> >> than with gcc, which often triggers a warning:
> >>
> >> fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
> >> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
> >>
> >> I experimentally found that for 32-bit ARM, reducing the maximum
> >> stack usage by 64 bytes keeps us reliably under the warning limit
> >> again.
> >>
> >> Signed-off-by: Arnd Bergmann <[email protected]>
> >> ---
> >> include/linux/poll.h | 4 ++++
> >> 1 file changed, 4 insertions(+)
> >>
> >> diff --git a/include/linux/poll.h b/include/linux/poll.h
> >> index 7e0fdcf905d2..1cdc32b1f1b0 100644
> >> --- a/include/linux/poll.h
> >> +++ b/include/linux/poll.h
> >> @@ -16,7 +16,11 @@
> >> extern struct ctl_table epoll_table[]; /* for sysctl */
> >> /* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
> >> additional memory. */
> >> +#ifdef __clang__
> >> +#define MAX_STACK_ALLOC 768
> >
> > Hi Arnd,
> > Upon a toolchain upgrade for Android, our 32b x86 image used for
> > first-party developer VMs started tripping -Wframe-larger-than= again
> > (thanks -Werror) which is blocking our ability to upgrade our toolchain.
> >
> > I've attached the zstd compressed .config file that reproduces with ToT
> > LLVM:
> >
> > $ cd linux
> > $ zstd -d path/to/config.zst -o .config
> > $ make ARCH=i386 LLVM=1 -j128 fs/select.o
> > fs/select.c:625:5: error: stack frame size (1028) exceeds limit (1024)
> > in 'core_sys_select' [-Werror,-Wframe-larger-than]
> > int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
> > ^
> >
> > As you can see, we're just barely tipping over the limit. Should I send
> > a patch to reduce this again? If so, any thoughts by how much?
> > Decrementing the current value by 4 builds the config in question, but
> > seems brittle.
> >
> > Do we need to only do this if !CONFIG_64BIT?
> > commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
> > seems to allude to this being more problematic on 32b targets?
>
> I think we should keep the limit consistent between 32 bit and 64 bit
> kernels. Lowering the allocation a bit more would of course have a
> performance impact for users that are just below the current limit,
> so I think it would be best to first look at what might be going
> wrong in the compiler.
>
> I managed to reproduce the issue and had a look at what happens
> here. A few random observations:
>
> - the kernel is built with -fsanitize=local-bounds, dropping this
> option reduces the stack allocation for this function by around
> 100 bytes, which would be the easiest change for you to build
> those kernels again without any source changes, but it may also
> be possible to change the core_sys_select function in a way that
> avoids the insertion of runtime bounds checks.
Thanks for taking a look Arnd; ++beers_owed;
I'm not sure we'd want to disable CONFIG_UBSAN_LOCAL_BOUNDS=y for this
particular configuration of the kernel over this, or remove
-fsanitize=local-bounds for this translation unit (even if we did so
specifically for 32b targets). FWICT, the parameter n of function
core_sys_select() is used to index into the stack allocated stack_fds,
which is what -fsanitize=local-bounds is inserting runtime guards for.
If I dump the compiler IR (before register allocation), the only
explicit stack allocations I observe once the middle end optimizations
have run are:
1. a single 64b value...looks like the ktime_t passed to
poll_schedule_timeout IIUC.
2. a struct poll_wqueues inlined from do_select.
3. 64x32b array, probably stack_fds.
(oh, yeah, those are correct, if I rebuild with `KCFLAGS="-g0
-fno-discard-value-names"` the IR retains identifiers for locals. I
should send a patch for that for kbuild).
I think that implies that the final stack slots emitted are a result
of the register allocator failing to keep all temporary values live in
registers; they are spilled to the stack.
Paul has been playing with visualizing stack slots recently, and might
be able to provide more color.
I worry that the back end might do tail duplication or if conversion
and potentially split large stack values into two distinct
(non-overlapping) stack slots, but haven't seen that yet in reality.
We've also seen poor stack slot reuse with KASAN with clang as well...
>
> - If I mark 'do_select' as noinline_for_stack, the reported frame
> size is decreased a lot and is suddenly independent of
> -fsanitize=local-bounds:
> fs/select.c:625:5: error: stack frame size (336) exceeds limit (100) in 'core_sys_select' [-Werror,-Wframe-larger-than]
> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
> fs/select.c:479:21: error: stack frame size (684) exceeds limit (100) in 'do_select' [-Werror,-Wframe-larger-than]
> static noinline int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
I think this approach makes the most sense to me; the caller
core_sys_select() has a large stack allocation `stack_fds`, and so
does the callee do_select with `table`. Add in inlining and long live
ranges and it makes sense that stack spills are going to tip us over
the threshold set by -Wframe-larger-than.
Whether you make do_select() `noinline_for_stack` conditional on
additional configs like CC_IS_CLANG or CONFIG_UBSAN_LOCAL_BOUNDS is
perhaps also worth considering.
How would you feel about a patch that:
1. reverts commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
2. marks do_select noinline_for_stack
?
I assume the point of "small string optimization" going on with
`stack_fds` in core_sys_select() is that the potential overhead for
kmalloc is much much higher than the cost of not inlining do_select()
into core_sys_select(). The above approach does solve this .config's
instance, and seems slightly less brittle to me.
> However, I don't even see how this makes sense at all, given that
> the actual frame size should be at least SELECT_STACK_ALLOC!
I think the math checks out:
#define FRONTEND_STACK_ALLOC 256
#define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
sizeof(long) == 4; // i386 ilp32
sizeof(stack_fds) == sizeof(long) * 256 / sizeof(long) == 256
>
> - The behavior of -ftrivial-auto-var-init= is a bit odd here: with =zero or
> =pattern, the stack usage is just below the limit (1020), without the
> option it is up to 1044. It looks like your .config picks =zero, which
> was dropped in the latest clang version, so it falls back to not
Huh? What do you mean by "was dropped?"
The config I sent has:
CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_BARE=y
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
# CONFIG_INIT_STACK_NONE is not set
CONFIG_INIT_STACK_ALL_ZERO=y
Disabling INIT_STACK_ALL_ZERO from the config provided doesn't elide
the diagnostic.
Enabling INIT_STACK_ALL_PATTERN does... explicit stack allocations in
IR haven't changed. In the generated assembly we're pushing 3x 4B
GPRs, subtracting 8B from the stack pointer, then another 1008B.
(Filed: https://github.com/llvm/llvm-project/issues/58237)
So that's 3 * 4B + 8B + 1008B == 1028. But CONFIG_FRAME_WARN is set
to 1024. I wonder if this diagnostic is not as precise as it could
be, or my math is wrong?
It looks like for arrays INIT_STACK_ALL_PATTERN uses memset to fill
the array with 0xFF rather than 0x00 used by INIT_STACK_ALL_ZERO. Not
sure why that would make a difference, but curious that it does.
Looking at the delta in the (massive) IR between the two, it looks
like the second for loop preheader and body differ. That's going to
result in different choices by the register allocator. The second
loop is referencing `can_busy_loop`, `busy_flag`, and `retval1` FWICT
when looking at IR. That looks like one of the loops in do_select.
> initializing. Setting it to =pattern should give you the old
> behavior, but I don't understand why clang uses more stack without
> the initialization, rather than using less, as it would likely cause
> fewer spills
>
> Arnd
--
Thanks,
~Nick Desaulniers
Effectively a revert of
commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
Various configs can still push the stack useage of core_sys_select()
over the CONFIG_FRAME_WARN threshold (1024B on 32b targets).
fs/select.c:619:5: error: stack frame size of 1048 bytes in function
'core_sys_select' [-Werror,-Wframe-larger-than=]
core_sys_select() has a large stack allocation for `stack_fds` where it
tries to do something equivalent to "small string optimization" to
potentially avoid a malloc.
core_sys_select() calls do_select() which has another potentially large
stack allocation, `table`. Both of these values depend on
FRONTEND_STACK_ALLOC.
Mix those two large allocation with register spills which are
exacerbated by various configs and compiler versions and we can just
barely exceed the 1024B limit.
Rather than keep trying to find the right value of MAX_STACK_ALLOC or
FRONTEND_STACK_ALLOC, mark do_select() as noinline_for_stack for 32b
targets.
The intent of FRONTEND_STACK_ALLOC is to help potentially avoid a
dynamic memory allocation. In that spirit, restore the previous
threshold but separate the stack frames for 32b targets.
Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: ad312f95d41c ("fs/select: avoid clang stack usage warning")
Suggested-by: Arnd Bergmann <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
fs/select.c | 7 +++++++
include/linux/poll.h | 4 ----
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/fs/select.c b/fs/select.c
index 0ee55af1a55c..945d04b9cf5a 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -476,6 +476,13 @@ static inline void wait_key_set(poll_table *wait, unsigned long in,
wait->_key |= POLLOUT_SET;
}
+#ifdef CONFIG_64BIT
+#define noinline_for_stack_32b
+#else
+#define noinline_for_stack_32b noinline_for_stack
+#endif
+
+noinline_for_stack_32b
static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
{
ktime_t expire, *to = NULL;
diff --git a/include/linux/poll.h b/include/linux/poll.h
index a9e0e1c2d1f2..d1ea4f3714a8 100644
--- a/include/linux/poll.h
+++ b/include/linux/poll.h
@@ -14,11 +14,7 @@
/* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
additional memory. */
-#ifdef __clang__
-#define MAX_STACK_ALLOC 768
-#else
#define MAX_STACK_ALLOC 832
-#endif
#define FRONTEND_STACK_ALLOC 256
#define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
#define POLL_STACK_ALLOC FRONTEND_STACK_ALLOC
base-commit: 93ed07a23fd08b8613f64cf0a15d7fbdaca010fd
--
2.38.0.rc2.412.g84df46c1b4-goog
On Fri, Oct 7, 2022, at 9:04 PM, Nick Desaulniers wrote:
> On Fri, Oct 7, 2022 at 1:28 AM Arnd Bergmann <[email protected]> wrote:
>> On Fri, Oct 7, 2022, at 12:21 AM, Nick Desaulniers wrote:
>> > On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
>>
>> - the kernel is built with -fsanitize=local-bounds, dropping this
>> option reduces the stack allocation for this function by around
>> 100 bytes, which would be the easiest change for you to build
>> those kernels again without any source changes, but it may also
>> be possible to change the core_sys_select function in a way that
>> avoids the insertion of runtime bounds checks.
>
> Thanks for taking a look Arnd; ++beers_owed;
>
> I'm not sure we'd want to disable CONFIG_UBSAN_LOCAL_BOUNDS=y for this
> particular configuration of the kernel over this, or remove
> -fsanitize=local-bounds for this translation unit (even if we did so
> specifically for 32b targets). FWICT, the parameter n of function
> core_sys_select() is used to index into the stack allocated stack_fds,
> which is what -fsanitize=local-bounds is inserting runtime guards for.
Right, so what I was thinking is that the existing runtime check
'if (size > sizeof(stack_fds) / 6)' could be rewritten in a way that
clang sees the bounds correctly, as the added check would test for
the exact same limit, right? It might be too hard to figure out, or
it might have other side-effects.
>> - If I mark 'do_select' as noinline_for_stack, the reported frame
>> size is decreased a lot and is suddenly independent of
>> -fsanitize=local-bounds:
>> fs/select.c:625:5: error: stack frame size (336) exceeds limit (100) in 'core_sys_select' [-Werror,-Wframe-larger-than]
>> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>> fs/select.c:479:21: error: stack frame size (684) exceeds limit (100) in 'do_select' [-Werror,-Wframe-larger-than]
>> static noinline int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
>
> I think this approach makes the most sense to me; the caller
> core_sys_select() has a large stack allocation `stack_fds`, and so
> does the callee do_select with `table`. Add in inlining and long live
> ranges and it makes sense that stack spills are going to tip us over
> the threshold set by -Wframe-larger-than.
>
> Whether you make do_select() `noinline_for_stack` conditional on
> additional configs like CC_IS_CLANG or CONFIG_UBSAN_LOCAL_BOUNDS is
> perhaps also worth considering.
>
> How would you feel about a patch that:
> 1. reverts commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
> 2. marks do_select noinline_for_stack
>
> ?
That is probably ok, but it does need proper testing to ensure that
there are no performance regressions. Do you know if gcc inlines the
function by default? If not, we probably don't need to make it
conditional.
> I assume the point of "small string optimization" going on with
> `stack_fds` in core_sys_select() is that the potential overhead for
> kmalloc is much much higher than the cost of not inlining do_select()
> into core_sys_select(). The above approach does solve this .config's
> instance, and seems slightly less brittle to me.
>
>> However, I don't even see how this makes sense at all, given that
>> the actual frame size should be at least SELECT_STACK_ALLOC!
>
> I think the math checks out:
>
> #define FRONTEND_STACK_ALLOC 256
> #define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
> long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
>
> sizeof(long) == 4; // i386 ilp32
> sizeof(stack_fds) == sizeof(long) * 256 / sizeof(long) == 256
Ah right, I misread what the code actually does here.
>> - The behavior of -ftrivial-auto-var-init= is a bit odd here: with =zero or
>> =pattern, the stack usage is just below the limit (1020), without the
>> option it is up to 1044. It looks like your .config picks =zero, which
>> was dropped in the latest clang version, so it falls back to not
>
> Huh? What do you mean by "was dropped?"
>
> The config I sent has:
> CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
> CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_BARE=y
> CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
> # CONFIG_INIT_STACK_NONE is not set
> CONFIG_INIT_STACK_ALL_ZERO=y
When I use this config on my kernel tree (currently on top of
next-20220929 for unrelated work) and build with clang-16,
CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO is disabled, so it falls
back from CONFIG_INIT_STACK_NONE instead of the unavailable
CONFIG_INIT_STACK_ALL_ZERO.
> Disabling INIT_STACK_ALL_ZERO from the config provided doesn't elide
> the diagnostic.
> Enabling INIT_STACK_ALL_PATTERN does... explicit stack allocations in
> IR haven't changed. In the generated assembly we're pushing 3x 4B
> GPRs, subtracting 8B from the stack pointer, then another 1008B.
> (Filed: https://github.com/llvm/llvm-project/issues/58237)
> So that's 3 * 4B + 8B + 1008B == 1028. But CONFIG_FRAME_WARN is set
> to 1024. I wonder if this diagnostic is not as precise as it could
> be, or my math is wrong?
>
> It looks like for arrays INIT_STACK_ALL_PATTERN uses memset to fill
> the array with 0xFF rather than 0x00 used by INIT_STACK_ALL_ZERO. Not
> sure why that would make a difference, but curious that it does.
The warning sizes I see are:
zero: 1028
pattern: 1020
none: 1044
So you are right, even with =pattern I get the warning, even though it's
16 bytes less than the =none case I was looking at first.
Arnd
On Fri, Oct 7, 2022 at 2:43 PM Arnd Bergmann <[email protected]> wrote:
>
> On Fri, Oct 7, 2022, at 9:04 PM, Nick Desaulniers wrote:
> > On Fri, Oct 7, 2022 at 1:28 AM Arnd Bergmann <[email protected]> wrote:
> >> On Fri, Oct 7, 2022, at 12:21 AM, Nick Desaulniers wrote:
> >> > On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
> >>
> >> - If I mark 'do_select' as noinline_for_stack, the reported frame
> >> size is decreased a lot and is suddenly independent of
> >> -fsanitize=local-bounds:
> >> fs/select.c:625:5: error: stack frame size (336) exceeds limit (100) in 'core_sys_select' [-Werror,-Wframe-larger-than]
> >> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
> >> fs/select.c:479:21: error: stack frame size (684) exceeds limit (100) in 'do_select' [-Werror,-Wframe-larger-than]
> >> static noinline int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
> >
> > I think this approach makes the most sense to me; the caller
> > core_sys_select() has a large stack allocation `stack_fds`, and so
> > does the callee do_select with `table`. Add in inlining and long live
> > ranges and it makes sense that stack spills are going to tip us over
> > the threshold set by -Wframe-larger-than.
> >
> > Whether you make do_select() `noinline_for_stack` conditional on
> > additional configs like CC_IS_CLANG or CONFIG_UBSAN_LOCAL_BOUNDS is
> > perhaps also worth considering.
> >
> > How would you feel about a patch that:
> > 1. reverts commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
> > 2. marks do_select noinline_for_stack
> >
> > ?
>
> That is probably ok, but it does need proper testing to ensure that
> there are no performance regressions.
Any recommendations on how to do so?
> Do you know if gcc inlines the
> function by default? If not, we probably don't need to make it
> conditional.
Ah good idea. For i386 defconfig and x86_64 defconfig, it does not!
Here's how I tested that:
$ make -j128 defconfig fs/select.o
$ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o |
grep do_select
This seems to be affected by -fno-conserve-stack, a currently gcc-only
command line flag. If I remove that, then i386 defconfig will inline
do_select but x86_64 defconfig will not.
I have a sneaking suspicion that -fno-conserve-stack and
-Wframe-larger-than conspire in GCC to avoid inlining when doing so
would trip `-Wframe-larger-than` warnings, but it's just a conspiracy
theory; I haven't read the source. Probably should implement exactly
that behavior in LLVM.
I'll triple check 32b+64b arm configs next week to verify. But if GCC
is not inlining do_select into core_sys_select then I think my patch
https://lore.kernel.org/llvm/[email protected]/
is on the right track; probably could drop the 32b-only condition and
make a note of GCC in the commit message.
Also, my colleague Paul just whipped up a neat tool to help debug
-Wframe-larger-than.
https://reviews.llvm.org/D135488
See the output from my run here:
https://paste.debian.net/1256338/
It's a very early WIP, but I think it would be incredibly helpful to
have this, and will probably help us improve Clang's stack usage.
--
Thanks,
~Nick Desaulniers
On Fri, Oct 07, 2022 at 11:42:51PM +0200, Arnd Bergmann wrote:
> On Fri, Oct 7, 2022, at 9:04 PM, Nick Desaulniers wrote:
> > On Fri, Oct 7, 2022 at 1:28 AM Arnd Bergmann <[email protected]> wrote:
> >> - The behavior of -ftrivial-auto-var-init= is a bit odd here: with =zero or
> >> =pattern, the stack usage is just below the limit (1020), without the
> >> option it is up to 1044. It looks like your .config picks =zero, which
> >> was dropped in the latest clang version, so it falls back to not
> >
> > Huh? What do you mean by "was dropped?"
> >
> > The config I sent has:
> > CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
> > CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_BARE=y
> > CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
> > # CONFIG_INIT_STACK_NONE is not set
> > CONFIG_INIT_STACK_ALL_ZERO=y
>
> When I use this config on my kernel tree (currently on top of
> next-20220929 for unrelated work) and build with clang-16,
> CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO is disabled, so it falls
> back from CONFIG_INIT_STACK_NONE instead of the unavailable
> CONFIG_INIT_STACK_ALL_ZERO.
I think you have a very recent Clang but are building a tree that
doesn't have commit 607e57c6c62c ("hardening: Remove Clang's enable flag
for -ftrivial-auto-var-init=zero").
--
Kees Cook
On 10/6/2022 3:21 PM, Nick Desaulniers wrote:
> On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
>> The select() implementation is carefully tuned to put a sensible amount
>> of data on the stack for holding a copy of the user space fd_set,
>> but not too large to risk overflowing the kernel stack.
>>
>> When building a 32-bit kernel with clang, we need a little more space
>> than with gcc, which often triggers a warning:
>>
>> fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
>> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>>
>> I experimentally found that for 32-bit ARM, reducing the maximum
>> stack usage by 64 bytes keeps us reliably under the warning limit
>> again.
>>
>> Signed-off-by: Arnd Bergmann <[email protected]>
>> ---
>> include/linux/poll.h | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/include/linux/poll.h b/include/linux/poll.h
>> index 7e0fdcf905d2..1cdc32b1f1b0 100644
>> --- a/include/linux/poll.h
>> +++ b/include/linux/poll.h
>> @@ -16,7 +16,11 @@
>> extern struct ctl_table epoll_table[]; /* for sysctl */
>> /* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
>> additional memory. */
>> +#ifdef __clang__
>> +#define MAX_STACK_ALLOC 768
> Hi Arnd,
> Upon a toolchain upgrade for Android, our 32b x86 image used for
> first-party developer VMs started tripping -Wframe-larger-than= again
> (thanks -Werror) which is blocking our ability to upgrade our toolchain.
I wonder if there is a way to disable the warning or increase the
threshold just for this function. I don't think attribute optimize would
work, but perhaps some pragma?
-Andi
From: Nick Desaulniers
> Sent: 07 October 2022 20:04
...
> It looks like for arrays INIT_STACK_ALL_PATTERN uses memset to fill
> the array with 0xFF rather than 0x00 used by INIT_STACK_ALL_ZERO. Not
> sure why that would make a difference, but curious that it does.
How much performance does filling the array cost?
I'd suspect it is enough to be measurable.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Fri, Oct 07, 2022 at 01:11:40PM -0700, Nick Desaulniers wrote:
> Effectively a revert of
> commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
>
> Various configs can still push the stack useage of core_sys_select()
> over the CONFIG_FRAME_WARN threshold (1024B on 32b targets).
>
> fs/select.c:619:5: error: stack frame size of 1048 bytes in function
> 'core_sys_select' [-Werror,-Wframe-larger-than=]
Btw, I also see a warning here with all my KASAN x86_64 gcc builds.
On Mon, Oct 10, 2022 at 12:44 AM Christoph Hellwig <[email protected]> wrote:
>
> On Fri, Oct 07, 2022 at 01:11:40PM -0700, Nick Desaulniers wrote:
> > Effectively a revert of
> > commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
> >
> > Various configs can still push the stack useage of core_sys_select()
> > over the CONFIG_FRAME_WARN threshold (1024B on 32b targets).
> >
> > fs/select.c:619:5: error: stack frame size of 1048 bytes in function
> > 'core_sys_select' [-Werror,-Wframe-larger-than=]
>
> Btw, I also see a warning here with all my KASAN x86_64 gcc builds.
Thanks for the report. That might be another interesting data point;
I haven't been able to reproduce that locally though:
$ make -j$(nproc) defconfig
$ ./scripts/config -e KASAN
$ make -j$(nproc) olddefconfig fs/select.o
$ gcc --version | head -n1
gcc (Debian 12.2.0-1) 12.2.0
I also tried enabling CONFIG_KASAN_INLINE=y but couldn't reproduce.
Mind sending me your .config that reproduces this?
--
Thanks,
~Nick Desaulniers
On Fri, Oct 7, 2022 at 3:54 PM Nick Desaulniers <[email protected]> wrote:
>
> This seems to be affected by -fno-conserve-stack, a currently gcc-only
> command line flag. If I remove that, then i386 defconfig will inline
> do_select but x86_64 defconfig will not.
>
> I have a sneaking suspicion that -fno-conserve-stack and
> -Wframe-larger-than conspire in GCC to avoid inlining when doing so
> would trip `-Wframe-larger-than` warnings, but it's just a conspiracy
> theory; I haven't read the source. Probably should implement exactly
> that behavior in LLVM.
Sorry, that should have read `-fconserve-stack` (not `-fno-conserve-stack`).
Playing with:
https://godbolt.org/z/hE67j1Y9G
experimentally, it looks like irrespective of -Wframe-larger-than,
-fconserve-stack will try to avoid inlining callees if their frame
size is 512B or greater for x86-64 targets, and 410B or greater for
32b targets. aarch64 is 410B though, perhaps that's leftover from the
32b ARM port. There's probably more to the story there though.
> I'll triple check 32b+64b arm configs next week to verify. But if GCC
> is not inlining do_select into core_sys_select then I think my patch
> https://lore.kernel.org/llvm/[email protected]/
> is on the right track; probably could drop the 32b-only condition and
> make a note of GCC in the commit message.
arm64 does not inline do_select into core_sys_select with
aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
for defconfig.
$ CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make -j128 defconfig fs/select.o
$ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o |
grep do_select
1a48: 2e fb ff 97 bl 0x700 <do_select>
Same for 32b ARM.
arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110
$ CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm make -j128 defconfig fs/select.o
$ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o |
grep do_select
1620: 07 fc ff eb bl #-4068 <do_select>
Is there a set of configs or different compiler version for which
that's not the case? Perhaps. But it doesn't look like marking
do_select noinline_for_stack changes the default behavior for GCC
builds, which is good.
So it looks like it's just clang being aggressive with inlining since
it doesn't have -fconserve-stack. I think
https://lore.kernel.org/lkml/[email protected]/
is still on the right track, though I'd remove the 32b only guard for
v2.
Christophe mentioned something about KASAN and GCC. I failed to
reproduce, and didn't see any reports on lore that seemed relevant.
https://lore.kernel.org/lkml/[email protected]/
I'll wait a day to see if there's more info (a config that reproduces)
before sending a v2 though.
> Also, my colleague Paul just whipped up a neat tool to help debug
> -Wframe-larger-than.
> https://reviews.llvm.org/D135488
> See the output from my run here:
> https://paste.debian.net/1256338/
> It's a very early WIP, but I think it would be incredibly helpful to
> have this, and will probably help us improve Clang's stack usage.
Paul also mentioned that -finline-max-stacksize is a thing, at least for clang.
https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-finline-max-stacksize
Though this only landed recently
https://reviews.llvm.org/rG8564e2fea559 and wont ship until clang-16.
That feels like a large hammer for core_sys_select/do_select; I think
we can use a fine scalpel. But it might be interesting to use that
with KASAN.
--
Thanks,
~Nick Desaulniers
On Fri, Oct 7, 2022, at 10:11 PM, Nick Desaulniers wrote:
> +#ifdef CONFIG_64BIT
> +#define noinline_for_stack_32b
> +#else
> +#define noinline_for_stack_32b noinline_for_stack
> +#endif
> +
> +noinline_for_stack_32b
I don't see much value in making it behave differently for 32 bit:
it doesn't reduce the total frame size on 32-bit machines but only
hides the warning. The bug you are working around also looks i386
specific (because of limited number of registers vs ubsan needing
a lot of them), so just make it a simple 'noinline_for_stack'.
Arnd
On Mon, Oct 10, 2022, at 8:40 PM, Nick Desaulniers wrote:
> On Fri, Oct 7, 2022 at 3:54 PM Nick Desaulniers <[email protected]> wrote:
> $ CROSS_COMPILE=aarch64-linux-gnu- ARCH=arm64 make -j128 defconfig fs/select.o
> $ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o |
> grep do_select
> 1a48: 2e fb ff 97 bl 0x700 <do_select>
>
> Same for 32b ARM.
> arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110
>
> $ CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm make -j128 defconfig fs/select.o
> $ llvm-objdump -Dr --disassemble-symbols=core_sys_select fs/select.o |
> grep do_select
> 1620: 07 fc ff eb bl #-4068 <do_select>
>
> Is there a set of configs or different compiler version for which
> that's not the case? Perhaps. But it doesn't look like marking
> do_select noinline_for_stack changes the default behavior for GCC
> builds, which is good.
I checked all arm32 defconfigs, and all supported gcc versions for arm32,
they all behave the same.
> So it looks like it's just clang being aggressive with inlining since
> it doesn't have -fconserve-stack. I think
> https://lore.kernel.org/lkml/[email protected]/
> is still on the right track, though I'd remove the 32b only guard for
> v2.
I think it's again the difference between top-down and bottom-up inlining.
> Paul also mentioned that -finline-max-stacksize is a thing, at least
> for clang.
> https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-finline-max-stacksize
> Though this only landed recently
> https://reviews.llvm.org/rG8564e2fea559 and wont ship until clang-16.
> That feels like a large hammer for core_sys_select/do_select; I think
> we can use a fine scalpel. But it might be interesting to use that
> with KASAN.
It's an interesting question whether it would help or hurt with
KASAN_STACK: Normally the idea is that KASAN_STACK intentionally
makes stack slots inside of a function non-overlapping, similar
to the use-after-scope sanitizer that we no longer use because
it caused too many stack overflows. Making it inline less should
help reduce the actual stack consumption (not just the reported
usage) because it makes called functions reuse the same stack slots,
but it also makes KASAN_STACK less effective because of the same
thing.
If -finline-max-stacksize is the equivalent of gcc's
-fconserve-stack, we could of course just always enable that.
Arnd
Effectively a revert of
commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
Various configs can still push the stack useage of core_sys_select()
over the CONFIG_FRAME_WARN threshold (1024B on 32b targets).
fs/select.c:619:5: error: stack frame size of 1048 bytes in function
'core_sys_select' [-Werror,-Wframe-larger-than=]
core_sys_select() has a large stack allocation for `stack_fds` where it
tries to do something equivalent to "small string optimization" to
potentially avoid a kmalloc.
core_sys_select() calls do_select() which has another potentially large
stack allocation, `table`. Both of these values depend on
FRONTEND_STACK_ALLOC.
Mix those two large allocation with register spills which are
exacerbated by various configs and compiler versions and we can just
barely exceed the 1024B limit.
Rather than keep trying to find the right value of MAX_STACK_ALLOC or
FRONTEND_STACK_ALLOC, mark do_select() as noinline_for_stack.
The intent of FRONTEND_STACK_ALLOC is to help potentially avoid a
dynamic memory allocation. In that spirit, restore the previous
threshold but separate the stack frames.
Many tests of various configs for different architectures and various
versions of GCC were performed; do_select() was never inlined into
core_sys_select() or compat_core_sys_select(). The kernel is built with
the GCC specific flag `-fconserve-stack` which can limit inlining
depending on per-target thresholds of callee stack size, which helps
avoid the issue when using GCC. Clang is being more aggressive and not
considering the stack size when decided whether to inline or not. We may
consider using the clang-16+ flag `-finline-max-stacksize=` in the
future.
Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: ad312f95d41c ("fs/select: avoid clang stack usage warning")
Suggested-by: Arnd Bergmann <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
Changes v1 -> v2:
* Drop the 32b specific guard, since I could reproduce the no-inlining
w/ aarch64-linux-gnu-gcc-10 ARCH=arm64 defconfig, and per Arnd.
* Drop references to 32b in commit message.
* Add new paragraph in commit message at the end about -fconserve-stack
and -finline-max-stacksize=.
* s/malloc/kmalloc/ in commit message.
fs/select.c | 1 +
include/linux/poll.h | 4 ----
2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/fs/select.c b/fs/select.c
index 0ee55af1a55c..794e2a91b1fa 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -476,6 +476,7 @@ static inline void wait_key_set(poll_table *wait, unsigned long in,
wait->_key |= POLLOUT_SET;
}
+noinline_for_stack
static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
{
ktime_t expire, *to = NULL;
diff --git a/include/linux/poll.h b/include/linux/poll.h
index a9e0e1c2d1f2..d1ea4f3714a8 100644
--- a/include/linux/poll.h
+++ b/include/linux/poll.h
@@ -14,11 +14,7 @@
/* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
additional memory. */
-#ifdef __clang__
-#define MAX_STACK_ALLOC 768
-#else
#define MAX_STACK_ALLOC 832
-#endif
#define FRONTEND_STACK_ALLOC 256
#define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
#define POLL_STACK_ALLOC FRONTEND_STACK_ALLOC
--
2.38.0.rc2.412.g84df46c1b4-goog
On Fri, Oct 7, 2022 at 6:46 PM Andi Kleen <[email protected]> wrote:
>
>
> On 10/6/2022 3:21 PM, Nick Desaulniers wrote:
> > On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
> >> The select() implementation is carefully tuned to put a sensible amount
> >> of data on the stack for holding a copy of the user space fd_set,
> >> but not too large to risk overflowing the kernel stack.
> >>
> >> When building a 32-bit kernel with clang, we need a little more space
> >> than with gcc, which often triggers a warning:
> >>
> >> fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
> >> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
> >>
> >> I experimentally found that for 32-bit ARM, reducing the maximum
> >> stack usage by 64 bytes keeps us reliably under the warning limit
> >> again.
> >>
> >> Signed-off-by: Arnd Bergmann <[email protected]>
> >> ---
> >> include/linux/poll.h | 4 ++++
> >> 1 file changed, 4 insertions(+)
> >>
> >> diff --git a/include/linux/poll.h b/include/linux/poll.h
> >> index 7e0fdcf905d2..1cdc32b1f1b0 100644
> >> --- a/include/linux/poll.h
> >> +++ b/include/linux/poll.h
> >> @@ -16,7 +16,11 @@
> >> extern struct ctl_table epoll_table[]; /* for sysctl */
> >> /* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
> >> additional memory. */
> >> +#ifdef __clang__
> >> +#define MAX_STACK_ALLOC 768
> > Hi Arnd,
> > Upon a toolchain upgrade for Android, our 32b x86 image used for
> > first-party developer VMs started tripping -Wframe-larger-than= again
> > (thanks -Werror) which is blocking our ability to upgrade our toolchain.
>
>
> I wonder if there is a way to disable the warning or increase the
> threshold just for this function. I don't think attribute optimize would
> work, but perhaps some pragma?
Here's what I would have guessed, the pragma approach seems a little broken.
https://godbolt.org/z/vY7fGYv7f
Maybe I'm holding it wrong?
>
>
> -Andi
>
>
>
--
Thanks,
~Nick Desaulniers
Hi Nick,
I love your patch! Perhaps something to improve:
[auto build test WARNING on 93ed07a23fd08b8613f64cf0a15d7fbdaca010fd]
url: https://github.com/intel-lab-lkp/linux/commits/Nick-Desaulniers/fs-select-mark-do_select-noinline_for_stack-for-32b/20221008-041539
base: 93ed07a23fd08b8613f64cf0a15d7fbdaca010fd
patch link: https://lore.kernel.org/r/20221007201140.1744961-1-ndesaulniers%40google.com
patch subject: [PATCH] fs/select: mark do_select noinline_for_stack for 32b
config: powerpc-randconfig-c003-20221012
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 791a7ae1ba3efd6bca96338e10ffde557ba83920)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/58064d22945d0fed666adc4cef463401981e2719
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Nick-Desaulniers/fs-select-mark-do_select-noinline_for_stack-for-32b/20221008-041539
git checkout 58064d22945d0fed666adc4cef463401981e2719
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=powerpc SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All warnings (new ones prefixed by >>):
>> fs/select.c:981:12: warning: stack frame size (1056) exceeds limit (1024) in 'do_sys_poll' [-Wframe-larger-than]
static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
^
1 warning generated.
vim +/do_sys_poll +981 fs/select.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 977
70674f95c0a2ea Andi Kleen 2006-03-28 978 #define N_STACK_PPS ((sizeof(stack_pps) - sizeof(struct poll_list)) / \
70674f95c0a2ea Andi Kleen 2006-03-28 979 sizeof(struct pollfd))
70674f95c0a2ea Andi Kleen 2006-03-28 980
e99ca56ce03dd9 Al Viro 2017-04-08 @981 static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
766b9f928bd5b9 Deepa Dinamani 2016-05-19 982 struct timespec64 *end_time)
^1da177e4c3f41 Linus Torvalds 2005-04-16 983 {
^1da177e4c3f41 Linus Torvalds 2005-04-16 984 struct poll_wqueues table;
43e11fa2d1d3b6 Gustavo A. R. Silva 2019-07-16 985 int err = -EFAULT, fdcount, len;
30c14e40ed8546 Jes Sorensen 2006-03-31 986 /* Allocate small arguments on the stack to save memory and be
30c14e40ed8546 Jes Sorensen 2006-03-31 987 faster - use long to make sure the buffer is aligned properly
30c14e40ed8546 Jes Sorensen 2006-03-31 988 on 64 bit archs to avoid unaligned access */
30c14e40ed8546 Jes Sorensen 2006-03-31 989 long stack_pps[POLL_STACK_ALLOC/sizeof(long)];
252e5725cfb55a Oleg Nesterov 2007-10-16 990 struct poll_list *const head = (struct poll_list *)stack_pps;
252e5725cfb55a Oleg Nesterov 2007-10-16 991 struct poll_list *walk = head;
252e5725cfb55a Oleg Nesterov 2007-10-16 992 unsigned long todo = nfds;
^1da177e4c3f41 Linus Torvalds 2005-04-16 993
d554ed895dc8f2 Jiri Slaby 2010-03-05 994 if (nfds > rlimit(RLIMIT_NOFILE))
^1da177e4c3f41 Linus Torvalds 2005-04-16 995 return -EINVAL;
^1da177e4c3f41 Linus Torvalds 2005-04-16 996
252e5725cfb55a Oleg Nesterov 2007-10-16 997 len = min_t(unsigned int, nfds, N_STACK_PPS);
252e5725cfb55a Oleg Nesterov 2007-10-16 998 for (;;) {
252e5725cfb55a Oleg Nesterov 2007-10-16 999 walk->next = NULL;
252e5725cfb55a Oleg Nesterov 2007-10-16 1000 walk->len = len;
252e5725cfb55a Oleg Nesterov 2007-10-16 1001 if (!len)
252e5725cfb55a Oleg Nesterov 2007-10-16 1002 break;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1003
252e5725cfb55a Oleg Nesterov 2007-10-16 1004 if (copy_from_user(walk->entries, ufds + nfds-todo,
252e5725cfb55a Oleg Nesterov 2007-10-16 1005 sizeof(struct pollfd) * walk->len))
^1da177e4c3f41 Linus Torvalds 2005-04-16 1006 goto out_fds;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1007
252e5725cfb55a Oleg Nesterov 2007-10-16 1008 todo -= walk->len;
252e5725cfb55a Oleg Nesterov 2007-10-16 1009 if (!todo)
252e5725cfb55a Oleg Nesterov 2007-10-16 1010 break;
252e5725cfb55a Oleg Nesterov 2007-10-16 1011
252e5725cfb55a Oleg Nesterov 2007-10-16 1012 len = min(todo, POLLFD_PER_PAGE);
43e11fa2d1d3b6 Gustavo A. R. Silva 2019-07-16 1013 walk = walk->next = kmalloc(struct_size(walk, entries, len),
0bcfe68b876748 Linus Torvalds 2021-09-07 1014 GFP_KERNEL);
252e5725cfb55a Oleg Nesterov 2007-10-16 1015 if (!walk) {
252e5725cfb55a Oleg Nesterov 2007-10-16 1016 err = -ENOMEM;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1017 goto out_fds;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1018 }
^1da177e4c3f41 Linus Torvalds 2005-04-16 1019 }
9f72949f679df0 David Woodhouse 2006-01-18 1020
252e5725cfb55a Oleg Nesterov 2007-10-16 1021 poll_initwait(&table);
ccec5ee302d5cb Mateusz Guzik 2016-01-06 1022 fdcount = do_poll(head, &table, end_time);
252e5725cfb55a Oleg Nesterov 2007-10-16 1023 poll_freewait(&table);
^1da177e4c3f41 Linus Torvalds 2005-04-16 1024
ef0ba05538299f Linus Torvalds 2021-01-07 1025 if (!user_write_access_begin(ufds, nfds * sizeof(*ufds)))
ef0ba05538299f Linus Torvalds 2021-01-07 1026 goto out_fds;
ef0ba05538299f Linus Torvalds 2021-01-07 1027
252e5725cfb55a Oleg Nesterov 2007-10-16 1028 for (walk = head; walk; walk = walk->next) {
^1da177e4c3f41 Linus Torvalds 2005-04-16 1029 struct pollfd *fds = walk->entries;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1030 int j;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1031
ef0ba05538299f Linus Torvalds 2021-01-07 1032 for (j = walk->len; j; fds++, ufds++, j--)
ef0ba05538299f Linus Torvalds 2021-01-07 1033 unsafe_put_user(fds->revents, &ufds->revents, Efault);
^1da177e4c3f41 Linus Torvalds 2005-04-16 1034 }
ef0ba05538299f Linus Torvalds 2021-01-07 1035 user_write_access_end();
252e5725cfb55a Oleg Nesterov 2007-10-16 1036
^1da177e4c3f41 Linus Torvalds 2005-04-16 1037 err = fdcount;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1038 out_fds:
252e5725cfb55a Oleg Nesterov 2007-10-16 1039 walk = head->next;
252e5725cfb55a Oleg Nesterov 2007-10-16 1040 while (walk) {
252e5725cfb55a Oleg Nesterov 2007-10-16 1041 struct poll_list *pos = walk;
252e5725cfb55a Oleg Nesterov 2007-10-16 1042 walk = walk->next;
252e5725cfb55a Oleg Nesterov 2007-10-16 1043 kfree(pos);
^1da177e4c3f41 Linus Torvalds 2005-04-16 1044 }
252e5725cfb55a Oleg Nesterov 2007-10-16 1045
^1da177e4c3f41 Linus Torvalds 2005-04-16 1046 return err;
ef0ba05538299f Linus Torvalds 2021-01-07 1047
ef0ba05538299f Linus Torvalds 2021-01-07 1048 Efault:
ef0ba05538299f Linus Torvalds 2021-01-07 1049 user_write_access_end();
ef0ba05538299f Linus Torvalds 2021-01-07 1050 err = -EFAULT;
ef0ba05538299f Linus Torvalds 2021-01-07 1051 goto out_fds;
^1da177e4c3f41 Linus Torvalds 2005-04-16 1052 }
9f72949f679df0 David Woodhouse 2006-01-18 1053
--
0-DAY CI Kernel Test Service
https://01.org/lkp
>> I wonder if there is a way to disable the warning or increase the
>> threshold just for this function. I don't think attribute optimize would
>> work, but perhaps some pragma?
> Here's what I would have guessed, the pragma approach seems a little broken.
> https://godbolt.org/z/vY7fGYv7f
> Maybe I'm holding it wrong?
Thanks. Looks like the warning setting is not propagated to the point
where the warning is generated in the backend. Could file a gcc bug on
that, but yes the solution won't work for now.
-Andi
On Tue, Oct 11, 2022 at 01:55:47PM -0700, Nick Desaulniers wrote:
> Effectively a revert of
> commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
>
> Various configs can still push the stack useage of core_sys_select()
> over the CONFIG_FRAME_WARN threshold (1024B on 32b targets).
>
> fs/select.c:619:5: error: stack frame size of 1048 bytes in function
> 'core_sys_select' [-Werror,-Wframe-larger-than=]
>
> core_sys_select() has a large stack allocation for `stack_fds` where it
> tries to do something equivalent to "small string optimization" to
> potentially avoid a kmalloc.
>
> core_sys_select() calls do_select() which has another potentially large
> stack allocation, `table`. Both of these values depend on
> FRONTEND_STACK_ALLOC.
>
> Mix those two large allocation with register spills which are
> exacerbated by various configs and compiler versions and we can just
> barely exceed the 1024B limit.
>
> Rather than keep trying to find the right value of MAX_STACK_ALLOC or
> FRONTEND_STACK_ALLOC, mark do_select() as noinline_for_stack.
>
> The intent of FRONTEND_STACK_ALLOC is to help potentially avoid a
> dynamic memory allocation. In that spirit, restore the previous
> threshold but separate the stack frames.
>
> Many tests of various configs for different architectures and various
> versions of GCC were performed; do_select() was never inlined into
> core_sys_select() or compat_core_sys_select(). The kernel is built with
> the GCC specific flag `-fconserve-stack` which can limit inlining
> depending on per-target thresholds of callee stack size, which helps
> avoid the issue when using GCC. Clang is being more aggressive and not
> considering the stack size when decided whether to inline or not. We may
> consider using the clang-16+ flag `-finline-max-stacksize=` in the
> future.
>
> Link: https://lore.kernel.org/lkml/[email protected]/
> Fixes: ad312f95d41c ("fs/select: avoid clang stack usage warning")
> Suggested-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
> ---
> Changes v1 -> v2:
> * Drop the 32b specific guard, since I could reproduce the no-inlining
> w/ aarch64-linux-gnu-gcc-10 ARCH=arm64 defconfig, and per Arnd.
> * Drop references to 32b in commit message.
> * Add new paragraph in commit message at the end about -fconserve-stack
> and -finline-max-stacksize=.
> * s/malloc/kmalloc/ in commit message.
>
> fs/select.c | 1 +
> include/linux/poll.h | 4 ----
> 2 files changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/fs/select.c b/fs/select.c
> index 0ee55af1a55c..794e2a91b1fa 100644
> --- a/fs/select.c
> +++ b/fs/select.c
> @@ -476,6 +476,7 @@ static inline void wait_key_set(poll_table *wait, unsigned long in,
> wait->_key |= POLLOUT_SET;
> }
>
> +noinline_for_stack
> static int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
> {
> ktime_t expire, *to = NULL;
> diff --git a/include/linux/poll.h b/include/linux/poll.h
> index a9e0e1c2d1f2..d1ea4f3714a8 100644
> --- a/include/linux/poll.h
> +++ b/include/linux/poll.h
> @@ -14,11 +14,7 @@
>
> /* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
> additional memory. */
> -#ifdef __clang__
> -#define MAX_STACK_ALLOC 768
> -#else
> #define MAX_STACK_ALLOC 832
> -#endif
> #define FRONTEND_STACK_ALLOC 256
> #define SELECT_STACK_ALLOC FRONTEND_STACK_ALLOC
> #define POLL_STACK_ALLOC FRONTEND_STACK_ALLOC
> --
> 2.38.0.rc2.412.g84df46c1b4-goog
>
>