2018-07-23 09:20:43

by Mukesh Ojha

[permalink] [raw]
Subject: Re: Issue related cpuhotplug failure path on 4.9.x version

Adding stable and lkml.

Sorry for spam others.

-Mukesh

On 7/23/2018 1:57 PM, Mukesh Ojha wrote:
>
> Hi All,
>
> I wanted to discuss about one of the corner case exists in 4.9 kernel
> (4.9.x) where
> If hotplug of one of the CPU fails due to failure in one of the callback,
> which is to be called after "notify:online"(as notify_online will
> create sysfs nodes
> for the hotplug cpu) .
>
> So, while cleaning up notify_dead() does not get called as step
> <https://elixir.bootlin.com/linux/v4.9/ident/step>->skip_onerr set to
> true for "notify:prepare"and due to that sysfs nodes of that cpu does
> not get
> cleaned up which can cause issue in next hotplug attempt of that cpu.
>
>                                                    Fails
> cpuhp_up_callbacks
> <https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_up_callbacks> =>
> cpuhp_invoke_callback
> <https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_invoke_callback> =>
> undo_cpu_up <https://elixir.bootlin.com/linux/v4.9/ident/undo_cpu_up>
>
> .name = "notify:prepare",
> .teardown.single = notify_dead
> <https://elixir.bootlin.com/linux/v4.9/ident/notify_dead>,
> .skip_onerr = true,
>
> I think the possible solution here could be to remove the
> -            .skip_onerr = true,
>
> for "notify:prepare"so that CPU_DEAD notification get send.
>
> Please, feel free to suggest if it has any side-effect as i don't feel
> any.
>
> Ref:
>
> https://elixir.bootlin.com/linux/v4.9/source/kernel/cpu.c#L458
>
> Cheers,
> Mukesh
>
>
>
>
>
>


2018-07-23 09:41:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Issue related cpuhotplug failure path on 4.9.x version

On Mon, Jul 23, 2018 at 02:49:27PM +0530, Mukesh Ojha wrote:
> On 7/23/2018 1:57 PM, Mukesh Ojha wrote:
> >
> > Hi All,
> >
> > I wanted to discuss about one of the corner case exists in 4.9 kernel
> > (4.9.x) where

4.9 is over 1 1/2 years old now. Please test this on 4.17 or better
yet, 4.18-rc as the cpu hotplug path has been radically cleaned up and
changed since 4.9 was released.

If you are stuck with 4.9 for some odd reason, sorry, but you really are
on your own. I recommend going and asking for support from the people
that are forcing you to stick with that kernel.

good luck!

greg k-h