2022-09-30 00:29:39

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] nvme-hwmon: Return error on kzalloc failure

On Fri, Sep 30, 2022 at 01:46:46AM +0300, Serge Semin wrote:
> Inability to allocate a buffer is a critical error which shouldn't be
> tolerated since most likely the rest of the driver won't work correctly.
> Thus instead of returning the zero status let's return the -ENOMEM error
> if the nvme_hwmon_data structure instance couldn't be created.

Nak for this one. The hwmon is not necessary for the rest of the driver to
function, so having the driver detach from the device seems a bit harsh. The
driver can participate in memory reclaim, so failing on a low memory condition
can make matters worse.

The rest looks good, though.

> @@ -230,7 +230,7 @@ int nvme_hwmon_init(struct nvme_ctrl *ctrl)
>
> data = kzalloc(sizeof(*data), GFP_KERNEL);
> if (!data)
> - return 0;
> + return -ENOMEM;
>
> data->ctrl = ctrl;
> mutex_init(&data->read_lock);
> --


2022-09-30 10:58:35

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] nvme-hwmon: Return error on kzalloc failure

On Thu, Sep 29, 2022 at 05:53:43PM -0600, Keith Busch wrote:
> On Fri, Sep 30, 2022 at 01:46:46AM +0300, Serge Semin wrote:
> > Inability to allocate a buffer is a critical error which shouldn't be
> > tolerated since most likely the rest of the driver won't work correctly.
> > Thus instead of returning the zero status let's return the -ENOMEM error
> > if the nvme_hwmon_data structure instance couldn't be created.
>

> Nak for this one. The hwmon is not necessary for the rest of the driver to
> function, so having the driver detach from the device seems a bit harsh.

Even if it is as you say, neither the method semantic nor the way it's
called imply that. Any failures except the allocation one are
perceived as erroneous.

> The
> driver can participate in memory reclaim, so failing on a low memory condition
> can make matters worse.

Yes it can, so can many other places in the driver utilizing kmalloc
with just GFP_KERNEL flag passed including on the same path as the
nvme_hwmon_init() execution. Kmalloc will make sure the reclaim is
either finished or executed in background anyway in all cases. Don't
really see why memory allocation failure is less worse in this case
than in many others in the same driver especially seeing as I said
above the method semantic doesn't imply the optional feature
detection. Moreover the allocated structure isn't huge at all. So
failing to allocate that would indeed mean problems with the memory
resource.

>
> The rest looks good, though.

but you ok with kmalloc in the next line. Seems like contradicting.

@Christoph, what do you think about this?

-Sergey

>
> > @@ -230,7 +230,7 @@ int nvme_hwmon_init(struct nvme_ctrl *ctrl)
> >
> > data = kzalloc(sizeof(*data), GFP_KERNEL);
> > if (!data)
> > - return 0;
> > + return -ENOMEM;
> >
> > data->ctrl = ctrl;
> > mutex_init(&data->read_lock);
> > --

2022-09-30 15:02:08

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] nvme-hwmon: Return error on kzalloc failure

On Fri, Sep 30, 2022 at 12:52:47PM +0300, Serge Semin wrote:
> On Thu, Sep 29, 2022 at 05:53:43PM -0600, Keith Busch wrote:
> > On Fri, Sep 30, 2022 at 01:46:46AM +0300, Serge Semin wrote:
> > > Inability to allocate a buffer is a critical error which shouldn't be
> > > tolerated since most likely the rest of the driver won't work correctly.
> > > Thus instead of returning the zero status let's return the -ENOMEM error
> > > if the nvme_hwmon_data structure instance couldn't be created.
> >
>
> > Nak for this one. The hwmon is not necessary for the rest of the driver to
> > function, so having the driver detach from the device seems a bit harsh.
>
> Even if it is as you say, neither the method semantic nor the way it's
> called imply that. Any failures except the allocation one are
> perceived as erroneous.

This is called by nvme_init_ctrl_finish(), and returns the error to
nvme_reset_work() only if it's < 0, which indicates we can't go on and the
driver unbinds.

This particular condition for hwmon is not something that prevents us from
making forward progress.

> > The
> > driver can participate in memory reclaim, so failing on a low memory condition
> > can make matters worse.
>
> Yes it can, so can many other places in the driver utilizing kmalloc
> with just GFP_KERNEL flag passed including on the same path as the
> nvme_hwmon_init() execution. Kmalloc will make sure the reclaim is
> either finished or executed in background anyway in all cases.

This path is in the first initialization before we've set up a namespace that
can be used as a reclaim destination.

> Don't
> really see why memory allocation failure is less worse in this case
> than in many others in the same driver especially seeing as I said

The other initialization kmalloc's are required to make forward progress toward
setting up a namespace. This one is not required.

2022-10-04 15:37:03

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] nvme-hwmon: Return error on kzalloc failure

On Fri, Sep 30, 2022 at 08:57:02AM -0600, Keith Busch wrote:
> On Fri, Sep 30, 2022 at 12:52:47PM +0300, Serge Semin wrote:
> > On Thu, Sep 29, 2022 at 05:53:43PM -0600, Keith Busch wrote:
> > > On Fri, Sep 30, 2022 at 01:46:46AM +0300, Serge Semin wrote:
> > > > Inability to allocate a buffer is a critical error which shouldn't be
> > > > tolerated since most likely the rest of the driver won't work correctly.
> > > > Thus instead of returning the zero status let's return the -ENOMEM error
> > > > if the nvme_hwmon_data structure instance couldn't be created.
> > >
> >
> > > Nak for this one. The hwmon is not necessary for the rest of the driver to
> > > function, so having the driver detach from the device seems a bit harsh.
> >
> > Even if it is as you say, neither the method semantic nor the way it's
> > called imply that. Any failures except the allocation one are
> > perceived as erroneous.
>

> This is called by nvme_init_ctrl_finish(), and returns the error to
> nvme_reset_work() only if it's < 0, which indicates we can't go on and the
> driver unbinds.

That's obvious. One of the my question was that what makes the no
memory error different from the rest of the errors causing the
nvme_hwmon_init() method to fail?

>
> This particular condition for hwmon is not something that prevents us from
> making forward progress.

If you consider the hwmon functionality as optional (AFAIU you are),
then just ignore the return value no matter the reason. If the problem
caused the hwmon initialization process to fail turns to be critical
it will be raised in some other place which is required for the NVME
driver to work properly. Otherwise the hwmon module initialization may
still cause the probe procedure to halt, which makes it not optional.
That's what I meant when was saying about "the function and its
caller semantics not implying that".

>
> > > The
> > > driver can participate in memory reclaim, so failing on a low memory condition
> > > can make matters worse.
> >
> > Yes it can, so can many other places in the driver utilizing kmalloc
> > with just GFP_KERNEL flag passed including on the same path as the
> > nvme_hwmon_init() execution. Kmalloc will make sure the reclaim is
> > either finished or executed in background anyway in all cases.
>
> This path is in the first initialization before we've set up a namespace that
> can be used as a reclaim destination.
>
> > Don't
> > really see why memory allocation failure is less worse in this case
> > than in many others in the same driver especially seeing as I said
>
> The other initialization kmalloc's are required to make forward progress toward
> setting up a namespace. This one is not required.

Anyway what you say seems still contradicting. First you said that the
hwmon functionality was optional, but the only error being ignored was
the no-memory one which was very rare and turned to be not ignored in
the most of the other places. Second you got to accept the second
patch of the series, which introduced a one more kmalloc followed
right after the first one in the same function nvme_hwmon_init(). That
kmalloc failure wasn't ignored but caused the nvme_hwmon_init()
function to return an error. If you suggest to forget about the first
part (which IMO still counts, but AFAICS is a common pattern in the
NVME core driver, i.e. nvme_configure_apst() and
nvme_configure_host_options()), the second part still applies.

-Sergey

2022-10-04 18:19:20

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] nvme-hwmon: Return error on kzalloc failure

On Tue, Oct 04, 2022 at 05:50:49PM +0300, Serge Semin wrote:
> >
> > This particular condition for hwmon is not something that prevents us from
> > making forward progress.
>
> If you consider the hwmon functionality as optional (AFAIU you are),
> then just ignore the return value no matter the reason.

That is not an option. This function does IO, and the controller may not be
usable on the other side of that, which means initialization must abort. We
can't just ignore errors; we just don't need to report errors that don't
prevent forward progress.

> If the problem
> caused the hwmon initialization process to fail turns to be critical
> it will be raised in some other place which is required for the NVME
> driver to work properly. Otherwise the hwmon module initialization may
> still cause the probe procedure to halt, which makes it not optional.
> That's what I meant when was saying about "the function and its
> caller semantics not implying that".
>
> >
> > > > The
> > > > driver can participate in memory reclaim, so failing on a low memory condition
> > > > can make matters worse.
> > >
> > > Yes it can, so can many other places in the driver utilizing kmalloc
> > > with just GFP_KERNEL flag passed including on the same path as the
> > > nvme_hwmon_init() execution. Kmalloc will make sure the reclaim is
> > > either finished or executed in background anyway in all cases.
> >
> > This path is in the first initialization before we've set up a namespace that
> > can be used as a reclaim destination.
> >
> > > Don't
> > > really see why memory allocation failure is less worse in this case
> > > than in many others in the same driver especially seeing as I said
> >
> > The other initialization kmalloc's are required to make forward progress toward
> > setting up a namespace. This one is not required.
>
> Anyway what you say seems still contradicting. First you said that the
> hwmon functionality was optional, but the only error being ignored was
> the no-memory one which was very rare and turned to be not ignored in
> the most of the other places.

> Second you got to accept the second
> patch of the series, which introduced a one more kmalloc followed
> right after the first one in the same function nvme_hwmon_init().

My comments on this patch were intended to be applied to all similiarly added
uses.

2022-10-07 10:28:34

by Serge Semin

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] nvme-hwmon: Return error on kzalloc failure

On Tue, Oct 04, 2022 at 11:33:44AM -0600, Keith Busch wrote:
> On Tue, Oct 04, 2022 at 05:50:49PM +0300, Serge Semin wrote:
> > >
> > > This particular condition for hwmon is not something that prevents us from
> > > making forward progress.
> >
> > If you consider the hwmon functionality as optional (AFAIU you are),
> > then just ignore the return value no matter the reason.
>
> That is not an option. This function does IO, and the controller may not be
> usable on the other side of that, which means initialization must abort. We
> can't just ignore errors; we just don't need to report errors that don't
> prevent forward progress.
>
> > If the problem
> > caused the hwmon initialization process to fail turns to be critical
> > it will be raised in some other place which is required for the NVME
> > driver to work properly. Otherwise the hwmon module initialization may
> > still cause the probe procedure to halt, which makes it not optional.
> > That's what I meant when was saying about "the function and its
> > caller semantics not implying that".
> >
> > >
> > > > > The
> > > > > driver can participate in memory reclaim, so failing on a low memory condition
> > > > > can make matters worse.
> > > >
> > > > Yes it can, so can many other places in the driver utilizing kmalloc
> > > > with just GFP_KERNEL flag passed including on the same path as the
> > > > nvme_hwmon_init() execution. Kmalloc will make sure the reclaim is
> > > > either finished or executed in background anyway in all cases.
> > >
> > > This path is in the first initialization before we've set up a namespace that
> > > can be used as a reclaim destination.
> > >
> > > > Don't
> > > > really see why memory allocation failure is less worse in this case
> > > > than in many others in the same driver especially seeing as I said
> > >
> > > The other initialization kmalloc's are required to make forward progress toward
> > > setting up a namespace. This one is not required.
> >
> > Anyway what you say seems still contradicting. First you said that the
> > hwmon functionality was optional, but the only error being ignored was
> > the no-memory one which was very rare and turned to be not ignored in
> > the most of the other places.
>
> > Second you got to accept the second
> > patch of the series, which introduced a one more kmalloc followed
> > right after the first one in the same function nvme_hwmon_init().
>

> My comments on this patch were intended to be applied to all similiarly added
> uses.

How could I've possibly got that from your "The rest looks good,
though." and nacking only this patch with the message "The hwmon is
not necessary for the rest of the driver..." ?

Anyway due to all these uncertainties it's better to have a second
opinion on the patches before re-spining the series.

@Christoph, since you've already started reviewing the pathchset could
you have a look at the patches #1 and #2 of the series? Please note
the @Keith' comments regarding the memory allocation failure handling
in there.

-Sergey