2022-08-12 06:22:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 1/5] sysfs: Add attribute info for /sys/devices/.../coredump_disabled

On Thu, Aug 11, 2022 at 04:21:54PM -0700, Manish Mandlik wrote:
> On Wed, Aug 10, 2022 at 9:21 AM Greg Kroah-Hartman <
> [email protected]> wrote:
>
> > On Wed, Aug 10, 2022 at 06:03:37PM +0200, Johannes Berg wrote:
> > > On Wed, 2022-08-10 at 09:00 -0700, Manish Mandlik wrote:
> > > > This patch adds the specification for
> > /sys/devices/.../coredump_disabled
> > > > attribute which allows the userspace to enable/disable devcoredump for
> > a
> > > > particular device and drivers can use it to enable/disable
> > functionality
> > > > accordingly. It is available when the CONFIG_DEV_COREDUMP is enabled
> > and
> > > > driver has implemented the .coredump() callback.
> > > >
> > >
> > > It would be nice to say _why_? What problem does this solve? You could
> > > just create the dump and discard it, instead, for example?
> >
> > Agreed, I do not understand the need for this at all.
> >
>
> The existing /sys/class/devcoredump/disabled (devcd) switch has two
> limitations - it disables dev_coredump for everyone who's using it;

Which is good and is the design of the thing.

> and
> drivers don't have visibility if devcd is disabled or not, so, the
> dev_coredump API simply lets drivers collect the coredump from a device but
> then later discards it if devcd is disabled.

Why would a driver care?

> Now that there are more subsystems using the base dev_coredump API, having
> a granular control will make it easier to selectively disable dev_coredump
> only for a particular device. For ChromeOS, this is useful to allow drivers
> to develop coredump functionality and deploy it without affecting other
> drivers with stable devcoredump implementations (example, we've had some
> devcoredumps that take 12s to run and we only want to enable it on test
> builds because it has lots of PII). The drivers can use this flag to
> refrain from collecting or triggering coredump when undesirable.

This feels odd. You have various out-of-tree drivers that take too long
when they crash to make a dump and it causes some unknown issue
elsewhere?

I don't really understand, sorry.

If you need something for development of a system, that's one thing, but
this feels like you are adding fine-grained tweaks that no one in a real
system would ever use.

What is broken with the current system of "on/off" that does not work
for you now? Why would a normal user only want to turn this on for one
driver and not another?

thanks,

greg k-h


2022-08-13 09:40:53

by Arend Van Spriel

[permalink] [raw]
Subject: Re: [PATCH v5 1/5] sysfs: Add attribute info for /sys/devices/.../coredump_disabled

On August 12, 2022 8:09:59 AM Greg Kroah-Hartman
<[email protected]> wrote:

> On Thu, Aug 11, 2022 at 04:21:54PM -0700, Manish Mandlik wrote:
>> On Wed, Aug 10, 2022 at 9:21 AM Greg Kroah-Hartman <
>> [email protected]> wrote:
>>
>>> On Wed, Aug 10, 2022 at 06:03:37PM +0200, Johannes Berg wrote:
>>>> On Wed, 2022-08-10 at 09:00 -0700, Manish Mandlik wrote:
>>>>> This patch adds the specification for
>>> /sys/devices/.../coredump_disabled
>>>>> attribute which allows the userspace to enable/disable devcoredump for
>>> a
>>>>> particular device and drivers can use it to enable/disable
>>> functionality
>>>>> accordingly. It is available when the CONFIG_DEV_COREDUMP is enabled
>>> and
>>>>> driver has implemented the .coredump() callback.
>>>>
>>>> It would be nice to say _why_? What problem does this solve? You could
>>>> just create the dump and discard it, instead, for example?
>>>
>>> Agreed, I do not understand the need for this at all.
>>
>> The existing /sys/class/devcoredump/disabled (devcd) switch has two
>> limitations - it disables dev_coredump for everyone who's using it;
>
> Which is good and is the design of the thing.
>
>> and
>> drivers don't have visibility if devcd is disabled or not, so, the
>> dev_coredump API simply lets drivers collect the coredump from a device but
>> then later discards it if devcd is disabled.
>
> Why would a driver care?
>
>> Now that there are more subsystems using the base dev_coredump API, having
>> a granular control will make it easier to selectively disable dev_coredump
>> only for a particular device. For ChromeOS, this is useful to allow drivers
>> to develop coredump functionality and deploy it without affecting other
>> drivers with stable devcoredump implementations (example, we've had some
>> devcoredumps that take 12s to run and we only want to enable it on test
>> builds because it has lots of PII). The drivers can use this flag to
>> refrain from collecting or triggering coredump when undesirable.
>
> This feels odd. You have various out-of-tree drivers that take too long
> when they crash to make a dump and it causes some unknown issue
> elsewhere?

If you have drivers taking 12s for coredump you could/should consider doing
it asynchronous, eg. schedule a worker for it. The coredump callback has
void return type so it would be fairly easy.

Regards,
Arend