2021-06-14 15:54:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context

On Mon, Jun 14, 2021 at 11:23:06AM -0400, Waiman Long wrote:
> Make /sys/devices/system/cpu/cpu<n>/online file to show a cpu as
> offline if it is not a valid cpu in a proper cpuset context when the
> cpuset_bound_cpuinfo sysctl parameter is turned on.

This says _what_ you are doing, but I do not understand _why_ you want
to do this.

What is going to use this information? And now you are showing more
files than you previously did, so what userspace tool is now going to
break?



>
> Signed-off-by: Waiman Long <[email protected]>
> ---
> drivers/base/core.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 54ba506e5a89..176b927fade2 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -29,6 +29,7 @@
> #include <linux/sched/mm.h>
> #include <linux/sysfs.h>
> #include <linux/dma-map-ops.h> /* for dma_default_coherent */
> +#include <linux/cpuset.h>
>
> #include "base.h"
> #include "power/power.h"
> @@ -2378,11 +2379,24 @@ static ssize_t uevent_store(struct device *dev, struct device_attribute *attr,
> }
> static DEVICE_ATTR_RW(uevent);
>
> +static bool is_device_cpu(struct device *dev)
> +{
> + return dev->bus && dev->bus->dev_name
> + && !strcmp(dev->bus->dev_name, "cpu");
> +}

No, this is not ok, there is a reason we did not put RTTI in struct
devices, so don't try to fake one here please.

> +
> static ssize_t online_show(struct device *dev, struct device_attribute *attr,
> char *buf)
> {
> bool val;
>
> + /*
> + * Show a cpu as offline if the cpu number is not valid in a
> + * proper cpuset bounding cpuinfo context.
> + */
> + if (is_device_cpu(dev) && !cpuset_current_cpu_valid(dev->id))
> + return sysfs_emit(buf, "0\n");

Why are you changing the driver core for a single random, tiny set of
devices? The device code for those devices can handle this just fine,
do NOT modify the driver core for each individual driver type, that way
lies madness.

This change is not ok, sorry.

greg k-h


2021-06-14 16:36:28

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context

On 6/14/21 11:52 AM, Greg KH wrote:
> On Mon, Jun 14, 2021 at 11:23:06AM -0400, Waiman Long wrote:
>> Make /sys/devices/system/cpu/cpu<n>/online file to show a cpu as
>> offline if it is not a valid cpu in a proper cpuset context when the
>> cpuset_bound_cpuinfo sysctl parameter is turned on.
> This says _what_ you are doing, but I do not understand _why_ you want
> to do this.
>
> What is going to use this information? And now you are showing more
> files than you previously did, so what userspace tool is now going to
> break?

One reason that is provided by the customer asking for this
functionality is because some applications use the number of cpu cores
for licensing purpose. Even though the applications are running in a
container with a smaller set of cpus, they may still charge as if all
the cpus are available. They ended up using a bind mount to mount over
the cpuX/online file.

I should have included this information in the patchset.


>
>
>> Signed-off-by: Waiman Long <[email protected]>
>> ---
>> drivers/base/core.c | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 54ba506e5a89..176b927fade2 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -29,6 +29,7 @@
>> #include <linux/sched/mm.h>
>> #include <linux/sysfs.h>
>> #include <linux/dma-map-ops.h> /* for dma_default_coherent */
>> +#include <linux/cpuset.h>
>>
>> #include "base.h"
>> #include "power/power.h"
>> @@ -2378,11 +2379,24 @@ static ssize_t uevent_store(struct device *dev, struct device_attribute *attr,
>> }
>> static DEVICE_ATTR_RW(uevent);
>>
>> +static bool is_device_cpu(struct device *dev)
>> +{
>> + return dev->bus && dev->bus->dev_name
>> + && !strcmp(dev->bus->dev_name, "cpu");
>> +}
> No, this is not ok, there is a reason we did not put RTTI in struct
> devices, so don't try to fake one here please.
>
>> +
>> static ssize_t online_show(struct device *dev, struct device_attribute *attr,
>> char *buf)
>> {
>> bool val;
>>
>> + /*
>> + * Show a cpu as offline if the cpu number is not valid in a
>> + * proper cpuset bounding cpuinfo context.
>> + */
>> + if (is_device_cpu(dev) && !cpuset_current_cpu_valid(dev->id))
>> + return sysfs_emit(buf, "0\n");
> Why are you changing the driver core for a single random, tiny set of
> devices? The device code for those devices can handle this just fine,
> do NOT modify the driver core for each individual driver type, that way
> lies madness.
>
> This change is not ok, sorry.

OK, thanks for the comments. I will see if there is alternative way of
doing it.

Cheers,
Longman

2021-06-14 17:03:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4/4] driver core: Allow showing cpu as offline if not valid in cpuset context

On Mon, Jun 14, 2021 at 12:32:01PM -0400, Waiman Long wrote:
> On 6/14/21 11:52 AM, Greg KH wrote:
> > On Mon, Jun 14, 2021 at 11:23:06AM -0400, Waiman Long wrote:
> > > Make /sys/devices/system/cpu/cpu<n>/online file to show a cpu as
> > > offline if it is not a valid cpu in a proper cpuset context when the
> > > cpuset_bound_cpuinfo sysctl parameter is turned on.
> > This says _what_ you are doing, but I do not understand _why_ you want
> > to do this.
> >
> > What is going to use this information? And now you are showing more
> > files than you previously did, so what userspace tool is now going to
> > break?
>
> One reason that is provided by the customer asking for this functionality is
> because some applications use the number of cpu cores for licensing purpose.
> Even though the applications are running in a container with a smaller set
> of cpus, they may still charge as if all the cpus are available. They ended
> up using a bind mount to mount over the cpuX/online file.

Great, then stick with the bind mount for foolish things like that.

There's no technical reason for doing this then, just marketing?

thanks,

greg k-h