Subject: [PATCH v5 0/3] use bin_attribute to break the size limitation of cpumap ABI

v5:
-remove the bitmap API bitmap_print_to_buf, alternatively, only provide
cpumap_print_to_buf API as what we really care about is cpumask for
this moment. we can freely take bitmap_print_to_buf back once we find
the second user.
hopefully this can alleviate Yury's worries on possible abuse of a new
bitmap API.
-correct the document of cpumap_print_to_buf;
-In patch1, clearly explain why we need this new API in commit log;
-Also refine the commit log of patch 2 and 3;
-As the modification is narrowed to the scope of cpumask, the kunit
test of bitmap_print_to_buf doesn't apply in the new patchset. so
test case patch4/4 is removed.

Thanks for the comments of Greg, Yury, Andy. Thanks for Jonathan's
review.

v4:
-add test cases for bitmap_print_to_buf API;
-add Reviewed-by of Jonathan Cameron for patches 1-3, thanks!

v3:
-fixed the strlen issue and patch #1,#2,#3 minor formatting issues, thanks
to Andy Shevchenko and Jonathan Cameron.

v2:
-split the original patch #1 into two patches and use kasprintf() in
-patch #1 to simplify the code. do some minor formatting adjustments.

Background:

the whole story began from this thread when Jonatah and me tried to add a
new topology level-cluster which exists on kunpeng920 and X86 Jacobsville:
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

in the discussion, Greg had some concern about the potential one page size
limitation of sysfs ABI for topology. Greg's comment is reasonable
and I think we should address the problem.

For this moment, numa node, cpu topology and some other drivers are using
cpu bitmap and list to expose hardware topology. When cpu number is large,
the page buffer of sysfs won't be able to hold the whole bitmask or list.
This doesn't really happen nowadays for bitmask as the maximum NR_CPUS
is 8196 for X86_64 and 4096 for ARM64 since
8196 * 9 / 32 = 2305
is still smaller than 4KB page size.

So the existing BUILD_BUG_ON() in drivers/base/node.c is pretty much
preventing future problems when hardware gets more and more CPUs:
static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
{
cpumask_var_t mask;
struct node *node_dev = to_node(dev);

/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));
}

But those ABIs exposing cpu lists are much more tricky as a list could be
like: 0, 3, 5, 7, 9, 11... etc. so nobody knows the size till the last
moment. Comparing to bitmask, list is easier to exceed one page.

In the previous discussion, Greg and Dave Hansen preferred to remove this
kind of limitation totally and remove the BUILD_BUG_ON() in
drivers/base/node.c together:
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

Todo:

right now, only topology and node are addressed. there are many other
drivers are calling cpumap_print_to_pagebuf() and have the similar
problems. we are going to address them one by one after this patchset
settles down.

Tian Tao (3):
cpumask: introduce cpumap_print_to_buf to support large bitmask and
list
topology: use bin_attribute to break the size limitation of cpumap ABI
drivers/base/node.c: use bin_attribute to break the size limitation of
cpumap ABI

drivers/base/node.c | 52 +++++++++++-------
drivers/base/topology.c | 115 ++++++++++++++++++++++------------------
include/linux/cpumask.h | 19 +++++++
lib/cpumask.c | 18 +++++++
4 files changed, 133 insertions(+), 70 deletions(-)

--
2.25.1


Subject: [PATCH v5 1/3] cpumask: introduce cpumap_print_to_buf to support large bitmask and list

From: Tian Tao <[email protected]>

The existing cpumap_print_to_pagebuf() is used by cpu topology and other
drivers to export hexadecimal bitmask and decimal list to userspace by
sysfs ABI.

Right now, those drivers are using a normal attribute for this kind of
ABIs. A normal attribute typically has show entry as below:

static ssize_t example_dev_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
...
return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
}
show entry of attribute has no offset and count parameters and this
means the file is limited to one page only.

cpumap_print_to_pagebuf() API works terribly well for this kind of
normal attribute with buf parameter and without offset, count:

static inline ssize_t
cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
{
return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
nr_cpu_ids);
}

The problem is once we have many cpus, we have a chance to make bitmask
or list more than one page. Especially for list, it could be as complex
as 0,3,5,7,9,...... We have no simple way to know it exact size.

It turns out bin_attribute is a way to break this limit. bin_attribute
has show entry as below:
static ssize_t
example_bin_attribute_show(struct file *filp, struct kobject *kobj,
struct bin_attribute *attr, char *buf,
loff_t offset, size_t count)
{
...
}

With the new offset and count parameters, this makes sysfs ABI be able
to support file size more than one page. For example, offset could be
>= 4096.

This patch introduces cpumap_print_to_buf() so that those drivers can
move to bin_attribute to support large bitmask and list. In result,
we have to pass the corresponding parameters from bin_attribute to this
new API.

Signed-off-by: Tian Tao <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Stefano Brivio <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: "Ma, Jianpeng" <[email protected]>
Cc: Yury Norov <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Barry Song <[email protected]>
---
include/linux/cpumask.h | 19 +++++++++++++++++++
lib/cpumask.c | 18 ++++++++++++++++++
2 files changed, 37 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index bfc4690de4f4..24f410a2e793 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -983,6 +983,25 @@ cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
nr_cpu_ids);
}

+/**
+ * cpumap_print_to_buf - copies the cpumask into the buffer either
+ * as comma-separated list of cpus or hex values of cpumask;
+ * Typically used by bin_attribute to export cpumask bitmask and
+ * list ABI.
+ * @list: indicates whether the cpumap must be list
+ * true: print in decimal list format
+ * fasle: print in hexadecimal bitmask format
+ * @mask: the cpumask to copy
+ * @buf: the buffer to copy into
+ * @off: in the string from which we are copying, We copy to @buf
+ * @count: the maximum number of bytes to print
+ *
+ * Returns the length of how many bytes have been copied.
+ */
+extern ssize_t
+cpumap_print_to_buf(bool list, char *buf, const struct cpumask *mask,
+ loff_t off, size_t count);
+
#if NR_CPUS <= BITS_PER_LONG
#define CPU_MASK_ALL \
(cpumask_t) { { \
diff --git a/lib/cpumask.c b/lib/cpumask.c
index c3c76b833384..40421a6d31bc 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -279,3 +279,21 @@ int cpumask_any_distribute(const struct cpumask *srcp)
return next;
}
EXPORT_SYMBOL(cpumask_any_distribute);
+
+ssize_t cpumap_print_to_buf(bool list, char *buf, const struct cpumask *mask,
+ loff_t off, size_t count)
+{
+ const char *fmt = list ? "%*pbl\n" : "%*pb\n";
+ ssize_t size;
+ void *data;
+
+ data = kasprintf(GFP_KERNEL, fmt, nr_cpu_ids, cpumask_bits(mask));
+ if (!data)
+ return -ENOMEM;
+
+ size = memory_read_from_buffer(buf, count, &off, data, strlen(data) + 1);
+ kfree(data);
+
+ return size;
+}
+EXPORT_SYMBOL(cpumap_print_to_buf);
--
2.25.1

Subject: [PATCH v5 2/3] topology: use bin_attribute to break the size limitation of cpumap ABI

From: Tian Tao <[email protected]>

Reading /sys/devices/system/cpu/cpuX/topology/ returns cpu topology.
However, the size of this file is limited to PAGE_SIZE because of
the limitation for sysfs attribute.
This patch moves to use bin_attribute to extend the ABI to be more
than one page so that cpumap bitmask and list won't be potentially
trimmed.

Signed-off-by: Tian Tao <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Barry Song <[email protected]>
---
drivers/base/topology.c | 115 ++++++++++++++++++++++------------------
1 file changed, 63 insertions(+), 52 deletions(-)

diff --git a/drivers/base/topology.c b/drivers/base/topology.c
index 4d254fcc93d1..dd3980124e33 100644
--- a/drivers/base/topology.c
+++ b/drivers/base/topology.c
@@ -21,25 +21,27 @@ static ssize_t name##_show(struct device *dev, \
return sysfs_emit(buf, "%d\n", topology_##name(dev->id)); \
}

-#define define_siblings_show_map(name, mask) \
-static ssize_t name##_show(struct device *dev, \
- struct device_attribute *attr, char *buf) \
-{ \
- return cpumap_print_to_pagebuf(false, buf, topology_##mask(dev->id));\
+#define define_siblings_read_func(name, mask) \
+static ssize_t name##_read(struct file *file, struct kobject *kobj, \
+ struct bin_attribute *attr, char *buf, \
+ loff_t off, size_t count) \
+{ \
+ struct device *dev = kobj_to_dev(kobj); \
+ \
+ return cpumap_print_to_buf(false, buf, topology_##mask(dev->id), \
+ off, count); \
+} \
+ \
+static ssize_t name##_list_read(struct file *file, struct kobject *kobj, \
+ struct bin_attribute *attr, char *buf, \
+ loff_t off, size_t count) \
+{ \
+ struct device *dev = kobj_to_dev(kobj); \
+ \
+ return cpumap_print_to_buf(true, buf, topology_##mask(dev->id), \
+ off, count); \
}

-#define define_siblings_show_list(name, mask) \
-static ssize_t name##_list_show(struct device *dev, \
- struct device_attribute *attr, \
- char *buf) \
-{ \
- return cpumap_print_to_pagebuf(true, buf, topology_##mask(dev->id));\
-}
-
-#define define_siblings_show_func(name, mask) \
- define_siblings_show_map(name, mask); \
- define_siblings_show_list(name, mask)
-
define_id_show_func(physical_package_id);
static DEVICE_ATTR_RO(physical_package_id);

@@ -49,71 +51,80 @@ static DEVICE_ATTR_RO(die_id);
define_id_show_func(core_id);
static DEVICE_ATTR_RO(core_id);

-define_siblings_show_func(thread_siblings, sibling_cpumask);
-static DEVICE_ATTR_RO(thread_siblings);
-static DEVICE_ATTR_RO(thread_siblings_list);
+define_siblings_read_func(thread_siblings, sibling_cpumask);
+static BIN_ATTR_RO(thread_siblings, 0);
+static BIN_ATTR_RO(thread_siblings_list, 0);

-define_siblings_show_func(core_cpus, sibling_cpumask);
-static DEVICE_ATTR_RO(core_cpus);
-static DEVICE_ATTR_RO(core_cpus_list);
+define_siblings_read_func(core_cpus, sibling_cpumask);
+static BIN_ATTR_RO(core_cpus, 0);
+static BIN_ATTR_RO(core_cpus_list, 0);

-define_siblings_show_func(core_siblings, core_cpumask);
-static DEVICE_ATTR_RO(core_siblings);
-static DEVICE_ATTR_RO(core_siblings_list);
+define_siblings_read_func(core_siblings, core_cpumask);
+static BIN_ATTR_RO(core_siblings, 0);
+static BIN_ATTR_RO(core_siblings_list, 0);

-define_siblings_show_func(die_cpus, die_cpumask);
-static DEVICE_ATTR_RO(die_cpus);
-static DEVICE_ATTR_RO(die_cpus_list);
+define_siblings_read_func(die_cpus, die_cpumask);
+static BIN_ATTR_RO(die_cpus, 0);
+static BIN_ATTR_RO(die_cpus_list, 0);

-define_siblings_show_func(package_cpus, core_cpumask);
-static DEVICE_ATTR_RO(package_cpus);
-static DEVICE_ATTR_RO(package_cpus_list);
+define_siblings_read_func(package_cpus, core_cpumask);
+static BIN_ATTR_RO(package_cpus, 0);
+static BIN_ATTR_RO(package_cpus_list, 0);

#ifdef CONFIG_SCHED_BOOK
define_id_show_func(book_id);
static DEVICE_ATTR_RO(book_id);
-define_siblings_show_func(book_siblings, book_cpumask);
-static DEVICE_ATTR_RO(book_siblings);
-static DEVICE_ATTR_RO(book_siblings_list);
+define_siblings_read_func(book_siblings, book_cpumask);
+static BIN_ATTR_RO(book_siblings, 0);
+static BIN_ATTR_RO(book_siblings_list, 0);
#endif

#ifdef CONFIG_SCHED_DRAWER
define_id_show_func(drawer_id);
static DEVICE_ATTR_RO(drawer_id);
-define_siblings_show_func(drawer_siblings, drawer_cpumask);
-static DEVICE_ATTR_RO(drawer_siblings);
-static DEVICE_ATTR_RO(drawer_siblings_list);
+define_siblings_read_func(drawer_siblings, drawer_cpumask);
+static BIN_ATTR_RO(drawer_siblings, 0);
+static BIN_ATTR_RO(drawer_siblings_list, 0);
#endif

+static struct bin_attribute *bin_attrs[] = {
+ &bin_attr_core_cpus,
+ &bin_attr_core_cpus_list,
+ &bin_attr_thread_siblings,
+ &bin_attr_thread_siblings_list,
+ &bin_attr_core_siblings,
+ &bin_attr_core_siblings_list,
+ &bin_attr_die_cpus,
+ &bin_attr_die_cpus_list,
+ &bin_attr_package_cpus,
+ &bin_attr_package_cpus_list,
+#ifdef CONFIG_SCHED_BOOK
+ &bin_attr_book_siblings,
+ &bin_attr_book_siblings_list,
+#endif
+#ifdef CONFIG_SCHED_DRAWER
+ &bin_attr_drawer_siblings,
+ &bin_attr_drawer_siblings_list,
+#endif
+ NULL
+};
+
static struct attribute *default_attrs[] = {
&dev_attr_physical_package_id.attr,
&dev_attr_die_id.attr,
&dev_attr_core_id.attr,
- &dev_attr_thread_siblings.attr,
- &dev_attr_thread_siblings_list.attr,
- &dev_attr_core_cpus.attr,
- &dev_attr_core_cpus_list.attr,
- &dev_attr_core_siblings.attr,
- &dev_attr_core_siblings_list.attr,
- &dev_attr_die_cpus.attr,
- &dev_attr_die_cpus_list.attr,
- &dev_attr_package_cpus.attr,
- &dev_attr_package_cpus_list.attr,
#ifdef CONFIG_SCHED_BOOK
&dev_attr_book_id.attr,
- &dev_attr_book_siblings.attr,
- &dev_attr_book_siblings_list.attr,
#endif
#ifdef CONFIG_SCHED_DRAWER
&dev_attr_drawer_id.attr,
- &dev_attr_drawer_siblings.attr,
- &dev_attr_drawer_siblings_list.attr,
#endif
NULL
};

static const struct attribute_group topology_attr_group = {
.attrs = default_attrs,
+ .bin_attrs = bin_attrs,
.name = "topology"
};

--
2.25.1

Subject: [PATCH v5 3/3] drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI

From: Tian Tao <[email protected]>

Reading sys/devices/system/cpu/cpuX/nodeX/ returns cpumap and cpulist.
However, the size of this file is limited to PAGE_SIZE because of the
limitation for sysfs attribute.

This patch moves to use bin_attribute to extend the ABI to be more
than one page so that cpumap bitmask and list won't be potentially
trimmed.

Signed-off-by: Tian Tao <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Barry Song <[email protected]>
---
drivers/base/node.c | 52 +++++++++++++++++++++++++++++----------------
1 file changed, 34 insertions(+), 18 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 9db297431b97..add53df53b45 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -27,42 +27,45 @@ static struct bus_type node_subsys = {
};


-static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
+static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf,
+ loff_t off, size_t count)
{
ssize_t n;
cpumask_var_t mask;
struct node *node_dev = to_node(dev);

- /* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
- BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));
-
if (!alloc_cpumask_var(&mask, GFP_KERNEL))
return 0;

cpumask_and(mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);
- n = cpumap_print_to_pagebuf(list, buf, mask);
+ n = cpumap_print_to_buf(list, buf, mask, off, count);
free_cpumask_var(mask);

return n;
}

-static inline ssize_t cpumap_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
+static inline ssize_t cpumap_read(struct file *file, struct kobject *kobj,
+ struct bin_attribute *attr, char *buf,
+ loff_t off, size_t count)
{
- return node_read_cpumap(dev, false, buf);
+ struct device *dev = kobj_to_dev(kobj);
+
+ return node_read_cpumap(dev, false, buf, off, count);
}

-static DEVICE_ATTR_RO(cpumap);

-static inline ssize_t cpulist_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
+static BIN_ATTR_RO(cpumap, 0);
+
+static inline ssize_t cpulist_read(struct file *file, struct kobject *kobj,
+ struct bin_attribute *attr, char *buf,
+ loff_t off, size_t count)
{
- return node_read_cpumap(dev, true, buf);
+ struct device *dev = kobj_to_dev(kobj);
+
+ return node_read_cpumap(dev, true, buf, off, count);
}

-static DEVICE_ATTR_RO(cpulist);
+static BIN_ATTR_RO(cpulist, 0);

/**
* struct node_access_nodes - Access class device to hold user visible
@@ -557,15 +560,28 @@ static ssize_t node_read_distance(struct device *dev,
static DEVICE_ATTR(distance, 0444, node_read_distance, NULL);

static struct attribute *node_dev_attrs[] = {
- &dev_attr_cpumap.attr,
- &dev_attr_cpulist.attr,
&dev_attr_meminfo.attr,
&dev_attr_numastat.attr,
&dev_attr_distance.attr,
&dev_attr_vmstat.attr,
NULL
};
-ATTRIBUTE_GROUPS(node_dev);
+
+static struct bin_attribute *node_dev_bin_attrs[] = {
+ &bin_attr_cpumap,
+ &bin_attr_cpulist,
+ NULL
+};
+
+static const struct attribute_group node_dev_group = {
+ .attrs = node_dev_attrs,
+ .bin_attrs = node_dev_bin_attrs
+};
+
+static const struct attribute_group *node_dev_groups[] = {
+ &node_dev_group,
+ NULL
+};

#ifdef CONFIG_HUGETLBFS
/*
--
2.25.1

2021-07-02 10:07:21

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI

On Fri, Jul 02, 2021 at 09:25:59PM +1200, Barry Song wrote:
> From: Tian Tao <[email protected]>
>
> Reading sys/devices/system/cpu/cpuX/nodeX/ returns cpumap and cpulist.
> However, the size of this file is limited to PAGE_SIZE because of the
> limitation for sysfs attribute.
>
> This patch moves to use bin_attribute to extend the ABI to be more
> than one page so that cpumap bitmask and list won't be potentially
> trimmed.

...

> }
>
> -static DEVICE_ATTR_RO(cpumap);
>
> -static inline ssize_t cpulist_show(struct device *dev,
> - struct device_attribute *attr,
> - char *buf)
> +static BIN_ATTR_RO(cpumap, 0);

It will be extra blank line after this change.

--
With Best Regards,
Andy Shevchenko


2021-07-02 10:09:31

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] cpumask: introduce cpumap_print_to_buf to support large bitmask and list

On Fri, Jul 02, 2021 at 09:25:57PM +1200, Barry Song wrote:
> From: Tian Tao <[email protected]>
>
> The existing cpumap_print_to_pagebuf() is used by cpu topology and other
> drivers to export hexadecimal bitmask and decimal list to userspace by
> sysfs ABI.
>
> Right now, those drivers are using a normal attribute for this kind of
> ABIs. A normal attribute typically has show entry as below:
>
> static ssize_t example_dev_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> ...
> return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
> }
> show entry of attribute has no offset and count parameters and this
> means the file is limited to one page only.
>
> cpumap_print_to_pagebuf() API works terribly well for this kind of
> normal attribute with buf parameter and without offset, count:
>
> static inline ssize_t
> cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
> {
> return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
> nr_cpu_ids);
> }
>
> The problem is once we have many cpus, we have a chance to make bitmask
> or list more than one page. Especially for list, it could be as complex
> as 0,3,5,7,9,...... We have no simple way to know it exact size.
>
> It turns out bin_attribute is a way to break this limit. bin_attribute
> has show entry as below:
> static ssize_t
> example_bin_attribute_show(struct file *filp, struct kobject *kobj,
> struct bin_attribute *attr, char *buf,
> loff_t offset, size_t count)
> {
> ...
> }
>
> With the new offset and count parameters, this makes sysfs ABI be able
> to support file size more than one page. For example, offset could be
> >= 4096.
>
> This patch introduces cpumap_print_to_buf() so that those drivers can
> move to bin_attribute to support large bitmask and list. In result,
> we have to pass the corresponding parameters from bin_attribute to this
> new API.

...

> +/**
> + * cpumap_print_to_buf - copies the cpumask into the buffer either
> + * as comma-separated list of cpus or hex values of cpumask;
> + * Typically used by bin_attribute to export cpumask bitmask and
> + * list ABI.

It can be split to

* cpumap_print_to_buf - copies the cpumask into the buffer

here...

> + * @list: indicates whether the cpumap must be list
> + * true: print in decimal list format
> + * fasle: print in hexadecimal bitmask format
> + * @mask: the cpumask to copy
> + * @buf: the buffer to copy into
> + * @off: in the string from which we are copying, We copy to @buf
> + * @count: the maximum number of bytes to print

..and

*
* The function copies the cpumask into the buffer either as comma-separated
* list of cpus or hex values of cpumask; Typically used by bin_attribute to
* export cpumask bitmask and list ABI.

here.

> + *
> + * Returns the length of how many bytes have been copied.
> + */

--
With Best Regards,
Andy Shevchenko


2021-07-02 13:19:25

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v5 2/3] topology: use bin_attribute to break the size limitation of cpumap ABI

Hi Barry,

I love your patch! Yet something to improve:

[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on staging/staging-testing linus/master v5.13 next-20210701]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Barry-Song/use-bin_attribute-to-break-the-size-limitation-of-cpumap-ABI/20210702-172727
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 3b1f941536af17537da09a7552c8e74804dd6823
config: nds32-allnoconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/647347a6350cefabb7ef767a0ab304be4395eeb2
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Barry-Song/use-bin_attribute-to-break-the-size-limitation-of-cpumap-ABI/20210702-172727
git checkout 647347a6350cefabb7ef767a0ab304be4395eeb2
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=nds32 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

nds32le-linux-ld: drivers/base/topology.o: in function `thread_siblings_list_read':
topology.c:(.text+0x4c): undefined reference to `cpumap_print_to_buf'
>> nds32le-linux-ld: topology.c:(.text+0x50): undefined reference to `cpumap_print_to_buf'
nds32le-linux-ld: drivers/base/topology.o: in function `thread_siblings_read':
topology.c:(.text+0xa8): undefined reference to `cpumap_print_to_buf'
nds32le-linux-ld: topology.c:(.text+0xac): undefined reference to `cpumap_print_to_buf'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.19 kB)
.config.gz (5.74 kB)
Download all attachments

2021-07-02 16:35:36

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v5 2/3] topology: use bin_attribute to break the size limitation of cpumap ABI

Hi Barry,

I love your patch! Yet something to improve:

[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on staging/staging-testing linus/master v5.13 next-20210701]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Barry-Song/use-bin_attribute-to-break-the-size-limitation-of-cpumap-ABI/20210702-172727
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 3b1f941536af17537da09a7552c8e74804dd6823
config: parisc-randconfig-r004-20210702 (attached as .config)
compiler: hppa64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/647347a6350cefabb7ef767a0ab304be4395eeb2
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Barry-Song/use-bin_attribute-to-break-the-size-limitation-of-cpumap-ABI/20210702-172727
git checkout 647347a6350cefabb7ef767a0ab304be4395eeb2
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross O=build_dir ARCH=parisc SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

hppa64-linux-ld: drivers/base/topology.o: in function `.LC3':
>> (.data.rel.ro+0x18): undefined reference to `cpumap_print_to_buf'
hppa64-linux-ld: drivers/base/topology.o: in function `.LC15':
(.data.rel.ro+0x58): undefined reference to `cpumap_print_to_buf'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (1.98 kB)
.config.gz (24.55 kB)
Download all attachments

2021-07-02 21:38:30

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] cpumask: introduce cpumap_print_to_buf to support large bitmask and list

On Fri, Jul 02, 2021 at 09:25:57PM +1200, Barry Song wrote:
> From: Tian Tao <[email protected]>
>
> The existing cpumap_print_to_pagebuf() is used by cpu topology and other
> drivers to export hexadecimal bitmask and decimal list to userspace by
> sysfs ABI.
>
> Right now, those drivers are using a normal attribute for this kind of
> ABIs. A normal attribute typically has show entry as below:
>
> static ssize_t example_dev_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> ...
> return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
> }
> show entry of attribute has no offset and count parameters and this
> means the file is limited to one page only.
>
> cpumap_print_to_pagebuf() API works terribly well for this kind of
> normal attribute with buf parameter and without offset, count:
>
> static inline ssize_t
> cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
> {
> return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
> nr_cpu_ids);
> }
>
> The problem is once we have many cpus, we have a chance to make bitmask
> or list more than one page. Especially for list, it could be as complex
> as 0,3,5,7,9,...... We have no simple way to know it exact size.
>
> It turns out bin_attribute is a way to break this limit. bin_attribute
> has show entry as below:
> static ssize_t
> example_bin_attribute_show(struct file *filp, struct kobject *kobj,
> struct bin_attribute *attr, char *buf,
> loff_t offset, size_t count)
> {
> ...
> }
>
> With the new offset and count parameters, this makes sysfs ABI be able
> to support file size more than one page. For example, offset could be
> >= 4096.
>
> This patch introduces cpumap_print_to_buf() so that those drivers can
> move to bin_attribute to support large bitmask and list. In result,
> we have to pass the corresponding parameters from bin_attribute to this
> new API.
>
> Signed-off-by: Tian Tao <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Andy Shevchenko <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Cc: Stefano Brivio <[email protected]>
> Cc: Alexander Gordeev <[email protected]>
> Cc: "Ma, Jianpeng" <[email protected]>
> Cc: Yury Norov <[email protected]>
> Cc: Valentin Schneider <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Daniel Bristot de Oliveira <[email protected]>
> Signed-off-by: Barry Song <[email protected]>
> ---
> include/linux/cpumask.h | 19 +++++++++++++++++++
> lib/cpumask.c | 18 ++++++++++++++++++
> 2 files changed, 37 insertions(+)
>
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index bfc4690de4f4..24f410a2e793 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -983,6 +983,25 @@ cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
> nr_cpu_ids);
> }
>
> +/**
> + * cpumap_print_to_buf - copies the cpumask into the buffer either
> + * as comma-separated list of cpus or hex values of cpumask;
> + * Typically used by bin_attribute to export cpumask bitmask and
> + * list ABI.
> + * @list: indicates whether the cpumap must be list
> + * true: print in decimal list format
> + * fasle: print in hexadecimal bitmask format
> + * @mask: the cpumask to copy
> + * @buf: the buffer to copy into
> + * @off: in the string from which we are copying, We copy to @buf
> + * @count: the maximum number of bytes to print
> + *
> + * Returns the length of how many bytes have been copied.
> + */
> +extern ssize_t
> +cpumap_print_to_buf(bool list, char *buf, const struct cpumask *mask,
> + loff_t off, size_t count);
> +
> #if NR_CPUS <= BITS_PER_LONG
> #define CPU_MASK_ALL \
> (cpumask_t) { { \
> diff --git a/lib/cpumask.c b/lib/cpumask.c
> index c3c76b833384..40421a6d31bc 100644
> --- a/lib/cpumask.c
> +++ b/lib/cpumask.c
> @@ -279,3 +279,21 @@ int cpumask_any_distribute(const struct cpumask *srcp)
> return next;
> }
> EXPORT_SYMBOL(cpumask_any_distribute);
> +
> +ssize_t cpumap_print_to_buf(bool list, char *buf, const struct cpumask *mask,
> + loff_t off, size_t count)
> +{
> + const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> + ssize_t size;
> + void *data;
> +
> + data = kasprintf(GFP_KERNEL, fmt, nr_cpu_ids, cpumask_bits(mask));
> + if (!data)
> + return -ENOMEM;
> +
> + size = memory_read_from_buffer(buf, count, &off, data, strlen(data) + 1);
> + kfree(data);

Barry,

It looks like my comments for previous iteration were ignored. I don't
like the approach where you allocate potentially big amount of kernel
memory just to free it almost immediately. Nor in lib/bitmap, neither
in lib/cpumask.

For next iterations, please move this function back to lib/bitmap
because there's no specific here for cpumasks.

Thaks,
Yury

> + return size;
> +}
> +EXPORT_SYMBOL(cpumap_print_to_buf);
> --
> 2.25.1

Subject: RE: [PATCH v5 1/3] cpumask: introduce cpumap_print_to_buf to support large bitmask and list



> -----Original Message-----
> From: Yury Norov [mailto:[email protected]]
> Sent: Saturday, July 3, 2021 9:31 AM
> To: Song Bao Hua (Barry Song) <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; tangchengchang
> <[email protected]>; Zengtao (B) <[email protected]>;
> yangyicong <[email protected]>; [email protected]; Linuxarm
> <[email protected]>; tiantao (H) <[email protected]>
> Subject: Re: [PATCH v5 1/3] cpumask: introduce cpumap_print_to_buf to support
> large bitmask and list
>
> On Fri, Jul 02, 2021 at 09:25:57PM +1200, Barry Song wrote:
> > From: Tian Tao <[email protected]>
> >
> > The existing cpumap_print_to_pagebuf() is used by cpu topology and other
> > drivers to export hexadecimal bitmask and decimal list to userspace by
> > sysfs ABI.
> >
> > Right now, those drivers are using a normal attribute for this kind of
> > ABIs. A normal attribute typically has show entry as below:
> >
> > static ssize_t example_dev_show(struct device *dev,
> > struct device_attribute *attr, char *buf)
> > {
> > ...
> > return cpumap_print_to_pagebuf(true, buf, &pmu_mmdc->cpu);
> > }
> > show entry of attribute has no offset and count parameters and this
> > means the file is limited to one page only.
> >
> > cpumap_print_to_pagebuf() API works terribly well for this kind of
> > normal attribute with buf parameter and without offset, count:
> >
> > static inline ssize_t
> > cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
> > {
> > return bitmap_print_to_pagebuf(list, buf, cpumask_bits(mask),
> > nr_cpu_ids);
> > }
> >
> > The problem is once we have many cpus, we have a chance to make bitmask
> > or list more than one page. Especially for list, it could be as complex
> > as 0,3,5,7,9,...... We have no simple way to know it exact size.
> >
> > It turns out bin_attribute is a way to break this limit. bin_attribute
> > has show entry as below:
> > static ssize_t
> > example_bin_attribute_show(struct file *filp, struct kobject *kobj,
> > struct bin_attribute *attr, char *buf,
> > loff_t offset, size_t count)
> > {
> > ...
> > }
> >
> > With the new offset and count parameters, this makes sysfs ABI be able
> > to support file size more than one page. For example, offset could be
> > >= 4096.
> >
> > This patch introduces cpumap_print_to_buf() so that those drivers can
> > move to bin_attribute to support large bitmask and list. In result,
> > we have to pass the corresponding parameters from bin_attribute to this
> > new API.
> >
> > Signed-off-by: Tian Tao <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Cc: Andy Shevchenko <[email protected]>
> > Cc: Randy Dunlap <[email protected]>
> > Cc: Stefano Brivio <[email protected]>
> > Cc: Alexander Gordeev <[email protected]>
> > Cc: "Ma, Jianpeng" <[email protected]>
> > Cc: Yury Norov <[email protected]>
> > Cc: Valentin Schneider <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Daniel Bristot de Oliveira <[email protected]>
> > Signed-off-by: Barry Song <[email protected]>
> > ---
> > include/linux/cpumask.h | 19 +++++++++++++++++++
> > lib/cpumask.c | 18 ++++++++++++++++++
> > 2 files changed, 37 insertions(+)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index bfc4690de4f4..24f410a2e793 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -983,6 +983,25 @@ cpumap_print_to_pagebuf(bool list, char *buf, const
> struct cpumask *mask)
> > nr_cpu_ids);
> > }
> >
> > +/**
> > + * cpumap_print_to_buf - copies the cpumask into the buffer either
> > + * as comma-separated list of cpus or hex values of cpumask;
> > + * Typically used by bin_attribute to export cpumask bitmask and
> > + * list ABI.
> > + * @list: indicates whether the cpumap must be list
> > + * true: print in decimal list format
> > + * fasle: print in hexadecimal bitmask format
> > + * @mask: the cpumask to copy
> > + * @buf: the buffer to copy into
> > + * @off: in the string from which we are copying, We copy to @buf
> > + * @count: the maximum number of bytes to print
> > + *
> > + * Returns the length of how many bytes have been copied.
> > + */
> > +extern ssize_t
> > +cpumap_print_to_buf(bool list, char *buf, const struct cpumask *mask,
> > + loff_t off, size_t count);
> > +
> > #if NR_CPUS <= BITS_PER_LONG
> > #define CPU_MASK_ALL \
> > (cpumask_t) { { \
> > diff --git a/lib/cpumask.c b/lib/cpumask.c
> > index c3c76b833384..40421a6d31bc 100644
> > --- a/lib/cpumask.c
> > +++ b/lib/cpumask.c
> > @@ -279,3 +279,21 @@ int cpumask_any_distribute(const struct cpumask *srcp)
> > return next;
> > }
> > EXPORT_SYMBOL(cpumask_any_distribute);
> > +
> > +ssize_t cpumap_print_to_buf(bool list, char *buf, const struct cpumask *mask,
> > + loff_t off, size_t count)
> > +{
> > + const char *fmt = list ? "%*pbl\n" : "%*pb\n";
> > + ssize_t size;
> > + void *data;
> > +
> > + data = kasprintf(GFP_KERNEL, fmt, nr_cpu_ids, cpumask_bits(mask));
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + size = memory_read_from_buffer(buf, count, &off, data, strlen(data) + 1);
> > + kfree(data);
>
> Barry,
>
> It looks like my comments for previous iteration were ignored. I don't
> like the approach where you allocate potentially big amount of kernel
> memory just to free it almost immediately. Nor in lib/bitmap, neither
> in lib/cpumask.
>

Yury, clearly your comment was not ignored. I explained in this reply:
https://lore.kernel.org/lkml/[email protected]/

I explained in that email and I want to make it more clear:

I don't think moving memory allocation to drivers is a correct way
as for its main users - bin attribute, we have no way to reuse the
buffer allocated in drivers.

> For next iterations, please move this function back to lib/bitmap
> because there's no specific here for cpumasks.

I am ok with taking the bitmap API back as actually it is what i
really preferred. Just to easy your worry on somebody else will
abuse bitmap API. So I narrowed the scope of the modification.


>
> Thaks,
> Yury
>
> > + return size;
> > +}
> > +EXPORT_SYMBOL(cpumap_print_to_buf);

Thanks
Barry