2004-06-03 16:48:21

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] Bitmap and Cpumask Cleanup - Overview


Bitmap and Cpumask Cleanup

Andrew,

Please consider the following 10 patches for your *-mm patch set,
to be sent shortly in follow-on email messages.

This patch set removes some 27 kernel source files, simplifies
cpumasks, and in the colorful language of Rusty, it gets rid of:

asm-generic/cpumask_optimized_for_large_smp_with_sparse_array_and_small_stack.h

It also forms the basis for a nodemask_t patch, that is
awaiting in the wings, from Matthew Dobson.

===

This is the sixth version of my cleanup of bitmaps and cpumasks.
It has changed very little from the fifth version of a month ago.

This set of 10 patches applies against 2.6.7-rc2-mm2.

Primay goals:

The primary goal of this patch set is to simplify the code for
cpumask_t and (later) nodemask_t, make them easier to use, and
reduce code duplication.

Several flavors of cpumask have been reduced to one, with some
local special handling to optimize for that vast majority of
systems which have less than 32 (or 64) CPUs.

The bitmap operations upon which cpumasks depend have been
optimized a bit more, with a more careful mix of inline and
outofline code.

By simplifying masks to a single file, it should also be
easier to add other such mask types, such as the nodemask
that Matthew Dobson has waiting in the wings, just by
copying cpumask.h and making a few global edits.

This patch set provides (compared to 2.6.7-rc2-mm2):

1) Some 27 files matching the pattern include/*/*mask*.h
are replaced with the single file include/linux/cpumask.h
The variety of arch-specific redirect headers for various
cpumask implementation flavors is gone.
2) The bitmap operations (bitmap.h, bitmap.c) are optimized
for systems of less than 32 (or 64) CPUs.
3) The cpumask operations are now just a thin layer on top
of the bitmask and a few other operations. A cpumask is a
bitmask of exactly NR_CPUS bits, wrapped in a structure.
4) bitmap_complement and cpumask_complement now take two args,
source and target, instead of working in place.
5) Some uses of these macros elsewhere in the kernel were fine
tuned.
6) On ia64, find_first_bit and find_first_zero_bit are
uninlined - saving quite a bit of kernel text space.
The architectures: alpha, parisc, ppc, sh, sparc, sparc64
have this same bit find code, and might also want to uninline.
7) Comments in bitmap.h and cpumask.h list available ops for ease
of browsing.
8) The MASK_ALL macro zeros out unused bits on multiword bitmaps.
9) This patch includes Bill Irwin's recent rewrite of the
bitmap_shift operators to avoid assuming some fixed upper
bound on bitmap sizes.
10) A few more mask macros have been added, to make it easier to
code mask manipulations correctly and easily. They provide
xor, andnot, intersects and subset operators.

Bug fixes:
1) *_complement macros don't leave unused high bits set
2) MASK_ALL for sizes > 1 word, but not exact word multiple,
doesn't have unused high bits set
3) Explicit, documented semantics for handling these unused high bits.
4) A few missing const attributes in bitmap & bitops added.
5) The (Hamming) cpumask weight macros were using the bitops hweight*()
macros, which don't mask high unused bits - fixed to use the bitmap
weight macro which does this masking.

Do to the rather limited use so far of cpumask macros, I am
not aware of anything that these bugs would actually break in
current mm or linus kernels.

Testing so far:
Kernels have been built for i386 (SMP and not), sparc64 and
ia64 (sn2_defconfig), and booted and minimally tested on SN2.
Kernel text size has been compared and found to be similar or
smaller. Correct function of bit operations has been tested
for a variety of NR_CPUS values in a user level framework,
using gcc compiler versions 2.95.3, 3.2.3 and 3.3.2. Joe Korty
has built and booted Version 5 on Opteron.

Reviews and feedback:
This code has been sent to the arch maintainers, and has been
reviewed in some form or other by several, including:
- William Lee Irwin III <[email protected]>
- Jesse Barnes <[email protected]>
- Joe Korty <[email protected]>
- Matthew Dobson <[email protected]>
- Rusty Russell <[email protected]>
So far as I know, no outstanding issues remain open.


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373


2004-06-03 17:13:42

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 2/10 bitmap cleanup preparation for cpumask overhaul

cpumask 2/10 bitmap cleanup preparation for cpumask overhaul

Document the bitmap bit model and handling of unused bits.

Tighten up bitmap so it does not generate nonzero bits
in the unused tail if it is not given any on input.

Add intersects, subset, xor and andnot operators.
Change bitmap_complement to take two operands.

Add a couple of missing 'const' qualifiers on
bitops test_bit and bitmap_equal args.

arch/x86_64/kernel/smpboot.c | 2
include/asm-generic/bitops.h | 2
include/asm-generic/cpumask_array.h | 2
include/asm-i386/mpspec.h | 2
include/asm-x86_64/mpspec.h | 2
include/linux/bitmap.h | 12 ++++
lib/bitmap.c | 87 +++++++++++++++++++++++++++++++++---
mm/mempolicy.c | 12 ++--
8 files changed, 101 insertions(+), 20 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/arch/x86_64/kernel/smpboot.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/x86_64/kernel/smpboot.c 2004-06-03 05:42:56.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/x86_64/kernel/smpboot.c 2004-06-03 05:56:31.000000000 -0700
@@ -827,7 +827,7 @@
if (apicid == boot_cpu_id || (apicid == BAD_APICID))
continue;

- if (!cpu_isset(apicid, phys_cpu_present_map))
+ if (!physid_isset(apicid, phys_cpu_present_map))
continue;
if ((max_cpus >= 0) && (max_cpus <= cpucount+1))
continue;
Index: 2.6.7-rc2-mm2/include/asm-generic/bitops.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/bitops.h 2004-06-03 05:39:23.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/bitops.h 2004-06-03 05:56:31.000000000 -0700
@@ -42,7 +42,7 @@
return retval;
}

-extern __inline__ int test_bit(int nr, long * addr)
+extern __inline__ int test_bit(int nr, const unsigned long * addr)
{
int mask;

Index: 2.6.7-rc2-mm2/include/asm-generic/cpumask_array.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/cpumask_array.h 2004-06-03 05:39:23.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/cpumask_array.h 2004-06-03 05:56:31.000000000 -0700
@@ -17,7 +17,7 @@
#define cpus_and(dst,src1,src2) bitmap_and((dst).mask,(src1).mask, (src2).mask, NR_CPUS)
#define cpus_or(dst,src1,src2) bitmap_or((dst).mask, (src1).mask, (src2).mask, NR_CPUS)
#define cpus_clear(map) bitmap_zero((map).mask, NR_CPUS)
-#define cpus_complement(map) bitmap_complement((map).mask, NR_CPUS)
+#define cpus_complement(map) bitmap_complement((map).mask, (map).mask, NR_CPUS)
#define cpus_equal(map1, map2) bitmap_equal((map1).mask, (map2).mask, NR_CPUS)
#define cpus_empty(map) bitmap_empty(map.mask, NR_CPUS)
#define cpus_addr(map) ((map).mask)
Index: 2.6.7-rc2-mm2/include/asm-i386/mpspec.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mpspec.h 2004-06-03 05:42:57.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mpspec.h 2004-06-03 05:56:31.000000000 -0700
@@ -53,7 +53,7 @@
#define physids_and(dst, src1, src2) bitmap_and((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_or(dst, src1, src2) bitmap_or((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_clear(map) bitmap_zero((map).mask, MAX_APICS)
-#define physids_complement(map) bitmap_complement((map).mask, MAX_APICS)
+#define physids_complement(map) bitmap_complement((map).mask, (map).mask, MAX_APICS)
#define physids_empty(map) bitmap_empty((map).mask, MAX_APICS)
#define physids_equal(map1, map2) bitmap_equal((map1).mask, (map2).mask, MAX_APICS)
#define physids_weight(map) bitmap_weight((map).mask, MAX_APICS)
Index: 2.6.7-rc2-mm2/include/asm-x86_64/mpspec.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-x86_64/mpspec.h 2004-06-03 05:42:57.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-x86_64/mpspec.h 2004-06-03 05:56:31.000000000 -0700
@@ -212,7 +212,7 @@
#define physids_and(dst, src1, src2) bitmap_and((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_or(dst, src1, src2) bitmap_or((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_clear(map) bitmap_zero((map).mask, MAX_APICS)
-#define physids_complement(map) bitmap_complement((map).mask, MAX_APICS)
+#define physids_complement(map) bitmap_complement((map).mask, (map).mask, MAX_APICS)
#define physids_empty(map) bitmap_empty((map).mask, MAX_APICS)
#define physids_equal(map1, map2) bitmap_equal((map1).mask, (map2).mask, MAX_APICS)
#define physids_weight(map) bitmap_weight((map).mask, MAX_APICS)
Index: 2.6.7-rc2-mm2/include/linux/bitmap.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/linux/bitmap.h 2004-06-03 05:39:36.000000000 -0700
+++ 2.6.7-rc2-mm2/include/linux/bitmap.h 2004-06-03 05:56:31.000000000 -0700
@@ -13,8 +13,8 @@
int bitmap_empty(const unsigned long *bitmap, int bits);
int bitmap_full(const unsigned long *bitmap, int bits);
int bitmap_equal(const unsigned long *bitmap1,
- unsigned long *bitmap2, int bits);
-void bitmap_complement(unsigned long *bitmap, int bits);
+ const unsigned long *bitmap2, int bits);
+void bitmap_complement(unsigned long *dst, const unsigned long *src, int bits);

static inline void bitmap_zero(unsigned long *bitmap, int bits)
{
@@ -41,6 +41,14 @@
const unsigned long *bitmap2, int bits);
void bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits);
+void bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+void bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+int bitmap_intersects(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+int bitmap_subset(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
int bitmap_weight(const unsigned long *bitmap, int bits);
int bitmap_scnprintf(char *buf, unsigned int buflen,
const unsigned long *maskp, int bits);
Index: 2.6.7-rc2-mm2/lib/bitmap.c
===================================================================
--- 2.6.7-rc2-mm2.orig/lib/bitmap.c 2004-06-03 05:39:46.000000000 -0700
+++ 2.6.7-rc2-mm2/lib/bitmap.c 2004-06-03 05:56:31.000000000 -0700
@@ -12,6 +12,26 @@
#include <asm/bitops.h>
#include <asm/uaccess.h>

+/*
+ * bitmaps provide an array of bits, implemented using an an
+ * array of unsigned longs. The number of valid bits in a
+ * given bitmap need not be an exact multiple of BITS_PER_LONG.
+ *
+ * The possible unused bits in the last, partially used word
+ * of a bitmap are 'don't care'. The implementation makes
+ * no particular effort to keep them zero. It ensures that
+ * their value will not affect the results of any operation.
+ * The bitmap operations that return Boolean (bitmap_empty,
+ * for example) or scalar (bitmap_weight, for example) results
+ * carefully filter out these unused bits from impacting their
+ * results.
+ *
+ * These operations actually hold to a slightly stronger rule:
+ * if you don't input any bitmaps to these ops that have some
+ * unused bits set, then they won't output any set unused bits
+ * in output bitmaps.
+ */
+
int bitmap_empty(const unsigned long *bitmap, int bits)
{
int k, lim = bits/BITS_PER_LONG;
@@ -43,7 +63,7 @@
EXPORT_SYMBOL(bitmap_full);

int bitmap_equal(const unsigned long *bitmap1,
- unsigned long *bitmap2, int bits)
+ const unsigned long *bitmap2, int bits)
{
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
@@ -59,13 +79,14 @@
}
EXPORT_SYMBOL(bitmap_equal);

-void bitmap_complement(unsigned long *bitmap, int bits)
+void bitmap_complement(unsigned long *dst, const unsigned long *src, int bits)
{
- int k;
- int nr = BITS_TO_LONGS(bits);
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ dst[k] = ~src[k];

- for (k = 0; k < nr; ++k)
- bitmap[k] = ~bitmap[k];
+ if (bits % BITS_PER_LONG)
+ dst[k] = ~src[k] & ((1UL << (bits % BITS_PER_LONG)) - 1);
}
EXPORT_SYMBOL(bitmap_complement);

@@ -173,6 +194,60 @@
}
EXPORT_SYMBOL(bitmap_or);

+void bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k;
+ int nr = BITS_TO_LONGS(bits);
+
+ for (k = 0; k < nr; k++)
+ dst[k] = bitmap1[k] ^ bitmap2[k];
+}
+EXPORT_SYMBOL(bitmap_xor);
+
+void bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k;
+ int nr = BITS_TO_LONGS(bits);
+
+ for (k = 0; k < nr; k++)
+ dst[k] = bitmap1[k] & ~bitmap2[k];
+}
+EXPORT_SYMBOL(bitmap_andnot);
+
+int bitmap_intersects(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ if (bitmap1[k] & bitmap2[k])
+ return 1;
+
+ if (bits % BITS_PER_LONG)
+ if ((bitmap1[k] & bitmap2[k]) &
+ ((1UL << (bits % BITS_PER_LONG)) - 1))
+ return 1;
+ return 0;
+}
+EXPORT_SYMBOL(bitmap_intersects);
+
+int bitmap_subset(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits)
+{
+ int k, lim = bits/BITS_PER_LONG;
+ for (k = 0; k < lim; ++k)
+ if (bitmap1[k] & ~bitmap2[k])
+ return 0;
+
+ if (bits % BITS_PER_LONG)
+ if ((bitmap1[k] & ~bitmap2[k]) &
+ ((1UL << (bits % BITS_PER_LONG)) - 1))
+ return 0;
+ return 1;
+}
+EXPORT_SYMBOL(bitmap_subset);
+
#if BITS_PER_LONG == 32
int bitmap_weight(const unsigned long *bitmap, int bits)
{
Index: 2.6.7-rc2-mm2/mm/mempolicy.c
===================================================================
--- 2.6.7-rc2-mm2.orig/mm/mempolicy.c 2004-06-03 05:39:47.000000000 -0700
+++ 2.6.7-rc2-mm2/mm/mempolicy.c 2004-06-03 05:56:31.000000000 -0700
@@ -92,14 +92,12 @@
/* Check if all specified nodes are online */
static int nodes_online(unsigned long *nodes)
{
- DECLARE_BITMAP(offline, MAX_NUMNODES);
+ DECLARE_BITMAP(online2, MAX_NUMNODES);

- bitmap_copy(offline, node_online_map, MAX_NUMNODES);
- if (bitmap_empty(offline, MAX_NUMNODES))
- set_bit(0, offline);
- bitmap_complement(offline, MAX_NUMNODES);
- bitmap_and(offline, offline, nodes, MAX_NUMNODES);
- if (!bitmap_empty(offline, MAX_NUMNODES))
+ bitmap_copy(online2, node_online_map, MAX_NUMNODES);
+ if (bitmap_empty(online2, MAX_NUMNODES))
+ set_bit(0, online2);
+ if (!bitmap_subset(nodes, online2, MAX_NUMNODES))
return -EINVAL;
return 0;
}


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:17:02

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 6/10 remove 26 no longer used cpumask*.h files

cpumask 6/10 remove 26 no longer used cpumask*.h files

With the cpumask rewrite in the previous patch, these
various include/asm-*/cpumask*.h headers are no longer used.

include/asm-alpha/cpumask.h | 6 -
include/asm-arm/cpumask.h | 6 -
include/asm-arm26/cpumask.h | 6 -
include/asm-cris/cpumask.h | 6 -
include/asm-generic/cpumask.h | 40 ------
include/asm-generic/cpumask_arith.h | 49 --------
include/asm-generic/cpumask_array.h | 54 ---------
include/asm-generic/cpumask_const_reference.h | 29 ----
include/asm-generic/cpumask_const_value.h | 21 ---
include/asm-generic/cpumask_up.h | 59 ----------
include/asm-h8300/cpumask.h | 6 -
include/asm-i386/cpumask.h | 6 -
include/asm-ia64/cpumask.h | 6 -
include/asm-m68k/cpumask.h | 6 -
include/asm-m68knommu/cpumask.h | 6 -
include/asm-mips/cpumask.h | 6 -
include/asm-parisc/cpumask.h | 6 -
include/asm-ppc/cpumask.h | 6 -
include/asm-ppc64/cpumask.h | 6 -
include/asm-s390/cpumask.h | 6 -
include/asm-sh/cpumask.h | 6 -
include/asm-sparc/cpumask.h | 6 -
include/asm-sparc64/cpumask.h | 6 -
include/asm-um/cpumask.h | 6 -
include/asm-v850/cpumask.h | 6 -
include/asm-x86_64/cpumask.h | 6 -
26 files changed, 372 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/include/asm-alpha/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-alpha/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-alpha/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_ALPHA_CPUMASK_H
-#define _ASM_ALPHA_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_ALPHA_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-arm/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-arm/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-arm/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_ARM_CPUMASK_H
-#define _ASM_ARM_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_ARM_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-arm26/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-arm26/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-arm26/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_ARM26_CPUMASK_H
-#define _ASM_ARM26_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_ARM26_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-cris/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-cris/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-cris/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_CRIS_CPUMASK_H
-#define _ASM_CRIS_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_CRIS_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-generic/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,40 +0,0 @@
-#ifndef __ASM_GENERIC_CPUMASK_H
-#define __ASM_GENERIC_CPUMASK_H
-
-#include <linux/config.h>
-#include <linux/kernel.h>
-#include <linux/threads.h>
-#include <linux/types.h>
-#include <linux/bitmap.h>
-
-#if NR_CPUS > BITS_PER_LONG && NR_CPUS != 1
-#define CPU_ARRAY_SIZE BITS_TO_LONGS(NR_CPUS)
-
-struct cpumask
-{
- unsigned long mask[CPU_ARRAY_SIZE];
-};
-
-typedef struct cpumask cpumask_t;
-
-#else
-typedef unsigned long cpumask_t;
-#endif
-
-#ifdef CONFIG_SMP
-#if NR_CPUS > BITS_PER_LONG
-#include <asm-generic/cpumask_array.h>
-#else
-#include <asm-generic/cpumask_arith.h>
-#endif
-#else
-#include <asm-generic/cpumask_up.h>
-#endif
-
-#if NR_CPUS <= 4*BITS_PER_LONG
-#include <asm-generic/cpumask_const_value.h>
-#else
-#include <asm-generic/cpumask_const_reference.h>
-#endif
-
-#endif /* __ASM_GENERIC_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-generic/cpumask_arith.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/cpumask_arith.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/cpumask_arith.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,49 +0,0 @@
-#ifndef __ASM_GENERIC_CPUMASK_ARITH_H
-#define __ASM_GENERIC_CPUMASK_ARITH_H
-
-/*
- * Arithmetic type -based cpu bitmaps. A single unsigned long is used
- * to contain the whole cpu bitmap.
- */
-
-#define cpu_set(cpu, map) set_bit(cpu, &(map))
-#define cpu_clear(cpu, map) clear_bit(cpu, &(map))
-#define cpu_isset(cpu, map) test_bit(cpu, &(map))
-#define cpu_test_and_set(cpu, map) test_and_set_bit(cpu, &(map))
-
-#define cpus_and(dst,src1,src2) do { dst = (src1) & (src2); } while (0)
-#define cpus_or(dst,src1,src2) do { dst = (src1) | (src2); } while (0)
-#define cpus_clear(map) do { map = 0; } while (0)
-#define cpus_complement(map) do { map = ~(map); } while (0)
-#define cpus_equal(map1, map2) ((map1) == (map2))
-#define cpus_empty(map) ((map) == 0)
-#define cpus_addr(map) (&(map))
-
-#if BITS_PER_LONG == 32
-#define cpus_weight(map) hweight32(map)
-#elif BITS_PER_LONG == 64
-#define cpus_weight(map) hweight64(map)
-#endif
-
-#define cpus_shift_right(dst, src, n) do { dst = (src) >> (n); } while (0)
-#define cpus_shift_left(dst, src, n) do { dst = (src) << (n); } while (0)
-
-#define any_online_cpu(map) \
-({ \
- cpumask_t __tmp__; \
- cpus_and(__tmp__, map, cpu_online_map); \
- __tmp__ ? first_cpu(__tmp__) : NR_CPUS; \
-})
-
-#define CPU_MASK_ALL (~((cpumask_t)0) >> (8*sizeof(cpumask_t) - NR_CPUS))
-#define CPU_MASK_NONE ((cpumask_t)0)
-
-/* only ever use this for things that are _never_ used on large boxen */
-#define cpus_coerce(map) ((unsigned long)(map))
-#define cpus_promote(map) ({ map; })
-#define cpumask_of_cpu(cpu) ({ ((cpumask_t)1) << (cpu); })
-
-#define first_cpu(map) find_first_bit(&(map), NR_CPUS)
-#define next_cpu(cpu, map) find_next_bit(&(map), NR_CPUS, cpu + 1)
-
-#endif /* __ASM_GENERIC_CPUMASK_ARITH_H */
Index: 2.6.7-rc2-mm2/include/asm-generic/cpumask_array.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/cpumask_array.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/cpumask_array.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,54 +0,0 @@
-#ifndef __ASM_GENERIC_CPUMASK_ARRAY_H
-#define __ASM_GENERIC_CPUMASK_ARRAY_H
-
-/*
- * Array-based cpu bitmaps. An array of unsigned longs is used to contain
- * the bitmap, and then contained in a structure so it may be passed by
- * value.
- */
-
-#define CPU_ARRAY_SIZE BITS_TO_LONGS(NR_CPUS)
-
-#define cpu_set(cpu, map) set_bit(cpu, (map).mask)
-#define cpu_clear(cpu, map) clear_bit(cpu, (map).mask)
-#define cpu_isset(cpu, map) test_bit(cpu, (map).mask)
-#define cpu_test_and_set(cpu, map) test_and_set_bit(cpu, (map).mask)
-
-#define cpus_and(dst,src1,src2) bitmap_and((dst).mask,(src1).mask, (src2).mask, NR_CPUS)
-#define cpus_or(dst,src1,src2) bitmap_or((dst).mask, (src1).mask, (src2).mask, NR_CPUS)
-#define cpus_clear(map) bitmap_zero((map).mask, NR_CPUS)
-#define cpus_complement(map) bitmap_complement((map).mask, (map).mask, NR_CPUS)
-#define cpus_equal(map1, map2) bitmap_equal((map1).mask, (map2).mask, NR_CPUS)
-#define cpus_empty(map) bitmap_empty(map.mask, NR_CPUS)
-#define cpus_addr(map) ((map).mask)
-#define cpus_weight(map) bitmap_weight((map).mask, NR_CPUS)
-#define cpus_shift_right(d, s, n) bitmap_shift_right((d).mask, (s).mask, n, NR_CPUS)
-#define cpus_shift_left(d, s, n) bitmap_shift_left((d).mask, (s).mask, n, NR_CPUS)
-#define first_cpu(map) find_first_bit((map).mask, NR_CPUS)
-#define next_cpu(cpu, map) find_next_bit((map).mask, NR_CPUS, cpu + 1)
-
-/* only ever use this for things that are _never_ used on large boxen */
-#define cpus_coerce(map) ((map).mask[0])
-#define cpus_promote(map) ({ cpumask_t __cpu_mask = CPU_MASK_NONE;\
- __cpu_mask.mask[0] = map; \
- __cpu_mask; \
- })
-#define cpumask_of_cpu(cpu) ({ cpumask_t __cpu_mask = CPU_MASK_NONE;\
- cpu_set(cpu, __cpu_mask); \
- __cpu_mask; \
- })
-#define any_online_cpu(map) \
-({ \
- cpumask_t __tmp__; \
- cpus_and(__tmp__, map, cpu_online_map); \
- find_first_bit(__tmp__.mask, NR_CPUS); \
-})
-
-
-/*
- * um, these need to be usable as static initializers
- */
-#define CPU_MASK_ALL ((cpumask_t) { {[0 ... CPU_ARRAY_SIZE-1] = ~0UL} })
-#define CPU_MASK_NONE ((cpumask_t) { {[0 ... CPU_ARRAY_SIZE-1] = 0UL} })
-
-#endif /* __ASM_GENERIC_CPUMASK_ARRAY_H */
Index: 2.6.7-rc2-mm2/include/asm-generic/cpumask_const_reference.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/cpumask_const_reference.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/cpumask_const_reference.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,29 +0,0 @@
-#ifndef __ASM_GENERIC_CPUMASK_CONST_REFERENCE_H
-#define __ASM_GENERIC_CPUMASK_CONST_REFERENCE_H
-
-struct cpumask_ref {
- const cpumask_t *val;
-};
-
-typedef const struct cpumask_ref cpumask_const_t;
-
-#define mk_cpumask_const(map) ((cpumask_const_t){ &(map) })
-#define cpu_isset_const(cpu, map) cpu_isset(cpu, *(map).val)
-
-#define cpus_and_const(dst,src1,src2) cpus_and(dst,*(src1).val,*(src2).val)
-#define cpus_or_const(dst,src1,src2) cpus_or(dst,*(src1).val,*(src2).val)
-
-#define cpus_equal_const(map1, map2) cpus_equal(*(map1).val, *(map2).val)
-
-#define cpus_copy_const(map1, map2) bitmap_copy((map1).mask, (map2).val->mask, NR_CPUS)
-
-#define cpus_empty_const(map) cpus_empty(*(map).val)
-#define cpus_weight_const(map) cpus_weight(*(map).val)
-#define first_cpu_const(map) first_cpu(*(map).val)
-#define next_cpu_const(cpu, map) next_cpu(cpu, *(map).val)
-
-/* only ever use this for things that are _never_ used on large boxen */
-#define cpus_coerce_const(map) cpus_coerce(*(map).val)
-#define any_online_cpu_const(map) any_online_cpu(*(map).val)
-
-#endif /* __ASM_GENERIC_CPUMASK_CONST_REFERENCE_H */
Index: 2.6.7-rc2-mm2/include/asm-generic/cpumask_const_value.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/cpumask_const_value.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/cpumask_const_value.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,21 +0,0 @@
-#ifndef __ASM_GENERIC_CPUMASK_CONST_VALUE_H
-#define __ASM_GENERIC_CPUMASK_CONST_VALUE_H
-
-typedef const cpumask_t cpumask_const_t;
-
-#define mk_cpumask_const(map) (map)
-#define cpu_isset_const(cpu, map) cpu_isset(cpu, map)
-#define cpus_and_const(dst,src1,src2) cpus_and(dst, src1, src2)
-#define cpus_or_const(dst,src1,src2) cpus_or(dst, src1, src2)
-#define cpus_equal_const(map1, map2) cpus_equal(map1, map2)
-#define cpus_empty_const(map) cpus_empty(map)
-#define cpus_copy_const(map1, map2) do { map1 = (cpumask_t)map2; } while (0)
-#define cpus_weight_const(map) cpus_weight(map)
-#define first_cpu_const(map) first_cpu(map)
-#define next_cpu_const(cpu, map) next_cpu(cpu, map)
-
-/* only ever use this for things that are _never_ used on large boxen */
-#define cpus_coerce_const(map) cpus_coerce(map)
-#define any_online_cpu_const(map) any_online_cpu(map)
-
-#endif /* __ASM_GENERIC_CPUMASK_CONST_VALUE_H */
Index: 2.6.7-rc2-mm2/include/asm-generic/cpumask_up.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-generic/cpumask_up.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-generic/cpumask_up.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,59 +0,0 @@
-#ifndef __ASM_GENERIC_CPUMASK_UP_H
-#define __ASM_GENERIC_CPUMASK_UP_H
-
-#define cpus_coerce(map) (map)
-
-#define cpu_set(cpu, map) do { (void)(cpu); cpus_coerce(map) = 1UL; } while (0)
-#define cpu_clear(cpu, map) do { (void)(cpu); cpus_coerce(map) = 0UL; } while (0)
-#define cpu_isset(cpu, map) ((void)(cpu), cpus_coerce(map) != 0UL)
-#define cpu_test_and_set(cpu, map) ((void)(cpu), test_and_set_bit(0, &(map)))
-
-#define cpus_and(dst, src1, src2) \
- do { \
- if (cpus_coerce(src1) && cpus_coerce(src2)) \
- cpus_coerce(dst) = 1UL; \
- else \
- cpus_coerce(dst) = 0UL; \
- } while (0)
-
-#define cpus_or(dst, src1, src2) \
- do { \
- if (cpus_coerce(src1) || cpus_coerce(src2)) \
- cpus_coerce(dst) = 1UL; \
- else \
- cpus_coerce(dst) = 0UL; \
- } while (0)
-
-#define cpus_clear(map) do { cpus_coerce(map) = 0UL; } while (0)
-
-#define cpus_complement(map) \
- do { \
- cpus_coerce(map) = !cpus_coerce(map); \
- } while (0)
-
-#define cpus_equal(map1, map2) (cpus_coerce(map1) == cpus_coerce(map2))
-#define cpus_empty(map) (cpus_coerce(map) == 0UL)
-#define cpus_addr(map) (&(map))
-#define cpus_weight(map) (cpus_coerce(map) ? 1UL : 0UL)
-#define cpus_shift_right(d, s, n) do { cpus_coerce(d) = 0UL; } while (0)
-#define cpus_shift_left(d, s, n) do { cpus_coerce(d) = 0UL; } while (0)
-#define first_cpu(map) (cpus_coerce(map) ? 0 : 1)
-#define next_cpu(cpu, map) 1
-
-/* only ever use this for things that are _never_ used on large boxen */
-#define cpus_promote(map) \
- ({ \
- cpumask_t __tmp__; \
- cpus_coerce(__tmp__) = map; \
- __tmp__; \
- })
-#define cpumask_of_cpu(cpu) ((void)(cpu), cpus_promote(1))
-#define any_online_cpu(map) (cpus_coerce(map) ? 0 : 1)
-
-/*
- * um, these need to be usable as static initializers
- */
-#define CPU_MASK_ALL 1UL
-#define CPU_MASK_NONE 0UL
-
-#endif /* __ASM_GENERIC_CPUMASK_UP_H */
Index: 2.6.7-rc2-mm2/include/asm-h8300/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-h8300/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-h8300/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_H8300_CPUMASK_H
-#define _ASM_H8300_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_H8300_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-i386/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_I386_CPUMASK_H
-#define _ASM_I386_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_I386_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-m68k/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-m68k/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-m68k/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_M68K_CPUMASK_H
-#define _ASM_M68K_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_M68K_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-m68knommu/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-m68knommu/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-m68knommu/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_M68KNOMMU_CPUMASK_H
-#define _ASM_M68KNOMMU_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_M68KNOMMU_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-mips/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-mips/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-mips/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_MIPS_CPUMASK_H
-#define _ASM_MIPS_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_MIPS_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-parisc/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-parisc/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-parisc/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_PARISC_CPUMASK_H
-#define _ASM_PARISC_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_PARISC_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-ppc/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-ppc/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-ppc/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_PPC_CPUMASK_H
-#define _ASM_PPC_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_PPC_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-ppc64/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-ppc64/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-ppc64/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_PPC64_CPUMASK_H
-#define _ASM_PPC64_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_PPC64_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-s390/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-s390/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-s390/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_S390_CPUMASK_H
-#define _ASM_S390_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_S390_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-sh/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-sh/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-sh/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_SH_CPUMASK_H
-#define _ASM_SH_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_SH_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-sparc/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-sparc/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-sparc/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_SPARC_CPUMASK_H
-#define _ASM_SPARC_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_SPARC_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-sparc64/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-sparc64/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-sparc64/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_SPARC64_CPUMASK_H
-#define _ASM_SPARC64_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_SPARC64_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-um/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-um/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-um/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_UM_CPUMASK_H
-#define _ASM_UM_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_UM_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-v850/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-v850/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-v850/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_V850_CPUMASK_H
-#define _ASM_V850_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_V850_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-x86_64/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-x86_64/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-x86_64/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_X86_64_CPUMASK_H
-#define _ASM_X86_64_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_X86_64_CPUMASK_H */
Index: 2.6.7-rc2-mm2/include/asm-ia64/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-ia64/cpumask.h 2004-06-03 07:03:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-ia64/cpumask.h 1969-12-31 16:00:00.000000000 -0800
@@ -1,6 +0,0 @@
-#ifndef _ASM_IA64_CPUMASK_H
-#define _ASM_IA64_CPUMASK_H
-
-#include <asm-generic/cpumask.h>
-
-#endif /* _ASM_IA64_CPUMASK_H */


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:20:25

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 7/10 remove obsolete cpumask macro uses - i386 arch

cpumask 7/10 remove obsolete cpumask macro uses - i386 arch

Remove by recoding i386 uses of the obsolete cpumask const,
coerce and promote macros.

arch/i386/kernel/io_apic.c | 2
arch/i386/kernel/smp.c | 2
arch/i386/mach-voyager/voyager_smp.c | 30 +++++-----
include/asm-i386/genapic.h | 2
include/asm-i386/mach-bigsmp/mach_apic.h | 10 +--
include/asm-i386/mach-default/mach_apic.h | 10 +--
include/asm-i386/mach-es7000/mach_apic.h | 10 +--
include/asm-i386/mach-numaq/mach_apic.h | 2
include/asm-i386/mach-summit/mach_apic.h | 8 +-
include/asm-i386/mach-visws/mach_apic.h | 4 -
10 files changed, 40 insertions(+), 40 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/arch/i386/kernel/io_apic.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/i386/kernel/io_apic.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/i386/kernel/io_apic.c 2004-06-03 07:08:00.000000000 -0700
@@ -224,7 +224,7 @@
struct irq_pin_list *entry = irq_2_pin + irq;
unsigned int apicid_value;

- apicid_value = cpu_mask_to_apicid(mk_cpumask_const(cpumask));
+ apicid_value = cpu_mask_to_apicid(cpumask);
/* Prepare to do the io_apic_write */
apicid_value = apicid_value << 24;
spin_lock_irqsave(&ioapic_lock, flags);
Index: 2.6.7-rc2-mm2/arch/i386/kernel/smp.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/i386/kernel/smp.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/i386/kernel/smp.c 2004-06-03 07:08:00.000000000 -0700
@@ -159,7 +159,7 @@
*/
inline void send_IPI_mask_bitmask(cpumask_t cpumask, int vector)
{
- unsigned long mask = cpus_coerce(cpumask);
+ unsigned long mask = cpus_addr(cpumask)[0];
unsigned long cfg;
unsigned long flags;

Index: 2.6.7-rc2-mm2/arch/i386/mach-voyager/voyager_smp.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/i386/mach-voyager/voyager_smp.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/i386/mach-voyager/voyager_smp.c 2004-06-03 07:08:00.000000000 -0700
@@ -153,7 +153,7 @@
send_CPI_allbutself(__u8 cpi)
{
__u8 cpu = smp_processor_id();
- __u32 mask = cpus_coerce(cpu_online_map) & ~(1 << cpu);
+ __u32 mask = cpus_addr(cpu_online_map)[0] & ~(1 << cpu);
send_CPI(mask, cpi);
}

@@ -402,11 +402,11 @@
/* set up everything for just this CPU, we can alter
* this as we start the other CPUs later */
/* now get the CPU disposition from the extended CMOS */
- phys_cpu_present_map = cpus_promote(voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK));
- cpus_coerce(phys_cpu_present_map) |= voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK + 1) << 8;
- cpus_coerce(phys_cpu_present_map) |= voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK + 2) << 16;
- cpus_coerce(phys_cpu_present_map) |= voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK + 3) << 24;
- printk("VOYAGER SMP: phys_cpu_present_map = 0x%lx\n", cpus_coerce(phys_cpu_present_map));
+ cpus_addr(phys_cpu_present_map)[0] = voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK);
+ cpus_addr(phys_cpu_present_map)[0] |= voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK + 1) << 8;
+ cpus_addr(phys_cpu_present_map)[0] |= voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK + 2) << 16;
+ cpus_addr(phys_cpu_present_map)[0] |= voyager_extended_cmos_read(VOYAGER_PROCESSOR_PRESENT_MASK + 3) << 24;
+ printk("VOYAGER SMP: phys_cpu_present_map = 0x%lx\n", cpus_addr(phys_cpu_present_map)[0]);
/* Here we set up the VIC to enable SMP */
/* enable the CPIs by writing the base vector to their register */
outb(VIC_DEFAULT_CPI_BASE, VIC_CPI_BASE_REGISTER);
@@ -706,12 +706,12 @@
/* now that the cat has probed the Voyager System Bus, sanity
* check the cpu map */
if( ((voyager_quad_processors | voyager_extended_vic_processors)
- & cpus_coerce(phys_cpu_present_map)) != cpus_coerce(phys_cpu_present_map)) {
+ & cpus_addr(phys_cpu_present_map)[0]) != cpus_addr(phys_cpu_present_map)[0]) {
/* should panic */
printk("\n\n***WARNING*** Sanity check of CPU present map FAILED\n");
}
} else if(voyager_level == 4)
- voyager_extended_vic_processors = cpus_coerce(phys_cpu_present_map);
+ voyager_extended_vic_processors = cpus_addr(phys_cpu_present_map)[0];

/* this sets up the idle task to run on the current cpu */
voyager_extended_cpus = 1;
@@ -909,7 +909,7 @@

if (!cpumask)
BUG();
- if ((cpumask & cpus_coerce(cpu_online_map)) != cpumask)
+ if ((cpumask & cpus_addr(cpu_online_map)[0]) != cpumask)
BUG();
if (cpumask & (1 << smp_processor_id()))
BUG();
@@ -952,7 +952,7 @@

preempt_disable();

- cpu_mask = cpus_coerce(mm->cpu_vm_mask) & ~(1 << smp_processor_id());
+ cpu_mask = cpus_addr(mm->cpu_vm_mask)[0] & ~(1 << smp_processor_id());
local_flush_tlb();
if (cpu_mask)
flush_tlb_others(cpu_mask, mm, FLUSH_ALL);
@@ -968,7 +968,7 @@

preempt_disable();

- cpu_mask = cpus_coerce(mm->cpu_vm_mask) & ~(1 << smp_processor_id());
+ cpu_mask = cpus_addr(mm->cpu_vm_mask)[0] & ~(1 << smp_processor_id());

if (current->active_mm == mm) {
if (current->mm)
@@ -989,7 +989,7 @@

preempt_disable();

- cpu_mask = cpus_coerce(mm->cpu_vm_mask) & ~(1 << smp_processor_id());
+ cpu_mask = cpus_addr(mm->cpu_vm_mask)[0] & ~(1 << smp_processor_id());
if (current->active_mm == mm) {
if(current->mm)
__flush_tlb_one(va);
@@ -1098,7 +1098,7 @@
int wait)
{
struct call_data_struct data;
- __u32 mask = cpus_coerce(cpu_online_map);
+ __u32 mask = cpus_addr(cpu_online_map)[0];

mask &= ~(1<<smp_processor_id());

@@ -1789,9 +1789,9 @@
unsigned long irq_mask = 1 << irq;
int cpu;

- real_mask = cpus_coerce(mask) & voyager_extended_vic_processors;
+ real_mask = cpus_addr(mask)[0] & voyager_extended_vic_processors;

- if(cpus_coerce(mask) == 0)
+ if(cpus_addr(mask)[0] == 0)
/* can't have no cpu's to accept the interrupt -- extremely
* bad things will happen */
return;
Index: 2.6.7-rc2-mm2/include/asm-i386/genapic.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/genapic.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/genapic.h 2004-06-03 07:08:00.000000000 -0700
@@ -63,7 +63,7 @@

unsigned (*get_apic_id)(unsigned long x);
unsigned long apic_id_mask;
- unsigned int (*cpu_mask_to_apicid)(cpumask_const_t cpumask);
+ unsigned int (*cpu_mask_to_apicid)(cpumask_t cpumask);

/* ipi */
void (*send_IPI_mask)(cpumask_t mask, int vector);
Index: 2.6.7-rc2-mm2/include/asm-i386/mach-bigsmp/mach_apic.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mach-bigsmp/mach_apic.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mach-bigsmp/mach_apic.h 2004-06-03 07:08:00.000000000 -0700
@@ -28,11 +28,11 @@
static unsigned long cpu = NR_CPUS;
do {
if (cpu >= NR_CPUS)
- cpu = first_cpu_const(cpu_online_map);
+ cpu = first_cpu(cpu_online_map);
else
- cpu = next_cpu_const(cpu, cpu_online_map);
+ cpu = next_cpu(cpu, cpu_online_map);
} while (cpu >= NR_CPUS);
- return mk_cpumask_const(cpumask_of_cpu(cpu));
+ return cpumask_of_cpu(cpu);
}
#define TARGET_CPUS (target_cpus())

@@ -150,12 +150,12 @@
}

/* As we are using single CPU as destination, pick only one CPU here */
-static inline unsigned int cpu_mask_to_apicid(cpumask_const_t cpumask)
+static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
int cpu;
int apicid;

- cpu = first_cpu_const(cpumask);
+ cpu = first_cpu(cpumask);
apicid = cpu_to_logical_apicid(cpu);
return apicid;
}
Index: 2.6.7-rc2-mm2/include/asm-i386/mach-default/mach_apic.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mach-default/mach_apic.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mach-default/mach_apic.h 2004-06-03 07:08:00.000000000 -0700
@@ -5,12 +5,12 @@

#define APIC_DFR_VALUE (APIC_DFR_FLAT)

-static inline cpumask_const_t target_cpus(void)
+static inline cpumask_t target_cpus(void)
{
#ifdef CONFIG_SMP
- return mk_cpumask_const(cpu_online_map);
+ return cpu_online_map;
#else
- return mk_cpumask_const(cpumask_of_cpu(0));
+ return cpumask_of_cpu(0);
#endif
}
#define TARGET_CPUS (target_cpus())
@@ -118,9 +118,9 @@
return physid_isset(GET_APIC_ID(apic_read(APIC_ID)), phys_cpu_present_map);
}

-static inline unsigned int cpu_mask_to_apicid(cpumask_const_t cpumask)
+static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
- return cpus_coerce_const(cpumask);
+ return cpus_addr(cpumask)[0];
}

static inline void enable_apic_mode(void)
Index: 2.6.7-rc2-mm2/include/asm-i386/mach-es7000/mach_apic.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mach-es7000/mach_apic.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mach-es7000/mach_apic.h 2004-06-03 07:08:00.000000000 -0700
@@ -89,7 +89,7 @@
int apic = bios_cpu_apicid[smp_processor_id()];
printk("Enabling APIC mode: %s. Using %d I/O APICs, target cpus %lx\n",
(apic_version[apic] == 0x14) ?
- "Physical Cluster" : "Logical Cluster", nr_ioapics, cpus_coerce(TARGET_CPUS));
+ "Physical Cluster" : "Logical Cluster", nr_ioapics, cpus_addr(TARGET_CPUS)[0]);
}

static inline int multi_timer_check(int apic, int irq)
@@ -159,14 +159,14 @@
return (1);
}

-static inline unsigned int cpu_mask_to_apicid(cpumask_const_t cpumask)
+static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
int num_bits_set;
int cpus_found = 0;
int cpu;
int apicid;

- num_bits_set = cpus_weight_const(cpumask);
+ num_bits_set = cpus_weight(cpumask);
/* Return id to all */
if (num_bits_set == NR_CPUS)
#if defined CONFIG_ES7000_CLUSTERED_APIC
@@ -178,10 +178,10 @@
* The cpus in the mask must all be on the apic cluster. If are not
* on the same apicid cluster return default value of TARGET_CPUS.
*/
- cpu = first_cpu_const(cpumask);
+ cpu = first_cpu(cpumask);
apicid = cpu_to_logical_apicid(cpu);
while (cpus_found < num_bits_set) {
- if (cpu_isset_const(cpu, cpumask)) {
+ if (cpu_isset(cpu, cpumask)) {
int new_apicid = cpu_to_logical_apicid(cpu);
if (apicid_cluster(apicid) !=
apicid_cluster(new_apicid)){
Index: 2.6.7-rc2-mm2/include/asm-i386/mach-numaq/mach_apic.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mach-numaq/mach_apic.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mach-numaq/mach_apic.h 2004-06-03 07:08:00.000000000 -0700
@@ -136,7 +136,7 @@
* We use physical apicids here, not logical, so just return the default
* physical broadcast to stop people from breaking us
*/
-static inline unsigned int cpu_mask_to_apicid(cpumask_const_t cpumask)
+static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
return (int) 0xF;
}
Index: 2.6.7-rc2-mm2/include/asm-i386/mach-summit/mach_apic.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mach-summit/mach_apic.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mach-summit/mach_apic.h 2004-06-03 07:08:00.000000000 -0700
@@ -140,14 +140,14 @@
{
}

-static inline unsigned int cpu_mask_to_apicid(cpumask_const_t cpumask)
+static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
int num_bits_set;
int cpus_found = 0;
int cpu;
int apicid;

- num_bits_set = cpus_weight_const(cpumask);
+ num_bits_set = cpus_weight(cpumask);
/* Return id to all */
if (num_bits_set == NR_CPUS)
return (int) 0xFF;
@@ -155,10 +155,10 @@
* The cpus in the mask must all be on the apic cluster. If are not
* on the same apicid cluster return default value of TARGET_CPUS.
*/
- cpu = first_cpu_const(cpumask);
+ cpu = first_cpu(cpumask);
apicid = cpu_to_logical_apicid(cpu);
while (cpus_found < num_bits_set) {
- if (cpu_isset_const(cpu, cpumask)) {
+ if (cpu_isset(cpu, cpumask)) {
int new_apicid = cpu_to_logical_apicid(cpu);
if (apicid_cluster(apicid) !=
apicid_cluster(new_apicid)){
Index: 2.6.7-rc2-mm2/include/asm-i386/mach-visws/mach_apic.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mach-visws/mach_apic.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mach-visws/mach_apic.h 2004-06-03 07:08:00.000000000 -0700
@@ -84,9 +84,9 @@
return physid_isset(boot_cpu_physical_apicid, phys_cpu_present_map);
}

-static inline unsigned int cpu_mask_to_apicid(cpumask_const_t cpumask)
+static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
- return cpus_coerce_const(cpumask);
+ return cpus_addr(cpumask)[0];
}

static inline u32 phys_pkg_id(u32 cpuid_apic, int index_msb)


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:18:07

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 8/10 remove obsolete cpumask macro uses - other archs

cpumask 8/10 remove obsolete cpumask macro uses - other archs

Remove by recoding other uses of the obsolete cpumask const,
coerce and promote macros.

arch/ppc64/kernel/open_pic.c | 8 ++++----
arch/ppc64/kernel/rtasd.c | 6 +++---
arch/x86_64/kernel/io_apic.c | 2 +-
arch/x86_64/kernel/pci-gart.c | 4 ++--
arch/x86_64/kernel/smp.c | 2 +-
include/asm-x86_64/smp.h | 5 ++---
6 files changed, 13 insertions(+), 14 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/arch/ppc64/kernel/open_pic.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/ppc64/kernel/open_pic.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/ppc64/kernel/open_pic.c 2004-06-03 07:08:02.000000000 -0700
@@ -591,7 +591,7 @@
void openpic_init_processor(u_int cpumask)
{
openpic_write(&OpenPIC->Global.Processor_Initialization,
- physmask(cpumask & cpus_coerce(cpu_online_map)));
+ physmask(cpumask & cpus_addr(cpu_online_map)[0]));
}

#ifdef CONFIG_SMP
@@ -625,7 +625,7 @@
CHECK_THIS_CPU;
check_arg_ipi(ipi);
openpic_write(&OpenPIC->THIS_CPU.IPI_Dispatch(ipi),
- physmask(cpumask & cpus_coerce(cpu_online_map)));
+ physmask(cpumask & cpus_addr(cpu_online_map)[0]));
}

void openpic_request_IPIs(void)
@@ -711,7 +711,7 @@
{
check_arg_timer(timer);
openpic_write(&OpenPIC->Global.Timer[timer].Destination,
- physmask(cpumask & cpus_coerce(cpu_online_map)));
+ physmask(cpumask & cpus_addr(cpu_online_map)[0]));
}


@@ -844,7 +844,7 @@
cpumask_t tmp;

cpus_and(tmp, cpumask, cpu_online_map);
- openpic_mapirq(irq_nr - open_pic_irq_offset, physmask(cpus_coerce(tmp)));
+ openpic_mapirq(irq_nr - open_pic_irq_offset, physmask(cpus_addr(tmp)[0]));
}

#ifdef CONFIG_SMP
Index: 2.6.7-rc2-mm2/arch/ppc64/kernel/rtasd.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/ppc64/kernel/rtasd.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/ppc64/kernel/rtasd.c 2004-06-03 07:08:02.000000000 -0700
@@ -415,7 +415,7 @@
}

lock_cpu_hotplug();
- cpu = first_cpu_const(mk_cpumask_const(cpu_online_map));
+ cpu = first_cpu(cpu_online_map);
for (;;) {
set_cpus_allowed(current, cpumask_of_cpu(cpu));
do_event_scan(event_scan);
@@ -429,9 +429,9 @@
schedule_timeout((HZ*60/rtas_event_scan_rate) / 2);
lock_cpu_hotplug();

- cpu = next_cpu_const(cpu, mk_cpumask_const(cpu_online_map));
+ cpu = next_cpu(cpu, cpu_online_map);
if (cpu == NR_CPUS)
- cpu = first_cpu_const(mk_cpumask_const(cpu_online_map));
+ cpu = first_cpu(cpu_online_map);
}

error:
Index: 2.6.7-rc2-mm2/arch/x86_64/kernel/io_apic.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/x86_64/kernel/io_apic.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/x86_64/kernel/io_apic.c 2004-06-03 07:08:02.000000000 -0700
@@ -1383,7 +1383,7 @@
unsigned long flags;
unsigned int dest;

- dest = cpu_mask_to_apicid(mk_cpumask_const(mask));
+ dest = cpu_mask_to_apicid(mask);

/*
* Only the first 8 bits are valid.
Index: 2.6.7-rc2-mm2/arch/x86_64/kernel/pci-gart.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/x86_64/kernel/pci-gart.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/x86_64/kernel/pci-gart.c 2004-06-03 07:08:02.000000000 -0700
@@ -148,7 +148,7 @@
{
unsigned long flags;
int bus = dev ? dev->bus->number : -1;
- cpumask_const_t bus_cpumask = pcibus_to_cpumask(bus);
+ cpumask_t bus_cpumask = pcibus_to_cpumask(bus);
int flushed = 0;
int i;

@@ -158,7 +158,7 @@
u32 w;
if (!northbridges[i])
continue;
- if (bus >= 0 && !(cpu_isset_const(i, bus_cpumask)))
+ if (bus >= 0 && !(cpu_isset(i, bus_cpumask)))
continue;
pci_write_config_dword(northbridges[i], 0x9c,
northbridge_flush_word[i] | 1);
Index: 2.6.7-rc2-mm2/arch/x86_64/kernel/smp.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/x86_64/kernel/smp.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/x86_64/kernel/smp.c 2004-06-03 07:08:02.000000000 -0700
@@ -94,7 +94,7 @@

static inline void send_IPI_mask(cpumask_t cpumask, int vector)
{
- unsigned long mask = cpus_coerce(cpumask);
+ unsigned long mask = cpus_addr(cpumask)[0];
unsigned long cfg;
unsigned long flags;

Index: 2.6.7-rc2-mm2/include/asm-x86_64/smp.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-x86_64/smp.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-x86_64/smp.h 2004-06-03 07:08:02.000000000 -0700
@@ -105,7 +105,6 @@
return BAD_APICID;
}

-#define cpu_online(cpu) cpu_isset(cpu, cpu_online_map)
#endif /* !ASSEMBLY */

#define NO_PROC_ID 0xFF /* No processor magic marker */
@@ -115,9 +114,9 @@
#define TARGET_CPUS 1

#ifndef ASSEMBLY
-static inline unsigned int cpu_mask_to_apicid(cpumask_const_t cpumask)
+static inline unsigned int cpu_mask_to_apicid(cpumask_t cpumask)
{
- return cpus_coerce_const(cpumask);
+ return cpus_addr(cpumask)[0];
}
#endif



--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:20:25

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 9/10 Remove no longer used obsolete macro emulation

cpumask 9/10 Remove no longer used obsolete macro emulation

Now that the emulation of the obsolete cpumask macros is no
longer needed, remove it from cpumask.h

include/linux/cpumask.h | 12 ------------
1 files changed, 12 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/include/linux/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/linux/cpumask.h 2004-06-03 07:04:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/linux/cpumask.h 2004-06-03 07:08:04.000000000 -0700
@@ -364,16 +364,4 @@
#define for_each_online_cpu(cpu) for_each_cpu_mask((cpu), cpu_online_map)
#define for_each_present_cpu(cpu) for_each_cpu_mask((cpu), cpu_present_map)

-/* Begin obsolete cpumask operator emulation */
-#define cpu_isset_const(a,b) cpu_isset(a,b)
-#define cpumask_const_t cpumask_t
-#define cpus_coerce(m) (cpus_addr(m)[0])
-#define cpus_coerce_const cpus_coerce
-#define cpus_promote(x) ({ cpumask_t m; m.bits[0] = x; m; })
-#define cpus_weight_const cpus_weight
-#define first_cpu_const first_cpu
-#define mk_cpumask_const(x) x
-#define next_cpu_const next_cpu
-/* End of obsolete cpumask operator emulation */
-
#endif /* __LINUX_CPUMASK_H */


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:23:33

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 4/10 uninline find_next_bit on ia64

cpumask 4/10 uninline find_next_bit on ia64

Move the page of code (~700 bytes of instructions)
for find_next_bit and find_next_zero_bit from inline
in include/asm-ia64/bitops.h to a real function in
arch/ia64/lib/bitops.c, leaving a declaration and
macro wrapper behind.

The other arch's with almost this same code might want to
also uninline it: alpha, parisc, ppc, sh, sparc, sparc64.

These are too big to inline.

arch/ia64/lib/Makefile | 2
arch/ia64/lib/bitop.c | 88 ++++++++++++++++++++++
include/asm-ia64/bitops.h | 92 ++---------------------
3 files changed, 99 insertions(+), 83 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/include/asm-ia64/bitops.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-ia64/bitops.h 2004-06-03 05:39:28.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-ia64/bitops.h 2004-06-03 05:57:10.000000000 -0700
@@ -11,7 +11,7 @@

#include <linux/compiler.h>
#include <linux/types.h>
-
+#include <asm/bitops.h>
#include <asm/intrinsics.h>

/**
@@ -359,93 +359,21 @@

#endif /* __KERNEL__ */

-/*
- * Find next zero bit in a bitmap reasonably efficiently..
- */
-static inline int
-find_next_zero_bit (void *addr, unsigned long size, unsigned long offset)
-{
- unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
- unsigned long result = offset & ~63UL;
- unsigned long tmp;
-
- if (offset >= size)
- return size;
- size -= result;
- offset &= 63UL;
- if (offset) {
- tmp = *(p++);
- tmp |= ~0UL >> (64-offset);
- if (size < 64)
- goto found_first;
- if (~tmp)
- goto found_middle;
- size -= 64;
- result += 64;
- }
- while (size & ~63UL) {
- if (~(tmp = *(p++)))
- goto found_middle;
- result += 64;
- size -= 64;
- }
- if (!size)
- return result;
- tmp = *p;
-found_first:
- tmp |= ~0UL << size;
- if (tmp == ~0UL) /* any bits zero? */
- return result + size; /* nope */
-found_middle:
- return result + ffz(tmp);
-}
+extern int __find_next_zero_bit (void *addr, unsigned long size, \
+ unsigned long offset);
+extern int __find_next_bit(const void *addr, unsigned long size, \
+ unsigned long offset);
+
+#define find_next_zero_bit(addr, size, offset) \
+ __find_next_zero_bit((addr), (size), (offset))
+#define find_next_bit(addr, size, offset) \
+ __find_next_bit((addr), (size), (offset))

/*
* The optimizer actually does good code for this case..
*/
#define find_first_zero_bit(addr, size) find_next_zero_bit((addr), (size), 0)

-/*
- * Find next bit in a bitmap reasonably efficiently..
- */
-static inline int
-find_next_bit(const void *addr, unsigned long size, unsigned long offset)
-{
- unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
- unsigned long result = offset & ~63UL;
- unsigned long tmp;
-
- if (offset >= size)
- return size;
- size -= result;
- offset &= 63UL;
- if (offset) {
- tmp = *(p++);
- tmp &= ~0UL << offset;
- if (size < 64)
- goto found_first;
- if (tmp)
- goto found_middle;
- size -= 64;
- result += 64;
- }
- while (size & ~63UL) {
- if ((tmp = *(p++)))
- goto found_middle;
- result += 64;
- size -= 64;
- }
- if (!size)
- return result;
- tmp = *p;
- found_first:
- tmp &= ~0UL >> (64-size);
- if (tmp == 0UL) /* Are any bits set? */
- return result + size; /* Nope. */
- found_middle:
- return result + __ffs(tmp);
-}
-
#define find_first_bit(addr, size) find_next_bit((addr), (size), 0)

#ifdef __KERNEL__
Index: 2.6.7-rc2-mm2/arch/ia64/lib/bitop.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/ia64/lib/bitop.c 2004-06-03 05:57:10.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/ia64/lib/bitop.c 2004-06-03 05:57:10.000000000 -0700
@@ -0,0 +1,88 @@
+#include <linux/compiler.h>
+#include <linux/types.h>
+#include <asm/intrinsics.h>
+#include <linux/module.h>
+#include <asm/bitops.h>
+
+/*
+ * Find next zero bit in a bitmap reasonably efficiently..
+ */
+
+int __find_next_zero_bit (void *addr, unsigned long size, unsigned long offset)
+{
+ unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
+ unsigned long result = offset & ~63UL;
+ unsigned long tmp;
+
+ if (offset >= size)
+ return size;
+ size -= result;
+ offset &= 63UL;
+ if (offset) {
+ tmp = *(p++);
+ tmp |= ~0UL >> (64-offset);
+ if (size < 64)
+ goto found_first;
+ if (~tmp)
+ goto found_middle;
+ size -= 64;
+ result += 64;
+ }
+ while (size & ~63UL) {
+ if (~(tmp = *(p++)))
+ goto found_middle;
+ result += 64;
+ size -= 64;
+ }
+ if (!size)
+ return result;
+ tmp = *p;
+found_first:
+ tmp |= ~0UL << size;
+ if (tmp == ~0UL) /* any bits zero? */
+ return result + size; /* nope */
+found_middle:
+ return result + ffz(tmp);
+}
+EXPORT_SYMBOL(__find_next_zero_bit);
+
+/*
+ * Find next bit in a bitmap reasonably efficiently..
+ */
+int __find_next_bit(const void *addr, unsigned long size, unsigned long offset)
+{
+ unsigned long *p = ((unsigned long *) addr) + (offset >> 6);
+ unsigned long result = offset & ~63UL;
+ unsigned long tmp;
+
+ if (offset >= size)
+ return size;
+ size -= result;
+ offset &= 63UL;
+ if (offset) {
+ tmp = *(p++);
+ tmp &= ~0UL << offset;
+ if (size < 64)
+ goto found_first;
+ if (tmp)
+ goto found_middle;
+ size -= 64;
+ result += 64;
+ }
+ while (size & ~63UL) {
+ if ((tmp = *(p++)))
+ goto found_middle;
+ result += 64;
+ size -= 64;
+ }
+ if (!size)
+ return result;
+ tmp = *p;
+ found_first:
+ tmp &= ~0UL >> (64-size);
+ if (tmp == 0UL) /* Are any bits set? */
+ return result + size; /* Nope. */
+ found_middle:
+ return result + __ffs(tmp);
+}
+EXPORT_SYMBOL(__find_next_bit);
Index: 2.6.7-rc2-mm2/arch/ia64/lib/Makefile
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/ia64/lib/Makefile 2004-06-03 05:43:00.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/ia64/lib/Makefile 2004-06-03 05:57:10.000000000 -0700
@@ -6,7 +6,7 @@

lib-y := __divsi3.o __udivsi3.o __modsi3.o __umodsi3.o \
__divdi3.o __udivdi3.o __moddi3.o __umoddi3.o \
- checksum.o clear_page.o csum_partial_copy.o copy_page.o \
+ bitop.o checksum.o clear_page.o csum_partial_copy.o copy_page.o \
clear_user.o strncpy_from_user.o strlen_user.o strnlen_user.o \
flush.o ip_fast_csum.o do_csum.o \
memset.o strlen.o swiotlb.o


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:26:59

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Major rewrite of cpumask to use a single implementation,
as a struct-wrapped bitmap.

This patch leaves some 26 include/asm-*/cpumask*.h
header files orphaned - to be removed next patch.

Some nine cpumask macros for const variants and to
coerce and promote between an unsigned long and a
cpumask are obsolete. Simple emulation wrappers are
provided in this patch for these obsolete macros,
which can be removed once each of the 3 archs (i386,
ppc64, x86_64) using them are recoded in follow-on
patches to not need them.

The CPU_MASK_ALL macro now avoids leaving possible
garbage one bits in any unused portion of the high word.

An inproved comment lists all available operators, for
convenient browsing.

arch/sparc64/kernel/irq.c | 6
include/linux/cpumask.h | 409 +++++++++++++++++++++--
kernel/rcupdate.c | 7
kernel/sched.c | 5
4 files changed, 387 insertions(+), 40 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/include/linux/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/linux/cpumask.h 2004-06-03 05:56:00.000000000 -0700
+++ 2.6.7-rc2-mm2/include/linux/cpumask.h 2004-06-03 07:04:31.000000000 -0700
@@ -1,52 +1,379 @@
#ifndef __LINUX_CPUMASK_H
#define __LINUX_CPUMASK_H

+/*
+ * Cpumasks provide a bitmap suitable for representing the
+ * set of CPU's in a system, one bit position per CPU number.
+ *
+ * See detailed comments in the file linux/bitmap.h describing the
+ * data type on which these cpumasks are based.
+ *
+ * For details of cpumask_scnprintf() and cpumask_parse(),
+ * see bitmap_scnprintf() and bitmap_parse() in lib/bitmap.c.
+ *
+ * The available cpumask operations are:
+ *
+ * void cpu_set(cpu, mask) turn on bit 'cpu' in mask
+ * void cpu_clear(cpu, mask) turn off bit 'cpu' in mask
+ * void cpus_setall(mask) set all bits
+ * void cpus_clear(mask) clear all bits
+ * int cpu_isset(cpu, mask) true iff bit 'cpu' set in mask
+ * int cpu_test_and_set(cpu, mask) test and set bit 'cpu' in mask
+ *
+ * void cpus_and(dst, src1, src2) dst = src1 & src2 [intersection]
+ * void cpus_or(dst, src1, src2) dst = src1 | src2 [union]
+ * void cpus_xor(dst, src1, src2) dst = src1 ^ src2
+ * void cpus_andnot(dst, src1, src2) dst = src1 & ~src2
+ * void cpus_complement(dst, src) dst = ~src
+ *
+ * int cpus_equal(mask1, mask2) Does mask1 == mask2?
+ * int cpus_intersects(mask1, mask2) Do mask1 and mask2 intersect?
+ * int cpus_subset(mask1, mask2) Is mask1 a subset of mask2?
+ * int cpus_empty(mask) Is mask empty (no bits sets)?
+ * int cpus_full(mask) Is mask full (all bits sets)?
+ * int cpus_weight(mask) Hamming weigh - number of set bits
+ *
+ * void cpus_shift_right(dst, src, n) Shift right
+ * void cpus_shift_left(dst, src, n) Shift left
+ *
+ * int first_cpu(mask) Number lowest set bit, or NR_CPUS
+ * int next_cpu(cpu, mask) Next cpu past 'cpu', or NR_CPUS
+ *
+ * cpumask_t cpumask_of_cpu(cpu) Return cpumask with bit 'cpu' set
+ * CPU_MASK_ALL Initializer - all bits set
+ * CPU_MASK_NONE Initializer - no bits set
+ * unsigned long *cpus_addr(mask) Array of unsigned long's in mask
+ *
+ * int cpumask_scnprintf(buf, len, mask) Format cpumask for printing
+ * int cpumask_parse(ubuf, ulen, mask) Parse ascii string as cpumask
+ *
+ * int num_online_cpus() Number of online CPUs
+ * int num_possible_cpus() Number of all possible CPUs
+ * int cpu_online(cpu) Is some cpu online?
+ * int cpu_possible(cpu) Is some cpu possible?
+ * void cpu_set_online(cpu) set cpu in cpu_online_map
+ * void cpu_set_offline(cpu) clear cpu in cpu_online_map
+ * int any_online_cpu(mask) First online cpu in mask
+ *
+ * for_each_cpu_mask(cpu, mask) for-loop cpu over mask
+ * for_each_cpu(cpu) for-loop cpu over cpu_possible_map
+ * for_each_online_cpu(cpu) for-loop cpu over cpu_online_map
+ *
+ * Subtlety:
+ * 1) The 'type-checked' form of cpu_isset() causes gcc (3.3.2, anyway)
+ * to generate slightly worse code. Note for example the additional
+ * 40 lines of assembly code compiling the "for each possible cpu"
+ * loops buried in the disk_stat_read() macros calls when compiling
+ * drivers/block/genhd.c (arch i386, CONFIG_SMP=y). So use a simple
+ * one-line #define for cpu_isset(), instead of wrapping an inline
+ * inside a macro, the way we do the other calls.
+ */
+
#include <linux/threads.h>
#include <linux/bitmap.h>
-#include <asm/cpumask.h>
#include <asm/bug.h>

-#ifdef CONFIG_SMP
-
-extern cpumask_t cpu_online_map;
-extern cpumask_t cpu_possible_map;
-
-#define num_online_cpus() cpus_weight(cpu_online_map)
-#define num_possible_cpus() cpus_weight(cpu_possible_map)
-
-#define cpu_online(cpu) cpu_isset(cpu, cpu_online_map)
-#define cpu_possible(cpu) cpu_isset(cpu, cpu_possible_map)
+typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
+extern cpumask_t _unused_cpumask_arg_;

-#define for_each_cpu_mask(cpu, mask) \
- for (cpu = first_cpu_const(mk_cpumask_const(mask)); \
- cpu < NR_CPUS; \
- cpu = next_cpu_const(cpu, mk_cpumask_const(mask)))
+#define cpu_set(cpu, dst) __cpu_set((cpu), &(dst))
+static inline void __cpu_set(int cpu, volatile cpumask_t *dstp)
+{
+ set_bit(cpu, dstp->bits);
+}
+
+#define cpu_clear(cpu, dst) __cpu_clear((cpu), &(dst))
+static inline void __cpu_clear(int cpu, volatile cpumask_t *dstp)
+{
+ clear_bit(cpu, dstp->bits);
+}
+
+#define cpus_setall(dst) __cpus_setall(&(dst), NR_CPUS)
+static inline void __cpus_setall(cpumask_t *dstp, int nbits)
+{
+ bitmap_fill(dstp->bits, nbits);
+}
+
+#define cpus_clear(dst) __cpus_clear(&(dst), NR_CPUS)
+static inline void __cpus_clear(cpumask_t *dstp, int nbits)
+{
+ bitmap_zero(dstp->bits, nbits);
+}
+
+/* No static inline type checking - see Subtlety (1) above. */
+#define cpu_isset(cpu, cpumask) test_bit((cpu), (cpumask).bits)
+
+#define cpu_test_and_set(cpu, cpumask) __cpu_test_and_set((cpu), &(cpumask))
+static inline int __cpu_test_and_set(int cpu, cpumask_t *addr)
+{
+ return test_and_set_bit(cpu, addr->bits);
+}
+
+#define cpus_and(dst, src1, src2) __cpus_and(&(dst), &(src1), &(src2), NR_CPUS)
+static inline void __cpus_and(cpumask_t *dstp, cpumask_t *src1p,
+ cpumask_t *src2p, int nbits)
+{
+ bitmap_and(dstp->bits, src1p->bits, src2p->bits, nbits);
+}
+
+#define cpus_or(dst, src1, src2) __cpus_or(&(dst), &(src1), &(src2), NR_CPUS)
+static inline void __cpus_or(cpumask_t *dstp, cpumask_t *src1p,
+ cpumask_t *src2p, int nbits)
+{
+ bitmap_or(dstp->bits, src1p->bits, src2p->bits, nbits);
+}
+
+#define cpus_xor(dst, src1, src2) __cpus_xor(&(dst), &(src1), &(src2), NR_CPUS)
+static inline void __cpus_xor(cpumask_t *dstp, cpumask_t *src1p,
+ cpumask_t *src2p, int nbits)
+{
+ bitmap_xor(dstp->bits, src1p->bits, src2p->bits, nbits);
+}
+
+#define cpus_andnot(dst, src1, src2) \
+ __cpus_andnot(&(dst), &(src1), &(src2), NR_CPUS)
+static inline void __cpus_andnot(cpumask_t *dstp, cpumask_t *src1p,
+ cpumask_t *src2p, int nbits)
+{
+ bitmap_andnot(dstp->bits, src1p->bits, src2p->bits, nbits);
+}
+
+#define cpus_complement(dst, src) __cpus_complement(&(dst), &(src), NR_CPUS)
+static inline void __cpus_complement(cpumask_t *dstp,
+ cpumask_t *srcp, int nbits)
+{
+ bitmap_complement(dstp->bits, srcp->bits, nbits);
+}
+
+#define cpus_equal(src1, src2) __cpus_equal(&(src1), &(src2), NR_CPUS)
+static inline int __cpus_equal(cpumask_t *src1p,
+ cpumask_t *src2p, int nbits)
+{
+ return bitmap_equal(src1p->bits, src2p->bits, nbits);
+}
+
+#define cpus_intersects(src1, src2) __cpus_intersects(&(src1), &(src2), NR_CPUS)
+static inline int __cpus_intersects(cpumask_t *src1p,
+ cpumask_t *src2p, int nbits)
+{
+ return bitmap_intersects(src1p->bits, src2p->bits, nbits);
+}
+
+#define cpus_subset(src1, src2) __cpus_subset(&(src1), &(src2), NR_CPUS)
+static inline int __cpus_subset(cpumask_t *src1p,
+ cpumask_t *src2p, int nbits)
+{
+ return bitmap_subset(src1p->bits, src2p->bits, nbits);
+}
+
+#define cpus_empty(src) __cpus_empty(&(src), NR_CPUS)
+static inline int __cpus_empty(cpumask_t *srcp, int nbits)
+{
+ return bitmap_empty(srcp->bits, nbits);
+}
+
+#define cpus_full(cpumask) __cpus_full(&(cpumask), NR_CPUS)
+static inline int __cpus_full(cpumask_t *srcp, int nbits)
+{
+ return bitmap_full(srcp->bits, nbits);
+}
+
+#define cpus_weight(cpumask) __cpus_weight(&(cpumask), NR_CPUS)
+static inline int __cpus_weight(cpumask_t *srcp, int nbits)
+{
+ return bitmap_weight(srcp->bits, nbits);
+}
+
+#define cpus_shift_right(dst, src, n) \
+ __cpus_shift_right(&(dst), &(src), (n), NR_CPUS)
+static inline void __cpus_shift_right(cpumask_t *dstp,
+ cpumask_t *srcp, int n, int nbits)
+{
+ bitmap_shift_right(dstp->bits, srcp->bits, n, nbits);
+}
+
+#define cpus_shift_left(dst, src, n) \
+ __cpus_shift_left(&(dst), &(src), (n), NR_CPUS)
+static inline void __cpus_shift_left(cpumask_t *dstp,
+ cpumask_t *srcp, int n, int nbits)
+{
+ bitmap_shift_left(dstp->bits, srcp->bits, n, nbits);
+}
+
+#define first_cpu(src) __first_cpu(&(src), NR_CPUS)
+static inline int __first_cpu(cpumask_t *srcp, int nbits)
+{
+ return find_first_bit(srcp->bits, nbits);
+}
+
+#define next_cpu(n, src) __next_cpu((n), &(src), NR_CPUS)
+static inline int __next_cpu(int n, cpumask_t *srcp, int nbits)
+{
+ return find_next_bit(srcp->bits, nbits, n+1);
+}
+
+#define cpumask_of_cpu(cpu) \
+({ \
+ typeof(_unused_cpumask_arg_) m; \
+ if (sizeof(m) == sizeof(unsigned long)) { \
+ m.bits[0] = 1UL<<(cpu); \
+ } else { \
+ cpus_clear(m); \
+ cpu_set((cpu), m); \
+ } \
+ m; \
+})
+
+#define CPU_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(NR_CPUS)
+
+#if NR_CPUS <= BITS_PER_LONG
+
+#define CPU_MASK_ALL \
+((cpumask_t) { { \
+ [BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD \
+} })

-#define for_each_cpu(cpu) for_each_cpu_mask(cpu, cpu_possible_map)
-#define for_each_online_cpu(cpu) for_each_cpu_mask(cpu, cpu_online_map)
#else
-#define cpu_online_map cpumask_of_cpu(0)
-#define cpu_possible_map cpumask_of_cpu(0)

-#define num_online_cpus() 1
-#define num_possible_cpus() 1
+#define CPU_MASK_ALL \
+((cpumask_t) { { \
+ [0 ... BITS_TO_LONGS(NR_CPUS)-2] = ~0UL, \
+ [BITS_TO_LONGS(NR_CPUS)-1] = CPU_MASK_LAST_WORD \
+} })

-#define cpu_online(cpu) ({ BUG_ON((cpu) != 0); 1; })
-#define cpu_possible(cpu) ({ BUG_ON((cpu) != 0); 1; })
-
-#define for_each_cpu(cpu) for (cpu = 0; cpu < 1; cpu++)
-#define for_each_online_cpu(cpu) for (cpu = 0; cpu < 1; cpu++)
#endif

+#define CPU_MASK_NONE \
+{ { \
+ [0 ... BITS_TO_LONGS(NR_CPUS)-1] = 0UL \
+} }
+
+#define cpus_addr(src) ((src).bits)
+
+#define cpumask_scnprintf(buf, len, src) \
+ __cpumask_scnprintf((buf), (len), &(src), NR_CPUS)
+static inline int __cpumask_scnprintf(char *buf, int len,
+ cpumask_t *srcp, int nbits)
+{
+ return bitmap_scnprintf(buf, len, srcp->bits, nbits);
+}
+
+#define cpumask_parse(ubuf, ulen, src) \
+ __cpumask_parse((ubuf), (ulen), &(src), NR_CPUS)
+static inline int __cpumask_parse(const char __user *buf, int len,
+ cpumask_t *srcp, int nbits)
+{
+ return bitmap_parse(buf, len, srcp->bits, nbits);
+}
+
+#if NR_CPUS > 1
+#define for_each_cpu_mask(cpu, mask) \
+ for ((cpu) = first_cpu(mask); \
+ (cpu) < NR_CPUS; \
+ (cpu) = next_cpu((cpu), (mask)))
+#else /* NR_CPUS == 1 */
+#define for_each_cpu_mask(cpu, mask) for ((cpu) = 0; (cpu) < 1; (cpu)++)
+#endif /* NR_CPUS */
+
+/*
+ * The following particular system cpumasks and operations manage
+ * possible, present and online cpus. Each of them is a fixed size
+ * bitmap of size NR_CPUS.
+ *
+ * #ifdef CONFIG_HOTPLUG_CPU
+ * cpu_possible_map - all NR_CPUS bits set
+ * cpu_present_map - has bit 'cpu' set iff cpu is populated
+ * cpu_online_map - has bit 'cpu' set iff cpu available to scheduler
+ * #else
+ * cpu_possible_map - has bit 'cpu' set iff cpu is populated
+ * cpu_present_map - copy of cpu_possible_map
+ * cpu_online_map - has bit 'cpu' set iff cpu available to scheduler
+ * #endif
+ *
+ * In either case, NR_CPUS is fixed at compile time, as the static
+ * size of these bitmaps. The cpu_possible_map is fixed at boot
+ * time, as the set of CPU id's that it is possible might ever
+ * be plugged in at anytime during the life of that system boot.
+ * The cpu_present_map is dynamic(*), representing which CPUs
+ * are currently plugged in. And cpu_online_map is the dynamic
+ * subset of cpu_present_map, indicating those CPUs available
+ * for scheduling.
+ *
+ * If HOTPLUG is enabled, then cpu_possible_map is forced to have
+ * all NR_CPUS bits set, otherwise it is just the set of CPUs that
+ * ACPI reports present at boot.
+ *
+ * If HOTPLUG is enabled, then cpu_present_map varies dynamically,
+ * depending on what ACPI reports as currently plugged in, otherwise
+ * cpu_present_map is just a copy of cpu_possible_map.
+ *
+ * (*) Well, cpu_present_map is dynamic in the hotplug case. If not
+ * hotplug, it's a copy of cpu_possible_map, hence fixed at boot.
+ *
+ * Subtleties:
+ * 1) UP arch's (NR_CPUS == 1, CONFIG_SMP not defined) hardcode
+ * assumption that their single CPU is online. The UP
+ * cpu_{online,possible,present}_maps are placebos. Changing them
+ * will have no useful affect on the following num_*_cpus()
+ * and cpu_*() macros in the UP case. This ugliness is a UP
+ * optimization - don't waste any instructions or memory references
+ * asking if you're online or how many CPUs there are if there is
+ * only one CPU.
+ * 2) Most SMP arch's #define some of these maps to be some
+ * other map specific to that arch. Therefore, the following
+ * must be #define macros, not inlines. To see why, examine
+ * the assembly code produced by the following. Note that
+ * set1() writes phys_x_map, but set2() writes x_map:
+ * int x_map, phys_x_map;
+ * #define set1(a) x_map = a
+ * inline void set2(int a) { x_map = a; }
+ * #define x_map phys_x_map
+ * main(){ set1(3); set2(5); }
+ */
+
+extern cpumask_t cpu_possible_map;
+extern cpumask_t cpu_online_map;
extern cpumask_t cpu_present_map;
-#define num_present_cpus() cpus_weight(cpu_present_map)
-#define cpu_present(cpu) cpu_isset(cpu, cpu_present_map)
-#define for_each_present_cpu(cpu) for_each_cpu_mask(cpu, cpu_present_map)

-#define cpumask_scnprintf(buf, buflen, map) \
- bitmap_scnprintf(buf, buflen, cpus_addr(map), NR_CPUS)
+#if NR_CPUS > 1
+#define num_online_cpus() cpus_weight(cpu_online_map)
+#define num_possible_cpus() cpus_weight(cpu_possible_map)
+#define num_present_cpus() cpus_weight(cpu_present_map)
+#define cpu_online(cpu) cpu_isset((cpu), cpu_online_map)
+#define cpu_possible(cpu) cpu_isset((cpu), cpu_possible_map)
+#define cpu_present(cpu) cpu_isset((cpu), cpu_present_map)
+#else
+#define num_online_cpus() 1
+#define num_possible_cpus() 1
+#define num_present_cpus() 1
+#define cpu_online(cpu) ((cpu) == 0)
+#define cpu_possible(cpu) ((cpu) == 0)
+#define cpu_present(cpu) ((cpu) == 0)
+#endif

-#define cpumask_parse(buf, buflen, map) \
- bitmap_parse(buf, buflen, cpus_addr(map), NR_CPUS)
+#define any_online_cpu(mask) \
+({ \
+ int cpu; \
+ for_each_cpu_mask(cpu, (mask)) \
+ if (cpu_online(cpu)) \
+ break; \
+ cpu; \
+})
+
+#define for_each_cpu(cpu) for_each_cpu_mask((cpu), cpu_possible_map)
+#define for_each_online_cpu(cpu) for_each_cpu_mask((cpu), cpu_online_map)
+#define for_each_present_cpu(cpu) for_each_cpu_mask((cpu), cpu_present_map)
+
+/* Begin obsolete cpumask operator emulation */
+#define cpu_isset_const(a,b) cpu_isset(a,b)
+#define cpumask_const_t cpumask_t
+#define cpus_coerce(m) (cpus_addr(m)[0])
+#define cpus_coerce_const cpus_coerce
+#define cpus_promote(x) ({ cpumask_t m; m.bits[0] = x; m; })
+#define cpus_weight_const cpus_weight
+#define first_cpu_const first_cpu
+#define mk_cpumask_const(x) x
+#define next_cpu_const next_cpu
+/* End of obsolete cpumask operator emulation */

#endif /* __LINUX_CPUMASK_H */
Index: 2.6.7-rc2-mm2/kernel/sched.c
===================================================================
--- 2.6.7-rc2-mm2.orig/kernel/sched.c 2004-06-03 05:56:00.000000000 -0700
+++ 2.6.7-rc2-mm2/kernel/sched.c 2004-06-03 06:42:03.000000000 -0700
@@ -3119,6 +3119,11 @@
cpumask_t cpu_present_map;
EXPORT_SYMBOL(cpu_present_map);

+#ifndef CONFIG_SMP
+cpumask_t cpu_online_map = CPU_MASK_ALL;
+cpumask_t cpu_possible_map = CPU_MASK_ALL;
+#endif
+
/**
* sys_sched_getaffinity - get the cpu affinity of a process
* @pid: pid of the process
Index: 2.6.7-rc2-mm2/arch/sparc64/kernel/irq.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/sparc64/kernel/irq.c 2004-06-03 05:42:56.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/sparc64/kernel/irq.c 2004-06-03 05:57:13.000000000 -0700
@@ -687,9 +687,10 @@
* Just Do It.
*/
struct irqaction *ap = bp->irq_info;
- cpumask_t cpu_mask = get_smpaff_in_irqaction(ap);
+ cpumask_t cpu_mask;
unsigned int buddy, ticks;

+ cpus_addr(cpu_mask)[0] = get_smpaff_in_irqaction(ap);
cpus_and(cpu_mask, cpu_mask, cpu_online_map);
if (cpus_empty(cpu_mask))
cpu_mask = cpu_online_map;
@@ -1206,9 +1207,10 @@
{
struct ino_bucket *bp = ivector_table + (long)data;
struct irqaction *ap = bp->irq_info;
- cpumask_t mask = get_smpaff_in_irqaction(ap);
+ cpumask_t mask;
int len;

+ cpus_addr(mask)[0] = get_smpaff_in_irqaction(ap);
if (cpus_empty(mask))
mask = cpu_online_map;

Index: 2.6.7-rc2-mm2/kernel/rcupdate.c
===================================================================
--- 2.6.7-rc2-mm2.orig/kernel/rcupdate.c 2004-06-03 05:46:28.000000000 -0700
+++ 2.6.7-rc2-mm2/kernel/rcupdate.c 2004-06-03 05:57:13.000000000 -0700
@@ -130,17 +130,14 @@
*/
static void rcu_start_batch(int next_pending)
{
- cpumask_t active;
-
if (next_pending)
rcu_ctrlblk.next_pending = 1;

if (rcu_ctrlblk.next_pending &&
rcu_ctrlblk.completed == rcu_ctrlblk.cur) {
/* Can't change, since spin lock held. */
- active = nohz_cpu_mask;
- cpus_complement(active);
- cpus_and(rcu_state.rcu_cpu_mask, cpu_online_map, active);
+ cpus_andnot(rcu_state.rcu_cpu_mask, cpu_online_map,
+ nohz_cpu_mask);
write_seqcount_begin(&rcu_ctrlblk.lock);
rcu_ctrlblk.next_pending = 0;
rcu_ctrlblk.cur++;


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:27:00

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 10/10 optimize various uses of new cpumasks

cpumask 10/10 optimize various uses of new cpumasks

Make use of for_each_cpu_mask() macro to simplify and optimize
a couple of sparc64 per-CPU loops.

Optimize a bit of cpumask code for asm-i386/mach-es7000

Convert physids_complement() to use both args in the files
include/asm-i386/mpspec.h, include/asm-x86_64/mpspec.h.

Remove cpumask hack from asm-x86_64/topology.h routine
pcibus_to_cpumask().

Clarify and slightly optimize several cpumask manipulations
in kernel/sched.c

arch/sparc64/kernel/smp.c | 66 +++-------
include/asm-i386/mach-es7000/mach_ipi.h | 5
include/asm-i386/mpspec.h | 2
include/asm-x86_64/mpspec.h | 2
include/asm-x86_64/topology.h | 6
kernel/sched.c | 18 +-
6 files changed, 39 insertions(+), 60 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/kernel/sched.c
===================================================================
--- 2.6.7-rc2-mm2.orig/kernel/sched.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/kernel/sched.c 2004-06-03 07:08:09.000000000 -0700
@@ -802,10 +802,9 @@
return cpu;

cpus_and(tmp, sd->span, cpu_online_map);
- for_each_cpu_mask(i, tmp) {
- if (!cpu_isset(i, p->cpus_allowed))
- continue;
+ cpus_and(tmp, tmp, p->cpus_allowed);

+ for_each_cpu_mask(i, tmp) {
if (idle_cpu(i))
return i;
}
@@ -3506,7 +3505,7 @@
perfctr_set_cpus_allowed(p, new_mask);

rq = task_rq_lock(p, &flags);
- if (any_online_cpu(new_mask) == NR_CPUS) {
+ if (!cpus_intersects(new_mask, cpu_online_map)) {
ret = -EINVAL;
goto out;
}
@@ -3682,8 +3681,7 @@
if (dest_cpu == NR_CPUS)
dest_cpu = any_online_cpu(tsk->cpus_allowed);
if (dest_cpu == NR_CPUS) {
- cpus_clear(tsk->cpus_allowed);
- cpus_complement(tsk->cpus_allowed);
+ cpus_setall(tsk->cpus_allowed);
dest_cpu = any_online_cpu(tsk->cpus_allowed);

/* Don't tell them about moving exiting tasks
@@ -3999,7 +3997,7 @@
int j;
char str[NR_CPUS];
struct sched_group *group = sd->groups;
- cpumask_t groupmask, tmp;
+ cpumask_t groupmask;

cpumask_scnprintf(str, NR_CPUS, sd->span);
cpus_clear(groupmask);
@@ -4029,8 +4027,7 @@
if (!cpus_weight(group->cpumask))
printk(" ERROR empty group:");

- cpus_and(tmp, groupmask, group->cpumask);
- if (cpus_weight(tmp) > 0)
+ if (cpus_intersects(groupmask, group->cpumask))
printk(" ERROR repeated CPUs:");

cpus_or(groupmask, groupmask, group->cpumask);
@@ -4049,8 +4046,7 @@
sd = sd->parent;

if (sd) {
- cpus_and(tmp, groupmask, sd->span);
- if (!cpus_equal(tmp, groupmask))
+ if (!cpus_subset(groupmask, sd->span))
printk(KERN_DEBUG "ERROR parent span is not a superset of domain->span\n");
}

Index: 2.6.7-rc2-mm2/arch/sparc64/kernel/smp.c
===================================================================
--- 2.6.7-rc2-mm2.orig/arch/sparc64/kernel/smp.c 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/arch/sparc64/kernel/smp.c 2004-06-03 07:08:09.000000000 -0700
@@ -406,14 +406,8 @@
int i;

__asm__ __volatile__("rdpr %%pstate, %0" : "=r" (pstate));
- for (i = 0; i < NR_CPUS; i++) {
- if (cpu_isset(i, mask)) {
- spitfire_xcall_helper(data0, data1, data2, pstate, i);
- cpu_clear(i, mask);
- if (cpus_empty(mask))
- break;
- }
- }
+ for_each_cpu_mask(i, mask)
+ spitfire_xcall_helper(data0, data1, data2, pstate, i);
}

/* Cheetah now allows to send the whole 64-bytes of data in the interrupt
@@ -456,25 +450,19 @@

nack_busy_id = 0;
{
- cpumask_t work_mask = mask;
int i;

- for (i = 0; i < NR_CPUS; i++) {
- if (cpu_isset(i, work_mask)) {
- u64 target = (i << 14) | 0x70;
-
- if (!is_jalapeno)
- target |= (nack_busy_id << 24);
- __asm__ __volatile__(
- "stxa %%g0, [%0] %1\n\t"
- "membar #Sync\n\t"
- : /* no outputs */
- : "r" (target), "i" (ASI_INTR_W));
- nack_busy_id++;
- cpu_clear(i, work_mask);
- if (cpus_empty(work_mask))
- break;
- }
+ for_each_cpu_mask(i, mask) {
+ u64 target = (i << 14) | 0x70;
+
+ if (!is_jalapeno)
+ target |= (nack_busy_id << 24);
+ __asm__ __volatile__(
+ "stxa %%g0, [%0] %1\n\t"
+ "membar #Sync\n\t"
+ : /* no outputs */
+ : "r" (target), "i" (ASI_INTR_W));
+ nack_busy_id++;
}
}

@@ -507,7 +495,6 @@
printk("CPU[%d]: mondo stuckage result[%016lx]\n",
smp_processor_id(), dispatch_stat);
} else {
- cpumask_t work_mask = mask;
int i, this_busy_nack = 0;

/* Delay some random time with interrupts enabled
@@ -518,22 +505,17 @@
/* Clear out the mask bits for cpus which did not
* NACK us.
*/
- for (i = 0; i < NR_CPUS; i++) {
- if (cpu_isset(i, work_mask)) {
- u64 check_mask;
-
- if (is_jalapeno)
- check_mask = (0x2UL << (2*i));
- else
- check_mask = (0x2UL <<
- this_busy_nack);
- if ((dispatch_stat & check_mask) == 0)
- cpu_clear(i, mask);
- this_busy_nack += 2;
- cpu_clear(i, work_mask);
- if (cpus_empty(work_mask))
- break;
- }
+ for_each_cpu_mask(i, mask) {
+ u64 check_mask;
+
+ if (is_jalapeno)
+ check_mask = (0x2UL << (2*i));
+ else
+ check_mask = (0x2UL <<
+ this_busy_nack);
+ if ((dispatch_stat & check_mask) == 0)
+ cpu_clear(i, mask);
+ this_busy_nack += 2;
}

goto retry;
Index: 2.6.7-rc2-mm2/include/asm-i386/mach-es7000/mach_ipi.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mach-es7000/mach_ipi.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mach-es7000/mach_ipi.h 2004-06-03 07:08:09.000000000 -0700
@@ -10,9 +10,8 @@

static inline void send_IPI_allbutself(int vector)
{
- cpumask_t mask = cpumask_of_cpu(smp_processor_id());
- cpus_complement(mask);
- cpus_and(mask, mask, cpu_online_map);
+ cpumask_t mask = cpu_online_map;
+ cpu_clear(smp_processor_id(), mask);
if (!cpus_empty(mask))
send_IPI_mask(mask, vector);
}
Index: 2.6.7-rc2-mm2/include/asm-i386/mpspec.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-i386/mpspec.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-i386/mpspec.h 2004-06-03 07:08:09.000000000 -0700
@@ -53,7 +53,7 @@
#define physids_and(dst, src1, src2) bitmap_and((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_or(dst, src1, src2) bitmap_or((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_clear(map) bitmap_zero((map).mask, MAX_APICS)
-#define physids_complement(map) bitmap_complement((map).mask, (map).mask, MAX_APICS)
+#define physids_complement(dst, src) bitmap_complement((dst).mask,(src).mask, MAX_APICS)
#define physids_empty(map) bitmap_empty((map).mask, MAX_APICS)
#define physids_equal(map1, map2) bitmap_equal((map1).mask, (map2).mask, MAX_APICS)
#define physids_weight(map) bitmap_weight((map).mask, MAX_APICS)
Index: 2.6.7-rc2-mm2/include/asm-x86_64/mpspec.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-x86_64/mpspec.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-x86_64/mpspec.h 2004-06-03 07:08:09.000000000 -0700
@@ -212,7 +212,7 @@
#define physids_and(dst, src1, src2) bitmap_and((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_or(dst, src1, src2) bitmap_or((dst).mask, (src1).mask, (src2).mask, MAX_APICS)
#define physids_clear(map) bitmap_zero((map).mask, MAX_APICS)
-#define physids_complement(map) bitmap_complement((map).mask, (map).mask, MAX_APICS)
+#define physids_complement(dst, src) bitmap_complement((dst).mask, (src).mask, MAX_APICS)
#define physids_empty(map) bitmap_empty((map).mask, MAX_APICS)
#define physids_equal(map1, map2) bitmap_equal((map1).mask, (map2).mask, MAX_APICS)
#define physids_weight(map) bitmap_weight((map).mask, MAX_APICS)
Index: 2.6.7-rc2-mm2/include/asm-x86_64/topology.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/asm-x86_64/topology.h 2004-06-03 06:42:03.000000000 -0700
+++ 2.6.7-rc2-mm2/include/asm-x86_64/topology.h 2004-06-03 07:08:09.000000000 -0700
@@ -20,9 +20,11 @@
#define node_to_first_cpu(node) (__ffs(node_to_cpumask[node]))
#define node_to_cpumask(node) (node_to_cpumask[node])

-static inline unsigned long pcibus_to_cpumask(int bus)
+static inline cpumask_t pcibus_to_cpumask(int bus)
{
- return mp_bus_to_cpumask[bus] & cpu_online_map;
+ cpumask_t tmp;
+ cpus_and(tmp, mp_bus_to_cpumask[bus], cpu_online_map);
+ return tmp;
}

#define NODE_BALANCE_RATE 30 /* CHECKME */


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:48:05

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 1/10 cpu_present_map real even on non-smp

cpumask 1/10 cpu_present_map real even on non-smp

This patch makes cpu_present_map a real map for all
configurations, instead of a constant for non-SMP.
It also moves the definition of cpu_present_map out of
kernel/cpu.c into kernel/sched.c, because cpu.c isn't
compiled into non-SMP kernels.

The pattern is that each of the possible, present and
online cpu maps are actual kernel global cpumask_t
variables, for all configurations. They are documented
in include/linux/cpumask.h. Some of the UP (NR_CPUS=1)
code cheats, and hardcodes the assumption that the single
bit position of these maps is always set, as an optimization.


include/linux/cpumask.h | 13 +++++--------
kernel/cpu.c | 8 --------
kernel/sched.c | 10 ++++++++++
3 files changed, 15 insertions(+), 16 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/include/linux/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/linux/cpumask.h 2004-06-03 05:39:37.000000000 -0700
+++ 2.6.7-rc2-mm2/include/linux/cpumask.h 2004-06-03 05:56:00.000000000 -0700
@@ -10,15 +10,12 @@

extern cpumask_t cpu_online_map;
extern cpumask_t cpu_possible_map;
-extern cpumask_t cpu_present_map;

#define num_online_cpus() cpus_weight(cpu_online_map)
#define num_possible_cpus() cpus_weight(cpu_possible_map)
-#define num_present_cpus() cpus_weight(cpu_present_map)

#define cpu_online(cpu) cpu_isset(cpu, cpu_online_map)
#define cpu_possible(cpu) cpu_isset(cpu, cpu_possible_map)
-#define cpu_present(cpu) cpu_isset(cpu, cpu_present_map)

#define for_each_cpu_mask(cpu, mask) \
for (cpu = first_cpu_const(mk_cpumask_const(mask)); \
@@ -27,25 +24,25 @@

#define for_each_cpu(cpu) for_each_cpu_mask(cpu, cpu_possible_map)
#define for_each_online_cpu(cpu) for_each_cpu_mask(cpu, cpu_online_map)
-#define for_each_present_cpu(cpu) for_each_cpu_mask(cpu, cpu_present_map)
#else
#define cpu_online_map cpumask_of_cpu(0)
#define cpu_possible_map cpumask_of_cpu(0)
-#define cpu_present_map cpumask_of_cpu(0)

#define num_online_cpus() 1
#define num_possible_cpus() 1
-#define num_present_cpus() 1

#define cpu_online(cpu) ({ BUG_ON((cpu) != 0); 1; })
#define cpu_possible(cpu) ({ BUG_ON((cpu) != 0); 1; })
-#define cpu_present(cpu) ({ BUG_ON((cpu) != 0); 1; })

#define for_each_cpu(cpu) for (cpu = 0; cpu < 1; cpu++)
#define for_each_online_cpu(cpu) for (cpu = 0; cpu < 1; cpu++)
-#define for_each_present_cpu(cpu) for (cpu = 0; cpu < 1; cpu++)
#endif

+extern cpumask_t cpu_present_map;
+#define num_present_cpus() cpus_weight(cpu_present_map)
+#define cpu_present(cpu) cpu_isset(cpu, cpu_present_map)
+#define for_each_present_cpu(cpu) for_each_cpu_mask(cpu, cpu_present_map)
+
#define cpumask_scnprintf(buf, buflen, map) \
bitmap_scnprintf(buf, buflen, cpus_addr(map), NR_CPUS)

Index: 2.6.7-rc2-mm2/kernel/cpu.c
===================================================================
--- 2.6.7-rc2-mm2.orig/kernel/cpu.c 2004-06-03 05:39:46.000000000 -0700
+++ 2.6.7-rc2-mm2/kernel/cpu.c 2004-06-03 05:56:00.000000000 -0700
@@ -20,14 +20,6 @@
DECLARE_MUTEX(cpucontrol);

static struct notifier_block *cpu_chain;
-/*
- * Represents all cpu's present in the system
- * In systems capable of hotplug, this map could dynamically grow
- * as new cpu's are detected in the system via any platform specific
- * method, such as ACPI for e.g.
- */
-cpumask_t cpu_present_map;
-EXPORT_SYMBOL(cpu_present_map);

/* Need to know about CPUs going up/down? */
int register_cpu_notifier(struct notifier_block *nb)
Index: 2.6.7-rc2-mm2/kernel/sched.c
===================================================================
--- 2.6.7-rc2-mm2.orig/kernel/sched.c 2004-06-03 05:46:32.000000000 -0700
+++ 2.6.7-rc2-mm2/kernel/sched.c 2004-06-03 05:56:00.000000000 -0700
@@ -3109,6 +3109,16 @@
return retval;
}

+/*
+ * Represents all cpu's present in the system
+ * In systems capable of hotplug, this map could dynamically grow
+ * as new cpu's are detected in the system via any platform specific
+ * method, such as ACPI for e.g.
+ */
+
+cpumask_t cpu_present_map;
+EXPORT_SYMBOL(cpu_present_map);
+
/**
* sys_sched_getaffinity - get the cpu affinity of a process
* @pid: pid of the process


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-03 17:48:06

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] cpumask 3/10 bitmap inlining and optimizations

cpumask 3/10 bitmap inlining and optimizations

These bitmap improvements make it a suitable basis for
fully supporting cpumask_t and nodemask_t. Inline macros
with compile-time checks enable generating tight code on
both small and large systems (large meaning cpumask_t
requires more than one unsigned long's worth of bits).

The existing bitmap_<op> macros in lib/bitmap.c
are renamed to __bitmap_<op>, and wrappers for each
bitmap_<op> are exposed in include/linux/bitmap.h

This patch _includes_ Bill Irwins rewrite of the
bitmap_shift operators to not require a fixed length
intermediate bitmap.

Improved comments list each available operator for easy
browsing.

include/linux/bitmap.h | 266 ++++++++++++++++++++++----
lib/bitmap.c | 91 ++++----
2 files changed, 276 insertions(+), 81 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>


Index: 2.6.7-rc2-mm2/include/linux/bitmap.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/linux/bitmap.h 2004-06-03 05:56:31.000000000 -0700
+++ 2.6.7-rc2-mm2/include/linux/bitmap.h 2004-06-03 05:57:06.000000000 -0700
@@ -3,57 +3,249 @@

#ifndef __ASSEMBLY__

-#include <linux/config.h>
-#include <linux/compiler.h>
#include <linux/types.h>
-#include <linux/kernel.h>
#include <linux/bitops.h>
#include <linux/string.h>

-int bitmap_empty(const unsigned long *bitmap, int bits);
-int bitmap_full(const unsigned long *bitmap, int bits);
-int bitmap_equal(const unsigned long *bitmap1,
- const unsigned long *bitmap2, int bits);
-void bitmap_complement(unsigned long *dst, const unsigned long *src, int bits);
+/*
+ * bitmaps provide bit arrays that consume one or more unsigned
+ * longs. The bitmap interface and available operations are listed
+ * here, in bitmap.h
+ *
+ * Function implementations generic to all architectures are in
+ * lib/bitmap.c. Functions implementations that are architecture
+ * specific are in various include/asm-<arch>/bitops.h headers
+ * and other arch/<arch> specific files.
+ *
+ * See lib/bitmap.c for more details.
+ */
+
+/*
+ * The available bitmap operations and their rough meaning in the
+ * case that the bitmap is a single unsigned long are thus:
+ *
+ * bitmap_zero(dst, nbits) *dst = 0UL
+ * bitmap_fill(dst, nbits) *dst = ~0UL
+ * bitmap_copy(dst, src, nbits) *dst = *src
+ * bitmap_and(dst, src1, src2, nbits) *dst = *src1 & *src2
+ * bitmap_or(dst, src1, src2, nbits) *dst = *src1 | *src2
+ * bitmap_xor(dst, src1, src2, nbits) *dst = *src1 ^ *src2
+ * bitmap_andnot(dst, src1, src2, nbits) *dst = *src1 & ~(*src2)
+ * bitmap_complement(dst, src, nbits) *dst = ~(*src)
+ * bitmap_equal(src1, src2, nbits) Are *src1 and *src2 equal?
+ * bitmap_intersects(src1, src2, nbits) Do *src1 and *src2 overlap?
+ * bitmap_subset(src1, src2, nbits) Is *src1 a subset of *src2?
+ * bitmap_empty(src, nbits) Are all bits zero in *src?
+ * bitmap_full(src, nbits) Are all bits set in *src?
+ * bitmap_weight(src, nbits) Hamming Weight: number set bits
+ * bitmap_shift_right(dst, src, n, nbits) *dst = *src >> n
+ * bitmap_shift_left(dst, src, n, nbits) *dst = *src << n
+ * bitmap_scnprintf(buf, len, src, nbits) Print bitmap src to buf
+ * bitmap_parse(ubuf, ulen, dst, nbits) Parse bitmap dst from buf
+ */
+
+/*
+ * Also the following operations in asm/bitops.h apply to bitmaps.
+ *
+ * set_bit(bit, addr) *addr |= bit
+ * clear_bit(bit, addr) *addr &= ~bit
+ * change_bit(bit, addr) *addr ^= bit
+ * test_bit(bit, addr) Is bit set in *addr?
+ * test_and_set_bit(bit, addr) Set bit and return old value
+ * test_and_clear_bit(bit, addr) Clear bit and return old value
+ * test_and_change_bit(bit, addr) Change bit and return old value
+ * find_first_zero_bit(addr, nbits) Position first zero bit in *addr
+ * find_first_bit(addr, nbits) Position first set bit in *addr
+ * find_next_zero_bit(addr, nbits, bit) Position next zero bit in *addr >= bit
+ * find_next_bit(addr, nbits, bit) Position next set bit in *addr >= bit
+ */
+
+/*
+ * The DECLARE_BITMAP(name,bits) macro, in linux/types.h, can be used
+ * to declare an array named 'name' of just enough unsigned longs to
+ * contain all bit positions from 0 to 'bits' - 1.
+ */
+
+/*
+ * lib/bitmap.c provides these functions:
+ */
+
+extern int __bitmap_empty(const unsigned long *bitmap, int bits);
+extern int __bitmap_full(const unsigned long *bitmap, int bits);
+extern int __bitmap_equal(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern void __bitmap_complement(unsigned long *dst, const unsigned long *src,
+ int bits);
+extern void __bitmap_shift_right(unsigned long *dst,
+ const unsigned long *src, int shift, int bits);
+extern void __bitmap_shift_left(unsigned long *dst,
+ const unsigned long *src, int shift, int bits);
+extern void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern int __bitmap_intersects(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern int __bitmap_subset(const unsigned long *bitmap1,
+ const unsigned long *bitmap2, int bits);
+extern int __bitmap_weight(const unsigned long *bitmap, int bits);
+
+extern int bitmap_scnprintf(char *buf, unsigned int len,
+ const unsigned long *src, int nbits);
+extern int bitmap_parse(const char __user *ubuf, unsigned int ulen,
+ unsigned long *dst, int nbits);
+
+#define BITMAP_LAST_WORD_MASK(nbits) \
+( \
+ ((nbits) % BITS_PER_LONG) ? \
+ (1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL \
+)

-static inline void bitmap_zero(unsigned long *bitmap, int bits)
+static inline void bitmap_zero(unsigned long *dst, int nbits)
{
- memset(bitmap, 0, BITS_TO_LONGS(bits)*sizeof(unsigned long));
+ if (nbits <= BITS_PER_LONG)
+ *dst = 0UL;
+ else {
+ int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ memset(dst, 0, len);
+ }
}

-static inline void bitmap_fill(unsigned long *bitmap, int bits)
+static inline void bitmap_fill(unsigned long *dst, int nbits)
{
- memset(bitmap, 0xff, BITS_TO_LONGS(bits)*sizeof(unsigned long));
+ size_t nlongs = BITS_TO_LONGS(nbits);
+ if (nlongs > 1) {
+ int len = (nlongs - 1) * sizeof(unsigned long);
+ memset(dst, 0xff, len);
+ }
+ dst[nlongs - 1] = BITMAP_LAST_WORD_MASK(nbits);
}

-static inline void bitmap_copy(unsigned long *dst,
- const unsigned long *src, int bits)
+static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
+ int nbits)
{
- int len = BITS_TO_LONGS(bits)*sizeof(unsigned long);
- memcpy(dst, src, len);
+ if (nbits <= BITS_PER_LONG)
+ *dst = *src;
+ else {
+ int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ memcpy(dst, src, len);
+ }
}

-void bitmap_shift_right(unsigned long *dst,
- const unsigned long *src, int shift, int bits);
-void bitmap_shift_left(unsigned long *dst,
- const unsigned long *src, int shift, int bits);
-void bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
- const unsigned long *bitmap2, int bits);
-void bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
- const unsigned long *bitmap2, int bits);
-void bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
- const unsigned long *bitmap2, int bits);
-void bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
- const unsigned long *bitmap2, int bits);
-int bitmap_intersects(const unsigned long *bitmap1,
- const unsigned long *bitmap2, int bits);
-int bitmap_subset(const unsigned long *bitmap1,
- const unsigned long *bitmap2, int bits);
-int bitmap_weight(const unsigned long *bitmap, int bits);
-int bitmap_scnprintf(char *buf, unsigned int buflen,
- const unsigned long *maskp, int bits);
-int bitmap_parse(const char __user *ubuf, unsigned int ubuflen,
- unsigned long *maskp, int bits);
+static inline void bitmap_and(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ *dst = *src1 & *src2;
+ else
+ __bitmap_and(dst, src1, src2, nbits);
+}
+
+static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ *dst = *src1 | *src2;
+ else
+ __bitmap_or(dst, src1, src2, nbits);
+}
+
+static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ *dst = *src1 ^ *src2;
+ else
+ __bitmap_xor(dst, src1, src2, nbits);
+}
+
+static inline void bitmap_andnot(unsigned long *dst, const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ *dst = *src1 & ~(*src2);
+ else
+ __bitmap_andnot(dst, src1, src2, nbits);
+}
+
+static inline void bitmap_complement(unsigned long *dst, const unsigned long *src,
+ int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ *dst = ~(*src) & BITMAP_LAST_WORD_MASK(nbits);
+ else
+ __bitmap_complement(dst, src, nbits);
+}
+
+static inline int bitmap_equal(const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ return ! ((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_equal(src1, src2, nbits);
+}
+
+static inline int bitmap_intersects(const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0;
+ else
+ return __bitmap_intersects(src1, src2, nbits);
+}
+
+static inline int bitmap_subset(const unsigned long *src1,
+ const unsigned long *src2, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ return ! ((*src1 & ~(*src2)) & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_subset(src1, src2, nbits);
+}
+
+static inline int bitmap_empty(const unsigned long *src, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ return ! (*src & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_empty(src, nbits);
+}
+
+static inline int bitmap_full(const unsigned long *src, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
+ else
+ return __bitmap_full(src, nbits);
+}
+
+static inline int bitmap_weight(const unsigned long *src, int nbits)
+{
+ return __bitmap_weight(src, nbits);
+}
+
+static inline void bitmap_shift_right(unsigned long *dst,
+ const unsigned long *src, int n, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ *dst = *src >> n;
+ else
+ __bitmap_shift_right(dst, src, n, nbits);
+}
+
+static inline void bitmap_shift_left(unsigned long *dst,
+ const unsigned long *src, int n, int nbits)
+{
+ if (nbits <= BITS_PER_LONG)
+ *dst = (*src << n) & BITMAP_LAST_WORD_MASK(nbits);
+ else
+ __bitmap_shift_left(dst, src, n, nbits);
+}

#endif /* __ASSEMBLY__ */

Index: 2.6.7-rc2-mm2/lib/bitmap.c
===================================================================
--- 2.6.7-rc2-mm2.orig/lib/bitmap.c 2004-06-03 05:56:31.000000000 -0700
+++ 2.6.7-rc2-mm2/lib/bitmap.c 2004-06-03 05:57:06.000000000 -0700
@@ -15,7 +15,8 @@
/*
* bitmaps provide an array of bits, implemented using an an
* array of unsigned longs. The number of valid bits in a
- * given bitmap need not be an exact multiple of BITS_PER_LONG.
+ * given bitmap does _not_ need to be an exact multiple of
+ * BITS_PER_LONG.
*
* The possible unused bits in the last, partially used word
* of a bitmap are 'don't care'. The implementation makes
@@ -30,9 +31,14 @@
* if you don't input any bitmaps to these ops that have some
* unused bits set, then they won't output any set unused bits
* in output bitmaps.
+ *
+ * The byte ordering of bitmaps is more natural on little
+ * endian architectures. See the big-endian headers
+ * include/asm-ppc64/bitops.h and include/asm-s390/bitops.h
+ * for the best explanations of this ordering.
*/

-int bitmap_empty(const unsigned long *bitmap, int bits)
+int __bitmap_empty(const unsigned long *bitmap, int bits)
{
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
@@ -40,14 +46,14 @@
return 0;

if (bits % BITS_PER_LONG)
- if (bitmap[k] & ((1UL << (bits % BITS_PER_LONG)) - 1))
+ if (bitmap[k] & BITMAP_LAST_WORD_MASK(bits))
return 0;

return 1;
}
-EXPORT_SYMBOL(bitmap_empty);
+EXPORT_SYMBOL(__bitmap_empty);

-int bitmap_full(const unsigned long *bitmap, int bits)
+int __bitmap_full(const unsigned long *bitmap, int bits)
{
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
@@ -55,14 +61,14 @@
return 0;

if (bits % BITS_PER_LONG)
- if (~bitmap[k] & ((1UL << (bits % BITS_PER_LONG)) - 1))
+ if (~bitmap[k] & BITMAP_LAST_WORD_MASK(bits))
return 0;

return 1;
}
-EXPORT_SYMBOL(bitmap_full);
+EXPORT_SYMBOL(__bitmap_full);

-int bitmap_equal(const unsigned long *bitmap1,
+int __bitmap_equal(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
int k, lim = bits/BITS_PER_LONG;
@@ -71,27 +77,26 @@
return 0;

if (bits % BITS_PER_LONG)
- if ((bitmap1[k] ^ bitmap2[k]) &
- ((1UL << (bits % BITS_PER_LONG)) - 1))
+ if ((bitmap1[k] ^ bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
return 0;

return 1;
}
-EXPORT_SYMBOL(bitmap_equal);
+EXPORT_SYMBOL(__bitmap_equal);

-void bitmap_complement(unsigned long *dst, const unsigned long *src, int bits)
+void __bitmap_complement(unsigned long *dst, const unsigned long *src, int bits)
{
int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
dst[k] = ~src[k];

if (bits % BITS_PER_LONG)
- dst[k] = ~src[k] & ((1UL << (bits % BITS_PER_LONG)) - 1);
+ dst[k] = ~src[k] & BITMAP_LAST_WORD_MASK(bits);
}
-EXPORT_SYMBOL(bitmap_complement);
+EXPORT_SYMBOL(__bitmap_complement);

/*
- * bitmap_shift_right - logical right shift of the bits in a bitmap
+ * __bitmap_shift_right - logical right shift of the bits in a bitmap
* @dst - destination bitmap
* @src - source bitmap
* @nbits - shift by this many bits
@@ -101,7 +106,7 @@
* direction. Zeros are fed into the vacated MS positions and the
* LS bits shifted off the bottom are lost.
*/
-void bitmap_shift_right(unsigned long *dst,
+void __bitmap_shift_right(unsigned long *dst,
const unsigned long *src, int shift, int bits)
{
int k, lim = BITS_TO_LONGS(bits), left = bits % BITS_PER_LONG;
@@ -131,10 +136,11 @@
if (off)
memset(&dst[lim - off], 0, off*sizeof(unsigned long));
}
-EXPORT_SYMBOL(bitmap_shift_right);
+EXPORT_SYMBOL(__bitmap_shift_right);
+

/*
- * bitmap_shift_left - logical left shift of the bits in a bitmap
+ * __bitmap_shift_left - logical left shift of the bits in a bitmap
* @dst - destination bitmap
* @src - source bitmap
* @nbits - shift by this many bits
@@ -144,7 +150,8 @@
* direction. Zeros are fed into the vacated LS bit positions
* and those MS bits shifted off the top are lost.
*/
-void bitmap_shift_left(unsigned long *dst,
+
+void __bitmap_shift_left(unsigned long *dst,
const unsigned long *src, int shift, int bits)
{
int k, lim = BITS_TO_LONGS(bits), left = bits % BITS_PER_LONG;
@@ -170,9 +177,9 @@
if (off)
memset(dst, 0, off*sizeof(unsigned long));
}
-EXPORT_SYMBOL(bitmap_shift_left);
+EXPORT_SYMBOL(__bitmap_shift_left);

-void bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
+void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
int k;
@@ -181,9 +188,9 @@
for (k = 0; k < nr; k++)
dst[k] = bitmap1[k] & bitmap2[k];
}
-EXPORT_SYMBOL(bitmap_and);
+EXPORT_SYMBOL(__bitmap_and);

-void bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
+void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
int k;
@@ -192,9 +199,9 @@
for (k = 0; k < nr; k++)
dst[k] = bitmap1[k] | bitmap2[k];
}
-EXPORT_SYMBOL(bitmap_or);
+EXPORT_SYMBOL(__bitmap_or);

-void bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
+void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
int k;
@@ -203,9 +210,9 @@
for (k = 0; k < nr; k++)
dst[k] = bitmap1[k] ^ bitmap2[k];
}
-EXPORT_SYMBOL(bitmap_xor);
+EXPORT_SYMBOL(__bitmap_xor);

-void bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
int k;
@@ -214,9 +221,9 @@
for (k = 0; k < nr; k++)
dst[k] = bitmap1[k] & ~bitmap2[k];
}
-EXPORT_SYMBOL(bitmap_andnot);
+EXPORT_SYMBOL(__bitmap_andnot);

-int bitmap_intersects(const unsigned long *bitmap1,
+int __bitmap_intersects(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
int k, lim = bits/BITS_PER_LONG;
@@ -225,14 +232,13 @@
return 1;

if (bits % BITS_PER_LONG)
- if ((bitmap1[k] & bitmap2[k]) &
- ((1UL << (bits % BITS_PER_LONG)) - 1))
+ if ((bitmap1[k] & bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
return 1;
return 0;
}
-EXPORT_SYMBOL(bitmap_intersects);
+EXPORT_SYMBOL(__bitmap_intersects);

-int bitmap_subset(const unsigned long *bitmap1,
+int __bitmap_subset(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
int k, lim = bits/BITS_PER_LONG;
@@ -241,15 +247,14 @@
return 0;

if (bits % BITS_PER_LONG)
- if ((bitmap1[k] & ~bitmap2[k]) &
- ((1UL << (bits % BITS_PER_LONG)) - 1))
+ if ((bitmap1[k] & ~bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
return 0;
return 1;
}
-EXPORT_SYMBOL(bitmap_subset);
+EXPORT_SYMBOL(__bitmap_subset);

#if BITS_PER_LONG == 32
-int bitmap_weight(const unsigned long *bitmap, int bits)
+int __bitmap_weight(const unsigned long *bitmap, int bits)
{
int k, w = 0, lim = bits/BITS_PER_LONG;

@@ -257,13 +262,12 @@
w += hweight32(bitmap[k]);

if (bits % BITS_PER_LONG)
- w += hweight32(bitmap[k] &
- ((1UL << (bits % BITS_PER_LONG)) - 1));
+ w += hweight32(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));

return w;
}
#else
-int bitmap_weight(const unsigned long *bitmap, int bits)
+int __bitmap_weight(const unsigned long *bitmap, int bits)
{
int k, w = 0, lim = bits/BITS_PER_LONG;

@@ -271,13 +275,12 @@
w += hweight64(bitmap[k]);

if (bits % BITS_PER_LONG)
- w += hweight64(bitmap[k] &
- ((1UL << (bits % BITS_PER_LONG)) - 1));
+ w += hweight64(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));

return w;
}
#endif
-EXPORT_SYMBOL(bitmap_weight);
+EXPORT_SYMBOL(__bitmap_weight);

/*
* Bitmap printing & parsing functions: first version by Bill Irwin,
@@ -394,7 +397,7 @@
if (nchunks == 0 && chunk == 0)
continue;

- bitmap_shift_left(maskp, maskp, CHUNKSZ, nmaskbits);
+ __bitmap_shift_left(maskp, maskp, CHUNKSZ, nmaskbits);
*maskp |= chunk;
nchunks++;
nbits += (nchunks == 1) ? nbits_to_hold_value(chunk) : CHUNKSZ;


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 00:04:49

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Paul Jackson <[email protected]> wrote:
>
> Major rewrite of cpumask to use a single implementation,
> as a struct-wrapped bitmap.
>
> ...
>
> +typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
>

We avoided doing this because in some situations the compiler will not pass
such a cpumask_t in a register, ever. An efficiency problem on sparc64,
apparently.

2004-06-04 00:23:12

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Andrew Morton <[email protected]> wrote:
>
> > +typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
> >
>
> We avoided doing this because in some situations the compiler will not pass
> such a cpumask_t in a register, ever. An efficiency problem on sparc64,
> apparently.

Although for some reason your patches shrink my sparc64 build from

text data bss dec hex filename
3508730 895000 302656 4706386 47d052 vmlinux
to 3507586 895080 302656 4705322 47cc2a vmlinux

so we can probably evade the wrath-of-davem.

2004-06-04 01:51:12

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, 2004-06-04 at 03:10, Paul Jackson wrote:
> cpumask 5/10 rewrite cpumask.h - single bitmap based implementation
>
> Major rewrite of cpumask to use a single implementation,
> as a struct-wrapped bitmap.

Go Paul!

> + * 1) The 'type-checked' form of cpu_isset() causes gcc (3.3.2, anyway)
> + * to generate slightly worse code. Note for example the additional
> + * 40 lines of assembly code compiling the "for each possible cpu"
> + * loops buried in the disk_stat_read() macros calls when compiling
> + * drivers/block/genhd.c (arch i386, CONFIG_SMP=y). So use a simple
> + * one-line #define for cpu_isset(), instead of wrapping an inline
> + * inside a macro, the way we do the other calls.

Hmm...

> +/* No static inline type checking - see Subtlety (1) above. */
> +#define cpu_isset(cpu, cpumask) test_bit((cpu), (cpumask).bits)

How about something really grungy like:

#define cpu_isset(cpu, cpumask) \
({ __typeof__(cpumask) __cpumask; \
(void)(&__cpumask) == (cpumask_t *)0); \
test_bit((cpu), (cpumask).bits); })

> +#define cpus_addr(src) ((src).bits)

We've discussed this before when talking about whether it'd be easier to
just make people use raw bitop functions directly, so I know we have
philosophical differences here.

So, opinion alert: if I were doing this, I'd probably live without this
macro; in my mind it crosses the "too much abstraction" line. I did
momentarily wonder what this macro did when I saw it used in the
succeeding patches.

But it's a minor nit; thanks for doing these.

Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-06-04 02:02:43

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Rusty Russell wrote:

> We've discussed this before when talking about whether it'd be easier to
> just make people use raw bitop functions directly, so I know we have
> philosophical differences here.
>
> So, opinion alert: if I were doing this, I'd probably live without this
> macro; in my mind it crosses the "too much abstraction" line. I did
> momentarily wonder what this macro did when I saw it used in the
> succeeding patches.
>

I think if you don't like that abstraction, there should be no
cpumask type at all, just use the bitmap.

I don't see what you gain from having the cpumask type but having
to get at its internals with the bitop functions.


> But it's a minor nit; thanks for doing these.
>

Yeah it looks quite good

2004-06-04 02:20:02

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, 2004-06-04 at 12:02, Nick Piggin wrote:
> Rusty Russell wrote:
> > So, opinion alert: if I were doing this, I'd probably live without this
> > macro; in my mind it crosses the "too much abstraction" line. I did
> > momentarily wonder what this macro did when I saw it used in the
> > succeeding patches.
>
> I think if you don't like that abstraction, there should be no
> cpumask type at all, just use the bitmap.
>
> I don't see what you gain from having the cpumask type but having
> to get at its internals with the bitop functions.

Yes, that was the previous debate to which I alluded, although having a
separate type helps make typesafe functions.

But to clarify: my question here was over the cpu_addr() macro.

Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-06-04 02:43:07

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Andrew wrote:
> Paul Jackson <[email protected]> wrote:
> >
> > Major rewrite of cpumask to use a single implementation,
> > as a struct-wrapped bitmap.
> >
> > ...
> >
> > +typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
>
> We avoided doing this because in some situations the compiler will not pass
> such a cpumask_t in a register, ever. An efficiency problem on sparc64,
> apparently.

When I contacted Dave Miller about this specific problem on March 26,
2004, he explained that this was more of a problem on sparc32, and that
since SMP on sparc32 wasn't in robust shape yet (my words), he seemed
(from what I could tell) not to be objecting too strongly.

I've added Dave to the Cc list, in case he wants to add or something, or
correct my efforts to represent his position.

At this point, if sparc (32 or 64) is a concern, I'd look into adding
arch-specific code for that case. The overall cleanup of cpumasks
pleases me enough that I would seek to minimize the impact on the
generic case for specific arch's that require an alternative
implementation.

My current understanding is that such a special case is not required
for sparc, or any other arch.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 02:53:27

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Andrew wrote:
> Although for some reason your patches shrink my sparc64 build from
>
> text data bss dec hex filename
> 3508730 895000 302656 4706386 47d052 vmlinux
> to 3507586 895080 302656 4705322 47cc2a vmlinux

Yes - these are typical of the kernel text space reductions that
I am seeing as well. Various little tweaks here and there, with
a careful eye to changes in the output of "nm -S".

The one exception being ia64, where I saved an additional
15000 or so kernel text bytes by uninlining find_next_bit().

Other arch's might want to try that same optimization. I ran
out of gas, and lacked the resources, to persue that.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 02:56:38

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Thu, 3 Jun 2004 19:47:55 -0700
Paul Jackson <[email protected]> wrote:

> > We avoided doing this because in some situations the compiler will not pass
> > such a cpumask_t in a register, ever. An efficiency problem on sparc64,
> > apparently.
>
> When I contacted Dave Miller about this specific problem on March 26,
> 2004, he explained that this was more of a problem on sparc32, and that
> since SMP on sparc32 wasn't in robust shape yet (my words), he seemed
> (from what I could tell) not to be objecting too strongly.
>
> I've added Dave to the Cc list, in case he wants to add or something, or
> correct my efforts to represent his position.

Those were my feelings, but looking at this specific case why
can't you just change the type to be an aggregate when possible?
Is it really that hard?

2004-06-04 04:27:22

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Rusty wrote:
> > +/* No static inline type checking - see Subtlety (1) above. */
> > +#define cpu_isset(cpu, cpumask) test_bit((cpu), (cpumask).bits)
>
> How about something really grungy like:
>
> #define cpu_isset(cpu, cpumask) \
> ({ __typeof__(cpumask) __cpumask; \
> (void)(&__cpumask) == (cpumask_t *)0); \
> test_bit((cpu), (cpumask).bits); })

Well ... we agree on the "grungy" part ... ;).

Your flavor has the same problem as I saw with the static inline nested
inside a #define macro flavor. On i386, SMP, gcc 3.3.2, when the
cpu_isset() is part of a for-loop control, both the normal and 'grungy'
flavors cost one extra jump instruction, whereas the flavor I ended up
using places one chunk of code more optimally, saving a hard jump
instruction.

This saves 196 bytes of kernel text space.

If you can show me a type checked cpu_isset() that generates code
as tight as what I ended up using, let me know.

Or if you would prefer I spend the 196 bytes to get this type checked,
that's worthy of consideration as well.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 04:29:48

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] cpumask 10/10 optimize various uses of new cpumasks

On Fri, 2004-06-04 at 03:11, Paul Jackson wrote:
> cpumask 10/10 optimize various uses of new cpumasks
>
> Make use of for_each_cpu_mask() macro to simplify and optimize
> a couple of sparc64 per-CPU loops.

This means we can finally do this, too... Yes!

Name: Cleanup cpumask_t Temporaries
Status: Booted on 2.6.7-rc2-bk4
Depends: Misc/cpumask-tour-de-force.patch.gz
Signed-off-by: Rusty Russell <[email protected]>

Paul Jackson's cpumask tour-de-force allows us to get rid of those
stupid temporaries which we used to hold CPU_MASK_ALL to hand them to
functions. This used to break NR_CPUS > BITS_PER_LONG.

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/arch/ppc64/kernel/irq.c .9851-linux-2.6.7-rc2-bk4.updated/arch/ppc64/kernel/irq.c
--- .9851-linux-2.6.7-rc2-bk4/arch/ppc64/kernel/irq.c 2004-05-31 09:57:03.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/arch/ppc64/kernel/irq.c 2004-06-04 12:06:09.000000000 +1000
@@ -738,7 +738,6 @@ static int irq_affinity_write_proc (stru
irq_desc_t *desc = get_irq_desc(irq);
int ret;
cpumask_t new_value, tmp;
- cpumask_t allcpus = CPU_MASK_ALL;

if (!desc->handler->set_affinity)
return -EIO;
@@ -753,7 +752,7 @@ static int irq_affinity_write_proc (stru
* NR_CPUS == 32 and cpumask is a long), so we mask it here to
* be consistent.
*/
- cpus_and(new_value, new_value, allcpus);
+ cpus_and(new_value, new_value, CPU_MASK_ALL);

/*
* Grab lock here so cpu_online_map can't change, and also
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/arch/ppc64/kernel/rtasd.c .9851-linux-2.6.7-rc2-bk4.updated/arch/ppc64/kernel/rtasd.c
--- .9851-linux-2.6.7-rc2-bk4/arch/ppc64/kernel/rtasd.c 2004-06-04 12:01:24.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/arch/ppc64/kernel/rtasd.c 2004-06-04 12:06:30.000000000 +1000
@@ -364,7 +364,6 @@ static int rtasd(void *unused)
unsigned int err_type;
int cpu = 0;
int event_scan = rtas_token("event-scan");
- cpumask_t all = CPU_MASK_ALL;
int rc;

daemonize("rtasd");
@@ -419,7 +418,7 @@ static int rtasd(void *unused)
for (;;) {
set_cpus_allowed(current, cpumask_of_cpu(cpu));
do_event_scan(event_scan);
- set_cpus_allowed(current, all);
+ set_cpus_allowed(current, CPU_MASK_ALL);

/* Drop hotplug lock, and sleep for a bit (at least
* one second since some machines have problems if we
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/arch/ppc64/kernel/xics.c .9851-linux-2.6.7-rc2-bk4.updated/arch/ppc64/kernel/xics.c
--- .9851-linux-2.6.7-rc2-bk4/arch/ppc64/kernel/xics.c 2004-05-31 09:57:04.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/arch/ppc64/kernel/xics.c 2004-06-04 12:08:16.000000000 +1000
@@ -240,14 +240,13 @@ static unsigned int real_irq_to_virt(uns
static int get_irq_server(unsigned int irq)
{
cpumask_t cpumask = irq_affinity[irq];
- cpumask_t allcpus = CPU_MASK_ALL;
cpumask_t tmp = CPU_MASK_NONE;
unsigned int server;

#ifdef CONFIG_IRQ_ALL_CPUS
/* For the moment only implement delivery to all cpus or one cpu */
if (smp_threads_ready) {
- if (cpus_equal(cpumask, allcpus)) {
+ if (cpus_equal(cpumask, CPU_MASK_ALL)) {
server = default_distrib_server;
} else {
cpus_and(tmp, cpu_online_map, cpumask);
@@ -616,8 +615,7 @@ static void xics_set_affinity(unsigned i
long status;
unsigned long xics_status[2];
unsigned long newmask;
- cpumask_t allcpus = CPU_MASK_ALL;
cpumask_t tmp = CPU_MASK_NONE;

irq = virt_irq_to_real(irq_offset_down(virq));
if (irq == XICS_IPI || irq == NO_IRQ)
@@ -632,7 +629,7 @@ static void xics_set_affinity(unsigned i
}

/* For the moment only implement delivery to all cpus or one cpu */
- if (cpus_equal(cpumask, allcpus)) {
+ if (cpus_equal(cpumask, CPU_MASK_ALL)) {
newmask = default_distrib_server;
} else {
cpus_and(tmp, cpu_online_map, cpumask);
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/include/asm-i386/mach-numaq/mach_apic.h .9851-linux-2.6.7-rc2-bk4.updated/include/asm-i386/mach-numaq/mach_apic.h
--- .9851-linux-2.6.7-rc2-bk4/include/asm-i386/mach-numaq/mach_apic.h 2004-06-04 12:01:24.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/include/asm-i386/mach-numaq/mach_apic.h 2004-06-04 12:03:54.000000000 +1000
@@ -8,8 +8,7 @@

static inline cpumask_t target_cpus(void)
{
- cpumask_t tmp = CPU_MASK_ALL;
- return tmp;
+ return CPU_MASK_ALL;
}

#define TARGET_CPUS (target_cpus())
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/include/asm-i386/mach-summit/mach_apic.h .9851-linux-2.6.7-rc2-bk4.updated/include/asm-i386/mach-summit/mach_apic.h
--- .9851-linux-2.6.7-rc2-bk4/include/asm-i386/mach-summit/mach_apic.h 2004-06-04 12:01:24.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/include/asm-i386/mach-summit/mach_apic.h 2004-06-04 12:04:00.000000000 +1000
@@ -19,8 +19,7 @@

static inline cpumask_t target_cpus(void)
{
- cpumask_t tmp = CPU_MASK_ALL;
- return tmp;
+ return CPU_MASK_ALL;
}
#define TARGET_CPUS (target_cpus())

diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/kernel/kmod.c .9851-linux-2.6.7-rc2-bk4.updated/kernel/kmod.c
--- .9851-linux-2.6.7-rc2-bk4/kernel/kmod.c 2004-05-31 09:57:40.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/kernel/kmod.c 2004-06-04 12:01:59.000000000 +1000
@@ -154,7 +154,6 @@ static int ____call_usermodehelper(void
{
struct subprocess_info *sub_info = data;
int retval;
- cpumask_t mask = CPU_MASK_ALL;

/* Unblock all signals. */
flush_signals(current);
@@ -165,7 +164,7 @@ static int ____call_usermodehelper(void
spin_unlock_irq(&current->sighand->siglock);

/* We can run anywhere, unlike our parent keventd(). */
- set_cpus_allowed(current, mask);
+ set_cpus_allowed(current, CPU_MASK_ALL);

retval = -EPERM;
if (current->fs->root)
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/kernel/kthread.c .9851-linux-2.6.7-rc2-bk4.updated/kernel/kthread.c
--- .9851-linux-2.6.7-rc2-bk4/kernel/kthread.c 2004-06-04 08:56:09.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/kernel/kthread.c 2004-06-04 12:03:34.000000000 +1000
@@ -65,7 +65,6 @@ static int kthread(void *_create)
void *data;
sigset_t blocked;
int ret = -EINTR;
- cpumask_t mask = CPU_MASK_ALL;

kthread_exit_files();

@@ -79,7 +78,7 @@ static int kthread(void *_create)
flush_signals(current);

/* By default we can run anywhere, unlike keventd. */
- set_cpus_allowed(current, mask);
+ set_cpus_allowed(current, CPU_MASK_ALL);

/* OK, tell user we're spawned, wait for stop or wakeup */
__set_current_state(TASK_INTERRUPTIBLE);
diff -urpN --exclude TAGS -X /home/rusty/devel/kernel/kernel-patches/current-dontdiff --minimal .9851-linux-2.6.7-rc2-bk4/kernel/sched.c .9851-linux-2.6.7-rc2-bk4.updated/kernel/sched.c
--- .9851-linux-2.6.7-rc2-bk4/kernel/sched.c 2004-06-04 12:01:25.000000000 +1000
+++ .9851-linux-2.6.7-rc2-bk4.updated/kernel/sched.c 2004-06-04 12:02:35.000000000 +1000
@@ -3906,16 +3906,15 @@ void __init sched_init(void)
/* Set up an initial dummy domain for early boot */
static struct sched_domain sched_domain_init;
static struct sched_group sched_group_init;
- cpumask_t cpu_mask_all = CPU_MASK_ALL;

memset(&sched_domain_init, 0, sizeof(struct sched_domain));
- sched_domain_init.span = cpu_mask_all;
+ sched_domain_init.span = CPU_MASK_ALL;
sched_domain_init.groups = &sched_group_init;
sched_domain_init.last_balance = jiffies;
sched_domain_init.balance_interval = INT_MAX; /* Don't balance */

memset(&sched_group_init, 0, sizeof(struct sched_group));
- sched_group_init.cpumask = cpu_mask_all;
+ sched_group_init.cpumask = CPU_MASK_ALL;
sched_group_init.next = &sched_group_init;
sched_group_init.cpu_power = SCHED_LOAD_SCALE;
#endif

--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-06-04 04:40:49

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] cpumask 10/10 optimize various uses of new cpumasks

Rusty Russell wrote:
> On Fri, 2004-06-04 at 03:11, Paul Jackson wrote:
>
>>cpumask 10/10 optimize various uses of new cpumasks
>>
>> Make use of for_each_cpu_mask() macro to simplify and optimize
>> a couple of sparc64 per-CPU loops.
>
>
> This means we can finally do this, too... Yes!
>
> Name: Cleanup cpumask_t Temporaries
> Status: Booted on 2.6.7-rc2-bk4
> Depends: Misc/cpumask-tour-de-force.patch.gz
> Signed-off-by: Rusty Russell <[email protected]>
>
> Paul Jackson's cpumask tour-de-force allows us to get rid of those
> stupid temporaries which we used to hold CPU_MASK_ALL to hand them to
> functions. This used to break NR_CPUS > BITS_PER_LONG.

Actually I think this was already possible as of a couple of
months ago, but thanks for doing the cleanup :)

Fix was embarrassingly simple as pointed out by Linus:

-#define CPU_MASK_ALL { {[0 ... CPU_ARRAY_SIZE-1] = ~0UL} }
-#define CPU_MASK_NONE { {[0 ... CPU_ARRAY_SIZE-1] = 0UL} }
+#define CPU_MASK_ALL ((cpumask_t) { {[0 ... CPU_ARRAY_SIZE-1] = ~0UL} })
+#define CPU_MASK_NONE ((cpumask_t) { {[0 ... CPU_ARRAY_SIZE-1] = 0UL} })

2004-06-04 04:42:04

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 10/10 optimize various uses of new cpumasks

> This means we can finally do this, too... Yes!

Nice. Though, actually, I suspect that Linus deserves the
credit for this particular improvement. He suggested adding
a (cpumask_t) typecast to the CPU_MASK_ALL macro, just for
this reason.

Or at least that's my recollection.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 04:57:10

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Dave wrote:
> Is it really that hard?

Well ...

Currently we have two implementations of cpumasks. One (in the main
kernel presently) provides a flexible mechanism for arch-specific
optimizations and the use of alternative data types to represent
cpumasks.

The other (using the patch set I just submitted to Andrew) is one size
fits all generic only (except for arch specific bitmap implementations).

The generic only is quite a bit simpler - it has some 26 fewer kernel
source files, and it saves sparc64 some 1144 bytes of kernel text space,
as measured by Andrew.

I really don't want to go 'back' to the fancy version. If a particular
architecture has specific additional needs, I'm certainly open to
hearing the justifications, tradeoffs and suggestions for ways to meet
those needs.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 05:04:48

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Thu, 3 Jun 2004 22:02:24 -0700
Paul Jackson <[email protected]> wrote:

> The generic only is quite a bit simpler - it has some 26 fewer kernel
> source files, and it saves sparc64 some 1144 bytes of kernel text space,
> as measured by Andrew.

I bet if you do a sparc32 build, you'll get larger text size
and more leaf functions will need stack frames.

> I really don't want to go 'back' to the fancy version. If a particular
> architecture has specific additional needs, I'm certainly open to
> hearing the justifications, tradeoffs and suggestions for ways to meet
> those needs.

Another thing is that only newer gcc's are good at changing structure
accesses such that they are optimized as aggregates when possible.

You're the one doing the work, so it's up to you. :-)

2004-06-04 05:12:31

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

> I don't see what you gain from having the cpumask type but having
> to get at its internals with the bitop functions.

There were a few places where arch-specific code had a cpumask_t type,
then took its address and operated on it as if it was a simple unsigned
long. Where I was confident that I could correctly and efficiently
recode them using 'real' cpumask operations, I did so. But in some
cases, I was not clear how to do this. Grep for "cpus_addr()" uses to
find these cases. The old cpumask implementation had similar macros,
including cpus_coerce() and cpus_promote(), for a similar purpose.

The "ideal" solution, in my view, would be to have someone with arch
specific experience in each affected arch code these uses of cpus_addr()
out, then remove cpus_addr() entirely.

Perhaps I should comment the cpus_addr() definition as 'deprecated'?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 05:24:46

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Thu, 3 Jun 2004 22:18:54 -0700
Paul Jackson <[email protected]> wrote:

> The "ideal" solution, in my view, would be to have someone with arch
> specific experience in each affected arch code these uses of cpus_addr()
> out, then remove cpus_addr() entirely.
>
> Perhaps I should comment the cpus_addr() definition as 'deprecated'?

I would suggest just making the build fail for these cases.
You could do this by simply doing something like:

#error Assumes cpumask_t is integral type

at the suspect spots.

2004-06-04 05:24:58

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

> I don't see what you gain from having the cpumask type but having
> to get at its internals with the bitop functions.

The essential gain, in my view, of cpumask, is that it encapsulates
the value NR_CPUS. cpumasks are bitmaps of length NR_CPUS.

Yes, there is an open issue of whether cpumasks are worth it.
I think enough code has taken to them that they are.

The getting at internals (via cpus_addr(), I'm guessing you mean)
was a workaround for some code that messed with cpumasks and simple
unsigned longs as if they were interoperable. "cpus_addr" should
be marked deprecated, and its use coded out. Its remaining uses
are in arch-specific areas where I lack the expertise and testing
environment to accomplish such.

I needed some legacy mechanism such as this, in order to avoid
having such existing uses bring the entire cpumask overhaul to
a screeching halt.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 05:35:48

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Paul Jackson wrote:
>>I don't see what you gain from having the cpumask type but having
>>to get at its internals with the bitop functions.
>
>
> The essential gain, in my view, of cpumask, is that it encapsulates
> the value NR_CPUS. cpumasks are bitmaps of length NR_CPUS.
>
> Yes, there is an open issue of whether cpumasks are worth it.
> I think enough code has taken to them that they are.
>

Yes, I'm all for the full cpumask abstraction.

> The getting at internals (via cpus_addr(), I'm guessing you mean)
> was a workaround for some code that messed with cpumasks and simple
> unsigned longs as if they were interoperable. "cpus_addr" should
> be marked deprecated, and its use coded out. Its remaining uses
> are in arch-specific areas where I lack the expertise and testing
> environment to accomplish such.
>
> I needed some legacy mechanism such as this, in order to avoid
> having such existing uses bring the entire cpumask overhaul to
> a screeching halt.
>

No, by getting at the internals, I mean the internals of the
type itself. Its implementation, if you will. (Well I guess
that also *includes* users getting the address and derefing it
as an unsigned long).

But no, I was talking about something more general. Rusty wrote:

>>+#define cpus_addr(src) ((src).bits)
>
>
> We've discussed this before when talking about whether it'd be easier to
> just make people use raw bitop functions directly, so I know we have
> philosophical differences here.
>
> So, opinion alert: if I were doing this, I'd probably live without this
> macro; in my mind it crosses the "too much abstraction" line. I did
> momentarily wonder what this macro did when I saw it used in the
> succeeding patches.

Now in my opinion, it is either all or nothing. I could be wrong,
but I don't think there is any point with a nice cpumask type if
you are just going to get inside it and do bitmap operations on it.

In summary, I think your patches are nice :)

2004-06-04 05:41:24

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Nick Piggin <[email protected]> wrote:
>
> Yes, I'm all for the full cpumask abstraction.

Where do we stand wrt pass-by-reference? I remember there was initially
some concern that lugging 512-bit scalars around by value was expensive, so
Bill's original work was at least geared toward pass-by-reference?

2004-06-04 05:53:42

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Andrew Morton wrote:
> Nick Piggin <[email protected]> wrote:
>
>> Yes, I'm all for the full cpumask abstraction.
>
>
> Where do we stand wrt pass-by-reference? I remember there was initially
> some concern that lugging 512-bit scalars around by value was expensive, so
> Bill's original work was at least geared toward pass-by-reference?
>

That is a valid concern. One I hadn't really thought about
as the patch is coming from SGI :)

kernel/sched.c doesn't pass around cpumask_t's anywhere
critical anymore (this used to be a problem). Any other
important places spring to mind?

2004-06-04 06:42:30

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

> Where do we stand wrt pass-by-reference?

I haven't found any place yet that I noticed this to be a problem.

Of course it's not a problem in the internals of the cpumask
implementation, because the very first layer of implementation, the
#define's, converts pass by value to pass by reference (and adds the
NR_CPUS where that helps). Then the bitmap ops in the next layer down
convert right back to single word operations by value, in the usual
NR_CPUS <= BITS_PER_LONG case, all within the scope of the compiler
code-generator.

If there was a place where it was important to pass a cpumask argument
by reference for efficiency, then I would claim that this should be done
by explicitly making the argument in question a (cpumask_t *) pointer,
instead of a cpumask_t value. This is the usual technique when passing
potentially large structures, when a local private copy is not needed.

If further such a place was in generic kernel code, where the vast
majority running on more reasonably sized hardware would object to
such, either because of esthetics, or efficiency (wasting a pointer
dereference) then ... cross that bridge when the water rises.

Hopefully, localized solutions could be developed for such needs,
without dragging back in all the complexity of the generalized
variability we had before.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 06:49:38

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Dave Miller wrote:
> I would suggest just making the build fail for these cases.

Interesting suggestion.

Others with more experience in such matters will have to
comment on whether it's a good idea.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 08:20:57

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Thu, Jun 03, 2004 at 10:10:10AM -0700, Paul Jackson wrote:
> +static inline void __cpu_set(int cpu, volatile cpumask_t *dstp)
> +{
> + set_bit(cpu, dstp->bits);
> +}

Hungarian notation?


On Thu, Jun 03, 2004 at 10:10:10AM -0700, Paul Jackson wrote:
> +#if NR_CPUS > 1
> +#define num_online_cpus() cpus_weight(cpu_online_map)
> +#define num_possible_cpus() cpus_weight(cpu_possible_map)
> +#define num_present_cpus() cpus_weight(cpu_present_map)
> +#define cpu_online(cpu) cpu_isset((cpu), cpu_online_map)
> +#define cpu_possible(cpu) cpu_isset((cpu), cpu_possible_map)
> +#define cpu_present(cpu) cpu_isset((cpu), cpu_present_map)
> +#else
> +#define num_online_cpus() 1
> +#define num_possible_cpus() 1
> +#define num_present_cpus() 1
> +#define cpu_online(cpu) ((cpu) == 0)
> +#define cpu_possible(cpu) ((cpu) == 0)
> +#define cpu_present(cpu) ((cpu) == 0)
> +#endif

#ifdef'ing it anyway?


On Thu, Jun 03, 2004 at 10:10:10AM -0700, Paul Jackson wrote:
> @@ -1206,9 +1207,10 @@
> {
> struct ino_bucket *bp = ivector_table + (long)data;
> struct irqaction *ap = bp->irq_info;
> - cpumask_t mask = get_smpaff_in_irqaction(ap);
> + cpumask_t mask;
> int len;
>
> + cpus_addr(mask)[0] = get_smpaff_in_irqaction(ap);
> if (cpus_empty(mask))
> mask = cpu_online_map;

This is an improvement?


-- wli

2004-06-04 08:46:41

by Keith Owens

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, 4 Jun 2004 01:19:06 -0700,
William Lee Irwin III <[email protected]> wrote:
>On Thu, Jun 03, 2004 at 10:10:10AM -0700, Paul Jackson wrote:
>> @@ -1206,9 +1207,10 @@
>> {
>> struct ino_bucket *bp = ivector_table + (long)data;
>> struct irqaction *ap = bp->irq_info;
>> - cpumask_t mask = get_smpaff_in_irqaction(ap);
>> + cpumask_t mask;
>> int len;
>>
>> + cpus_addr(mask)[0] = get_smpaff_in_irqaction(ap);
>> if (cpus_empty(mask))
>> mask = cpu_online_map;
>
>This is an improvement?

The existing code assumes that cpumask_t fits in a long; struct
irqaction->mask is defined as a long. Paul marked such suspect code
with cpus_addr(), it needs to be reviewed and corrected.

2004-06-04 09:10:45

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
On Thu, Jun 03, 2004 at 10:10:10AM -0700, Paul Jackson wrote:
> > +static inline void __cpu_set(int cpu, volatile cpumask_t *dstp)
> > ...
> Hungarian notation?

You mean the 'p' for pointer? Well, loosely speaking, I guess you could
call it that. Why do you ask?

Well ... I am not being straight forward. I likely know why you ask. I
find an occassional 'p' in a variable name to be helpful. For example,
in this case, I am flipping between referring to the same datum by
reference and by value - so it is useful to reflect that distinction in
the variable names - it's _the_ key distinction. If you wish to state a
case to the contrary, you're welcome to do so.

> #ifdef'ing it anyway?

In certain cases, yes. I had a version of these particular macros that
used inline logic instead, but this looked easier to read to my eye. If
I spoke out against ifdef's carte blanche at some point (which likely I
did) then I was being incautious in my speaking. The question is more
how best to make the code readable, maintainable, robust, fast and small.

> This is an improvement?

... see Keith's reply ...

Thank-you for your review comments.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 09:33:12

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Paul Jackson writes:
> Perhaps I should comment the cpus_addr() definition as 'deprecated'?

Please don't. cpus_addr() is useful when you need to get a
handle on the representation for non-cpumask_t operations.

Case in point: the perfctr kernel extension needs to communicate
a cpumask_t to user-space because of the asymmetric nature of
HT P4s. Unfortunately, a simple copy_to_user() won't work because:
a) the size depends on kernel .config, and
b) the representation is defined in terms of sequences of ulong,
which breaks 32-bit applications on 64-bit kernels.
So perfctr instead converts a cpumask_t to a sequence of uint,
and copies both the number of uints and the uints themselves
to user-space.

Having to do this conversion with a for-each-CPU type loop would
be slow and ugly, and would IMO show that the cpumask_t ADT had
become an obstacle to the actual work that needs to be done.

So please keep cpus_addr().

/Mikael

2004-06-04 09:38:32

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 11:31:24AM +0200, Mikael Pettersson wrote:
> Please don't. cpus_addr() is useful when you need to get a
> handle on the representation for non-cpumask_t operations.
> Case in point: the perfctr kernel extension needs to communicate
> a cpumask_t to user-space because of the asymmetric nature of
> HT P4s. Unfortunately, a simple copy_to_user() won't work because:
> a) the size depends on kernel .config, and
> b) the representation is defined in terms of sequences of ulong,
> which breaks 32-bit applications on 64-bit kernels.
> So perfctr instead converts a cpumask_t to a sequence of uint,
> and copies both the number of uints and the uints themselves
> to user-space.
> Having to do this conversion with a for-each-CPU type loop would
> be slow and ugly, and would IMO show that the cpumask_t ADT had
> become an obstacle to the actual work that needs to be done.
> So please keep cpus_addr().

If the marshalling code presents different formats to userspace
depending on BITS_PER_LONG then it's buggy.


-- wli

2004-06-04 09:43:09

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Mikael Pettersson <[email protected]> wrote:
>
> Paul Jackson writes:
> > Perhaps I should comment the cpus_addr() definition as 'deprecated'?
>
> Please don't. cpus_addr() is useful when you need to get a
> handle on the representation for non-cpumask_t operations.
>
> Case in point: the perfctr kernel extension needs to communicate
> a cpumask_t to user-space because of the asymmetric nature of
> HT P4s. Unfortunately, a simple copy_to_user() won't work because:
> a) the size depends on kernel .config, and
> b) the representation is defined in terms of sequences of ulong,
> which breaks 32-bit applications on 64-bit kernels.
> So perfctr instead converts a cpumask_t to a sequence of uint,
> and copies both the number of uints and the uints themselves
> to user-space.

In that case the cpumask code should provide some API function which
converts a cpumask_t into (and from?) some canonical and documented form.
Then you copy what it gave you to userspace.

Particular pieces of code shouldn't go poking inside the cpumask_t's
representation. It's different on different architectures and we could
even change it in the future.

2004-06-04 09:50:41

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III writes:
> On Fri, Jun 04, 2004 at 11:31:24AM +0200, Mikael Pettersson wrote:
> > Please don't. cpus_addr() is useful when you need to get a
> > handle on the representation for non-cpumask_t operations.
> > Case in point: the perfctr kernel extension needs to communicate
> > a cpumask_t to user-space because of the asymmetric nature of
> > HT P4s. Unfortunately, a simple copy_to_user() won't work because:
> > a) the size depends on kernel .config, and
> > b) the representation is defined in terms of sequences of ulong,
> > which breaks 32-bit applications on 64-bit kernels.
> > So perfctr instead converts a cpumask_t to a sequence of uint,
> > and copies both the number of uints and the uints themselves
> > to user-space.
> > Having to do this conversion with a for-each-CPU type loop would
> > be slow and ugly, and would IMO show that the cpumask_t ADT had
> > become an obstacle to the actual work that needs to be done.
> > So please keep cpus_addr().
>
> If the marshalling code presents different formats to userspace
> depending on BITS_PER_LONG then it's buggy.

No. Read what I wrote: binary compatibility was the very problem I
set out to solve, not cause.

For a given cpumask_t value, user-space sees the same binary
representation irregardless of how you combine 32 or 64-bit
user-spaces with 32 or 64-bit kernels.

This has all been worked out on x86 and amd64, and the conversion
is endian-neutral so e.g. ppc32 on ppc64 should work.

2004-06-04 09:54:38

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, 4 Jun 2004 01:19:06 -0700,
>> This is an improvement?

On Fri, Jun 04, 2004 at 06:43:43PM +1000, Keith Owens wrote:
> The existing code assumes that cpumask_t fits in a long; struct
> irqaction->mask is defined as a long. Paul marked such suspect code
> with cpus_addr(), it needs to be reviewed and corrected.

I'd rather just do it.


Index: irqaction-2.6.7-rc2/include/linux/interrupt.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/linux/interrupt.h 2004-05-29 23:26:11.000000000 -0700
+++ irqaction-2.6.7-rc2/include/linux/interrupt.h 2004-06-04 02:24:12.348627000 -0700
@@ -7,6 +7,7 @@
#include <linux/linkage.h>
#include <linux/bitops.h>
#include <linux/preempt.h>
+#include <linux/cpumask.h>
#include <asm/atomic.h>
#include <asm/hardirq.h>
#include <asm/ptrace.h>
@@ -35,7 +36,7 @@
struct irqaction {
irqreturn_t (*handler)(int, void *, struct pt_regs *);
unsigned long flags;
- unsigned long mask;
+ cpumask_t mask;
const char *name;
void *dev_id;
struct irqaction *next;
Index: irqaction-2.6.7-rc2/arch/ppc/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/ppc/kernel/irq.c 2004-05-29 23:26:35.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/ppc/kernel/irq.c 2004-06-04 02:24:12.374623000 -0700
@@ -241,7 +241,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->dev_id = dev_id;
action->next = NULL;
Index: irqaction-2.6.7-rc2/arch/alpha/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/alpha/kernel/irq.c 2004-05-29 23:26:27.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/alpha/kernel/irq.c 2004-06-04 02:24:12.393620000 -0700
@@ -457,7 +457,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/arm/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/arm/kernel/irq.c 2004-05-29 23:25:40.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/arm/kernel/irq.c 2004-06-04 02:24:12.419616000 -0700
@@ -674,7 +674,7 @@

action->handler = handler;
action->flags = irq_flags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/arm26/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/arm26/kernel/irq.c 2004-05-29 23:26:43.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/arm26/kernel/irq.c 2004-06-04 02:24:12.442612000 -0700
@@ -549,7 +549,7 @@

action->handler = handler;
action->flags = irq_flags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/cris/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/cris/kernel/irq.c 2004-06-01 03:11:16.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/cris/kernel/irq.c 2004-06-04 02:24:12.465609000 -0700
@@ -240,7 +240,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/i386/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/i386/kernel/irq.c 2004-06-01 03:11:16.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/i386/kernel/irq.c 2004-06-04 02:24:12.499604000 -0700
@@ -654,7 +654,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/ia64/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/ia64/kernel/irq.c 2004-06-01 03:11:17.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/ia64/kernel/irq.c 2004-06-04 02:24:12.523600000 -0700
@@ -608,7 +608,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/mips/baget/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/mips/baget/irq.c 2004-05-29 23:26:19.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/mips/baget/irq.c 2004-06-04 02:24:12.543597000 -0700
@@ -325,7 +325,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/mips/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/mips/kernel/irq.c 2004-05-29 23:25:45.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/mips/kernel/irq.c 2004-06-04 02:24:12.568593000 -0700
@@ -487,7 +487,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/parisc/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/parisc/kernel/irq.c 2004-05-29 23:26:04.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/parisc/kernel/irq.c 2004-06-04 02:24:12.590590000 -0700
@@ -644,7 +644,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/ppc64/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/ppc64/kernel/irq.c 2004-05-29 23:26:09.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/ppc64/kernel/irq.c 2004-06-04 02:24:12.617586000 -0700
@@ -206,7 +206,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->dev_id = dev_id;
action->next = NULL;
Index: irqaction-2.6.7-rc2/arch/sh/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/sh/kernel/irq.c 2004-05-29 23:26:03.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/sh/kernel/irq.c 2004-06-04 02:24:12.638583000 -0700
@@ -436,7 +436,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/sparc/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/sparc/kernel/irq.c 2004-05-29 23:25:43.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/sparc/kernel/irq.c 2004-06-04 02:24:12.668578000 -0700
@@ -448,7 +448,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->dev_id = NULL;
action->next = NULL;
@@ -528,7 +528,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/sparc/kernel/sun4d_irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/sparc/kernel/sun4d_irq.c 2004-05-29 23:25:40.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/sparc/kernel/sun4d_irq.c 2004-06-04 02:24:12.682576000 -0700
@@ -336,7 +336,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/sparc64/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/sparc64/kernel/irq.c 2004-05-29 23:26:43.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/sparc64/kernel/irq.c 2004-06-04 02:45:41.266681000 -0700
@@ -118,10 +118,6 @@
action->flags |= __irq_ino(irq) << 48;
#define get_ino_in_irqaction(action) (action->flags >> 48)

-#if NR_CPUS > 64
-#error irqaction embedded smp affinity does not work with > 64 cpus, FIXME
-#endif
-
#define put_smpaff_in_irqaction(action, smpaff) (action)->mask = (smpaff)
#define get_smpaff_in_irqaction(action) ((action)->mask)

@@ -454,7 +450,7 @@
action->next = NULL;
action->dev_id = dev_id;
put_ino_in_irqaction(action, irq);
- put_smpaff_in_irqaction(action, 0);
+ put_smpaff_in_irqaction(action, CPU_MASK_NONE);

if (tmp)
tmp->next = action;
@@ -710,7 +706,7 @@
if (++buddy >= NR_CPUS)
buddy = 0;
if (++ticks > NR_CPUS) {
- put_smpaff_in_irqaction(ap, 0);
+ put_smpaff_in_irqaction(ap, CPU_MASK_NONE);
goto out;
}
}
@@ -944,7 +940,7 @@
action->name = name;
action->next = NULL;
put_ino_in_irqaction(action, irq);
- put_smpaff_in_irqaction(action, 0);
+ put_smpaff_in_irqaction(action, CPU_MASK_NONE);

*(bucket->pil + irq_action) = action;
enable_irq(irq);
@@ -1162,45 +1158,6 @@

#ifdef CONFIG_SMP

-#define HEX_DIGITS 16
-
-static unsigned int parse_hex_value (const char *buffer,
- unsigned long count, unsigned long *ret)
-{
- unsigned char hexnum [HEX_DIGITS];
- unsigned long value;
- int i;
-
- if (!count)
- return -EINVAL;
- if (count > HEX_DIGITS)
- count = HEX_DIGITS;
- if (copy_from_user(hexnum, buffer, count))
- return -EFAULT;
-
- /*
- * Parse the first 8 characters as a hex string, any non-hex char
- * is end-of-string. '00e1', 'e1', '00E1', 'E1' are all the same.
- */
- value = 0;
-
- for (i = 0; i < count; i++) {
- unsigned int c = hexnum[i];
-
- switch (c) {
- case '0' ... '9': c -= '0'; break;
- case 'a' ... 'f': c -= 'a'-10; break;
- case 'A' ... 'F': c -= 'A'-10; break;
- default:
- goto out;
- }
- value = (value << 4) | c;
- }
-out:
- *ret = value;
- return 0;
-}
-
static int irq_affinity_read_proc (char *page, char **start, off_t off,
int count, int *eof, void *data)
{
@@ -1219,7 +1176,7 @@
return len;
}

-static inline void set_intr_affinity(int irq, unsigned long hw_aff)
+static inline void set_intr_affinity(int irq, cpumask_t hw_aff)
{
struct ino_bucket *bp = ivector_table + irq;

@@ -1237,22 +1194,17 @@
unsigned long count, void *data)
{
int irq = (long) data, full_count = count, err;
- unsigned long new_value, i;
+ cpumask_t new_value;

- err = parse_hex_value(buffer, count, &new_value);
+ err = cpumask_parse(buffer, count, new_value);

/*
* Do not allow disabling IRQs completely - it's a too easy
* way to make the system unusable accidentally :-) At least
* one online CPU still has to be targeted.
*/
- for (i = 0; i < NR_CPUS; i++) {
- if ((new_value & (1UL << i)) != 0 &&
- !cpu_online(i))
- new_value &= ~(1UL << i);
- }
-
- if (!new_value)
+ cpus_and(new_value, new_value, cpu_online_map);
+ if (cpus_empty(new_value))
return -EINVAL;

set_intr_affinity(irq, new_value);
Index: irqaction-2.6.7-rc2/arch/sparc64/kernel/smp.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/sparc64/kernel/smp.c 2004-05-29 23:25:41.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/sparc64/kernel/smp.c 2004-06-04 02:46:31.379063000 -0700
@@ -420,9 +420,6 @@
* packet, but we have no use for that. However we do take advantage of
* the new pipelining feature (ie. dispatch to multiple cpus simultaneously).
*/
-#if NR_CPUS > 32
-#error Fixup cheetah_xcall_deliver Dave...
-#endif
static void cheetah_xcall_deliver(u64 data0, u64 data1, u64 data2, cpumask_t mask)
{
u64 pstate, ver;
Index: irqaction-2.6.7-rc2/arch/um/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/um/kernel/irq.c 2004-05-29 23:26:27.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/um/kernel/irq.c 2004-06-04 02:24:12.746566000 -0700
@@ -419,7 +419,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/v850/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/v850/kernel/irq.c 2004-05-29 23:26:27.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/v850/kernel/irq.c 2004-06-04 02:24:12.772562000 -0700
@@ -392,7 +392,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;
Index: irqaction-2.6.7-rc2/arch/x86_64/kernel/irq.c
===================================================================
--- irqaction-2.6.7-rc2.orig/arch/x86_64/kernel/irq.c 2004-06-01 03:11:21.000000000 -0700
+++ irqaction-2.6.7-rc2/arch/x86_64/kernel/irq.c 2004-06-04 02:24:12.795559000 -0700
@@ -491,7 +491,7 @@

action->handler = handler;
action->flags = irqflags;
- action->mask = 0;
+ cpus_clear(action->mask);
action->name = devname;
action->next = NULL;
action->dev_id = dev_id;

2004-06-04 09:59:41

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III writes:
>> If the marshalling code presents different formats to userspace
>> depending on BITS_PER_LONG then it's buggy.

On Fri, Jun 04, 2004 at 11:46:49AM +0200, Mikael Pettersson wrote:
> No. Read what I wrote: binary compatibility was the very problem I
> set out to solve, not cause.
> For a given cpumask_t value, user-space sees the same binary
> representation irregardless of how you combine 32 or 64-bit
> user-spaces with 32 or 64-bit kernels.
> This has all been worked out on x86 and amd64, and the conversion
> is endian-neutral so e.g. ppc32 on ppc64 should work.

cpumask_scnprintf() is correct to all appearances... testcase please.


-- wli

2004-06-04 11:17:28

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III writes:
> William Lee Irwin III writes:
> >> If the marshalling code presents different formats to userspace
> >> depending on BITS_PER_LONG then it's buggy.
>
> On Fri, Jun 04, 2004 at 11:46:49AM +0200, Mikael Pettersson wrote:
> > No. Read what I wrote: binary compatibility was the very problem I
> > set out to solve, not cause.
> > For a given cpumask_t value, user-space sees the same binary
> > representation irregardless of how you combine 32 or 64-bit
> > user-spaces with 32 or 64-bit kernels.
> > This has all been worked out on x86 and amd64, and the conversion
> > is endian-neutral so e.g. ppc32 on ppc64 should work.
>
> cpumask_scnprintf() is correct to all appearances... testcase please.

How large buf does it need? I don't see any spec for that in 2.6.6.

Second, let's just say that while some kernel people think that
converting stuff to ASCII is "neat", I'm not one of them. It's
just a waste of time and space, for both kernel and user-space.
I'd rather do a for-each-CPU loop which strictly keeps to cpumask_t
operations than take a detour via ASCII.

2004-06-04 11:28:08

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III writes:
>> cpumask_scnprintf() is correct to all appearances... testcase please.

On Fri, Jun 04, 2004 at 01:16:35PM +0200, Mikael Pettersson wrote:
> How large buf does it need? I don't see any spec for that in 2.6.6.
> Second, let's just say that while some kernel people think that
> converting stuff to ASCII is "neat", I'm not one of them. It's
> just a waste of time and space, for both kernel and user-space.
> I'd rather do a for-each-CPU loop which strictly keeps to cpumask_t
> operations than take a detour via ASCII.

If you care to export an architecture-neutral and/or 32/64 -bit
compatible binary representation of a bitmap, please provide the
implementation in lib/bitmap.c; I'm relatively agnostic on the ASCII
vs. whatever issue. Others may not be...

The cpu count for cpumask_t should be visible to userspace as the
dreaded sysconf(_SC_NPROCESSORS_CONF)... don't ask how this is
implemented, you don't want to know.

Thanks.

-- wli

2004-06-04 11:33:07

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 04:27:44AM -0700, William Lee Irwin III wrote:
> If you care to export an architecture-neutral and/or 32/64 -bit
> compatible binary representation of a bitmap, please provide the
> implementation in lib/bitmap.c; I'm relatively agnostic on the ASCII
> vs. whatever issue. Others may not be...
> The cpu count for cpumask_t should be visible to userspace as the
> dreaded sysconf(_SC_NPROCESSORS_CONF)... don't ask how this is
> implemented, you don't want to know.
> Thanks.
> -- wli

Hmm. Okay, I'd better confess. It parses /proc/cpuinfo... except there's
no information there about the NR_CPUS used for cpumask_t, but rather
only num_cpus_online().

akpm, apps are in trouble. Some interface is needed to export NR_CPUS
so the kernel doesn't clobber their memory if they guess too low. Andi,
how does libnuma cope with this?


-- wli

2004-06-04 16:00:52

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Mikael writes:
> Case in point: the perfctr kernel extension needs to communicate ...

Nice example. Thank-you.

Yes - doing that 1-bit at a time in a per-cpu loop would be ugly.

We should leave cpus_addr() around, at least until such time as the
cpumask ADT provided routines to support exactly what you are doing -
copying up masks to user space as length specified arrays of uint.

==

Aside - be careful here not to get the two halves of 64 bit longs,
on 64 bit big endian machines, backwards.

>From the depths of my email archives (Joe Korty might recognize this)
comes the following snippet of code and commentary, never put to use,
which ponders the handling of such masks:

+/*
+ * Bitops apply to arrays of unsigned long. This is almost
+ * the same as an array of unsigned ints, except on 64 bit big
+ * endian architectures, in which the two 32-bit int halves of
+ * each long are reversed (big 32-bit halfword first, naturally).
+ *
+ * Use this BIT32X (for "BITop 32-bit indeX") macro to index the
+ * i-th word of a bit mask declared as an array of 32 bit words.
+ *
+ * Usage example accessing 32-bit words in mask[] in order,
+ * smallest first:
+ * u32 mask[MASKLENGTH];
+ * int i;
+ * for (i = 0; i < MASKLENGTH; i++)
+ * ... mask[BIT32X(i)] ...
+ */
+#ifndef BIT32X
+#include <asm/byteorder.h>
+#if BITS_PER_LONG == 64 && defined(__BIG_ENDIAN)
+#define BIT32X(i) ((i)^1)
+#elif BITS_PER_LONG == 32 || defined(__LITTLE_ENDIAN)
+#define BIT32X(i) (i)
+#endif
+#endif


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 16:15:48

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
> Some interface is needed to export NR_CPUS

Well ... technically ... such an interface already exists.
However the word "interface" might be too kind a description.

/*
* Ugly hack to get size of cpumask - keep calling sched_getaffinity
* with masks of increasing size until it stops complaining -EINVAL
* that the mask is too small. Determines size of kernel cpumask,
* in number of bytes. Size will always be some multiple of the
* size of an unsigned long. Since it's ok to be too big, and since
* kernel NR_CPUS is usually a power of two, just try each power
* of two, until one works. Contrary to the man page, a successful
* sched_getaffinity() returns a positive number (the sizeof(cpumask_t),
* to be exact), not zero. Consider any return >= 0 to mean success.
*/

#include <errno.h>
extern int errno;

int cpumasksz()
{
int nbytes;
unsigned long *mask = 0;

for (nbytes = sizeof(unsigned long); nbytes < 10000; nbytes *= 2) {
int r;
mask = (unsigned long *)realloc(mask, nbytes);
if (mask == 0)
return -ENOMEM;
errno = 0;
r = sched_getaffinity(0, nbytes, mask);
if (r < 0 && errno == EINVAL)
continue;
if (r >= 0)
break;
return -errno;
}
free(mask);
return nbytes;
}



--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 16:30:37

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
>> Some interface is needed to export NR_CPUS

On Fri, Jun 04, 2004 at 09:23:16AM -0700, Paul Jackson wrote:
> Well ... technically ... such an interface already exists.
> However the word "interface" might be too kind a description.

I'm thoroughly disgusted.

Also, you need to return the positive value returned by
sched_getaffinity(), not the value of nbytes you stopped at.


-- wli

2004-06-04 16:56:29

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Mikael writes:
>> Case in point: the perfctr kernel extension needs to communicate ...

On Fri, Jun 04, 2004 at 09:03:14AM -0700, Paul Jackson wrote:
> Nice example. Thank-you.
> Yes - doing that 1-bit at a time in a per-cpu loop would be ugly.
> We should leave cpus_addr() around, at least until such time as the
> cpumask ADT provided routines to support exactly what you are doing -
> copying up masks to user space as length specified arrays of uint.

This is patently ridiculous. Make a compat_sched_getaffinity(), and
likewise for whatever else is copying unsigned long arrays to userspace.


-- wli

2004-06-04 17:03:54

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

> I'd rather just do it.

Nice. Thanks, Bill.

The patch will collide with 'linus.patch', in Andrew's 2.6.7-rc2-mm2,
which changes the arch/sparc64/kernel/irq.c line:

- static unsigned int parse_hex_value (const char *buffer,
+ static unsigned int parse_hex_value (const char __user *buffer,

Otherwise, it passes my cursory inspection.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 17:24:00

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Bill wrote:
> This is patently ridiculous. Make a compat_sched_getaffinity(), and
> likewise for whatever else is copying unsigned long arrays to userspace.

Earlier, Andrew wrote:
> In that case the cpumask code should provide some API function which
> converts a cpumask_t into (and from?) some canonical and documented form.
> Then you copy what it gave you to userspace.

I'd vote to have the documented form that Andrew speaks of be arrays
of 32-bit words, which is what I understood Mikael was doing. I agree
with Andrew's suggested to/from canonical functions.

I'd prefer not copying arrays of unsigned longs, due to the confusions
of coding to them across 32/64 bit and big/little endian architectures.
At times I have wished the kernel had chosen u32 arrays instead of
unsigned long arrays for bitmaps, for the same reason. The cpumask
sprintf and parse format is intentionally 32-bit chunk friendly.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 17:40:41

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III:
> I'm thoroughly disgusted.

Yup ... LOL. One sick piece of code.

I didn't return the actual return from sched_getaffinity() because (1)
it's ok to estimate the mask size too high, and (2) given that the man
page and kernel don't agree on the return value of sched_getaffinity(),
I figured that the less I relied on it, the longer my user code would
continue functioning in a useful manner. As always, the key to robust
code (code that withstands the perils of time) is minimizing risky
assumptions.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 17:53:14

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 10:29:46AM -0700, Paul Jackson wrote:
> I'd vote to have the documented form that Andrew speaks of be arrays
> of 32-bit words, which is what I understood Mikael was doing. I agree
> with Andrew's suggested to/from canonical functions.
> I'd prefer not copying arrays of unsigned longs, due to the confusions
> of coding to them across 32/64 bit and big/little endian architectures.
> At times I have wished the kernel had chosen u32 arrays instead of
> unsigned long arrays for bitmaps, for the same reason. The cpumask
> sprintf and parse format is intentionally 32-bit chunk friendly.

Index: irqaction-2.6.7-rc2/include/linux/bitmap.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/linux/bitmap.h 2004-05-29 23:26:49.000000000 -0700
+++ irqaction-2.6.7-rc2/include/linux/bitmap.h 2004-06-04 10:35:31.982041000 -0700
@@ -46,6 +46,7 @@
const unsigned long *maskp, int bits);
int bitmap_parse(const char __user *ubuf, unsigned int ubuflen,
unsigned long *maskp, int bits);
+void bitmap_to_u32_array(u32 *dst, unsigned long *src, int nlongs);

#endif /* __ASSEMBLY__ */

Index: irqaction-2.6.7-rc2/lib/bitmap.c
===================================================================
--- irqaction-2.6.7-rc2.orig/lib/bitmap.c 2004-05-29 23:26:27.000000000 -0700
+++ irqaction-2.6.7-rc2/lib/bitmap.c 2004-06-04 10:51:46.878834000 -0700
@@ -330,3 +330,22 @@
return 0;
}
EXPORT_SYMBOL(bitmap_parse);
+
+#if defined(__BIG_ENDIAN) && BITS_PER_LONG == 64
+void bitmap_to_u32_array(u32 *dst, unsigned long *src, int nwords)
+{
+ int i;
+
+ for (i = 0; i < nwords; ++i) {
+ u64 word = src[i];
+ dst[2*i] = word >> 32;
+ dst[2*i+1] = word;
+ }
+}
+#else
+void bitmap_to_u32_array(u32 *dst, unsigned long *src, int nwords)
+{
+ memcpy(dst, src, nwords*sizeof(unsigned long));
+}
+#endif
+EXPORT_SYMBOL(bitmap_to_u32_array);

2004-06-04 18:13:04

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

/* William Lee Irwin III:
>> I'm thoroughly disgusted.

On Fri, Jun 04, 2004 at 10:47:56AM -0700, Paul Jackson wrote:
> Yup ... LOL. One sick piece of code.
> I didn't return the actual return from sched_getaffinity() because (1)
> it's ok to estimate the mask size too high, and (2) given that the man
> page and kernel don't agree on the return value of sched_getaffinity(),
> I figured that the less I relied on it, the longer my user code would
> continue functioning in a useful manner. As always, the key to robust
> code (code that withstands the perils of time) is minimizing risky
> assumptions.

Even the following returns 32 on UP. _SC_NPROCESSOR_CONF is
unimplementable. NR_CPUS serves as an upper bound on the number of cpus
that may at some time be simultaneously present in the future. Without
any way to reliably determine this, luserspace is fscked.

*/


#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <limits.h>
#include <sys/syscall.h>

static int getaffinity(pid_t, size_t, unsigned long *);
static int detect_nr_cpus(void);

int main(void)
{
printf("%d\n", detect_nr_cpus());
return 0;
}

static int detect_nr_cpus(void)
{
unsigned long *cpus = malloc(sizeof(long));
size_t upper, middle, lower = sizeof(long);

for (upper = lower; getaffinity(0, upper, cpus) < 0; upper *= 2) {
if (!realloc(cpus, 2*upper))
return -ENOMEM;
}
while (lower < upper) {
middle = (lower + upper)/2;
if (!realloc(cpus, middle))
return -ENOMEM;
if (getaffinity(0, middle, cpus) < 0)
lower = middle;
else
upper = middle;
}
return CHAR_BIT*upper;
}

static int getaffinity(pid_t pid, size_t size, unsigned long *cpus)
{
return syscall(__NR_sched_getaffinity, pid, size, cpus);
}

2004-06-04 18:21:19

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 11:12:33AM -0700, William Lee Irwin III wrote:
> while (lower < upper) {

--- nr_cpus.c.orig 2004-06-04 11:16:16.492419568 -0700
+++ nr_cpus.c 2004-06-04 11:16:26.508896832 -0700
@@ -23,7 +23,7 @@
if (!realloc(cpus, 2*upper))
return -ENOMEM;
}
- while (lower < upper) {
+ while (lower < upper - 1) {
middle = (lower + upper)/2;
if (!realloc(cpus, middle))
return -ENOMEM;

2004-06-04 18:29:42

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III <[email protected]> wrote:
>
> _SC_NPROCESSOR_CONF is
> unimplementable. NR_CPUS serves as an upper bound on the number of cpus
> that may at some time be simultaneously present in the future.

NR_CPUS is arguably the correct thing when it comes to copying per-cpu info
to and from userspace.

Sometimes userspace wants to know NR_CPUS. Sometimes it wants to know the
index of the max possible CPU. Sometimes, perhaps the index of the max
online CPU. Sometimes the max index of the CPUs upon which this task is
eligible to run. Sometimes (lame) userspace may want to know, at compile
time, the maximum number of CPUs which a Linux kernel will ever support.

It's not completely trivial.

Which of the above is _SC_NPROCESSOR_CONF supposed to return?

2004-06-04 18:37:32

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
> Without any way to reliably determine this, luserspace is fscked.

I don't see why user code needs to determine NR_CPUS exactly. Any
reasonable upper bound should work - reasonable meaning doesn't waste
too many unused words of memory.

It's not really NR_CPUS that users need - its a reasonably close upper
bound to the size of the space that sched_getaffinity() must be provided
they need. And your code does a pretty good job of providing that.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 18:39:12

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III <[email protected]> wrote:
>> _SC_NPROCESSOR_CONF is
>> unimplementable. NR_CPUS serves as an upper bound on the number of cpus
>> that may at some time be simultaneously present in the future.

On Fri, Jun 04, 2004 at 11:27:30AM -0700, Andrew Morton wrote:
> NR_CPUS is arguably the correct thing when it comes to copying per-cpu info
> to and from userspace.
> Sometimes userspace wants to know NR_CPUS. Sometimes it wants to know the
> index of the max possible CPU. Sometimes, perhaps the index of the max
> online CPU. Sometimes the max index of the CPUs upon which this task is
> eligible to run. Sometimes (lame) userspace may want to know, at compile
> time, the maximum number of CPUs which a Linux kernel will ever support.
> It's not completely trivial.
> Which of the above is _SC_NPROCESSOR_CONF supposed to return?

_SC_NPROCESSORS_CONF looks like a glibc extension, and if so it's
somewhat arbitrary. It's not documented in the manpage or the header's
comments. I presumed it was the largest number of cpus that could be
simultaneously online in the running kernel instance (which is NR_CPUS
in current kernels). If it's a standard it's likely to be very poorly-
defined. These things differ as the implementations start varying e.g.
for very sparse cpuid spaces, but not anything we handle now.

I'd have to write more stuff to try to find the rest of these things.
I'm not sure all of them are implementable.

And the luserspace code needs the following atop all that:

--- nr_cpus.c.orig2 2004-06-04 11:23:28.130800560 -0700
+++ nr_cpus.c 2004-06-04 11:27:12.785647832 -0700
@@ -10,28 +10,35 @@

int main(void)
{
+ int cpus = detect_nr_cpus();
printf("%d\n", detect_nr_cpus());
- return 0;
+ return cpus <= 0;
}

static int detect_nr_cpus(void)
{
unsigned long *cpus = malloc(sizeof(long));
size_t upper, middle, lower = sizeof(long);
+ int ret = -ENOMEM;

+ if (!cpus)
+ return -ENOMEM;
for (upper = lower; getaffinity(0, upper, cpus) < 0; upper *= 2) {
if (!realloc(cpus, 2*upper))
- return -ENOMEM;
+ goto out;
}
while (lower < upper - 1) {
middle = (lower + upper)/2;
if (!realloc(cpus, middle))
- return -ENOMEM;
+ goto out;
if (getaffinity(0, middle, cpus) < 0)
lower = middle;
else
upper = middle;
}
+ ret = CHAR_BIT*upper;
+out:
+ free(cpus);
return CHAR_BIT*upper;
}

2004-06-04 18:44:02

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
>> Without any way to reliably determine this, luserspace is fscked.

On Fri, Jun 04, 2004 at 11:42:19AM -0700, Paul Jackson wrote:
> I don't see why user code needs to determine NR_CPUS exactly. Any
> reasonable upper bound should work - reasonable meaning doesn't waste
> too many unused words of memory.
> It's not really NR_CPUS that users need - its a reasonably close upper
> bound to the size of the space that sched_getaffinity() must be provided
> they need. And your code does a pretty good job of providing that.

Wrong. Apps that want to reconfigure the system to e.g. online more cpus
in response to heightened load want to know.


-- wli

2004-06-04 18:53:36

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
> +void bitmap_to_u32_array(u32 *dst, unsigned long *src, int nwords)

Good.

Assuming that you're signing up to push this baby along ...

Would it be better to remove the ENDIAN specific ifdefs from
lib/bitmap.c, and instead add something like the BIT32X() macro I
described earlier on this thread, in an aside to Mikael Pettersson?
If that macro were defined right in the ./linux/byteorder/*_endian.h
files, then endian neutral code could be put in lib/bitmap.c, and
endian aware code kept in the endian.h files.

Could you extend the cpumask_t API with a corresponding routine?

Mikael - does William's routine look like the makings of something
that fits your needs?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-04 19:09:47

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation


> This is patently ridiculous. Make a compat_sched_getaffinity(), and
> likewise for whatever else is copying unsigned long arrays to userspace.

Did someone say compat_sched_getaffinity?

Anton

--

Patch from Milton Miller that adds the sched_affinity syscalls into the
compat layer.

Signed-off-by: Milton Miller <[email protected]>
Signed-off-by: Anton Blanchard <[email protected]>

--

gr16b-anton/kernel/compat.c | 88 +++++++++++++++++++++++++++++++++++++++-----
1 files changed, 79 insertions(+), 9 deletions(-)

diff -purN linux-2.6.5/kernel/compat.c linux-2.6.5.sys_sched_setaffinity/kernel/compat.c
--- linux-2.6.5/kernel/compat.c 2004-04-04 03:37:07.000000000 +0000
+++ linux-2.6.5.sys_sched_setaffinity/kernel/compat.c 2004-05-10 11:13:20.000000000 +0000
@@ -372,22 +372,64 @@ compat_sys_wait4(compat_pid_t pid, compa
}
}

+/* for maximum compatability, we allow programs to use a single (compat)
+ * unsigned long bitmask if all cpus will fit. If not, you have to have
+ * at least the kernel size available.
+ */
+#define USE_COMPAT_ULONG_CPUMASK (NR_CPUS <= 8*sizeof(compat_ulong_t))
+
asmlinkage long compat_sys_sched_setaffinity(compat_pid_t pid,
unsigned int len,
compat_ulong_t *user_mask_ptr)
{
- unsigned long kernel_mask;
+ cpumask_t kernel_mask;
mm_segment_t old_fs;
int ret;

- if (get_user(kernel_mask, user_mask_ptr))
- return -EFAULT;
+ if (USE_COMPAT_ULONG_CPUMASK) {
+ compat_ulong_t user_mask;
+
+ if (len < sizeof(user_mask))
+ return -EINVAL;
+
+ if (get_user(user_mask, user_mask_ptr))
+ return -EFAULT;
+
+ kernel_mask = cpus_promote(user_mask);
+ } else {
+ if (len < sizeof(kernel_mask))
+ return -EINVAL;
+
+ if (!access_ok(VERIFY_READ, user_mask_ptr, sizeof(kernel_mask)))
+ return -EFAULT;
+ else {
+ int i, j;
+ unsigned long *k, m;
+ compat_ulong_t um;
+
+ k = &cpus_coerce(kernel_mask);
+
+ for (i=0; i < sizeof(kernel_mask)/sizeof(m); i++) {
+ m = 0;
+
+ for (j = 0; j < sizeof(m)/sizeof(um); j++ ) {
+ if (__get_user(um, user_mask_ptr))
+ return -EFAULT;
+ user_mask_ptr++;
+ m <<= 4*sizeof(um);
+ m <<= 4*sizeof(um);
+ m |= um;
+ }
+ *k++ = m;
+ }
+ }
+ }

old_fs = get_fs();
set_fs(KERNEL_DS);
ret = sys_sched_setaffinity(pid,
sizeof(kernel_mask),
- &kernel_mask);
+ (unsigned long *)&kernel_mask);
set_fs(old_fs);

return ret;
@@ -396,21 +438,49 @@ asmlinkage long compat_sys_sched_setaffi
asmlinkage int compat_sys_sched_getaffinity(compat_pid_t pid, unsigned int len,
compat_ulong_t *user_mask_ptr)
{
- unsigned long kernel_mask;
+ cpumask_t kernel_mask;
mm_segment_t old_fs;
int ret;

+ if (len < (USE_COMPAT_ULONG_CPUMASK ? sizeof(compat_ulong_t)
+ : sizeof(kernel_mask)))
+ return -EINVAL;
+
old_fs = get_fs();
set_fs(KERNEL_DS);
ret = sys_sched_getaffinity(pid,
sizeof(kernel_mask),
- &kernel_mask);
+ (unsigned long *)&kernel_mask);
set_fs(old_fs);

if (ret > 0) {
- ret = sizeof(compat_ulong_t);
- if (put_user(kernel_mask, user_mask_ptr))
- return -EFAULT;
+ if (USE_COMPAT_ULONG_CPUMASK) {
+ ret = sizeof(compat_ulong_t);
+ if (put_user(cpus_coerce(kernel_mask), user_mask_ptr))
+ return -EFAULT;
+ } else {
+ int i, j, err;
+ unsigned long *k, m;
+ compat_ulong_t um;
+
+ err = ! access_ok(VERIFY_WRITE, user_mask_ptr, ret);
+
+ k = &cpus_coerce(kernel_mask);
+
+ for (i=0; i < sizeof(kernel_mask)/sizeof(m) && !err; i++) {
+ m = *k++;
+
+ for (j = 0; j < sizeof(m)/sizeof(compat_ulong_t) && !err; j++ ) {
+ um = m;
+ err |= __put_user(um, user_mask_ptr);
+ user_mask_ptr++;
+ m >>= 4*sizeof(compat_ulong_t);
+ m >>= 4*sizeof(compat_ulong_t);
+ }
+ }
+ if (err)
+ ret = -EFAULT;
+ }
}

return ret;

2004-06-04 19:18:15

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

At some point in the past, I wrote:
>> This is patently ridiculous. Make a compat_sched_getaffinity(), and
>> likewise for whatever else is copying unsigned long arrays to userspace.

On Sat, Jun 05, 2004 at 05:08:03AM +1000, Anton Blanchard wrote:
> Did someone say compat_sched_getaffinity?
> Anton

Thank you.

On Sat, Jun 05, 2004 at 05:08:03AM +1000, Anton Blanchard wrote:
> Patch from Milton Miller that adds the sched_affinity syscalls into the
> compat layer.
> Signed-off-by: Milton Miller <[email protected]>
> Signed-off-by: Anton Blanchard <[email protected]>

I'll sign off on it too if that helps any.


-- wli

2004-06-04 20:29:22

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Anton Blanchard <[email protected]> wrote:
>
>
> > This is patently ridiculous. Make a compat_sched_getaffinity(), and
> > likewise for whatever else is copying unsigned long arrays to userspace.
>
> Did someone say compat_sched_getaffinity?
>

aargh! It's back!

>
> --
>
> Patch from Milton Miller that adds the sched_affinity syscalls into the
> compat layer.

There's something about this patch which make me break out in hives. Does
it *really* need to be that complicated?

iirc, the last time I looked through this I was unable to convince myself
that it was endianness-correct. Is it?

2004-06-04 23:59:51

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Andrew - I view the following as a discussion of further work
that might be done, on top of the cpumask patches I
have submitted. Except for one specific issue that
David Mosberger is working off-line with you and me,
I have not yet seen any reason to change the cpumask
patches as submitted. I hope that this additional
discussion is not discouraging you from considering
those cpumask patches for acceptance.

William Lee Irwin III wrote:
> > Yes - doing that 1-bit at a time in a per-cpu loop would be ugly.
> > We should leave cpus_addr() around, at least until such time as the
> > cpumask ADT provided routines to support exactly what you are doing -
> > copying up masks to user space as length specified arrays of uint.
>
> This is patently ridiculous. Make a compat_sched_getaffinity(), and
> likewise for whatever else is copying unsigned long arrays to userspace.

My mind reading skills are failing me. At the risk of opening myself to
further ridicule, which part of what I wrote is patently ridiculous,
why, and how does that differ from whatever you had in mind when you
recommended doing "likewise"?

Putting your comments aside for a moment ...

We have here a bit of suckage. The kernel bitmaps/cpumasks are arrays
of unsigned long, with the low order long in the low order array slot,
and the bytes within the longs in natural byte-order for that arch. The
sched_setaffinity/sched_getaffinity calls in the kernel copy this stuff
directly to/from user space. This doesn't work so well for 32 bit tasks
on a 64 bit big-endian kernel. [Begin off-topic alert] The glibc
sched_setaffinity and sched_getaffinity calls forcibly truncate the size
of masks to some constant hardcoded size -- you have to use
__SYSCALL(__NR_set_mempolicy) and such to get the real syscall. This
doesn't work so well for kernels compiled with NR_CPUS larger than the
hardcoded glibc size. [End off-topic alert] This also doesn't provide
any help to other code needing to move binary masks across the
kernel/user boundary, such as the perfctr kernel extension that Mikael
Pettersson <[email protected]> describes.

I presume that it is too late to change the low level format of masks
that the sched_setaffinity/sched_getaffinity API support. I'd be
delighted to be wrong on this presumption. So there is need for a
compat variant of these calls, for use by 32 bit apps on 64 bit kernels.
My first reaction to Milton Miller's compat_sched_getaffinity patch
that Anton reminded us of is similar to Andrew's. I haven't had the
intestinal fortitude to study the matter closer yet. Before actually
reading the code, I would expect that all it had to handle was the
swapping of 32 bit halves of 64 bit longs on 64 bit big endian kernels,
such as I described in my discussion of a mythical BIT32X() macro,
earlier in this thread. I would not expect it to have to make such a
big deal of the special case of one word masks, as distinct from n word
masks, though I agree that a 32 bit app should be able to use a single
32 bit word mask on a 64 bit kernel compiled with NR_CPUS <= 32.

A key question, since it seems the perfctr stuff Mikael Pettersson
describes is on its way into the main stream kernel, is whether any
other kernel binary bitmap/cpumask API should use the same format as
used by the kernel sched_setaffinity and sched_getaffinity, or use a
more easily portable format - say an array of 32 bit words rather than
an array of unsigned longs. One could make impassioned pleas either
way. Having one kernel represent the same type in two different binary
formats is a bit of a botch. But then again, arrays of 32 bit words are
'nicer'. And in fact, we already _have_ two formats required, since 32
bit apps on 64 bit end endian kernels necessarily see a different format
than their kernel uses natively -- indeed they use a format that is
essentially the same as perfctr is using now.

My vote, already cast when I slid the 32 bit chunk ascii format past
y'all (it's amazing now, that I managed to do that ...) would be to
export the array of 32 bit words format from the kernel, in all calls
except the set/get affinity calls, where we have already cast the die
otherwise. I like what I understand Mikael is trying to do here.

In any case, I'd hope that any big/little endian distinctions could be
encapsulated in macros provided by include/linux/byteorder headers. I'd
hope that whichever one or two formats the kernel exported were
supported by conversion routines in bitmap.c and bitmap.h, and if
useful, also made available via the cpumask_t API. Once cpumask
routines were available to convert the perfctr format, then that would
be one less use of the infamous cpus_addr() macro. We should minimize
'open coding' of the conversion routines outside of the bitmap routines,
which means look for the opportunity to move codes from both perfctr and
compat_sched_setaffinity into lib/bitmap.c.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-05 01:32:16

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
>> This is patently ridiculous. Make a compat_sched_getaffinity(), and
>> likewise for whatever else is copying unsigned long arrays to userspace.

On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> My mind reading skills are failing me. At the risk of opening myself to
> further ridicule, which part of what I wrote is patently ridiculous,
> why, and how does that differ from whatever you had in mind when you
> recommended doing "likewise"?

Ridiculous == some bizarre for_each_cpu() loop doing put put_user() once
for every bit of a cpumask_t.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> Putting your comments aside for a moment ...
> We have here a bit of suckage. The kernel bitmaps/cpumasks are arrays
> of unsigned long, with the low order long in the low order array slot,
> and the bytes within the longs in natural byte-order for that arch. The
> sched_setaffinity/sched_getaffinity calls in the kernel copy this stuff
> directly to/from user space. This doesn't work so well for 32 bit tasks
> on a 64 bit big-endian kernel. [Begin off-topic alert] The glibc
> sched_setaffinity and sched_getaffinity calls forcibly truncate the size
> of masks to some constant hardcoded size -- you have to use
> __SYSCALL(__NR_set_mempolicy) and such to get the real syscall. This
> doesn't work so well for kernels compiled with NR_CPUS larger than the
> hardcoded glibc size. [End off-topic alert] This also doesn't provide
> any help to other code needing to move binary masks across the
> kernel/user boundary, such as the perfctr kernel extension that Mikael
> Pettersson <[email protected]> describes.

Sounds like a glibc bug. It should probably dynamically detect
sizeof(cpumask_t), except of course that the API it's stuck with for
all time won't allow for dynamic allocation of the things.

Except when I look at my glibc headers, it's 1024 bits. And they're not
particularly recent glibc versions. SGI may need to get that bumped up,
but I doubt many others do.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> I presume that it is too late to change the low level format of masks
> that the sched_setaffinity/sched_getaffinity API support. I'd be
> delighted to be wrong on this presumption. So there is need for a
> compat variant of these calls, for use by 32 bit apps on 64 bit kernels.
> My first reaction to Milton Miller's compat_sched_getaffinity patch
> that Anton reminded us of is similar to Andrew's. I haven't had the
> intestinal fortitude to study the matter closer yet. Before actually
> reading the code, I would expect that all it had to handle was the
> swapping of 32 bit halves of 64 bit longs on 64 bit big endian kernels,
> such as I described in my discussion of a mythical BIT32X() macro,
> earlier in this thread. I would not expect it to have to make such a
> big deal of the special case of one word masks, as distinct from n word
> masks, though I agree that a 32 bit app should be able to use a single
> 32 bit word mask on a 64 bit kernel compiled with NR_CPUS <= 32.

I thought something more like this would work, but haven't tried it.
This wants a real copy_bitmap_to_user() helper unlike compat_set_fd_set().

Index: irqaction-2.6.7-rc2/fs/compat.c
===================================================================
--- irqaction-2.6.7-rc2.orig/fs/compat.c 2004-06-01 03:11:30.000000000 -0700
+++ irqaction-2.6.7-rc2/fs/compat.c 2004-06-04 10:28:44.190035000 -0700
@@ -40,6 +40,7 @@
#include <linux/nfsd/nfsd.h>
#include <linux/nfsd/syscall.h>
#include <linux/personality.h>
+#include <linux/cpu.h>

#include <net/sock.h> /* siocdevprivate_ioctl */

@@ -1394,6 +1395,31 @@
return ret;
}

+asmlinkage long compat_sched_getaffinity(compat_pid_t pid,
+ compat_uint_t len, compat_ulong_t __user *cpus)
+{
+ cpumask_t affinity;
+ int ret = 0;
+ task_t *task;
+
+ if (len < sizeof(cpumask_t))
+ return -EINVAL;
+ if (!access_ok(VERIFY_WRITE, cpus, sizeof(cpumask_t)))
+ return -EFAULT;
+ lock_cpu_hotplug();
+ read_lock(&tasklist_lock);
+ if ((task = pid ? find_task_by_pid(pid) : current))
+ cpus_and(affinity, task->cpus_allowed, cpu_possible_map);
+ else
+ ret = -ESRCH;
+ read_unlock(&tasklist_lock);
+ unlock_cpu_hotplug();
+ if (ret)
+ return ret;
+ compat_set_fd_set(NR_CPUS, cpus, cpus_addr(affinity));
+ return sizeof(cpumask_t);
+}
+
#if defined(CONFIG_NFSD) || defined(CONFIG_NFSD_MODULE)
/* Stuff for NFS server syscalls... */
struct compat_nfsctl_svc {


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> A key question, since it seems the perfctr stuff Mikael Pettersson
> describes is on its way into the main stream kernel, is whether any
> other kernel binary bitmap/cpumask API should use the same format as
> used by the kernel sched_setaffinity and sched_getaffinity, or use a
> more easily portable format - say an array of 32 bit words rather than
> an array of unsigned longs. One could make impassioned pleas either
> way. Having one kernel represent the same type in two different binary
> formats is a bit of a botch. But then again, arrays of 32 bit words are
> 'nicer'. And in fact, we already _have_ two formats required, since 32
> bit apps on 64 bit end endian kernels necessarily see a different format
> than their kernel uses natively -- indeed they use a format that is
> essentially the same as perfctr is using now.

This is trivial. Just like we needed ASCII marshalling, we need endian-
correct 32/64-bit bitmap marshalling.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> My vote, already cast when I slid the 32 bit chunk ascii format past
> y'all (it's amazing now, that I managed to do that ...) would be to
> export the array of 32 bit words format from the kernel, in all calls
> except the set/get affinity calls, where we have already cast the die
> otherwise. I like what I understand Mikael is trying to do here.

The only case where this is distinguished at all from copy_to_user() is
64-bit bigendian with 32-bit userspace.


On Fri, Jun 04, 2004 at 05:05:42PM -0700, Paul Jackson wrote:
> In any case, I'd hope that any big/little endian distinctions could be
> encapsulated in macros provided by include/linux/byteorder headers. I'd
> hope that whichever one or two formats the kernel exported were
> supported by conversion routines in bitmap.c and bitmap.h, and if
> useful, also made available via the cpumask_t API. Once cpumask
> routines were available to convert the perfctr format, then that would
> be one less use of the infamous cpus_addr() macro. We should minimize
> 'open coding' of the conversion routines outside of the bitmap routines,
> which means look for the opportunity to move codes from both perfctr and
> compat_sched_setaffinity into lib/bitmap.c.


Index: irqaction-2.6.7-rc2/include/asm-generic/cpumask_array.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/asm-generic/cpumask_array.h 2004-05-29 23:26:10.000000000 -0700
+++ irqaction-2.6.7-rc2/include/asm-generic/cpumask_array.h 2004-06-04 18:29:36.984743000 -0700
@@ -27,6 +27,8 @@
#define first_cpu(map) find_first_bit((map).mask, NR_CPUS)
#define next_cpu(cpu, map) find_next_bit((map).mask, NR_CPUS, cpu + 1)

+#define cpus_to_u32_array(d, s) bitmap_to_u32_array(d, (s).mask, sizeof(cpumask_t))
+
/* only ever use this for things that are _never_ used on large boxen */
#define cpus_coerce(map) ((map).mask[0])
#define cpus_promote(map) ({ cpumask_t __cpu_mask = CPU_MASK_NONE;\
Index: irqaction-2.6.7-rc2/include/asm-generic/cpumask_arith.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/asm-generic/cpumask_arith.h 2004-05-29 23:26:26.000000000 -0700
+++ irqaction-2.6.7-rc2/include/asm-generic/cpumask_arith.h 2004-06-04 18:29:41.238097000 -0700
@@ -38,6 +38,8 @@
#define CPU_MASK_ALL (~((cpumask_t)0) >> (8*sizeof(cpumask_t) - NR_CPUS))
#define CPU_MASK_NONE ((cpumask_t)0)

+#define cpus_to_u32_array(d, s) bitmap_to_u32_array(d, &(s), sizeof(cpumask_t))
+
/* only ever use this for things that are _never_ used on large boxen */
#define cpus_coerce(map) ((unsigned long)(map))
#define cpus_promote(map) ({ map; })
Index: irqaction-2.6.7-rc2/include/asm-generic/cpumask_up.h
===================================================================
--- irqaction-2.6.7-rc2.orig/include/asm-generic/cpumask_up.h 2004-05-29 23:25:55.000000000 -0700
+++ irqaction-2.6.7-rc2/include/asm-generic/cpumask_up.h 2004-06-04 18:29:46.573286000 -0700
@@ -40,6 +40,8 @@
#define first_cpu(map) (cpus_coerce(map) ? 0 : 1)
#define next_cpu(cpu, map) 1

+#define cpus_to_u32_array(d, s) bitmap_to_u32_array(d, &(s), sizeof(cpumask_t))
+
/* only ever use this for things that are _never_ used on large boxen */
#define cpus_promote(map) \
({ \

-- wli

2004-06-05 02:53:47

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 11:38:15AM -0700, William Lee Irwin III wrote:
> if (!realloc(cpus, 2*upper))
> - return -ENOMEM;
> + goto out;

... and as someone pointed out:


--- nr_cpus.c.orig3 2004-06-04 19:49:15.000000000 -0700
+++ nr_cpus.c 2004-06-04 19:49:31.000000000 -0700
@@ -24,12 +24,12 @@
if (!cpus)
return -ENOMEM;
for (upper = lower; getaffinity(0, upper, cpus) < 0; upper *= 2) {
- if (!realloc(cpus, 2*upper))
+ if (!(cpus = realloc(cpus, 2*upper)))
goto out;
}
while (lower < upper - 1) {
middle = (lower + upper)/2;
- if (!realloc(cpus, middle))
+ if (!(cpus = realloc(cpus, middle)))
goto out;
if (getaffinity(0, middle, cpus) < 0)
lower = middle;

2004-06-05 03:29:51

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 07:51:31PM -0700, William Lee Irwin III wrote:
> ... and as someone pointed out:
> - if (!realloc(cpus, 2*upper))
> + if (!(cpus = realloc(cpus, 2*upper)))

--- nr_cpus.c.orig4 2004-06-04 20:25:36.000000000 -0700
+++ nr_cpus.c 2004-06-04 20:26:08.000000000 -0700
@@ -17,20 +17,24 @@

static int detect_nr_cpus(void)
{
- unsigned long *cpus = malloc(sizeof(long));
+ unsigned long *tmp, *cpus = malloc(sizeof(long));
size_t upper, middle, lower = sizeof(long);
int ret = -ENOMEM;

if (!cpus)
return -ENOMEM;
for (upper = lower; getaffinity(0, upper, cpus) < 0; upper *= 2) {
- if (!(cpus = realloc(cpus, 2*upper)))
+ if (!(tmp = realloc(cpus, 2*upper)))
goto out;
+ else
+ cpus = tmp;
}
while (lower < upper - 1) {
middle = (lower + upper)/2;
- if (!(cpus = realloc(cpus, middle)))
+ if (!(tmp = realloc(cpus, middle)))
goto out;
+ else
+ cpus = tmp;
if (getaffinity(0, middle, cpus) < 0)
lower = middle;
else

2004-06-05 06:43:07

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 11:42:19AM -0700, Paul Jackson wrote:
> I don't see why user code needs to determine NR_CPUS exactly.

William Lee Irwin III replied:
> Wrong. Apps that want to reconfigure the system to e.g. online more cpus
> in response to heightened load want to know.

Earlier, Andrew Morton wrote:
> Sometimes userspace wants to know NR_CPUS. Sometimes it wants ...


I disagree, Andrew and William.

What the kernel should provide user space is:

* cpu_possible_map - which CPU's are possible at all
* cpu_present_map - which CPU's are presently plugged in
* cpu_online_map - which CPU's are online for scheduling
* other more specific maps, such as perhaps perfctr stuff
* the size of these maps, for dynamic allocation

See further the include/linux/cpumask.h file in my recent patch set for
consolidated documentation of these maps.

Currently, in the HOTPLUG configuration, cpu_possible_map happens to be
just all the CPUs from 0 to NR_CPUS-1 (i.e. CPU_MASK_ALL), but that is a
detail of the current implementation that should _not_ be used to drive
the design of what we expose to user space.

The other stuff Andrew mentions, such as max CPU number possible or
whatever, can and should be computed by user space code, from the above.

Just knowing NR_CPUS won't be much use to hotplug user code,
except as an overly specific way to determine the size of these maps.

The size of the CPU maps could be in bytes (the usual sizeof unit) or
for the present format, in number of unsigned longs. Probably bytes is
less surprising, even though it's slightly overly specified (the bottom
2 or 3 bits will always be zero).

I see no reason why user space needs to distinguish between NR_CPUS == 8
and NR_CPUS == 4, say, beyond what is visible in such maps as above. And
if at any time one of the maps is sparse, then a single small integer
such as NR_CPUS is immediately inadequate.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-05 06:55:31

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Andrew wrote:
> In that case the cpumask code should provide some API function which
> converts a cpumask_t into (and from?) some canonical and documented form.
> Then you copy what it gave you to userspace.

Exactly. I said something similar in my earlier reply.
But not the same. Andrew nailed it.

And the bitmap code should provide that conversion code,
since cpumask is just (soon, I hope ;) a thin layer on
bitmap.

And there might be bitop or byteorder arch specific code
beneath that.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-05 07:23:14

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William, Anton:

If William recoded his bitmap_to_u32_array() routine, provided elsewhere
on this thread, to take as the length argument 'nwords', not the number
of source u64 words, but rather the number (possibly an odd number) of
u32 dest words, then could that routine be used to significantly simply
this compat_sched_getaffinity() code?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-05 07:59:31

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William wrote:
> Ridiculous == ...

oh - ok - we agree

> Sounds like a glibc bug.

We agree. Someone in glibc land doesn't.

> it's 1024 bits ... SGI may need to get that bumped up,
> but I doubt many others do.

SGI fits, for now.

Someday, someone, won't.

It's baked in to glibc, so someday, someone will have a bit of pain.
Oh well ... Good chance that someone will include me. Not much I
can do about it now.

> +asmlinkage long compat_sched_getaffinity(compat_pid_t pid, ...

That looks more readable. Thanks.

Do you see a sensible way to pass back an odd number of u32's? If
NR_CPUS is say 8, then the 32 bit user code might expect to only need
one compat_ulong_t, not two. If NR_CPUS is 48, then it should only need
3, not 4. And so forth.

As I noted in another reply, perhaps your bitmap_to_u32_array() code
could be modified to handle this.

And I agree with Andrew's suggestion, that cpumask provide the
conversion, to and from kernel memory, separately from copying the
result to user space.

> This is trivial. Just like we needed ASCII marshalling ...

The implementation is trivial.

The API design choice was the point of my analysis. Not how to do it,
but what to do.

Should the kernel support two flavors of bitmap marshalling, where it is
headed now, with the sched_setaffinity() format differing from the
perfctr format? Or should we pick one, and demand that the other
change?

Since I like the perfctr format better, and since I suspect it is to
late to change the sched_setaffinity format, I am resigned to supporting
two binary bitmap formats, across the kernel/user API boundary, forever.

Actually, the two forms are close. They differ just in the big endian
64 bit case. And if you were able to handle an odd number of u32 dest
words in your bitmap_to_u32_array() code, then perhaps that single bit
of code could serve as the marshalling for both. So we end up with two
variants of one flavor, differing only in whether you want 32 or 64 bit
chunks when running on a 64 bit arch, the 64 bit chunks being the native
kernel bitmap representation, for now at least.

That's probably about as good as we are going to do with this.

> The only case where this is distinguished at all from copy_to_user() is
> 64-bit bigendian with 32-bit userspace.

Yes - exactly. Well, almost. Either 32-bit userspace compatibility,
or 32 bit chunks for improved portability, such as perfctr has chosen,
if I understand them correctly.

> Index: irqaction-2.6.7-rc2/include/asm-generic/cpumask_array.h

Hmmm ... do you use Quilt too? That's the only place I recall seeing
this "Index" line. Cool.

If Andrew accepts my cpumask patches, then you will presumably have
to do your addition of cpus_to_u32_array(). Trivial, of course.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-05 08:27:15

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Sat, Jun 05, 2004 at 01:04:44AM -0700, Paul Jackson wrote:
> That looks more readable. Thanks.
> Do you see a sensible way to pass back an odd number of u32's? If
> NR_CPUS is say 8, then the 32 bit user code might expect to only need
> one compat_ulong_t, not two. If NR_CPUS is 48, then it should only need
> 3, not 4. And so forth.
> As I noted in another reply, perhaps your bitmap_to_u32_array() code
> could be modified to handle this.
> And I agree with Andrew's suggestion, that cpumask provide the
> conversion, to and from kernel memory, separately from copying the
> result to user space.

So do it in the caller, e.g..

int copy_cpus_to_user32(const u32 __user *ubuf, cpumask_t cpus)
{
int i, ret, len = ALIGN(NR_CPUS, 32)/(32/sizeof(u32));
u32 *ary = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
if (!ary)
return -ENOMEM;
cpus_to_u32_array(ary, cpus);
ret = copy_to_user(ubuf, ary, len);
kfree(ary);
return ret;
}

int copy_cpus_from_user32(cpumask_t *cpus, const u32 __user *ubuf)
{
int i, ret, len = ALIGN(NR_CPUS, 32)/(32/sizeof(u32));
u32 *ary = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
if (!ary)
return -ENOMEM;
if (!(ret = copy_from_user(ary, ubuf, len)))
cpus_from_u32_array(cpus, ary);
kfree(ary);
return ret;
}

or some such nonsense, or whatever someone can be arsed to consider a
better idea. One should note such a reformatting, as a user ABI change
in a stable series, would be unfriendly to 64-bit userspace on BE boxen.
i.e. do this only for 32-bit target userspace on 64-bit kernels + boxen.


-- wli

2004-06-06 02:08:14

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Sat, 2004-06-05 at 04:12, William Lee Irwin III wrote:
> /* William Lee Irwin III:
> >> I'm thoroughly disgusted.
>
> On Fri, Jun 04, 2004 at 10:47:56AM -0700, Paul Jackson wrote:
> > Yup ... LOL. One sick piece of code.

We've been here before. I argued the userspace interface was broken to
require this looping, Linus said it was fine, Ingo said "userspace will
assume < 1024 cpus" and if we get more than that we'll need a new
interface, and that's what glibc does today with its cpu_set_t.

Shades of select-style pain, but it's not likely to change in the near
future.

Note also that saying "Schedule me on CPU 1 and 999" 'succeeds' at the
moment.

Yes, NR_CPUS needs to get to userspace somehow sanely if we want to fix
this in general.

Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-06-06 08:03:46

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

pj wrote:
> but rather the number (possibly an odd number) of u32 dest words,

or the byte size of the destination buffer ...

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-06 08:16:24

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

pj wrote:
>> but rather the number (possibly an odd number) of u32 dest words,

On Sun, Jun 06, 2004 at 01:07:47AM -0700, Paul Jackson wrote:
> or the byte size of the destination buffer ...

I posted some code for you to cherrypick and run with here.
i.e. the copy_cpus_to_user32()/copy_cpus_from_user32() stuff.
Should be Message-ID: <[email protected]>

I can resend as a patch if need be.


-- wli

2004-06-06 08:34:50

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William Lee Irwin III wrote:
> or whatever someone can be arsed to consider a better idea.

If anyone lurking feels the urge to drive this puppy home, jump in.

I'm unavailable, and from what I can guess reading between William's
lines, he's not signed up either. Be forewarned - it's an area that can
generate some long lkml threads ;). Both William and I seem to have an
ample supply of keystrokes.


> a user ABI change in a stable series, would be unfriendly

I agree. While I contemplated such, I don't recall advocating such,
for the reason you state. We're stuck at least for now with the
sched_(set/get)affinity ABI.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-06 12:09:32

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Rusty wrote:
> Yes, NR_CPUS needs to get to userspace somehow sanely if we want to fix
> this in general.

Are you saying that NR_CPUS is needed, or just the number of longs in a
cpumask (sizeof (cpumask_t), essentially)?

I can see where the size is needed, in order to make the system calls to
set and get masks of arbitrary size. Since these sizes are a multiple
of sizeof(long), at a minimum this means user code needs to know the
number of longs in a mask. Though the number of bytes, as in
sizeof(cpumask_t), rather than of longs, is perhaps a less surprising
interface.

I can't see where the user code cares whether NR_CPUS is 47 or 48?

Am I missing something?

I am a firm believer in passing the minimum essential information across
major boundaries. Passing too much creates maintaince problems, and
encourages misuse of information, resulting in bogus user code.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-06 12:14:26

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

Rusty wrote:
>> Yes, NR_CPUS needs to get to userspace somehow sanely if we want to fix
>> this in general.

On Sun, Jun 06, 2004 at 05:16:57AM -0700, Paul Jackson wrote:
> Are you saying that NR_CPUS is needed, or just the number of longs in a
> cpumask (sizeof (cpumask_t), essentially)?
> I can see where the size is needed, in order to make the system calls to
> set and get masks of arbitrary size. Since these sizes are a multiple
> of sizeof(long), at a minimum this means user code needs to know the
> number of longs in a mask. Though the number of bytes, as in
> sizeof(cpumask_t), rather than of longs, is perhaps a less surprising
> interface.
> I can't see where the user code cares whether NR_CPUS is 47 or 48?
> Am I missing something?
> I am a firm believer in passing the minimum essential information across
> major boundaries. Passing too much creates maintaince problems, and
> encourages misuse of information, resulting in bogus user code.

You've been told, and several times already. The current example is
userspace needing to know when to stop trying to online cpus.

-- wli

2004-06-06 12:20:48

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

> You've been told, and several times already. The current example is
> userspace needing to know when to stop trying to online cpus.

To which I've answered, you are describing cpu_possible_map,
cpu_present_map and cpu_online_map, not NR_CPUS.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-06 12:26:58

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

> I can resend as a patch if need be.

No need to. I have it. As I replied on lkml (after you
offered to resend the patch, but before I had read this
offer), I am not available to push this puppy home.

I take it you are not either.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-06 12:36:19

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

At some point in the past, my attribution was unkindly stripped from:
>> You've been told, and several times already. The current example is
>> userspace needing to know when to stop trying to online cpus.

On Sun, Jun 06, 2004 at 05:28:43AM -0700, Paul Jackson wrote:
> To which I've answered, you are describing cpu_possible_map,
> cpu_present_map and cpu_online_map, not NR_CPUS.

Why do I have to put up with this and why do they always come after me?
All of their sizes are rounded up to CHAR_BIT*sizeof(cpumask_t), and
all of their contents are variable.

Hmm. /proc/config.gz will do for now.


-- wli

2004-06-06 13:34:25

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William wrote:
> Why do I have to put up with this and why do they always come after me?

I do not understand this. Probably just as well.

> All of their sizes are rounded up to CHAR_BIT*sizeof(cpumask_t)

The number of bits in a mask is CHAR_BIT*sizeof(cpumask_t), yes.
So -- your point being?

> and all of their contents are variable.

No - not unless my commentary describing these maps in the cpumask.h
file in this patch is wrong. Some of the contents are fixed at boot,
such as cpu_possible_map. And again, your point being?

> Hmm. /proc/config.gz will do for now.

Only available if IKCONFIG, IKCONFIG_PROC is configured on.

Care to try again ...

Really, if you want to know which CPUs it is possible to have present,
you are describing cpu_possible_map. While it is an accident of the
current implementation that in the HOTPLUG configuration this is set to
all CPUs from 0 to NR_CPUS-1, i.e. to CPU_MASK_ALL, that fact is not a
design constant, and hence not any basis for designing what information
should be exposed by the kernel to user space.

The user code needs to know the value of these maps, and they need to
know how big (bytes or longs) the maps are, so they can get them across
the kernel/user boundary.

If you claim they need to know whether NR_CPUS is 47 or 48, for hotplug
purposes, then I can only presume that you are assuming that
cpu_possible_map is by design always set to CPU_MASK_ALL. We wouldn't
even have a cpu_possible_map if that were a design constant.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-06 15:09:23

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, 4 Jun 2004 12:01:46 -0700, Paul Jackson wrote:
>William Lee Irwin III wrote:
>> +void bitmap_to_u32_array(u32 *dst, unsigned long *src, int nwords)
...
>Mikael - does William's routine look like the makings of something
>that fits your needs?

It could, except I think it gets the word order wrong:

+#if defined(__BIG_ENDIAN) && BITS_PER_LONG == 64
+void bitmap_to_u32_array(u32 *dst, unsigned long *src, int nwords)
+{
+ int i;
+
+ for (i = 0; i < nwords; ++i) {
+ u64 word = src[i];
+ dst[2*i] = word >> 32;
+ dst[2*i+1] = word;
+ }
+}
+#else
+void bitmap_to_u32_array(u32 *dst, unsigned long *src, int nwords)
+{
+ memcpy(dst, src, nwords*sizeof(unsigned long));
+}
+#endif

Notice how it emits the high int before the low int.
(Which btw also is the native big-endian storage order,
so the memcpy() would have done the same.)

Now consider the location of bit 0, with mask value 1(*),
on a 64-bit big-endian machine. The code above puts this
in the second int, as bit 0 in *((char*)dst + 7).
But a 32-bit user-space, or a 64-bit user-space that sees
an array of ints not longs, wants it in the first int,
as bit 0 in *((char*)dst + 3).

Perfctr's marshalling procedure for cpumask_t values
(drivers/perfctr/init.c:cpus_copy_to_user() in recent -mm)
is endian-neutral and converts each long by emitting the
ints from least significant to most significant.

Considering the API for retrieving an array of unknown size,
perfctr's marshalling procedure does the following:

> const unsigned int k_nrwords = PERFCTR_CPUMASK_NRLONGS*(sizeof(long)/sizeof(int));
> unsigned int u_nrwords;
> if (get_user(u_nrwords, &argp->nrwords))
> return -EFAULT;
> if (put_user(k_nrwords, &argp->nrwords))
> return -EFAULT;
> if (u_nrwords < k_nrwords)
> return -EOVERFLOW;

That is, it always tells user-space how much space is needed,
and if user-space provided too little, it gets EOVERFLOW.
Knowing the number of words in the encoded cpumask_t also
avoids having to know the exact value of NR_CPUS in user-space.

/Mikael

(*) Normal bit order, not IBM POWER's reversed bit order.

2004-06-06 16:45:33

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Sun, Jun 06, 2004 at 05:07:59PM +0200, Mikael Pettersson wrote:
> Notice how it emits the high int before the low int.
> (Which btw also is the native big-endian storage order,
> so the memcpy() would have done the same.)
> Now consider the location of bit 0, with mask value 1(*),
> on a 64-bit big-endian machine. The code above puts this
> in the second int, as bit 0 in *((char*)dst + 7).
> But a 32-bit user-space, or a 64-bit user-space that sees
> an array of ints not longs, wants it in the first int,
> as bit 0 in *((char*)dst + 3).

Feh. So swap the assignments.


On Sun, Jun 06, 2004 at 05:07:59PM +0200, Mikael Pettersson wrote:
> Perfctr's marshalling procedure for cpumask_t values
> (drivers/perfctr/init.c:cpus_copy_to_user() in recent -mm)
> is endian-neutral and converts each long by emitting the
> ints from least significant to most significant.
> Considering the API for retrieving an array of unknown size,
> perfctr's marshalling procedure does the following:
> > const unsigned int k_nrwords = PERFCTR_CPUMASK_NRLONGS*(sizeof(long)/sizeof(int));
> > unsigned int u_nrwords;
> > if (get_user(u_nrwords, &argp->nrwords))
> > return -EFAULT;
> > if (put_user(k_nrwords, &argp->nrwords))
> > return -EFAULT;
> > if (u_nrwords < k_nrwords)
> > return -EOVERFLOW;
> That is, it always tells user-space how much space is needed,
> and if user-space provided too little, it gets EOVERFLOW.
> Knowing the number of words in the encoded cpumask_t also
> avoids having to know the exact value of NR_CPUS in user-space.
> /Mikael
> (*) Normal bit order, not IBM POWER's reversed bit order.

I don't really care about the particular format exported to userspace,
but cpus_addr() is not a legitimate API. cpus_copy_to_user() is, but it
should belong to the core. Just shove the stuff that's doing cpus_addr()
for internals of cpumask_t into lib/bitmap.c etc. and it should be fine.


-- wli

2004-06-06 17:41:07

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

William wrote:
> I don't really care about the particular format exported to userspace,
> but cpus_addr() is not a legitimate API.

I'd like to thank-you for pointing out cpus_addr() to me several months
ago, when I unwittingly proposed to replace it, with something else of
a different name, doing the same thing.

I agree it is not legitimate - to the extent that it remains, the cleanup
of cpumasks is not yet complete. Though, with my patch set of this week,
I think we're making good progress.

I am a little puzzled at the strength of your latest objections to it.
For all I know, it may well be your own invention. It's been there a
while, since before my time with this code.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-06 23:22:34

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Sun, 2004-06-06 at 22:16, Paul Jackson wrote:
> Rusty wrote:
> > Yes, NR_CPUS needs to get to userspace somehow sanely if we want to fix
> > this in general.
>
> Are you saying that NR_CPUS is needed, or just the number of longs in a
> cpumask (sizeof (cpumask_t), essentially)?

You're right. Three things are required.

1) Access to cpu_online_map (currently usually intuited from
/proc/cpuinfo)
2) Notification of cpu add/remove (currently via /sbin/hotplug)
3) Minimum size of cpumask_t (currently hardcoded, could be detected by
looping).

Although we don't, in general, know the size of long (think i386 binary
on x86_64), in practice if you always round NR_CPUS up to 64-bits you
can get #3.

> I am a firm believer in passing the minimum essential information across
> major boundaries. Passing too much creates maintaince problems, and
> encourages misuse of information, resulting in bogus user code.

In this case, though, the early example programs for setaffinity all
used "unsigned long mask; sys_sched_setaffinity(...&mask,
sizeof(mask))", which was both simple and wrong. Similarly, getaffinity
users who didn't zero the mask before handing it to the kernel.

Oh well,
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-06-07 06:36:43

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

> Three things are required.

Thank-you, Rusty. That makes more sense. The first step in designing
good API's is to be clear what information it is essential to pass.


> In this case, though, the early example programs for setaffinity all
> used "unsigned long mask; sys_sched_setaffinity(...&mask,
> sizeof(mask))", which was both simple and wrong.

Yeah ... several layers of suckage are in the kernel bitmap layout,
kernel/user API for passing bitmaps, and low level glibc sched set and
get affinity API. You identify another one. Not felony (major)
suckage, just petty (minor). Annoying none the less. And the source of
quite a few hours of "software maintenance" work. This margin is too
small to contain the details ;).

Oh well ... I've done my stint for now (hopeful past tense, again)
trying to leave things a little neater than when I arrived.

I hope to turn now back to the cpuset work that I've been assisting
Simon Derr (Bull) on.

Thanks, Rusty, William, Matthew, Nick and others.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-07 07:55:20

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation


> aargh! It's back!

Its stalking you.

> There's something about this patch which make me break out in hives. Does
> it *really* need to be that complicated?

If we dont want maximum backwards compatibility we can get rid of the <
sizeof(long) bits.

> iirc, the last time I looked through this I was unable to convince myself
> that it was endianness-correct. Is it?

Should be, but we could boot it on something 64bit little endian to prove
it.

Anton

2004-06-07 16:55:05

by Joe Korty

[permalink] [raw]
Subject: fix up compat_sched_[get/set]affinity

On Sat, Jun 05, 2004 at 01:26:47AM -0700, William Lee Irwin III wrote:
> On Sat, Jun 05, 2004 at 01:04:44AM -0700, Paul Jackson wrote:
> > That looks more readable. Thanks.
> > Do you see a sensible way to pass back an odd number of u32's? If
> > NR_CPUS is say 8, then the 32 bit user code might expect to only need
> > one compat_ulong_t, not two. If NR_CPUS is 48, then it should only need
> > 3, not 4. And so forth.
> > As I noted in another reply, perhaps your bitmap_to_u32_array() code
> > could be modified to handle this.
> > And I agree with Andrew's suggestion, that cpumask provide the
> > conversion, to and from kernel memory, separately from copying the
> > result to user space.
>
> So do it in the caller, e.g..
>
> int copy_cpus_to_user32(const u32 __user *ubuf, cpumask_t cpus)
> {
> int i, ret, len = ALIGN(NR_CPUS, 32)/(32/sizeof(u32));
> u32 *ary = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
> if (!ary)
> return -ENOMEM;
> cpus_to_u32_array(ary, cpus);
> ret = copy_to_user(ubuf, ary, len);
> kfree(ary);
> return ret;
> }
>
> int copy_cpus_from_user32(cpumask_t *cpus, const u32 __user *ubuf)
> {
> int i, ret, len = ALIGN(NR_CPUS, 32)/(32/sizeof(u32));
> u32 *ary = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
> if (!ary)
> return -ENOMEM;
> if (!(ret = copy_from_user(ary, ubuf, len)))
> cpus_from_u32_array(cpus, ary);
> kfree(ary);
> return ret;
> }
>
> or some such nonsense, or whatever someone can be arsed to consider a
> better idea. One should note such a reformatting, as a user ABI change
> in a stable series, would be unfriendly to 64-bit userspace on BE boxen.
> i.e. do this only for 32-bit target userspace on 64-bit kernels + boxen.




Possible algorithms for the support routines needed by wli's code, above.
Completely untested, hope to refine and test soon.


void bitmap_to_compat_bitmap(compat_ulong_t *dest, const ulong_t *src, int nmaskbits)
{
ulong_t word;
int i, j;

for (j=0; j < nmaskbits; ) {
word = *dest++;
for (i=0; j < nmaskbits && i < BITS_PER_LONG; ) {
*src++ = word >> i;
i += BITS_PER_COMPAT_LONG;
j += BITS_PER_COMPAT_LONG;
}
}
}

void compat_bitmap_to_bitmap(ulong_t *dest, const compat_ulong_t *src, int nmaskbits)
{
ulong_t word;
int i,j;

for (j=0; j < nmaskbits; ) {
word = 0;
for (i=0; j < nmaskbits && i < BITS_PER_LONG; ) {
word |= *src++ << i;
i += BITS_PER_COMPAT_LONG;
j += BITS_PER_COMPAT_LONG;
}
*dest++ = word;
}
}

2004-06-07 17:07:50

by William Lee Irwin III

[permalink] [raw]
Subject: Re: fix up compat_sched_[get/set]affinity

On Mon, Jun 07, 2004 at 12:54:46PM -0400, Joe Korty wrote:
> Possible algorithms for the support routines needed by wli's code, above.
> Completely untested, hope to refine and test soon.

I had in mind:
#define cpus_to_u32_array(a,cpus) bitmap_to_u32_array(a,cpus,sizeof(cpus))
#define cpus_from_u32_array(a,cpus) bitmap_from_u32_array(a,cpus,sizeof(cpus))

Non-issue since Mikael already has some tested/etc. code to drop into
lib/bitmap.c or wherever.


-- wli

2004-06-09 00:14:15

by Paul Jackson

[permalink] [raw]
Subject: PATCH] cpumask 11/10 comment, spacing tweaks

Andrew - If you take my other cpumask patchs, could you add this one?
I just noticed that my 'big cpumask comment' was a bit stale.

No code changes - just spacing and comments.

Tweak cpumask.h comments, spacing:

1) Add comments for cpu_present_map macros: num_present_cpus() and cpu_present()
2) Remove comments for obsolete macros: cpu_set_online(), cpu_set_offline()
3) Reorder a few comment lines, to match the code and confuse readers of this patch
3) Tabify one chunk of code

include/linux/cpumask.h | 34 +++++++++++++++++++---------------
1 files changed, 19 insertions(+), 15 deletions(-)

Signed-off-by: Paul Jackson <[email protected]>

Index: 2.6.7-rc2-mm2/include/linux/cpumask.h
===================================================================
--- 2.6.7-rc2-mm2.orig/include/linux/cpumask.h 2004-06-07 15:45:25.000000000 -0700
+++ 2.6.7-rc2-mm2/include/linux/cpumask.h 2004-06-08 16:46:05.000000000 -0700
@@ -47,17 +47,21 @@
* int cpumask_scnprintf(buf, len, mask) Format cpumask for printing
* int cpumask_parse(ubuf, ulen, mask) Parse ascii string as cpumask
*
+ * for_each_cpu_mask(cpu, mask) for-loop cpu over mask
+ *
* int num_online_cpus() Number of online CPUs
* int num_possible_cpus() Number of all possible CPUs
+ * int num_present_cpus() Number of present CPUs
+ *
* int cpu_online(cpu) Is some cpu online?
* int cpu_possible(cpu) Is some cpu possible?
- * void cpu_set_online(cpu) set cpu in cpu_online_map
- * void cpu_set_offline(cpu) clear cpu in cpu_online_map
+ * int cpu_present(cpu) Is some cpu present (can schedule)?
+ *
* int any_online_cpu(mask) First online cpu in mask
*
- * for_each_cpu_mask(cpu, mask) for-loop cpu over mask
* for_each_cpu(cpu) for-loop cpu over cpu_possible_map
* for_each_online_cpu(cpu) for-loop cpu over cpu_online_map
+ * for_each_present_cpu(cpu) for-loop cpu over cpu_present_map
*
* Subtlety:
* 1) The 'type-checked' form of cpu_isset() causes gcc (3.3.2, anyway)
@@ -336,19 +340,19 @@
extern cpumask_t cpu_present_map;

#if NR_CPUS > 1
-#define num_online_cpus() cpus_weight(cpu_online_map)
-#define num_possible_cpus() cpus_weight(cpu_possible_map)
-#define num_present_cpus() cpus_weight(cpu_present_map)
-#define cpu_online(cpu) cpu_isset((cpu), cpu_online_map)
-#define cpu_possible(cpu) cpu_isset((cpu), cpu_possible_map)
-#define cpu_present(cpu) cpu_isset((cpu), cpu_present_map)
+#define num_online_cpus() cpus_weight(cpu_online_map)
+#define num_possible_cpus() cpus_weight(cpu_possible_map)
+#define num_present_cpus() cpus_weight(cpu_present_map)
+#define cpu_online(cpu) cpu_isset((cpu), cpu_online_map)
+#define cpu_possible(cpu) cpu_isset((cpu), cpu_possible_map)
+#define cpu_present(cpu) cpu_isset((cpu), cpu_present_map)
#else
-#define num_online_cpus() 1
-#define num_possible_cpus() 1
-#define num_present_cpus() 1
-#define cpu_online(cpu) ((cpu) == 0)
-#define cpu_possible(cpu) ((cpu) == 0)
-#define cpu_present(cpu) ((cpu) == 0)
+#define num_online_cpus() 1
+#define num_possible_cpus() 1
+#define num_present_cpus() 1
+#define cpu_online(cpu) ((cpu) == 0)
+#define cpu_possible(cpu) ((cpu) == 0)
+#define cpu_present(cpu) ((cpu) == 0)
#endif

#define any_online_cpu(mask) \


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-06-09 16:40:06

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] cpumask 5/10 rewrite cpumask.h - single bitmap based implementation

On Fri, Jun 04, 2004 at 02:54:03AM -0700, William Lee Irwin III wrote:
> I'd rather just do it.
> Index: irqaction-2.6.7-rc2/include/linux/interrupt.h
> ===================================================================
> --- irqaction-2.6.7-rc2.orig/include/linux/interrupt.h 2004-05-29 23:26:11.000000000 -0700
> +++ irqaction-2.6.7-rc2/include/linux/interrupt.h 2004-06-04 02:24:12.348627000 -0700

I should mention that I've tested this change on sparc64 and it worked
beautifully.


-- wli