2004-03-29 12:14:18

by Paul Jackson

[permalink] [raw]
Subject: [PATCH] mask ADT: new mask.h file [2/22]

Patch_2_of_22 - New mask ADT
Adds new include/linux/mask.h header file

==> See this mask.h header for more extensive mask documentation <==

diffstat Patch_2_of_22:
mask.h | 372 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 372 insertions(+)

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.1707 -> 1.1708
# (new) -> 1.1 include/linux/mask.h
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 04/03/28 [email protected] 1.1708
# New mask ADT to be used as common basis for cpumask and nodemask types.
# --------------------------------------------
#
diff -Nru a/include/linux/mask.h b/include/linux/mask.h
--- /dev/null Wed Dec 31 16:00:00 1969
+++ b/include/linux/mask.h Mon Mar 29 01:03:28 2004
@@ -0,0 +1,372 @@
+#ifndef _ASM_GENERIC_MASK_H
+#define _ASM_GENERIC_MASK_H
+
+/*
+ * include/linux/mask.h
+ *
+ * Copyright (c) 2004 Silicon Graphics, Inc. All rights reserved.
+ *
+ * Paul Jackson <[email protected]>
+ *
+ * This file is distributed under the GNU GENERAL PUBLIC LICENSE (GPL)
+ * Version 2 (June 1991). See the "COPYING" file distributed with this
+ * software for more info.
+ */
+
+/*
+ * Mask is an abstract data type for multi-word bit masks. Masks
+ * are bitmaps (arrays of unsigned longs) wrapped in a struct.
+ * The struct wrapper enables them to be assigned and passed as
+ * arguments. Masks are useful for representing sets of small
+ * numbers, such as the CPU numbers on a multi-processor system.
+ *
+ * How mask.h fits with bitops.h, bitmap.h, cpumask.h and nodemask.h:
+ *
+ * 1) bitmap.h and lib/bitmap.c provide several operations
+ * for manipulating bitmaps, as arrays of unsigned long.
+ * There are operations that and, or, set, clear, shift,
+ * print, parse, and test these bitmaps. The routines
+ * typically require a pointer to an array of unsigned longs,
+ * and a count of the number of valid bits therein.
+ *
+ * The byte ordering of bitmaps is more natural on little
+ * endian architectures. See the big-endian headers
+ * include/asm-ppc64/bitops.h and include/asm-s390/bitops.h
+ * for the best explanations of this ordering.
+ *
+ * 2) bitops.h provides a few specific inline routines for
+ * finding the first set bit and the Hamming weight of
+ * these same bitmaps. Several architectures provide
+ * include/asm-<arch>/bitops.h variants, optimized for
+ * specific processor instruction sets.
+ *
+ * 3) mask.h makes use of bitmap.h, bitops.h and a few other
+ * kernel utilities to provide a mask abstract data type
+ * (ADT) as a structure of an array of unsigned longs.
+ * mask.h attempts to provide a fairly complete set of
+ * operations, to make it easy for users to write clear
+ * concise and correct mask manipulation code.
+ *
+ * The mask macros use the fairly rich compile time
+ * optimizations of gcc in order to generate optimum code
+ * for each architecture. Some of these macros require the
+ * current mask type or number of bits - the cpumask.h and
+ * nodemask.h macros serve to hide such details.
+ *
+ * 4) cpumask.h and nodemask.h are both based on mask.h.
+ * They provide cpu and node specific macros that hide
+ * such details as NR_CPUS that might be needed by the
+ * lower level mask macros defined below.
+ *
+ * Summary:
+ * Don't use mask.h directly - use cpumask.h and nodemask.h.
+ *
+ * The available mask operations and their rough meaning in the
+ * case that "nbits == 8 * sizeof(single unsigned long)" are:
+ *
+ * __mask(nbits) Use to define new *mask types
+ * mask_setbit(bit, mask) mask |= bit
+ * mask_clearbit(bit, mask) mask &= ~bit
+ * mask_setall(mask, nbits) mask = ~0UL
+ * mask_clearall(mask) mask = 0UL
+ * mask_isset(bit, mask) Is bit set in mask?
+ * mask_test_and_set(bit, mask) mask | bit ? 0 : mask |= bit
+ * mask_and(dst, src1, src2) dst = src1 & src2
+ * mask_or(dst, src1, src2) dst = src1 | src2
+ * mask_xor(dst, src1, src2) dst = src1 ^ src2
+ * mask_andnot(dst, src1, src2) dst = src1 & ~src2
+ * mask_complement(dst, src, nbits) dst = ~src
+ * mask_equal(mask1, mask2) Are mask1 and mask2 equal?
+ * mask_intersects(mask1, mask2) Do mask1 and mask2 overlap?
+ * mask_subset(mask1, mask2) Is mask1 a subset of mask2?
+ * mask_empty(mask) Is mask empty?
+ * mask_full(mask, nbits) Are all bits set in mask?
+ * mask_weight(mask, nbits) Hamming Weight: number set bits
+ * mask_shift_right(dst, src, n, nbits) dst = src >> n
+ * mask_shift_left(dst, src, n, nbits) dst = src << n
+ * mask_first(mask, nbits) find_first_bit(mask, nbits)
+ * mask_next(bit, mask, nbits) find_next_bit(mask, size, bit, nbits)
+ * mask_of_bit(bit, T) returns mask with a single bit set
+ * MASK_ALL(nbits) returns mask with all bits set
+ * MASK_NONE(nbits) returns mask with no bits set
+ * mask_raw(mask) Array of unsigned long's in mask
+ * mask_scnprintf(buf, len, mask, nbits) scnprintf(buf, len, "%lx", mask, nbits)
+ * mask_parse(ubuf, ulen, mask, nbits) parse comma sep 32 bit words to mask
+ *
+ * Various Implementation Details
+ * ==============================
+ *
+ * The parameter 'T' above must be a variable of the appropriate
+ * mask type (cpumask_t or nodemask_t, for instance). This
+ * variable is only used for its typeof() information.
+ *
+ * For details of mask_scnprintf() and mask_parse(), see
+ * bitmap_scnprintf() and bitmap_parse() in lib/bitmap.c
+ *
+ * A new *mask type should be defined, such as cpumask_t or
+ * nodemask_t, for each possibly different sized (number of
+ * bits) bitmask based on this mask ADT. The definition
+ * for example of cpumask_t is:
+ * typedef __mask(NR_CPUS) cpumask_t;
+ *
+ * The previous definitions of CPU_MASK_ALL were inconsistent.
+ * For cpumasks of one word size or less, they set exactly
+ * NR_CPUS bits, leaving any remaining high order bits zero.
+ * For cpumasks of multiple words, all bits were set, which
+ * might result in a cpumask of weight > NR_CPUS, if NR_CPUS is
+ * not an exact multiple of the number of bits in an unsigned
+ * long. The MASK_ALL(nbits) macro fixes this inconsistency
+ * by refining the static initializor to set only valid bits
+ * in the last word.
+ *
+ * These macros presume that all masks passed in a given call
+ * are the same nbits long, and that only bits in positions
+ * b where 0 <= b < nbits might be set in input masks.
+ * They ensure that no additional bits outside this range
+ * become set (however don't protect against improperly set
+ * bits that are outside this range but still inside the array
+ * of unsigned longs representing the mask.) In other words,
+ * any implementation of these ops may assume as a precondition
+ * that any unused bits (bits in the array of unsigned
+ * longs, outside the range 0 to nbits-1) are zero. And any
+ * implementation of these ops must ensure as a postcondition
+ * on all output masks that this same precondition (unused
+ * bits are zero) holds. If you manage to create, by some
+ * other means, a mask with some unused bits non-zero, and
+ * then pass that mask to one of these mask operations, that
+ * operation may malfunction.
+ *
+ * The abstract bit model supported by these masks is that of
+ * an infinite set of bits, in positions numbered 0 and up,
+ * where all but the first 'nbits' bits are always zero.
+ * Calls that implicitly attempt to set any bit outside of
+ * the first 'nbits' bits successfully and quietly leave such
+ * bits as zero. Calls that query or modify specifically
+ * numbered bit positions require as a precondition that the
+ * specified bit position 'n' is the range 0 <= n < nbits, and
+ * may malfunction if handed a bit position outside this range.
+ *
+ * The mask_raw() op enables violating this model. It returns
+ * the address of the start of the array of unsigned
+ * longs, enabling the caller to directly manipulate them.
+ * This can be used to intermix mask ops with classic 'C' bit
+ * operations, usually only done on systems whose cpumask_t
+ * fits in a single unsigned long word.
+ *
+ * The underlying bitmap.c operations such as bitmap_and() and
+ * bitmap_or() don't follow this model. They don't assume
+ * the precondition that unused bits are zero, and they do
+ * mask off any unused portion of input masks in most cases.
+ * However the underlying bitop.h operations, such as set_bit()
+ * and clear_bit(), do no sanitizing of their inputs, depending
+ * heavily on preconditions.
+ *
+ * The declaration of the array _m[] of unsigned longs in the
+ * definition of __mask() below intentionally does not use
+ * the DECLARE_BITMAP() macro. It would be gratuitously opaque
+ * in this case - as the macro implementations below depend on
+ * the internal details of this declaration.
+ *
+ * The MASK_LAST_WORD() macro defines the value of the last (high
+ * order) word of a bitmask. In particular, for the case of a
+ * mask of a size that is an exact multiple of the word size,
+ * with all bits set, it ensures that this value is all one's,
+ * not all zero's.
+ *
+ * The file include/mask.h applies to all architectures.
+ * Architectures requiring custom details should provide
+ * them in their include/asm-<arch>/bitops.h file, and
+ * if necessary modify the common include/linux/mask.h
+ * file to conditionally generate the necessary code,
+ * depending on compile time settings. No need to write
+ * ugly #ifdef's to do this - gcc provides a rich set
+ * of compile time extensions. See further for example:
+ * http://gcc.gnu.org/onlinedocs/gcc-3.3.3/gcc/C-Extensions.html
+ *
+ * Some architectures (I'm told sparc32) do not pass structures
+ * efficiently as arguments to subroutine calls, even if the
+ * structure is just one word long. If you need to pass
+ * a cpumask or nodemask to a subroutine in a performance
+ * critical path on such an architecture, then as an
+ * alternative, pass the first unsigned long of the mask
+ * directly, using cpus_raw() or nodes_raw(). Of course,
+ * if your masks are more than one word long, this won't
+ * be adequate.
+ */
+
+#include "linux/bitops.h"
+#include "linux/string.h"
+#include "linux/types.h"
+#include "linux/kernel.h"
+#include "linux/bitmap.h"
+
+#define __mask(bits) struct { unsigned long _m[BITS_TO_LONGS(bits)]; }
+
+#define MASK_LAST_WORD(nbits) \
+( \
+ ((nbits) % BITS_PER_LONG) ? \
+ (1<<((nbits) % BITS_PER_LONG))-1 : ~0UL \
+)
+
+#define mask_setbit(bit, mask) \
+ set_bit((bit), (mask)._m)
+
+#define mask_clearbit(bit, mask) \
+ clear_bit((bit), (mask)._m)
+
+#define mask_setall(mask, nbits) \
+do { \
+ size_t sz_all_but_last = sizeof(mask) - sizeof(unsigned long); \
+ if (sz_all_but_last > 0) \
+ memset((mask)._m, 0xff, sz_all_but_last); \
+ (mask)._m[BITS_TO_LONGS(nbits)-1] = MASK_LAST_WORD(nbits); \
+} while(0)
+
+#define mask_clearall(mask) \
+do { \
+ if (sizeof(mask) == sizeof(unsigned long)) \
+ (mask)._m[0] = 0UL; \
+ else \
+ memset((mask)._m, 0, sizeof(mask)); \
+} while(0)
+
+#define mask_isset(bit, mask) \
+ test_bit((bit), (mask)._m)
+
+#define mask_test_and_set(bit, mask) \
+ test_and_set_bit(bit, (mask)._m)
+
+#define mask_and(dst, src1, src2) \
+do { \
+ if (sizeof(dst) == sizeof(unsigned long)) \
+ (dst)._m[0] = (src1)._m[0] & (src2)._m[0]; \
+ else \
+ bitmap_and((dst)._m, (src1)._m, (src2)._m, \
+ sizeof(dst) * 8); \
+} while(0)
+
+#define mask_or(dst, src1, src2) \
+do { \
+ if (sizeof(dst) == sizeof(unsigned long)) \
+ (dst)._m[0] = (src1)._m[0] | (src2)._m[0]; \
+ else \
+ bitmap_or((dst)._m, (src1)._m, (src2)._m, \
+ sizeof(dst) * 8); \
+} while(0)
+
+#define mask_xor(dst, src1, src2) \
+do { \
+ if (sizeof(dst) == sizeof(unsigned long)) \
+ (dst)._m[0] = (src1)._m[0] ^ (src2)._m[0]; \
+ else \
+ bitmap_xor((dst)._m, (src1)._m, (src2)._m, \
+ sizeof(dst) * 8); \
+} while(0)
+
+#define mask_andnot(dst, src1, src2) \
+do { \
+ if (sizeof(dst) == sizeof(unsigned long)) \
+ (dst)._m[0] = (src1)._m[0] & ~(src2)._m[0]; \
+ else \
+ bitmap_andnot((dst)._m, (src1)._m, (src2)._m, \
+ sizeof(dst) * 8); \
+} while(0)
+
+#define mask_complement(dst, src, nbits) \
+ bitmap_complement((dst)._m, (src)._m, nbits)
+
+#define mask_equal(mask1, mask2) \
+({ \
+ int r; \
+ if (sizeof(mask1) == sizeof(unsigned long)) \
+ r = ((mask1)._m[0] == (mask2)._m[0]); \
+ else { \
+ int nbits = sizeof(mask1) * 8; \
+ r = bitmap_equal((mask1)._m, (mask2)._m, (nbits)); \
+ } \
+ r; \
+})
+
+#define mask_intersects(mask1, mask2) \
+({ \
+ int r; \
+ if (sizeof(mask1) == sizeof(unsigned long)) \
+ r = (((mask1)._m[0] & (mask2)._m[0]) != 0); \
+ else { \
+ int nbits = sizeof(mask1) * 8; \
+ r = bitmap_intersects((mask1)._m, (mask2)._m, (nbits)); \
+ } \
+ r; \
+})
+
+#define mask_subset(mask1, mask2) \
+({ \
+ int r; \
+ if (sizeof(mask1) == sizeof(unsigned long)) \
+ r = (((mask1)._m[0] & ~(mask2)._m[0]) == 0); \
+ else { \
+ int nbits = sizeof(mask1) * 8; \
+ r = bitmap_subset((mask1)._m, (mask2)._m, (nbits)); \
+ } \
+ r; \
+})
+
+#define mask_empty(mask) \
+({ \
+ int r; \
+ if (sizeof(mask) == sizeof(unsigned long)) \
+ r = ((mask)._m[0] == 0UL); \
+ else { \
+ int nbits = sizeof(mask) * 8; \
+ r = bitmap_empty((mask)._m, nbits); \
+ } \
+ r; \
+})
+
+#define mask_full(mask, nbits) \
+ bitmap_full((mask)._m, (nbits))
+
+#define mask_weight(mask, nbits) \
+ bitmap_weight((mask)._m, (nbits))
+
+#define mask_shift_right(dst, src, n, nbits) \
+ bitmap_shift_right((dst)._m, (src)._m, (n), (nbits))
+
+#define mask_shift_left(dst, src, n, nbits) \
+ bitmap_shift_left((dst)._m, (src)._m, (n), (nbits))
+
+#define mask_first(mask, nbits) \
+ find_first_bit((mask)._m, (nbits))
+
+#define mask_next(bit, mask, nbits) \
+ find_next_bit((mask)._m, (nbits), (bit)+1)
+
+#define mask_of_bit(bit, T) \
+({ \
+ typeof(T) m; \
+ mask_clearall(m); \
+ mask_setbit((bit), m); \
+ m; \
+})
+
+#define MASK_ALL(nbits) \
+{{ \
+ [0 ... BITS_TO_LONGS(nbits)-1] = ~0UL, \
+ [BITS_TO_LONGS(nbits)-1] = MASK_LAST_WORD(nbits) \
+}}
+
+#define MASK_NONE(nbits) \
+{{ \
+ [0 ... BITS_TO_LONGS(nbits)-1] = 0UL \
+}}
+
+#define mask_raw(mask) \
+ ((mask)._m)
+
+#define mask_scnprintf(buf, len, mask, nbits) \
+ bitmap_scnprintf((buf), (len), ((mask)._m), (nbits))
+
+#define mask_parse(ubuf, ulen, mask, nbits) \
+ bitmap_parse((ubuf), (ulen), ((mask)._m), (nbits))
+
+#endif /* _ASM_GENERIC_MASK_H */


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373


2004-03-30 00:31:40

by Matthew Dobson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-03-29 at 04:12, Paul Jackson wrote:
> Patch_2_of_22 - New mask ADT
> Adds new include/linux/mask.h header file
>
> ==> See this mask.h header for more extensive mask documentation <==
>

<snip>

> + * Various Implementation Details
> + * ==============================
> + *
> + * The parameter 'T' above must be a variable of the appropriate
> + * mask type (cpumask_t or nodemask_t, for instance). This
> + * variable is only used for its typeof() information.
> + *
> + * For details of mask_scnprintf() and mask_parse(), see
> + * bitmap_scnprintf() and bitmap_parse() in lib/bitmap.c
> + *
> + * A new *mask type should be defined, such as cpumask_t or
> + * nodemask_t, for each possibly different sized (number of
> + * bits) bitmask based on this mask ADT. The definition
> + * for example of cpumask_t is:
> + * typedef __mask(NR_CPUS) cpumask_t;

Is this necessary, or just convenient? I can see how it would be
necessary for the mask_of_bit() function...


> + * These macros presume that all masks passed in a given call
> + * are the same nbits long, and that only bits in positions
> + * b where 0 <= b < nbits might be set in input masks.
> + * They ensure that no additional bits outside this range
> + * become set (however don't protect against improperly set
> + * bits that are outside this range but still inside the array
> + * of unsigned longs representing the mask.) In other words,
> + * any implementation of these ops may assume as a precondition
> + * that any unused bits (bits in the array of unsigned
> + * longs, outside the range 0 to nbits-1) are zero. And any
> + * implementation of these ops must ensure as a postcondition
> + * on all output masks that this same precondition (unused
> + * bits are zero) holds. If you manage to create, by some
> + * other means, a mask with some unused bits non-zero, and
> + * then pass that mask to one of these mask operations, that
> + * operation may malfunction.
> + *
> + * The abstract bit model supported by these masks is that of
> + * an infinite set of bits, in positions numbered 0 and up,
> + * where all but the first 'nbits' bits are always zero.
> + * Calls that implicitly attempt to set any bit outside of
> + * the first 'nbits' bits successfully and quietly leave such
> + * bits as zero. Calls that query or modify specifically
> + * numbered bit positions require as a precondition that the
> + * specified bit position 'n' is the range 0 <= n < nbits, and
> + * may malfunction if handed a bit position outside this range.

Ok... This implies that my comments on the last patch were valid. I
like this model. Assume the unused bits aren't set, and take care not
to set them. I'm happy to see this spelled out explicitly.


> + * The underlying bitmap.c operations such as bitmap_and() and
> + * bitmap_or() don't follow this model. They don't assume
> + * the precondition that unused bits are zero, and they do
> + * mask off any unused portion of input masks in most cases.
> + * However the underlying bitop.h operations, such as set_bit()
> + * and clear_bit(), do no sanitizing of their inputs, depending
> + * heavily on preconditions.

bitmap_and() & bitmap_or() *do not* mask off the unused input bits.
Unless you add that code in a subsequent patch... This paragraph seems
a bit unclear. You're saying that bitmap_and() & bitmap_or() *don't*
follow the precondition, but *do* mask off unused bits, which I'm not
seeing. Then the 'however' is confusing, because you continue with the
same point about *not* following preconditions. Maybe something like:

"The underlying bitmap.c operations such as bitmap_and() and bitmap_or()
don't follow this model of 'unused bits'. Users must follow the
precondition that unused bits are zero to guarantee no unused bits are
set by the operations. Further, the underlying bitop.h operations, such
as set_bit() and clear_bit(), do no sanitizing of their inputs, also
depending on users not to set/clear unused bits (ie: ensure 0 <= b <
nbits)."


> + * The file include/mask.h applies to all architectures.
> + * Architectures requiring custom details should provide
> + * them in their include/asm-<arch>/bitops.h file, and
> + * if necessary modify the common include/linux/mask.h
> + * file to conditionally generate the necessary code,
> + * depending on compile time settings. No need to write
> + * ugly #ifdef's to do this - gcc provides a rich set
> + * of compile time extensions. See further for example:
> + * http://gcc.gnu.org/onlinedocs/gcc-3.3.3/gcc/C-Extensions.html

I think that it wouldn't be terribly ugly to split out the 1 unsigned
long special cases (bitmap_and, bitmap_or, etc) with #ifdefs. I think
that in some ways it makes it a little more readable because it gets rid
of "if (sizeof(foo) == sizeof(unsigned long))..." all over the place. I
don't feel strongly enough about it to make a stink, though...


> +#define mask_test_and_set(bit, mask) \
> + test_and_set_bit(bit, (mask)._m)

test_and_set_bit((bit), (mask)._m) ?

-Matt

2004-03-30 01:24:36

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Matthew wrote (of my recommendation to not use the mask type directly):
> Is this necessary, or just convenient?

Technically as you suspect, just convenient, except in the case of the
mask_of_bit() macro, as you observe.

I was adhering to the K.I.S.S. school here - just tell the user one
recommended way of using things, suppressing my engineering urge to
explain alternatives that had no real advantages.

> This paragraph seems a bit unclear.

That's an understatment. I left out a 'not' and didn't
explain half the cases. See my updates, below.

> I think that it wouldn't be terribly ugly to split out the 1 unsigned
> long special cases (bitmap_and, bitmap_or, etc) with #ifdefs.

Do you have in mind an ifdef per function, or putting
several functions inside an ifdef? If you think it
looks better - show us the code ;).

> test_and_set_bit((bit), (mask)._m) ?

yup.

Here's my cumulative patch of changes that I have gained so far
from your excellent feedback, and a couple I've noticed:

diff -Nru a/include/linux/mask.h b/include/linux/mask.h
--- a/include/linux/mask.h Mon Mar 29 17:21:41 2004
+++ b/include/linux/mask.h Mon Mar 29 17:21:41 2004
@@ -62,7 +62,7 @@
* Don't use mask.h directly - use cpumask.h and nodemask.h.
*
* The available mask operations and their rough meaning in the
- * case that "nbits == 8 * sizeof(single unsigned long)" are:
+ * case that "nbits == BITS_PER_LONG" are:
*
* __mask(nbits) Use to define new *mask types
* mask_setbit(bit, mask) mask |= bit
@@ -70,7 +70,7 @@
* mask_setall(mask, nbits) mask = ~0UL
* mask_clearall(mask) mask = 0UL
* mask_isset(bit, mask) Is bit set in mask?
- * mask_test_and_set(bit, mask) mask | bit ? 0 : mask |= bit
+ * mask_test_and_set(bit, mask) mask & bit ? 1 : (mask |= bit, 0)
* mask_and(dst, src1, src2) dst = src1 & src2
* mask_or(dst, src1, src2) dst = src1 | src2
* mask_xor(dst, src1, src2) dst = src1 ^ src2
@@ -155,8 +155,10 @@
*
* The underlying bitmap.c operations such as bitmap_and() and
* bitmap_or() don't follow this model. They don't assume
- * the precondition that unused bits are zero, and they do
+ * the precondition that unused bits are zero, and they do not
* mask off any unused portion of input masks in most cases.
+ * The bitmap operations that produce Boolean or scalar results,
+ * such as for empty, full and weight, _do_ filter out unused bits.
* However the underlying bitop.h operations, such as set_bit()
* and clear_bit(), do no sanitizing of their inputs, depending
* heavily on preconditions.
@@ -234,7 +236,7 @@
test_bit((bit), (mask)._m)

#define mask_test_and_set(bit, mask) \
- test_and_set_bit(bit, (mask)._m)
+ test_and_set_bit((bit), (mask)._m)

#define mask_and(dst, src1, src2) \
do { \
diff -Nru a/lib/bitmap.c b/lib/bitmap.c
--- a/lib/bitmap.c Mon Mar 29 17:21:41 2004
+++ b/lib/bitmap.c Mon Mar 29 17:21:41 2004
@@ -47,7 +47,7 @@
int bitmap_equal(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
- int k, lim = bits/BITS_PER_LONG;;
+ int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
if (bitmap1[k] != bitmap2[k])
return 0;
@@ -63,7 +63,7 @@

void bitmap_complement(unsigned long *dst, const unsigned long *src, int bits)
{
- int k, lim = bits/BITS_PER_LONG;;
+ int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
dst[k] = ~src[k];

@@ -149,7 +149,7 @@
int bitmap_intersects(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
- int k, lim = bits/BITS_PER_LONG;;
+ int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
if (bitmap1[k] & bitmap2[k])
return 1;
@@ -165,7 +165,7 @@
int bitmap_subset(const unsigned long *bitmap1,
const unsigned long *bitmap2, int bits)
{
- int k, lim = bits/BITS_PER_LONG;;
+ int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
if (bitmap1[k] & ~bitmap2[k])
return 0;


--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-03-30 01:28:11

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-03-29 at 04:12, Paul Jackson wrote:
>>> * The underlying bitmap.c operations such as bitmap_and() and
>>> * bitmap_or() don't follow this model. They don't assume
>>> * the precondition that unused bits are zero, and they do
>>> * mask off any unused portion of input masks in most cases.
>>> * However the underlying bitop.h operations, such as set_bit()
>>> * and clear_bit(), do no sanitizing of their inputs, depending
>>> * heavily on preconditions.

On Mon, Mar 29, 2004 at 04:30:18PM -0800, Matthew Dobson wrote:
> bitmap_and() & bitmap_or() *do not* mask off the unused input bits.
> Unless you add that code in a subsequent patch... This paragraph seems
> a bit unclear. You're saying that bitmap_and() & bitmap_or() *don't*
> follow the precondition, but *do* mask off unused bits, which I'm not
> seeing. Then the 'however' is confusing, because you continue with the
> same point about *not* following preconditions. Maybe something like:

They actually don't need to. The cases where "or" gives rise to the need
to do a fixup pass is only when the second operand is complemented. If
both the inputs satisfy the trailing zero precondition, it's already
guaranteed that bitmap_or()'s result will do likewise as a
postcondition. If _either_ of the inputs to bitmap_and() satisfy the
trailing zero precondition, its result will also satisfy the trailing
zero postcondition.

This is all assuming the trailing zero invariant; the "don't care"
invariant behaves very differently and incurs its checks only at the
time of examinations whose underlying implementations operate on whole
words, like cpus_equal(), cpus_weight(), and cpus_empty().

For the single-word case, something like this is required for mainline:
#if NR_CPUS % BITS_PER_LONG
#define __CPU_VALID_BITS__ (~((1UL << (NR_CPUS % BITS_PER_LONG)) - 1))
#else
#define __CPU_VALID_BITS__ (~0UL)
#endif

#define cpus_equal(x,y) (!(((x) ^ (y)) & __CPU_VALID_BITS__))
#define cpus_empty(x) (!((x) & __CPU_VALID_BITS__))
#define cpus_coerce(x) ((unsigned long)(x) & __CPU_VALID_BITS__)
#if BITS_PER_LONG == 32
#define cpus_weight(x) hweight32((x) & __CPU_VALID_BITS__)
#else
#define cpus_weight(x) hweight64((x) & __CPU_VALID_BITS__)
#endif

... and all other operations unmodified. In all honesty, the difference
in overhead and/or implementation complexity between the two invariants
is miniscule. I don't care who wants what per se; if you want to change
that invariant, it's not my concern what the invariant is so long as
it's respected and there's a coherent notion of what people want it to
be. Given some of your statements, I wonder sometimes if you actually
understand the "don't care" invariant.

The patch series is a little clunky though. e.g. the cpus_raw()
renaming has zero semantic effect, where the cpus_complement() API
change should probably move people to a renamed function (e.g. dyadic
cpus_not() or some such) so callers can be incrementally converted, it
looks like there are points in the series where various things won't
compile etc. but otherwise innocuous. The really hard questions will
come when those with real concerns about the operational semantics
start coming out of the woodwork. Actually, why don't you start
by asking Ray Bryant, since it was prodding from him about codegen
results directly contradicting yours that originally instigated the
apparently reviled cpumask_const_t. A coherent story out of SGI (e.g.
not so many contradictory statements from different people) would
probably help and/or would have helped get this right the first time.


-- wli

2004-03-30 01:44:40

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]


The following typo needs fixing. I had double opening
brace, instead of parenthesis-brace. Time to increase
my screen font size.

===== include/linux/mask.h 1.3 vs edited =====
--- 1.3/include/linux/mask.h Mon Mar 29 17:10:27 2004
+++ edited/include/linux/mask.h Mon Mar 29 17:37:01 2004
@@ -352,13 +352,13 @@
})

#define MASK_ALL(nbits)
-{{
+({
[0 ... BITS_TO_LONGS(nbits)-1] = ~0UL,
[BITS_TO_LONGS(nbits)-1] = MASK_LAST_WORD(nbits)
}}

#define MASK_NONE(nbits)
-{{
+({
[0 ... BITS_TO_LONGS(nbits)-1] = 0UL
}}

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-03-30 01:54:44

by Matthew Dobson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-03-29 at 16:47, Paul Jackson wrote:
> The following typo needs fixing. I had double opening
> brace, instead of parenthesis-brace. Time to increase
> my screen font size.
>
> ===== include/linux/mask.h 1.3 vs edited =====
> --- 1.3/include/linux/mask.h Mon Mar 29 17:10:27 2004
> +++ edited/include/linux/mask.h Mon Mar 29 17:37:01 2004
> @@ -352,13 +352,13 @@
> })
>
> #define MASK_ALL(nbits)
> -{{
> +({
> [0 ... BITS_TO_LONGS(nbits)-1] = ~0UL,
> [BITS_TO_LONGS(nbits)-1] = MASK_LAST_WORD(nbits)
> }}
>
> #define MASK_NONE(nbits)
> -{{
> +({
> [0 ... BITS_TO_LONGS(nbits)-1] = 0UL
> }}

Might want to fix the double closing braces as well.

Definitely time to increase your font size! ;)

-Matt

2004-03-30 01:57:51

by Matthew Dobson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-03-29 at 16:27, Paul Jackson wrote:
> Matthew wrote (of my recommendation to not use the mask type directly):
> > Is this necessary, or just convenient?
>
> Technically as you suspect, just convenient, except in the case of the
> mask_of_bit() macro, as you observe.
>
> I was adhering to the K.I.S.S. school here - just tell the user one
> recommended way of using things, suppressing my engineering urge to
> explain alternatives that had no real advantages.

That's what I figured. Just looking for a clarification.


> > I think that it wouldn't be terribly ugly to split out the 1 unsigned
> > long special cases (bitmap_and, bitmap_or, etc) with #ifdefs.
>
> Do you have in mind an ifdef per function, or putting
> several functions inside an ifdef? If you think it
> looks better - show us the code ;).

I was thinking of having a large ifdef'd section for the one word case.
I'll run that up and see if it does in fact look any cleaner.


> Here's my cumulative patch of changes that I have gained so far
> from your excellent feedback, and a couple I've noticed:

<diff snipped>

Looks good. I'll keep chugging along through the patches! :)

-Matt

2004-03-30 02:07:23

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, Mar 29, 2004 at 05:27:44PM -0800, William Lee Irwin III wrote:
> ... and all other operations unmodified. In all honesty, the difference
> in overhead and/or implementation complexity between the two invariants
> is miniscule. I don't care who wants what per se; if you want to change
> that invariant, it's not my concern what the invariant is so long as
> it's respected and there's a coherent notion of what people want it to
> be. Given some of your statements, I wonder sometimes if you actually
> understand the "don't care" invariant.

Sorry, that was directed at Paul.


-- wli

2004-03-30 02:08:24

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, Mar 29, 2004 at 04:47:01PM -0800, Paul Jackson wrote:
> The following typo needs fixing. I had double opening
> brace, instead of parenthesis-brace. Time to increase
> my screen font size.

You've got double closing braces, too.


-- wli

2004-03-30 02:29:28

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> You've got double closing braces, too.

Ah so, said the blind man.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-03-30 02:24:30

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> Given some of your statements, I wonder sometimes if you actually
> understand the "don't care" invariant.

Always possible. If you care to point to such confusions, especially in
this patch series, as Matthew is doing, go right ahead. Not much I can
do with "some of your statements".

> the cpus_raw() renaming has zero semantic effect,

Not quite zero, though close. It provides a pointer to the array of
unsigned longs, so one _could_ manipulate masks of two or more words via
that pointer. The previous promote and coerce just jammed the first
word in or out of a mask.

However the main motivation for that change was to make this override a
bit _less_ abstract. I presumed that the typical coder using such an
override was more likely to be comfortable thinking in terms of memory
words and their addresses, rather than coercion and promotion between
two data types. I figured that the closer the call was to how they
think anyway, the better the chance they had of using it correctly.

> the cpus_complement() API change should probably move people to a
> renamed function

Somtimes yes, sometimes no. In this case, the number of uses of
complement was vanishingly small. As in essentially _none_. There
is one physids_complement() macro, using bitmap_complement(),
but that phyids_complement() macro is itself unused. That's it.

A staged migration would be excessive in this case. Just get the
code 'right' and move on.

> Actually, why don't you start by asking Ray Bryant, since it was
> prodding from him about codegen results directly contradicting yours
> that originally instigated the apparently reviled cpumask_const_t.

I have - he has had nothing to add. After a point, trying to use
discussions from a year ago to which I wasn't party (and would have
forgotten if I was) as a guide to this stuff is not helpful.

I am quite willing to believe that one _could_ write the code so that it
looked too inefficient. In the cases that I examined with objdump,
using gcc 3.2 and above, I was able to get nice results.

I recommend you examine the generated code yourself, and point out the
places that are wasting cycles, for the compilers and processors and
systems that you care about.

And even if there are such places (which likely there are), that still
doesn't mean that cpumask_const_t is the necessary solution. Indeed, as
I have explained with some care and detail in responses last week to
Matthew's patch set, I consider cpumask_const_t to be an abuse of
conventional 'C' idioms. In most cases, we should have enough control
of the situation to address such inefficiencies in the mask.h wrappers,
or by making reasonable changes in the invoking code. It is _not_
proper to insist that someone be able to pass a large structure on
the stack by value semantics, without them realizing it or feeling
the pain. This is 'C' we are coding, not Python. Code using this
stuff may have to be written with an awareness of the possible sizes
of things, and choose argument passing conventions appropriately.

> A coherent story out of SGI (e.g. not so many contradictory statements
> from different people) would probably help and/or would have helped get
> this right the first time.

Efforts to dissect the past to determine who did what wrong when are
frequently unproductive. The more productive question to ask is:

What can we do to improve the current code?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-03-30 06:38:31

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

At some point in the past, I wrote:
>> Given some of your statements, I wonder sometimes if you actually
>> understand the "don't care" invariant.

On Mon, Mar 29, 2004 at 05:27:25PM -0800, Paul Jackson wrote:
> Always possible. If you care to point to such confusions, especially in
> this patch series, as Matthew is doing, go right ahead. Not much I can
> do with "some of your statements".

Okay, let's see if I can give a high-level explanation. There are two
choices of invariants being discussed, with various properties:

(a) zeroed tail invariant
(i) acts as an "ingress filter" on leftover bits
(ii) forces leftover bits to a defined state
(iii) requires leftover bits being set up by (ii) for correctness

(b) don't care invariant
(i) acts as an "egress filter" on leftover bits
(ii) leftover bits are in "undefined" states
(iii) correctness requires (ii)'s bits may never "leak" to callers

These invariants can be changed transparently. Changing them is not a
bugfix of any kind. Cleanups are fine to do. Just separate the bugfixes
from cleanups.


At some point in the past, I wrote:
>> the cpus_raw() renaming has zero semantic effect,

On Mon, Mar 29, 2004 at 05:27:25PM -0800, Paul Jackson wrote:
> Not quite zero, though close. It provides a pointer to the array of
> unsigned longs, so one _could_ manipulate masks of two or more words via
> that pointer. The previous promote and coerce just jammed the first
> word in or out of a mask.

That's just as bad, since cpus_addr() does that now.


On Mon, Mar 29, 2004 at 05:27:25PM -0800, Paul Jackson wrote:
> However the main motivation for that change was to make this override a
> bit _less_ abstract. I presumed that the typical coder using such an
> override was more likely to be comfortable thinking in terms of memory
> words and their addresses, rather than coercion and promotion between
> two data types. I figured that the closer the call was to how they
> think anyway, the better the chance they had of using it correctly.

As nice as that is, pure renaming for its own sake isn't worthwhile
except in a very limited number of cases, of which this is not one.


At some point in the past, I wrote:
>> the cpus_complement() API change should probably move people to a
>> renamed function

On Mon, Mar 29, 2004 at 05:27:25PM -0800, Paul Jackson wrote:
> Somtimes yes, sometimes no. In this case, the number of uses of
> complement was vanishingly small. As in essentially _none_. There
> is one physids_complement() macro, using bitmap_complement(),
> but that phyids_complement() macro is itself unused. That's it.
> A staged migration would be excessive in this case. Just get the
> code 'right' and move on.

Okay, I checked and a staged migration isn't needed given the number
of callers:

./include/asm-x86_64/mpspec.h:215:#define physids_complement(map) bitmap_complement((map).mask, MAX_APICS)
./include/asm-i386/mpspec.h:56:#define physids_complement(map) bitmap_complement((map).mask, MAX_APICS)
./include/asm-i386/mach-es7000/mach_ipi.h:14: cpus_complement(mask);
./kernel/sched.c:3309: cpus_complement(tsk->cpus_allowed);


At some point in the past, I wrote:
>> Actually, why don't you start by asking Ray Bryant, since it was
>> prodding from him about codegen results directly contradicting yours
>> that originally instigated the apparently reviled cpumask_const_t.

On Mon, Mar 29, 2004 at 05:27:25PM -0800, Paul Jackson wrote:
> I have - he has had nothing to add. After a point, trying to use
> discussions from a year ago to which I wasn't party (and would have
> forgotten if I was) as a guide to this stuff is not helpful.
> I am quite willing to believe that one _could_ write the code so that it
> looked too inefficient. In the cases that I examined with objdump,
> using gcc 3.2 and above, I was able to get nice results.
> I recommend you examine the generated code yourself, and point out the
> places that are wasting cycles, for the compilers and processors and
> systems that you care about.

I'm concerned rather strictly with external requirements as opposed to
my own preferences. Dependency on specific very recent compiler version
is unlikely to be acceptable for all architectures.


On Mon, Mar 29, 2004 at 05:27:25PM -0800, Paul Jackson wrote:
> And even if there are such places (which likely there are), that still
> doesn't mean that cpumask_const_t is the necessary solution. Indeed, as
> I have explained with some care and detail in responses last week to
> Matthew's patch set, I consider cpumask_const_t to be an abuse of
> conventional 'C' idioms. In most cases, we should have enough control
> of the situation to address such inefficiencies in the mask.h wrappers,
> or by making reasonable changes in the invoking code. It is _not_
> proper to insist that someone be able to pass a large structure on
> the stack by value semantics, without them realizing it or feeling
> the pain. This is 'C' we are coding, not Python. Code using this
> stuff may have to be written with an awareness of the possible sizes
> of things, and choose argument passing conventions appropriately.

cpumask_const_t addressed the need for call-by-reference operational
semantics while not incurring the overhead of indirection for smaller
systems. Your assessment of it appears to be off-base, as that kind of
type ambiguity is effectively mandated by the requirements of not
incurring indirection overhead for smaller systems while simultaneously
transparently falling back to call-by-reference semantics for larger ones.

What I'm really trying to point out on this front is that you should
survey/lobby (sub)arch maintainers to get the requirement lifted, not me.


At some point in the past, I wrote:
>> A coherent story out of SGI (e.g. not so many contradictory statements
>> from different people) would probably help and/or would have helped get
>> this right the first time.

On Mon, Mar 29, 2004 at 05:27:25PM -0800, Paul Jackson wrote:
> Efforts to dissect the past to determine who did what wrong when are
> frequently unproductive. The more productive question to ask is:
> What can we do to improve the current code?

The past is not the issue. As far as I'm concerned, the codegen
concerns still stand until (sub)arch maintainers decide otherwise.

Compiler-specific operational semantics and compiler version
dependencies are *REALLY* scary from a portability POV and have already
burned the cpumask_t codebase once in the case of the bitmap inlines.


-- wli

2004-03-30 09:43:12

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> That's just as bad, since cpus_addr() does that now.

Excellent - I missed that one somehow.

How about I change my cpus_raw back to cpus_addr ?
Would that meet with your approval ?

> What I'm really trying to point out on this front is that you should
> survey/lobby (sub)arch maintainers to get the requirement lifted, not me.

Good point.

> Compiler-specific operational semantics and compiler version
> dependencies are *REALLY* scary from a portability POV and have already
> burned the cpumask_t codebase once in the case of the bitmap inlines.

Mention of such past difficulties, without any elaboration of details,
serves only to discourage change.

If you can provide sufficient details that we can make a useful
evaluation of whether such concerns are still a show stopper for
this work, then I welcome your presentation of such.

If not, then I can do nothing with this observation. In that case,
given the clear and substantial simplifications that appear possible,
I can only push on.

> Okay, I checked and a staged migration isn't needed given the number
> of callers:

Good - thanks for checking.

> Your assessment of it appears to be off-base, as that kind of
> type ambiguity is effectively mandated by the requirements of
> not incurring indirection overhead for smaller systems while
> simultaneously transparently falling back to call-by-reference
> semantics for larger ones.

I agree that if that's the general requirement, then cpumask_const_t
is an appropriate answer.

The only decent example of any such requirement ever being needed that I
have been able to locate so far came from the sparc architecture. After
a useful discussion of this with David Miller, on a subthread of
Matthew's thread last week (IIRC), this only applied to sparc32 and
those folks are a ways from providing SMP. Also, I could not find
any performance critical paths in the kernel that pass a cpumask_t
as an argument to a real (non-inline) function. So all in all, the
chance that we need this, on an architecture with structure passing
constraints, is getting pretty small.

And if ever we do, then I would have to recommend we balance the
'requirement' for transparent argument type adaption with the
'requirement' to keep things simple. If one call had to pass an
explicit pointer, or even had to wrap the real function call with a
macro, that selectively passed a value or a pointer, depending on
cpumask_t size, then that would be an alternative well worth
considering, in my view, over the alternative of providing an entire
parallel set of mask headers that provide this 'const' capability.

Some requirements are better met with narrowly focused special cases,
than with fully generalized solutions.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-03-30 10:20:25

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

At some point in the past, I wrote:
>> Compiler-specific operational semantics and compiler version
>> dependencies are *REALLY* scary from a portability POV and have already
>> burned the cpumask_t codebase once in the case of the bitmap inlines.

On Tue, Mar 30, 2004 at 12:45:40AM -0800, Paul Jackson wrote:
> Mention of such past difficulties, without any elaboration of details,
> serves only to discourage change.
> If you can provide sufficient details that we can make a useful
> evaluation of whether such concerns are still a show stopper for
> this work, then I welcome your presentation of such.
> If not, then I can do nothing with this observation. In that case,
> given the clear and substantial simplifications that appear possible,
> I can only push on.

It can't be made specific. It's a rather unfortunate fact of life that
various arches require magic toolchain versions to produce working
kernels. The bits about dragging this around in front of arch maintainers
is to figure out whether the magic toolchain versions required for the
various arches barf on the constructs you're using.


At some point in the past, I wrote:
>> Your assessment of it appears to be off-base, as that kind of
>> type ambiguity is effectively mandated by the requirements of
>> not incurring indirection overhead for smaller systems while
>> simultaneously transparently falling back to call-by-reference
>> semantics for larger ones.

On Tue, Mar 30, 2004 at 12:45:40AM -0800, Paul Jackson wrote:
> I agree that if that's the general requirement, then cpumask_const_t
> is an appropriate answer.
> The only decent example of any such requirement ever being needed that I
> have been able to locate so far came from the sparc architecture. After
> a useful discussion of this with David Miller, on a subthread of
> Matthew's thread last week (IIRC), this only applied to sparc32 and
> those folks are a ways from providing SMP. Also, I could not find
> any performance critical paths in the kernel that pass a cpumask_t
> as an argument to a real (non-inline) function. So all in all, the
> chance that we need this, on an architecture with structure passing
> constraints, is getting pretty small.

I think it's feasible to get the requirement lifted. You just need to
(once again) run it past arch maintainers.


On Tue, Mar 30, 2004 at 12:45:40AM -0800, Paul Jackson wrote:
> And if ever we do, then I would have to recommend we balance the
> 'requirement' for transparent argument type adaption with the
> 'requirement' to keep things simple. If one call had to pass an
> explicit pointer, or even had to wrap the real function call with a
> macro, that selectively passed a value or a pointer, depending on
> cpumask_t size, then that would be an alternative well worth
> considering, in my view, over the alternative of providing an entire
> parallel set of mask headers that provide this 'const' capability.
> Some requirements are better met with narrowly focused special cases,
> than with fully generalized solutions.

Maybe so. The requirements were actually three competing/conflicting
requirements: one from Ray Bryant/SGI for call-by-reference semantics
available to core API's, one from davem for call-by-value on arithmetic
types in core API's. There was another imposed for zero indirection on
small SMP systems you're probably going to have to work harder to get
an ack on since I don't remember where it came from apart from "on high".
But it sounds like at least one of the requirement competitors may be
backing out here. If Ray or other vaguely independent SGI ppl (yes, this
does need to look like it's more than unilateral) could speak up
regarding the call-by-reference semantics requirement that would lift a
third of it. The unwrapped structures was basically davem and tinyboxen,
and I saw the bit where he lifted the sparc(64) side of that requirement.

As I see it we have:
(a) pester more ppl at SGI to get cpumask_const_t requirements dropped
(b) davem's already backed off cpumask_arith for sparc(32|64) ABI bits
(c) some kind of high-level ack is needed for indirection on tinyboxen
to kill off the second requirement for cpumask_arith
(d) run this past arch maintainers for acks wrt. compiler versions vs.
the costructs you're using in inlines

(a) should be very easy for you, (b) is fait accompli, (c) akpm could
chime in on at any moment. One hopeful thing here for you is that it's
a question of asking the right ppl; the code itself is largely a non-
issue, apart from whether higher-level maintainers regard it as
excessive code churn or too cleanup-heavy.

I'll send a private reply about how to get a hold of arch maintainers,
which should take care of (d).

(davem please don't shoot me -- this stuff could break things otherwise
similar to how i386 hit bad codegen in earlier revisions pre-merge)


-- wli

2004-03-31 00:11:08

by Ray Bryant

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

wli,

I do apologize for not following this discussion until now. Mu mailer was happily filing these
messages away for me in a folder and I hadn't noticed the subject line. Paul Jackson called me to
discuss this and we've run a few test cases to understand the impact of call by value versus call by
reference implementations. We've discovered a couple interesting things:

(1) As near as we can tell, there are only 3 routines that appear to have cpumask_t arguments in
the Linux 2.6.4 kernel that are also called often enough that it might matter. All of these
routines live in sched.c: load_balance(), find_busiest_queue(), and set_cpus_allowed(). The first
two of these routines are called on the order of once every few 10s of milliseconds, at least on our
machines, so they don't really represent a huge load on the system even if they do impose some extra
instructions due to call by value instead of call by reference. The last one is probably called
even less often, so from the standpoint of these routines we can't make a case for requring call by
reference.

(2) The base routines that actually do the bit manipulation appear all to be call by reference in
any case. So that is a good thing. It would be silly to copy an 8 word (or larger??) mask into
registers just to set a single bit.

(3) The current compilers appear to be somewhat better at managing these call by value structures
than the ones I was experimenting with last summer.

As I recall our discussion on this of last summer, I think the notion was to get the call by
reference stuff into 2.6 and then evaluate later what the break even point was (i. e. at how many
words of cpumask does it become more efficient to use call by reference). I think our thinking at
the present time is that there is no cross over point, that the call by value stuff is good enough
for all sizes of bitmasks.

Also, since then, Jesse Barnes has become the official maintainer for 2.6-sn. So all such decisions
need to be blessed by him. That will, in part, reduce the inherent noise level of having several
different SGI folks broadcasting requirements out into the community.

Paul's point to me was that while the transparency of the call by value versus call by reference
approach was virtually transparent (perhaps translucent?) at the source code level, it did introduce
some complexity that could be confusing. Perhaps a better approach might be to explicitly code the
3 routines mentioned above to take a pointer to bitmask argument, and then we would be done.
(Certainly on a UP machine this makes little difference, and I would be surprised if we could
measure the difference on a 32 way IA32 box.)

So, assuming all of the above is correct, I guess I am willing to remove the call by reference
requirement that I was harping on so much last summer.

Jesse do you agree?

William Lee Irwin III wrote:

>
> Maybe so. The requirements were actually three competing/conflicting
> requirements: one from Ray Bryant/SGI for call-by-reference semantics
> available to core API's, one from davem for call-by-value on arithmetic
> types in core API's. There was another imposed for zero indirection on
> small SMP systems you're probably going to have to work harder to get
> an ack on since I don't remember where it came from apart from "on high".
> But it sounds like at least one of the requirement competitors may be
> backing out here. If Ray or other vaguely independent SGI ppl (yes, this
> does need to look like it's more than unilateral) could speak up
> regarding the call-by-reference semantics requirement that would lift a
> third of it. The unwrapped structures was basically davem and tinyboxen,
> and I saw the bit where he lifted the sparc(64) side of that requirement.
>
> As I see it we have:
> (a) pester more ppl at SGI to get cpumask_const_t requirements dropped
> (b) davem's already backed off cpumask_arith for sparc(32|64) ABI bits
> (c) some kind of high-level ack is needed for indirection on tinyboxen
> to kill off the second requirement for cpumask_arith
> (d) run this past arch maintainers for acks wrt. compiler versions vs.
> the costructs you're using in inlines
>
> (a) should be very easy for you, (b) is fait accompli, (c) akpm could
> chime in on at any moment. One hopeful thing here for you is that it's
> a question of asking the right ppl; the code itself is largely a non-
> issue, apart from whether higher-level maintainers regard it as
> excessive code churn or too cleanup-heavy.
>
> I'll send a private reply about how to get a hold of arch maintainers,
> which should take care of (d).
>
> (davem please don't shoot me -- this stuff could break things otherwise
> similar to how i386 hit bad codegen in earlier revisions pre-merge)
>
>
> -- wli
>

--
Best Regards,
Ray
-----------------------------------------------
Ray Bryant
512-453-9679 (work) 512-507-7807 (cell)
[email protected] [email protected]
The box said: "Requires Windows 98 or better",
so I installed Linux.
-----------------------------------------------

2004-03-31 00:15:36

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Tuesday 30 March 2004 4:16 pm, Ray Bryant wrote:
> So, assuming all of the above is correct, I guess I am willing to remove
> the call by reference requirement that I was harping on so much last
> summer.
>
> Jesse do you agree?

Yeah, that's fine by me. Given what I've seen, it doesn't look like the
bitmap manipulation routines need to be particularly speedy--certainly not
for sn2 specific code, so I'm ok with just about any implementation so long
as it supports >64p and >64 nodes.

Thanks,
Jesse

2004-04-01 00:39:49

by Matthew Dobson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-03-29 at 04:12, Paul Jackson wrote:
> Patch_2_of_22 - New mask ADT
> Adds new include/linux/mask.h header file
>
> ==> See this mask.h header for more extensive mask documentation <==

Paul,
2 small changes to include/linux/mask.h:

1) Change #ifndef _ASM_GENERIC_MASK_H to #ifnded __LINUX_MASK_H since
mask.h resides in include/linux/ not include/asm-generic/.

2) In #define mask_complement(), the "nbits" argument to
bitmap_complement wasn't wrapped in parens.

Cheers!

-Matt


Attachments:
mask_h-mcd.patch (886.00 B)

2004-04-01 01:00:11

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> 2 small changes to include/linux/mask.h:

Yup - good ones.

I'm just starting to rebuild this set of patches, with:

1) The various little changes posted so far and your updates
2) The "don't generate non-zero unused bits" strengthening of
lib/bitmap.c's postconditions a clearly distinct member of the
patch set, with a comment explaining the bitmap/bitop "model",
3) Renaming cpus_raw => cpus_addr (it's current name),
4) Whatever else I see in passing, and
5) An actually test of it this time, on an ia64.

Then I should make this update visible and seek further buy in from
any potentially affected arch maintainers.

Much of the above represents what I've learned from Bill Irwin's
detailed review of this so far. I trust Bill will speak up if I
misconstrued one of his recommendations.

Question:

Matthew, Bill, Andrew, anyone - how should I present the next version of
this set of 22 give or take patches in this set? Another new set of 22
lkml posts; one reply or 22 replies to the some post on the existing
threads; tarballs; web site URL; ... The possibilities are varied.

My inclination would be just as I did it the first time - some 22
rapid fire separate posts. But I'm game for considering alternatives.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-01 01:12:28

by Matthew Dobson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Wed, 2004-03-31 at 16:58, Paul Jackson wrote:
> Question:
>
> Matthew, Bill, Andrew, anyone - how should I present the next version of
> this set of 22 give or take patches in this set? Another new set of 22
> lkml posts; one reply or 22 replies to the some post on the existing
> threads; tarballs; web site URL; ... The possibilities are varied.
>
> My inclination would be just as I did it the first time - some 22
> rapid fire separate posts. But I'm game for considering alternatives.

I'd prefer (especially since I'm on the CC line ;) that you send out the
[0/22] mail, and send all the rest of the mails (the actual patches) as
replies to that first mail. That way, at least all of the emails are in
the same thread and can easily be collapsed/expanded. I know when I
checked my mail monday morning and there were all those individual
mails, I thought I'd broken something and people were out for my blood!
;)

-Matt

2004-04-01 01:19:31

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> all the rest of the mails (the actual patches) as replies to that first mail.

Ok - sounds good.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-01 01:25:03

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Paul Jackson <[email protected]> wrote:
>
> Matthew, Bill, Andrew, anyone - how should I present the next version of
> this set of 22 give or take patches in this set?

<Wakes up. Have they finished yet?>

I'll need reminding what these patches actually do.

Can you sit on them for two weeks? Once we're mid-way into 2.6.6-pre I'll
be in better shape. When kernels explode I have too many suspects already
at present.

2004-04-01 01:35:52

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> <Wakes up. Have they finished yet?>

Not yet. (Do you want on/off the CC list on this chatter ...?)


> I'll need reminding what these patches actually do.

Yup - when we are ready to recommend you include them.

The fundamental reason - code and complexity reduction, so that the
users of cpumasks and nodemasks can figure them out with less effort.
And by distilling out the basic 'mask' ADT from cpumasks, the nodemask
type can be added more easily.

But I'll happily repeat that, when the time comes.


> Can you sit on them for two weeks?

Sure. Too easy. It would have been a week before we were ready
to encourage your consideration anyway.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-05 01:26:54

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-03-29 at 22:12, Paul Jackson wrote:
> Patch_2_of_22 - New mask ADT
> Adds new include/linux/mask.h header file
>
> ==> See this mask.h header for more extensive mask documentation <==

Y'know, I dislike this. I really do. I know you've done a lot of work,
but too much abstraction scares me.

Especially when what you're abstracting is so trivial.

We have include/linux/bitmap.h which has a collection of unsigned long[]
operations. I suggest that people who want to use a bitmap should wrap
this in a struct and use the bitops directly, eg:

struct cpumap {
DECLARE_BITMAP(bits,NR_CPUS);
};

This has several advantages:
1) It doesn't add new code to the kernel,
2) Operations on "cpumap.bits" are easy to read, write and understand,
3) Different mask types are not type compatible,
4) Any new operations you need to write can be used by anyone who needs
a bitmap, whether in a struct or not.

Please consider...
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-04-05 07:07:26

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

First, let me say I am honored to have the pleasure of your
consideration of this work. I enjoy your work and your irreverant
style.

I have a suspicion that what you saw is not what I wrote ;).

> 1) It doesn't add new code to the kernel,

This work reduces the number of kernel source files by about 30, reduces
the number of kernel source lines (after filtering out block comments ;)
by over 200 lines, and for most of the kernel configurations that I've
tried, reduces the kernel text size by a few 100 bytes (numa text sizes
are larger - I'll be working with Matthew on that ...).

Essentially what I am doing here is taking Mathew's nodemask_t proposed
patch, which duplicated much of the cpumask_t mechanisms, each providing
several implementations (arith, array, const or not const), and throwing
away all the duplication with cpumasks and variations of implementation,
to have a single underlying type.

Where before, someone coding an operation on cpumasks had to look in
several nested, configuration ifdef'd files to figure out what
operations were available, and how they worked, now there would be one
file cpumask.h, one file nodemask.h, and one common mask.h file they
share.

I'm removing kernel code, net, not adding, whether you count by source
lines, source files or binary text size.

The resulting API's that a kernel hacker might encounter are
(1) the bitmap/bitops of before,
(2) the cpumask_t ops of before, but more accessible, and
(3) the nodemask_t ops Matthew proposed, just like cpumask_t ops.

> 2) Operations on "cpumap.bits" are easy to read, write and understand,

This sounds like an objection to the cpumask_t type, which was added by
others before me. I'm sure that Bill Irwin would know the origins of
that type, and may well have played an essential role in its creation.

> 3) Different mask types are not type compatible,

I'm confused which way you mean this. I'm not sure what you mean
by 'type compatible', and I am not sure whether you think it is a
good idea I'm missing out on, or a bad idea that I'd be better off
without.

So I will have to await the honor of a future reply before I can
comment on this entirely.

Note however that my declarations for cpumask_t and nodemask_t types
are essentially:

typedef struct { DECLARE_BITMAP(_m,NR_CPUS); } cpumask_t
typedef struct { DECLARE_BITMAP(_m,MAX_NUMNODES); } nodemask_t

which look suspicously like what you recommend. Would you rather
I used "bits" for "_m", and removed the typedef? I inherited the
cpumask_t typedef from those who came before me, but would be quite
open to a recommendation to change the style of declarations from:

cpumask_t foo;
to:
struct cpumask foo;

> 4) Any new operations you need to write can be used by anyone who needs
> a bitmap, whether in a struct or not.

This is still entirely true - new operations (such as the intersects,
subset, xor and andnot that I'm proposing here) _are_ added to bitmaps
or bitops, and directly usable there, as before. That layer is entirely
unchanged, except:

1) I added four bitmap ops (though I might abandon xor, andnot)
2) I added a 'theory' comment, resulting from my confusions
in properly understanding what Bill Irwin had done.
3) I added a couple of missing 'const' attributes
4) I changed bitmap_complement to zero out the unused high bits.

I'm concerned that you thought otherwise, and suspect a serious failure
to communicate on my part.

Again - thank-you for reviewing this. I await your further feedback.

P.S. - Perhaps you real concern here is that I'm not going far enough.

Instead of just putting the cpumask_t internals on a diet
and allowing for a nodemask_t that shares implementation,
rather I should change both outright, to the more explicit
style that is perhaps what you have in mind:

struct cpumap { DECLARE_BITMAP(bits, NR_CPUS); };
struct cpumask s, d1, d2;
bitmap_or(s.bits, d1.bits, d2.bits);

Nuke cpumask_t, nodemask_t and the existing cpus_or,
nodes_or, ... and similar such 60 odd various macros
specific to these two types.

I rather like that approach. It would build nicely on
what I've done so far. It would be a more intrusive patch,
changing all declarations and operations on cpumasks.

If I thought it would sell, I would be most interested.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-05 07:43:01

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-04-05 at 17:05, Paul Jackson wrote:
> P.S. - Perhaps you real concern here is that I'm not going far enough.
>
> Instead of just putting the cpumask_t internals on a diet
> and allowing for a nodemask_t that shares implementation,
> rather I should change both outright, to the more explicit
> style that is perhaps what you have in mind:
>
> struct cpumap { DECLARE_BITMAP(bits, NR_CPUS); };
> struct cpumask s, d1, d2;
> bitmap_or(s.bits, d1.bits, d2.bits);
>
> Nuke cpumask_t, nodemask_t and the existing cpus_or,
> nodes_or, ... and similar such 60 odd various macros
> specific to these two types.
>
> I rather like that approach. It would build nicely on
> what I've done so far. It would be a more intrusive patch,
> changing all declarations and operations on cpumasks.

Yes, this is exactly the point I was incoherently groping towards.
Throw away mask.h, and make any needed enhancements to bitmap.h (eg.
inlines which check for the case of len <= BITS_PER_LONG).

Then change cpumask_t and nodemask_t ops to inlines which just use
bitmap.h, and get rid of the
asm-generic/cpumask_optimized_for_large_smp_with_sparse_array_and_small_stack.h etc. and then finally look at how ugly it would be to change users to directly using the bitmap.h functions on cpumasks.

> If I thought it would sell, I would be most interested.

I guess I'd like to see the cost of perfection. It could just be that
the current cpumask_t headers creep me out...

Thanks!
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-04-05 07:48:34

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

In my previous reply to Rusty, I wrote:
> struct cpumap { DECLARE_BITMAP(bits, NR_CPUS); };
> struct cpumask s, d1, d2;
> bitmap_or(s.bits, d1.bits, d2.bits);

Brain dead code alert (as well as a typo alert for the 'cpumap') - that
last line needs to be:

bitmap_or(s.bits, d1.bits, d2.bits, NR_CPUS);

Which is why the 60 odd cpumask and nodemask specific operation macros
exist, to avoid having to explicitly specify the bitsize on each call

In other words, I understand that the following three possibilities
exist for coding these masks:

/* specify bitsize both on declarations and operations */
struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); };
struct cpumask s, d1, d2;
bitmap_or(s.bits, d1.bits, d2.bits, NR_CPUS); /* explicit bitsize */

or:

/* specify bitsize on declaration; use specialized operations */
struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); };
struct cpumask s, d1, d2;
cpus_or(s.bits, d1.bits, d2.bits); /* 'cpu' implies NR_CPUS */

or:

/* carry the bitsize in the structure [pseudo C alert] */
struct mask { int nbits = NR_CPUS; DECLARE_BITMAP(bits, NR_CPUS); };
struct mask s, d1, d2;
mask_or(s.bits, d1.bits, d2.bits); /* mask_* ops get size from struct */

Am I missing any choices? Which do you prefer?

I understand that the kernel currently does the 2nd choice, encoding the
bitsize in the operation name.

My personal preference is to continue doing this 2nd choice.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-05 08:10:30

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Rusty wrote:
> Throw away mask.h, and make any needed enhancements to bitmap.h (eg.
> inlines which check for the case of len <= BITS_PER_LONG).

That seems like a good, easy and localized change. Ok,
I will see if how it looks after I try to code it.


> get rid of the
> asm-generic/cpumask_optimized_for_large_smp_with_sparse_array_and_small_stack.h

My mask patch does this.


> then finally look at how ugly it would be to change users to
> directly using the bitmap.h functions on cpumasks.

That boils down to a very straightforward question. Do we ask
them to write:

cpus_or(s.bits, d1.bits, d2.bits)

or:

bitmap_or(s.bits, d1.bits, d2.bits, NR_CPUS);

I prefer the first choice. It requires a thin cpumask.h header
to wrap the bitmap ops, and add the final NR_CPUS to each one.

I believe the 2nd choice introduces one more way to screw up,
by specifying the wrong bitsize. I've got enough such ways
already - don't need more.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 05:00:25

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Mon, 2004-04-05 at 18:08, Paul Jackson wrote:
> > get rid of the
> > asm-generic/cpumask_optimized_for_large_smp_with_sparse_array_and_small_stack.h
>
> My mask patch does this.

Yes, which is why I'm such a fan.

> > then finally look at how ugly it would be to change users to
> > directly using the bitmap.h functions on cpumasks.
>
> That boils down to a very straightforward question. Do we ask
> them to write:
>
> cpus_or(s.bits, d1.bits, d2.bits)
>
> or:
>
> bitmap_or(s.bits, d1.bits, d2.bits, NR_CPUS);
>
> I prefer the first choice. It requires a thin cpumask.h header
> to wrap the bitmap ops, and add the final NR_CPUS to each one.

Well, you'd do presumably:
cpus_or(&s, &d1, &d2);

And make cpus_or() an inline so you get typechecking.

But my rough grepping reveals that there are around 420 uses of all the
cpu macros throughout the kernel. But if you merely implement:

any_online_cpu
cpumask_of_cpu
cpu_isset
cpu_set
cpu_clear

You'll have covered about 300 of them. I don't think a complete
abstraction is actually required or desirable: if someone wants to do
something tricky (like anding, oring, etc), there's nothing wrong with
accessing cpu.bits.

Thanks!
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-04-06 06:08:42

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]


[[ I removed Dave Hansen <[email protected]> from the Cc list - per his Mar 29 request. ]]

> > My mask patch does this., [email protected]
>
> Yes, which is why I'm such a fan.

Glad to be of service ...


> > That boils down to a very straightforward question. Do we ask
> > them to write:
> >
> > cpus_or(s.bits, d1.bits, d2.bits)
> >
> > or:
> > ...
>
> Well, you'd do presumably:
> cpus_or(&s, &d1, &d2);

Argh - I was being sloppy. I meant to write just what we have now:

cpus_or(d, s1, s2);

- trivial typo fix - my dst and src were backwards
- more significant - no ".bits", no "&"

A patch that goes through 300 lines of kernel source code adding one to
three ampersands per line strikes me as ugly and pointless. Hopefully
it was just my sloppiness that led to such a possibility, not a seriously
proposed change on your part.


> And make cpus_or() an inline so you get typechecking.

Yes - typechecking in this situation is good.


> You'll have covered about 300 of them. I don't think a complete
> abstraction is actually required or desirable:

I suspect we've hit on our first area of actual disagreement here.

You observe that providing inline wrappers for the 5 most commonly
used cpumask macros would cover 300 of the 420 uses. The other 23
or so macros are less commonly used. Sounds about right ...

I prefer to provide all 28 macros. I don't see a cost, but do see
a gain.

The gain is that someone coding some operations on a cpumask doesn't
have to go fishing around in multiple places to find out what ops
are supported, which ops are in nice "cpus_*" form, and which are
obtained by accessing the underling bitmap/bitop operations.

Doing only 5 of the 28, because the other 23 are less frequently used,
seems to me like a false optimization. Either provide no cpumask
abstraction, or provide a more-or-less complete one. Half baked layers
create further overload on my limited brain capacity.

Also eliminating the 23 less frequently used cpumask operations would
generate another ugly and pointless kernel patch, recoding another
120 lines of code (usually arch specific, sometimes touchy, difficult
to get tested, and likely to break someone in ways not obvious at first).

Just to be specific, a typical implementation for such an operator would look like:

typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;

static inline void cpus_or(cpumask_t d, const cpumask_t s1, const cpumask_t s2)
{
bitmap_or(d.bits, s1.bits, s2.bits, NR_CPUS);
}

It would be used exactly as it is today:

cpumask_t x, y, z;
cpus_or(x, y, z);

Other than perhaps changing "cpumask_t foo;" to "struct cpumask foo", I
don't see anything in the 420 lines of code that invokes cpumask
operations that I think would gain from wholesale changes.

So ... tell me again what is to be gained by discarding 23 of the 28
cpumask operators?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 06:24:04

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Paul Jackson wrote:

> Other than perhaps changing "cpumask_t foo;" to "struct cpumask foo", I
> don't see anything in the 420 lines of code that invokes cpumask
> operations that I think would gain from wholesale changes.

I like cpumask_t. It isn't conceptually a structure, is it?
And you should not need to look inside it or use it with
anything other than using the cpumask interface, right?

2004-04-06 06:35:19

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Nick wrote:
> I like cpumask_t.

Ok - one vote for cpumask_t.

I could go either way. I see that 'struct foo' is more common than
'foo_t' in kernel code.

I will not actually propose to change cpumask_t to 'struct cpumask'
unless others want it. Without a half-way decent reason, it would just
be stupid churning. But I wouldn't put up much resistance to such a
change.


> And you should not need to look inside it or use it with
> anything other than using the cpumask interface, right?

In my view, right - you (seldom) need to look inside. From what I can
make of Rusty's statements so far, he apparently has a different view ;).

We'll see.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 06:40:12

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Tue, 2004-04-06 at 16:06, Paul Jackson wrote:
> > You'll have covered about 300 of them. I don't think a complete
> > abstraction is actually required or desirable:
>
> I suspect we've hit on our first area of actual disagreement here.
>
> You observe that providing inline wrappers for the 5 most commonly
> used cpumask macros would cover 300 of the 420 uses. The other 23
> or so macros are less commonly used. Sounds about right ...
>
> I prefer to provide all 28 macros. I don't see a cost, but do see
> a gain.

Because I believe one should *always* resist the urge to write
infrastructure. Wait until the users of your functionality gather out
the front of your house with torches because they're all sick of the
burden of using existing infrastructure.

Really.

I don't even want to learn 28 bitops primitives. I certainly don't want
to learn 28 nodemask and 28 cpumask primitives.

I prefer a single set of operators, but preempting complaints means
figuring out what people want. I'd be happy with the obviously
cpu-specific ones, myself:

first_cpu
next_cpu
any_online_cpu
cpumask_of_cpu

> The gain is that someone coding some operations on a cpumask doesn't
> have to go fishing around in multiple places to find out what ops
> are supported

Agreed. That's a big benefit of cutting it out altogether.

> Just to be specific, a typical implementation for such an operator would look like:
>
> typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
>
> static inline void cpus_or(cpumask_t d, const cpumask_t s1, const cpumask_t s2)
> {
> bitmap_or(d.bits, s1.bits, s2.bits, NR_CPUS);
> }

That'd be a noop, I think.

Cheers,
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-04-06 06:47:14

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> That'd be a noop, I think.

Huh?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 06:49:29

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Paul Jackson wrote:
> Nick wrote:
>
>>I like cpumask_t.
>
>
> Ok - one vote for cpumask_t.
>
> I could go either way. I see that 'struct foo' is more common than
> 'foo_t' in kernel code.
>
> I will not actually propose to change cpumask_t to 'struct cpumask'
> unless others want it. Without a half-way decent reason, it would just
> be stupid churning. But I wouldn't put up much resistance to such a
> change.
>

I think Linus likes keeping struct around if something is
a collection of items both conceptually and in its usage.
And prefers typedefs for things that are single entities
outside their implementation.


>
>
>>And you should not need to look inside it or use it with
>>anything other than using the cpumask interface, right?
>
>
> In my view, right - you (seldom) need to look inside. From what I can
> make of Rusty's statements so far, he apparently has a different view ;).
>

I prefer your complete API. I don't think there is any point
doing the abstraction at all if you only have half the API.

2004-04-06 06:55:24

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Rusty Russell wrote:
> On Tue, 2004-04-06 at 16:06, Paul Jackson wrote:
>
>>>You'll have covered about 300 of them. I don't think a complete
>>>abstraction is actually required or desirable:
>>
>>I suspect we've hit on our first area of actual disagreement here.
>>
>>You observe that providing inline wrappers for the 5 most commonly
>>used cpumask macros would cover 300 of the 420 uses. The other 23
>>or so macros are less commonly used. Sounds about right ...
>>
>>I prefer to provide all 28 macros. I don't see a cost, but do see
>>a gain.
>
>
> Because I believe one should *always* resist the urge to write
> infrastructure. Wait until the users of your functionality gather out
> the front of your house with torches because they're all sick of the
> burden of using existing infrastructure.
>
> Really.
>
> I don't even want to learn 28 bitops primitives. I certainly don't want
> to learn 28 nodemask and 28 cpumask primitives.
>

If they are all equivalent operations, it is a lot saner
than having some "common" half ot the API available to
your abstract type, isn't it?

Surely it would have to be all or nothing...

2004-04-06 07:01:38

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Nick wrote:
> I prefer your complete API. I don't think there is any point
> doing the abstraction at all if you only have half the API.

Thanks. Any ideas how to persuade Rusty?

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 07:03:02

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Rusty wrote:
> Agreed. That's a big benefit of cutting it out altogether.

And if it wasn't that this would result in requiring every call to
specify the number of bits, resulting in one more chance for a stupid
error, and one less for type checking, I'd vote to remove
cpumask_t/nodemask_t, as I understand you would prefer.

One should resist infrastructure if one can nuke a layer entirely.

But half-baked layers introduce one more form of difficult to
remember inconsistency.

Hmmm ...

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 07:05:11

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Nick wrote:
>> I like cpumask_t.

On Mon, Apr 05, 2004 at 11:34:15PM -0700, Paul Jackson wrote:
> Ok - one vote for cpumask_t.
> I could go either way. I see that 'struct foo' is more common than
> 'foo_t' in kernel code.
> I will not actually propose to change cpumask_t to 'struct cpumask'
> unless others want it. Without a half-way decent reason, it would just
> be stupid churning. But I wouldn't put up much resistance to such a
> change.

The reason this wasn't done up-front was that the presence of a "struct"
constructor in the type caused concern about the operational semantics of
argument-passing being less efficient as they would be for arithmetic
types. Since the type had to be (a) ambiguous and (b) potentially
arithmetic the typedef was mandated by this.

If the requirements for ambiguity and arithmetic types are lifted, this
would be possible.

If this makes you happier (though I can't imagine why SGI and you
aren't more concerned with functional or performance issues, e.g.
mmlist_lock or tasklist_lock contention, or headless node or mixed cpu
speed support, or fully automatic large TLB entry use / superpages)
then by all means. Just (a) be careful so you don't make things explode
for some arch; arches rely on this stuff so please run things by arch
maintainers when rearranging their code and (b) please, please, for the
love of $DEITY, **NOT** "struct cpumask_struct".


Nick wrote:
>> And you should not need to look inside it or use it with
>> anything other than using the cpumask interface, right?

On Mon, Apr 05, 2004 at 11:34:15PM -0700, Paul Jackson wrote:
> In my view, right - you (seldom) need to look inside. From what I can
> make of Rusty's statements so far, he apparently has a different view ;).
> We'll see.

You should also bear in mind that the current implementations of these
operations use a macro calling convention, thereby altering their output
operands as a side-effect without call-by-reference. I've lost the
intestinal fortitude to investigate whether you already handled this,
but altering the calling convention to e.g. true call-by-value would
have to sweep all callers to act on the output operands compatibly anyway.


-- wli

2004-04-06 07:10:40

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Nick wrote:
> I think Linus likes keeping struct around if something is ...

makes sense - thanks

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 07:24:38

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Tue, 2004-04-06 at 16:45, Paul Jackson wrote:
> > That'd be a noop, I think.
>
> Huh?

You passed the structs by value: you wanted to pass the address.

Linus dislikes typedefs for various reasons: if it's actually a struct
on some archs and not on others, he likes it. Otherwise he historically
prefers the struct. However, Linus is not the most reliable barometer.

Ingo added task_t because it make some lines fit in the sched.c code,
for example. We removed list_t in 2.5.34.

I dislike typedefs because you have an extra name for something, and can
predeclare structs, which saves gratuitous header file inclusion.

Hope that clarifies,
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-04-06 07:34:16

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Don't worry, Bill. It doesn't look like anyone wants to change
cpumask_t to struct cpumask. I just wasn't objecting to it - shouldn't
even have mentioned it.

> so please run things by arch maintainers

I'll be doing that. And I'm sure Andrew wouldn't consider it otherwise.

> for the love of $DEITY, **NOT** "struct cpumask_struct".

I think we can all heartily agree to that advice.

> You should also bear in mind that the current implementations of these
> operations use a macro calling convention, thereby altering their output
> operands as a side-effect without call-by-reference.

Ah - I think you just explained to me Rusty's 'That'd be a noop', to which
I had responded 'Huh?'. Thanks.

And the added ampersands that Rusty added a couple of messages before that.

Duh ... smacking forehead.

Output operands need to be passed by pointer (which fact may or may not
be hidden in a macro ...).

At the risk of embarrassing myself again in public, how about this:

typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;

#define cpus_or(d,s1,s2) _cpus_or(&d, &s1, &s2)

static inline void _cpus_or(cpumask_t *d, const cpumask_t *s1, const cpumask_t *s2)
{
bitmap_or(d->bits, s1->bits, s2->bits, NR_CPUS);
}

It would be used exactly as it is today:

cpumask_t x, y, z;
cpus_or(x, y, z);



--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 07:35:14

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Nick, responding to Rusty:
> > I certainly don't want
> > to learn 28 nodemask and 28 cpumask primitives.
> >
>
> If they are all equivalent operations,

In this case, they are equivalent. The nodemask.h header in this patch
series is very close to being:

< cpumask.h sed -e 's/cpu/node/' -e 's/NR_CPUS/MAX_NUMNODES/' > nodemask.h

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 07:35:06

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> You passed the structs by value: you wanted to pass the address.

Ah - yes. Blushing.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-06 10:43:38

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Rusty - thank-you very much for your constructive feedback so far.

Seems to me that we are in agreement that slimming down the
internals of cpumask_t is worth proceeding with, but not on possible
changes to the cpumask API seen by the rest of the kernel.

Following your suggestion, I have a new bitmap.c/bitmap.h working in my
workarea, that pulls in the changes I had done before in mask.h, such as
special casing bitmaps of a single word at compile time to use native C
ops.

A sample of the code in my include/linux/bitmap.h:

static inline void bitmap_and(unsigned long *d, const unsigned long *s1,
const unsigned long *s2, int nbits)
{
if (nbits <= BITS_PER_LONG)
d[0] = s1[0] & s2[0];
else
_bitmap_and(d, s1, s2, nbits);
}

And in lib/bitmap.c, the routine that was called bitmap_and() is now
called _bitmap_and(), and is (as before) not inlined.

I will continue with the coding to remove the mask ADT header file
include/linux/mask.h from my patch set, recoding cpumask.h (and
Matthew's nodemask.h) directly on top of bitmap/bitops, and defining
cpumask_t as:

typedef struct { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;

I'd like to table for the moment discussions of changing the cpumask API
seen by the rest of the kernel code. Some of us like what we have, and
we don't have anything resembling a consensus on something else. So
unless you want to agree that whatever I said is entirely right <grin>,
we lack the concensus needed to make pervasive cpumask API changes.

The above changes to the innards of cpumask will open the door to the
alternatives being considered (struct vs typedef, full vs partial vs no
extra 'cpumask' api, explicit vs implicit '&' pass by reference).

I acknowledge once again Bill Irwin's notice that this presents some
arch-specific risks. I've done some reasonable work resolving these for
sparc, i386 and ia64. But I still need to seek further feedback and
approval/disapproval from the arch maintainers. My current expectation
is that any remaining arch-specific issues will be narrow enough, and
that the gains from this infrastructure cleanup are broad enough, that
we will be able to resolve the arch-specific issues without compromising
the overall cleanup. Only time will tell for sure.

For now I expect to preserve the current implicit pass by reference that
(some of) the existing cpumask implementations used, I will preserve all
28 or so cpumask ops with their current signature, and I will likely use
a layer of static inlines to increase type safety.

That is, cpumasks will be bitmap arrays wrapped in a struct, and the ops
on them typically will be bitmap ops wrapped in inlines wrapped in macros.

I agree that there's a whole lot of wrapping going on there ... ;).

Continued feedback welcome.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-07 00:03:25

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

On Tue, 2004-04-06 at 20:40, Paul Jackson wrote:
> Rusty - thank-you very much for your constructive feedback so far.
>
> Seems to me that we are in agreement that slimming down the
> internals of cpumask_t is worth proceeding with, but not on possible
> changes to the cpumask API seen by the rest of the kernel.

OK, cool. We can have that debate later.

> static inline void bitmap_and(unsigned long *d, const unsigned long *s1,
> const unsigned long *s2, int nbits)
> {
> if (nbits <= BITS_PER_LONG)
> d[0] = s1[0] & s2[0];
> else
> _bitmap_and(d, s1, s2, nbits);
> }

Two suggestions:
1) I think you only want the fastpath when it's eliminated by the
compiler, so perhaps:
if (__builtin_constant_p(nbits) && nbits <= BITS_PER_LONG)

2) The normal kernel naming scheme is two underscores (__bitmap_and),
probably because it's clearer visually.

Thanks,
Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

2004-04-07 01:51:17

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

> OK, cool. We can have that debate [cpumask api changes] later.

Good - I look forward to it ;).

> if (__builtin_constant_p(nbits) && nbits <= BITS_PER_LONG)

Good idea. I'll play around with it, and see what goodness I can
find in it.

> The normal kernel naming scheme is two underscores (__bitmap_and),

Ok - two it is.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373

2004-04-07 03:58:12

by Paul Jackson

[permalink] [raw]
Subject: Re: [PATCH] mask ADT: new mask.h file [2/22]

Rusty suggested:
> 1) I think you only want the fastpath when it's eliminated by the
> compiler, so perhaps:
> if (__builtin_constant_p(nbits) && nbits <= BITS_PER_LONG)

It's not quite a cut and dried decision.

Define a 'small' system to be one where NR_CPUS <= BITS_PER_LONG, and a
'large' system to be those that aren't small.

If one had a bitmap operator called in a performance critical path on a
'small' system with a non-constant (unknown to the optimizer) bitmap
size, then one would _not_ want the above check for builtin_constant.

Leaving out the buildin_constant check results in code that checks the
variable size at runtime, performs the fast inline instructions on a
single word if it fits, else jumps to an out-of-line block of code that
calls the real __bitmap_op() function.

Since on a 'small' system, the bitmap size probably fits in a word, and
since on a performance critical path, it's worth checking for the fast
inline instruction opportunity, avoiding a jump out of line and avoiding
a real function call, the code obtained by leaving out the check for
builtin_constant is ideal.

However ... it costs you a half dozen machine instructions of text
space, for the out-of-line code block to handle the slow case that calls
the real __bitmap_op() function.

Since almost all systems are small, and since almost all bitmap operations
are not on critical paths, and since almost all bitmap operations are
called with a constant bitmap size, these half dozen machine instructions
are almost never worth a slug of warm spit.

Better to do just as Rusty suggests - if called with a non-constant
size, just say screw it and call the __bitmap_op() routine. While that
call might not have been necessary (might even be an 'issue' in a
performance critical path), it's not worth the half-dozen machine
instructions to find out.

So I guess the real question is:

Is it worth the slug of warm source code spit, the
extra __builtin_constant_p(nbits) condition, to get rid
of these six instructions for each bitmap_op() call
made with a non-constant bitmap size?

Or the real real question - is this discussion worth a slug of warm
lkml posts ... ;)?

===

Aha - I just built an 8 CPU SMP i386 config with my latest stuff.
Exactly one call appears to any __bitmap_* function, in the routine
bitmap_parse() of lib/bitmap.c:

bitmap_shift_right(maskp, maskp, CHUNKSZ, nmaskbits);

Ok - change this to the following, and save six text instructions
testing for the possibility that nmaskbits < BITS_PER_LONG:

__bitmap_shift_right(maskp, maskp, CHUNKSZ, nmaskbits);

There .. now the inline bitmap_* operators are _always_ seeing
constant bitmap sizes (for this exhaustive sampling of size one ;).

Conclusions:

1) To heck with the extra __builtin_constant_p(nbits) condition.

It's not worth polluting the source code with stuff to futz
with a six instruction space/time tradeoff in some situation
that doesn't yet exist, and if it did exist, might or might
not want such a tradeoff.

2) No - this post wasn't worth a slug of warm spit ;).

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373