Clean up perfctr/virtual by using the new cpus_andnot() operation
instead of messing with cpus_complement() and a new temporary.
Signed-off-by: Mikael Pettersson <[email protected]>
diff -ruN linux-2.6.7-rc3-mm1/drivers/perfctr/virtual.c linux-2.6.7-rc3-mm1.perfctr-update/drivers/perfctr/virtual.c
--- linux-2.6.7-rc3-mm1/drivers/perfctr/virtual.c 2004-06-09 19:38:38.000000000 +0200
+++ linux-2.6.7-rc3-mm1.perfctr-update/drivers/perfctr/virtual.c 2004-06-09 21:04:33.000000000 +0200
@@ -403,13 +403,11 @@
return -EFAULT;
if (control.cpu_control.nractrs || control.cpu_control.nrictrs) {
- cpumask_t tmp, old_mask, new_mask, tmp1;
+ cpumask_t tmp, old_mask, new_mask;
tmp = perfctr_cpus_forbidden_mask;
- cpus_complement(tmp1, tmp);
- tmp = tmp1;
old_mask = tsk->cpus_allowed;
- cpus_and(new_mask, old_mask, tmp);
+ cpus_andnot(new_mask, old_mask, tmp);
if (cpus_empty(new_mask))
return -EINVAL;
> Clean up perfctr/virtual by using the new cpus_andnot() operation
Neat.
Do you still need "tmp" ? Perhaps you could further add the
following patch (untested, unbuilt, ...).
This saves copies and stack space for one cpumask (that's
512 bits on my SN2 systems).
Signed-off-by: Paul Jackson <[email protected]>
Index: 2.6.7-rc3-mm1/drivers/perfctr/virtual.c
===================================================================
--- 2.6.7-rc3-mm1.orig/drivers/perfctr/virtual.c 2004-06-09 15:34:34.000000000 -0700
+++ 2.6.7-rc3-mm1/drivers/perfctr/virtual.c 2004-06-09 15:38:32.000000000 -0700
@@ -403,11 +403,10 @@
return -EFAULT;
if (control.cpu_control.nractrs || control.cpu_control.nrictrs) {
- cpumask_t tmp, old_mask, new_mask;
+ cpumask_t old_mask, new_mask;
- tmp = perfctr_cpus_forbidden_mask;
old_mask = tsk->cpus_allowed;
- cpus_andnot(new_mask, old_mask, tmp);
+ cpus_andnot(new_mask, old_mask, perfctr_cpus_forbidden_mask);
if (cpus_empty(new_mask))
return -EINVAL;
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373
Or ... pushing the point further ... one _could_ remove the old_mask as
well. However I think this makes the code less clear, even if it does
save a stack copy of a cpumask_t. So I'm of mixed feelings on this
patch, edging slightly toward negative. Same utter lack of testing as
the previous patch.
Signed-off-by: Paul Jackson <[email protected]>
Index: 2.6.7-rc3-mm1/drivers/perfctr/virtual.c
===================================================================
--- 2.6.7-rc3-mm1.orig/drivers/perfctr/virtual.c 2004-06-09 15:38:32.000000000 -0700
+++ 2.6.7-rc3-mm1/drivers/perfctr/virtual.c 2004-06-09 15:53:06.000000000 -0700
@@ -403,14 +403,14 @@
return -EFAULT;
if (control.cpu_control.nractrs || control.cpu_control.nrictrs) {
- cpumask_t old_mask, new_mask;
+ cpumask_t new_mask;
- old_mask = tsk->cpus_allowed;
- cpus_andnot(new_mask, old_mask, perfctr_cpus_forbidden_mask);
+ cpus_andnot(new_mask, tsk->cpus_allowed,
+ perfctr_cpus_forbidden_mask);
if (cpus_empty(new_mask))
return -EINVAL;
- if (!cpus_equal(new_mask, old_mask))
+ if (!cpus_equal(tsk->cpus_allowed, new_mask))
set_cpus_allowed(tsk, new_mask);
}
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373
Paul Jackson writes:
> > Clean up perfctr/virtual by using the new cpus_andnot() operation
>
> Neat.
>
> Do you still need "tmp" ? Perhaps you could further add the
> following patch (untested, unbuilt, ...).
>
> This saves copies and stack space for one cpumask (that's
> 512 bits on my SN2 systems).
>
> Signed-off-by: Paul Jackson <[email protected]>
>
> Index: 2.6.7-rc3-mm1/drivers/perfctr/virtual.c
> ===================================================================
> --- 2.6.7-rc3-mm1.orig/drivers/perfctr/virtual.c 2004-06-09 15:34:34.000000000 -0700
> +++ 2.6.7-rc3-mm1/drivers/perfctr/virtual.c 2004-06-09 15:38:32.000000000 -0700
> @@ -403,11 +403,10 @@
> return -EFAULT;
>
> if (control.cpu_control.nractrs || control.cpu_control.nrictrs) {
> - cpumask_t tmp, old_mask, new_mask;
> + cpumask_t old_mask, new_mask;
>
> - tmp = perfctr_cpus_forbidden_mask;
> old_mask = tsk->cpus_allowed;
> - cpus_andnot(new_mask, old_mask, tmp);
> + cpus_andnot(new_mask, old_mask, perfctr_cpus_forbidden_mask);
Doesn't work because cpus_andnot() requires all three parameters
to be lvalues. In UP and PowerPC builds, perfctr_cpus_forbidden_mask
is #define:d to CPU_MASK_NONE to allow client-side optimisations.
Making it always be a variable will slow down UP and PowerPC with
unoptimisable cpumask tests; alternatively I'll have to wrap my
cpumask uses with optimisation macros and/or #ifdefs.
/Mikael
Mikael wrote:
> > - cpus_andnot(new_mask, old_mask, tmp);
> > + cpus_andnot(new_mask, old_mask, perfctr_cpus_forbidden_mask);
>
> Doesn't work because cpus_andnot() requires all three parameters
> to be lvalues. ... CPU_MASK_NONE ...
I think your other fix (also done by Bill Irwin), make the above possible:
#define CPU_MASK_NONE \
-{ { \
+((cpumask_t) { { \
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373
Mikael wrote:
>> Doesn't work because cpus_andnot() requires all three parameters
>> to be lvalues. ... CPU_MASK_NONE ...
On Thu, Jun 10, 2004 at 09:01:57AM -0700, Paul Jackson wrote:
> I think your other fix (also done by Bill Irwin), make the above possible:
> #define CPU_MASK_NONE \
> -{ { \
> +((cpumask_t) { { \
I've posted this at least twice, I think once in isolation for some
driver (MSI?) and once as part of the Alpha fixes.
Please get some cross-compilers together so we don't have every non-x86
arch exploding at once. Alpha vs. cpu_possible_map was horrendous.
-- wli
On Thu, 10 Jun 2004 09:03:50 -0700 William Lee Irwin III wrote:
| Mikael wrote:
| >> Doesn't work because cpus_andnot() requires all three parameters
| >> to be lvalues. ... CPU_MASK_NONE ...
|
| On Thu, Jun 10, 2004 at 09:01:57AM -0700, Paul Jackson wrote:
| > I think your other fix (also done by Bill Irwin), make the above possible:
| > #define CPU_MASK_NONE \
| > -{ { \
| > +((cpumask_t) { { \
|
| I've posted this at least twice, I think once in isolation for some
| driver (MSI?) and once as part of the Alpha fixes.
|
| Please get some cross-compilers together so we don't have every non-x86
| arch exploding at once. Alpha vs. cpu_possible_map was horrendous.
or submit such patches to OSDL PLM. PLM will apply the patch
(to whatever base you say, although you may have to supply the
"base" also; linus and -mm bases are already there)
and then try to cross-compile it on ia32, ia64, ppc32, ppc64, alpha,
sparc32, sparc64, and x86_64.
E.g., here's the results of 2.6.7-rc3-bk3:
https://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=3010
--
~Randy
William Lee Irwin wrote:
> Please get some cross-compilers together so we don't have every non-x86
> arch exploding at once. Alpha vs. cpu_possible_map was horrendous.
Yes sir.
Randy Dunlap wrote:
> or submit such patches to OSDL PLM.
That too.
I'm reminded of the old saw:
Fools rush in where angels fear to tread.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.650.933.1373