Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753371AbbBZGUH (ORCPT ); Thu, 26 Feb 2015 01:20:07 -0500 Received: from linuxhacker.ru ([217.76.32.60]:53671 "EHLO fiona.linuxhacker.ru" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1751352AbbBZGUD (ORCPT ); Thu, 26 Feb 2015 01:20:03 -0500 From: green@linuxhacker.ru To: Rusty Russell , Andrew Morton , "David S. Miller" Cc: linux-kernel@vger.kernel.org, Oleg Drokin Subject: [PATCH 0/2] incorrect cpumask behavior with CPUMASK_OFFSTACK Date: Thu, 26 Feb 2015 01:19:09 -0500 Message-Id: <1424931551-11757-1-git-send-email-green@linuxhacker.ru> X-Mailer: git-send-email 2.1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2571 Lines: 62 From: Oleg Drokin I just got a report today from Tyson Whitehead that Lustre crashes when CPUMASK_OFFSTACK is enabled. A little investigation revealed that this code: cpumask_t mask; ... cpumask_copy(&mask, topology_thread_cpumask(0)); weight = cpus_weight(mask); that was supposed to calculate number of cpu siblings/partitions returns a crazy high number over 3000 which is impossible as I only have 8 cpu cores on my system. So after a bit of digging I found out that: cpumask_copy only copies up to nr_cpumask_bits (actual number of cpus I have in the system), where as cpumask_weight actully tries to count bits up to NR_CPUS. Not only calculating up to NR_CPUS is wasteful in this case, and since we know how many cpus we have in the system - it only makes sense to calculate only this much anyway, it's wrong because the copy only copied 8 bits to our variable and the rest of it is some random stack garbage. So I propose two patches here, the first one I am more certain about - operations that operate on current cpuset like cpus_weight, but also cpus_empty, cpus_$LOGICALop cpus_$BINARYop are converted from NR_CPUS to nr_cpumask_bits (this is ok when CONFIG_CPUMASK_OFFSTACK is not set as it's then defined to NR_CPUS anyway). I am leaving __cpus_setall __cpus_clear out of it as these two look like they deal with entire set and it would be useful for them to operate on all NR_CPUS bits for the case if more cpus are added later and such. The second patch that I am not sure if we wnat, but it seems to be useful until struct cpumask is fully dynamic is to convert what looks like whole-set operations e.g. copies, namely: cpumask_setall, cpumask_clear, cpumask_copy to always operate on NR_CPUS bits to ensure there's no stale garbage left in the mask should the cpu count increases later. I checked the code and allocating cpumasks on stack is not all that uncommon in the code, so this should be a worthwhile fix. Please consider. Oleg Drokin (2): cpumask: Properly calculate cpumask values cpumask: make whole cpumask operations like copy to work with NR_CPUS bits include/linux/cpumask.h | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/