Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3439741imm; Sun, 13 May 2018 11:22:13 -0700 (PDT) X-Google-Smtp-Source: AB8JxZou2rIhm7kfBUOJIFN7A94XYTiKpI8ovZqm8ooTYo1lZZmmVLVdT3XW2dmTmkonRoRvuEPd X-Received: by 2002:a65:6553:: with SMTP id a19-v6mr5841012pgw.3.1526235733430; Sun, 13 May 2018 11:22:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526235733; cv=none; d=google.com; s=arc-20160816; b=El+M6ofcD7MwET62w68qvzq0fkcboaH7LViRhmzLAtR/8ejMHdoM8oO7tDUiqZ8avI WUFCaJ6Geir7BITVKOD4qgJMiNEZ/uLCNxKrHi6eWFGSl5DfTccZFJfPUwfdcdMPWfOc tyEgGW+H7HhBCR6Boo4nDQm2zWMSTiu3VGjkC+qgvpMyJr+f+3wnnd0UImjOQmpQWMEd 0oZKWiVsJxDBaK+22Gvm7HsdysBxuCeNfrUegh2ZnkS5rtcjYl8NkJucNUQiWQpP/VHu uAjGgiaST9YMzgvPPTmLyms3tvK+PGPold8rRtKHDsfVRoCN14801HynERtFuELpmpHO rcwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=zx2v4yHSWYJyoKxqczZgnNDQRHQrnGhT4P1bdzTvX6g=; b=TyXgH2ahu3e2HPNK1hbgtsmEbhhkuXZvad53Kud55mbc/cawy4hc8Wb+fl2Db0wYru fnO5poJhqtafpwkdv/rk+yD1IjBXdvaxIHBTs4roeFUoXG77MSW2EUXNeV0AmV8Vky4/ 4NAIC6Ky2VAiMtcreD2nZMi98qtKMH3UdSuqIbMMA+IZUyHlYn/gZfTdwQ6GxeWyDx1z RT3+gsHld+pgjn439pOaohfXnJXo1XMm/JOOqx8UTqJMYWl9LrsQXsiv8LMau+dd0vD1 Asw2gfFLnU5lC82/cShm2oW4BZEbFXkRkSu0sNLHUqxQUN314iNJDKBkfv8yJHnmk0A5 zc3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=AsPMQoTb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u3-v6si2224400plb.2.2018.05.13.11.21.56; Sun, 13 May 2018 11:22:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=AsPMQoTb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751962AbeEMSVo (ORCPT + 99 others); Sun, 13 May 2018 14:21:44 -0400 Received: from mail-it0-f48.google.com ([209.85.214.48]:54523 "EHLO mail-it0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751295AbeEMSVn (ORCPT ); Sun, 13 May 2018 14:21:43 -0400 Received: by mail-it0-f48.google.com with SMTP id z6-v6so7596469iti.4 for ; Sun, 13 May 2018 11:21:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zx2v4yHSWYJyoKxqczZgnNDQRHQrnGhT4P1bdzTvX6g=; b=AsPMQoTbxHKeRIDk+u/WvCYkCAcMQWYG3nryCSFFpRQ32ruM93x7osUnRTAIdUB4f7 G8NSBbLUPE8BPXD+G3NWV5IKGDs6OFKaMcDG+iyvgO6YkUOn1Ofy2U7AcxLJWwRNXTPT uBpYskNc2EI77kKYgoScL+gC0pHPMki7eJSI4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zx2v4yHSWYJyoKxqczZgnNDQRHQrnGhT4P1bdzTvX6g=; b=TeL5QrsuFeB8wAUUWssHxP4xh2NN60+7PA85Zu4a51tBnlcm9r9iLaD/K/1pylQfSE UpUdx0+ly5s+ymwfoiE2TQsaYCGmvYmaLbsGNOHSKkrQo2CWnUv+3NA5a2bSQLmpoPKG Uyo02IT5nPRRYIm2GY5xaFsE4XR4rijI6gHQa79zf+uk5jHdAaiKCQ+E2gihb2WreoAA DD2LFqyu7LLUBTgtVe4E23+2a+/CDQsipGcOXpypvyDMehpHNeKjK16KrlOFtkhd7a57 6qp4R9sQH9fV38LZ+AO6Qa84G7RriPk5CcXxHdDaJs9QZFUUiztpGrfmimYhaPxbSZEz cC1Q== X-Gm-Message-State: ALKqPwd7Z2UYk5+JM7rwmyFC1X6KWcHnlLP24W8UfO8wu+2O4/U4aCvM I1JMkZMuPZtCwAhUgktehLm5a1dsWBKgAlTOndo= X-Received: by 2002:a24:3ccf:: with SMTP id m198-v6mr6189834ita.113.1526235702806; Sun, 13 May 2018 11:21:42 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Linus Torvalds Date: Sun, 13 May 2018 11:21:31 -0700 Message-ID: Subject: Re: for_each_cpu() is buggy for UP kernel? To: Dexuan Cui Cc: Ingo Molnar , Alexey Dobriyan , Andrew Morton , Peter Zijlstra , Thomas Gleixner , Greg Kroah-Hartman , Rakib Mullick , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 8, 2018 at 11:24 PM Dexuan Cui wrote: > Should we fix the for_each_cpu() in include/linux/cpumask.h for UP? As Thomas points out, this has come up before. One of the issues is historical - we tried very hard to make the SMP code not cause code generation problems for UP, and part of that was just that all these loops were literally designed to entirely go away under UP. It still *looks* syntactically like a loop, but an optimizing compiler will see that there's nothing there, and "for_each_cpu(...) x" essentially just turns into "x" on UP. An empty mask simply generally doesn't make sense, since opn UP you also don't have any masking of CPU ops, so the mask is ignored, and that helps the code generation immensely. If you have to load and test the mask, you immediately lose out badly in code generation. So honestly, I'd really prefer to keep our current behavior. Perhaps with a debug option that actually tests (on SMP - because that's what every developer is actually _using_ these days) that the mask isn't empty. But I'm not sure that would find this case, since presumably on SMP it might never be empty. Now, there is likely a fairly good argument that UP is getting _so_ uninteresting that we shouldn't even worry about code generation. But the counter-argument to that is that if people are using UP in this day and age, they probably are using some really crappy hardware that needs all the help it can get. At least for now, I'd rather have this inconsistency, because it really makes a surprisingly *big* difference in code generation. From the little test I just did, adding that mask testing to a *single* case of for_each_cpu() added 20 instructions. I didn't look at exactly why that happened (because the code generation was so radically different), but it was very noticeable. I used your macro replacement in kernel/taskstats.c in case you want to try to dig into what happened, but I'm not surprised. It really turns an unconditional trivial loop into a much more complex thing that needs to look at and test a value that we didn't care about before. Maybe we should introduce a "for_each_cpu_maybe_empty()" helper for cases like this? Linus