Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3209034imm; Sun, 13 May 2018 06:37:51 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqFLH+UkhonW8cWpxcSMB5mxGsF3LYRj55IiiUBgQBd1qhK9WRuzXaVwESz5LUF+eA7VHiG X-Received: by 2002:a17:902:b702:: with SMTP id d2-v6mr5954590pls.228.1526218671763; Sun, 13 May 2018 06:37:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526218671; cv=none; d=google.com; s=arc-20160816; b=0fhcv9AY7v+fw/cLtAqlXnpeVaCCJYwoff5ef9Qvq89uolCEeJSkU8Lj8apptvgLDS 9rjJXGslLOhQQOIicESc+2ghVXWrNVdBv++AcXCLSzIqr/HZ0o/PmjLIhllXjuzIWArX OYKZK5RzN5QO+dnutpH+scZUvN23Ud2IZDFROTfSHx/ZmOBzJUdQn6cKzO+9bFEbOwJ5 Pv/fSGPdF1HpLLUb1bwSabP7yckgzbhGvg+Rhre2lezjYSsbtTb1CAEC7UjrmCuT3T6H NT1bWjUMTuP/tQ6WuInHgMrFuqqCKtEiqf04JJtC4zFVRb+ujC04zx+SJC52Drl55QUO WEBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=Z+AWO1aejnbS4pIJO1qjiC8kmwaR3Fs3fIt3ZYXQ420=; b=ouFwZ5v9wj5HgVtrfy1go300cOoxnaR4drjiaXA5j7ymsphH0wkB0P3ZLp9RHtzd2m elEEeqFmgSKwd36LKeZBYWRuzHyb+uLwstVJx5Lh+z7UdKINqgGmqO9AxdX8npD1G7RB KHjOS4Aiiu7gc/UUKGY4gVO/ydxZWtfx0SqPcTqyPagH1VUO7qNthgpZthqdMTmE3G+F VsMqH0UTOzDt9dUA3JpnhcMJGEgOOwxpBMgcPtZe9XGW4vVhtLQfKKe93iv59vmNtJte VmE2Pbd0SbKbdPop4qsJ0WwgIlP6vV3/bUwsD8cmKPP+TJkE+lMcO6TsLr2chV61zcIK Fr4A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s14-v6si5729119pgf.263.2018.05.13.06.37.36; Sun, 13 May 2018 06:37:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751878AbeEMNf5 (ORCPT + 99 others); Sun, 13 May 2018 09:35:57 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:49273 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751397AbeEMNf4 (ORCPT ); Sun, 13 May 2018 09:35:56 -0400 Received: from p4fea4eb5.dip0.t-ipconnect.de ([79.234.78.181] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fHrAD-0003XY-Ff; Sun, 13 May 2018 15:35:49 +0200 Date: Sun, 13 May 2018 15:35:48 +0200 (CEST) From: Thomas Gleixner To: Andrew Morton cc: Dexuan Cui , Ingo Molnar , Alexey Dobriyan , Peter Zijlstra , Greg Kroah-Hartman , Rakib Mullick , "'linux-kernel@vger.kernel.org'" , Linus Torvalds Subject: Re: for_each_cpu() is buggy for UP kernel? In-Reply-To: <20180509162027.95ffa21312f7363d13d5ea1e@linux-foundation.org> Message-ID: References: <20180509162027.95ffa21312f7363d13d5ea1e@linux-foundation.org> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 9 May 2018, Andrew Morton wrote: > On Wed, 9 May 2018 06:24:16 +0000 Dexuan Cui wrote: > > > In include/linux/cpumask.h, for_each_cpu is defined like this for UP kernel (CONFIG_NR_CPUS=1): > > > > #define for_each_cpu(cpu, mask) \ > > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > > > Here 'mask' is ignored, but what if 'mask' contains 0 CPU? -- in this case, the for loop should not > > run at all, but with the current code, we run the loop once with cpu==0. > > > > I think I'm seeing a bug in my UP kernel that is caused by the buggy for_each_cpu(): > > > > in kernel/time/tick-broadcast.c: tick_handle_oneshot_broadcast(), tick_broadcast_oneshot_mask > > contains 0 CPU, but due to the buggy for_each_cpu(), the variable 'next_event' is changed from > > its default value KTIME_MAX to "next_event = td->evtdev->next_event"; as a result, > > tick_handle_oneshot_broadcast () -> tick_broadcast_set_event() -> clockevents_program_event() > > -> pit_next_event() is programming the PIT timer by accident, causing an interrupt storm of PIT > > interrupts in some way: I'm seeing that the kernel is receiving ~8000 PIT interrupts per second for > > 1~5 minutes when the UP kernel boots, and it looks the kernel hangs, but in 1~5 minutes, finally > > somehow the kernel can recover and boot up fine. But, occasionally, the kernel just hangs there > > forever, receiving ~8000 PIT timers per second. > > > > With the below change in kernel/time/tick-broadcast.c, the interrupt storm will go away: > > > > +#undef for_each_cpu > > +#define for_each_cpu(cpu, mask) \ > > + for ((cpu) = 0; (((cpu) < 1) && ((mask)[0].bits[0] & 1)); (cpu)++, (void)mask) > > > > Should we fix the for_each_cpu() in include/linux/cpumask.h for UP? > > I think so, yes. That might reveal new peculiarities, but such is life. > > I guess we should use bitmap_empty() rather than open-coding it. Agreed. FWIW, this had been discussed before, but there was no real conclusion: https://lkml.kernel.org/r/alpine.DEB.2.20.1709161850010.2105@nanos Thanks, tglx