Received: by 10.192.165.148 with SMTP id m20csp468379imm; Wed, 9 May 2018 16:22:36 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqCZleme5Mun6QOEyqrn8DPKJiF8ohKOW7qHp8hT8uVEtQjEKwB1jwzEIFfa+0yp7m860sg X-Received: by 2002:a65:4e86:: with SMTP id b6-v6mr36220452pgs.392.1525908156135; Wed, 09 May 2018 16:22:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525908156; cv=none; d=google.com; s=arc-20160816; b=YAIChn9Wgs/Ax6OkdmRRdMSb9px9ZXDWcxJDcsmTkhBDMHrOd2uBfgVYZgP+UXn2Xb 7hxmPhoy+9qRw7R/XmNFYqJFYV0d6yi4tHlKkOQPxMYm7UZ68n0j+KCpsnfxKi6LDoMq Bl9Xdy8gHJK0JSwmZOg6Va5cMkJZugLn5Et8tRhF+bTHdgDk6ps/iZaXSfJ3uaOt9JLi LaA0imNhscLgT7YaZfaCwuJE1bZnb4Te/e+Si7SHz5+Bp9E1N/niJYpuoobFJKBe5f4M SZ2jADsBndFcPOilyF3lk4Hn5xGh3U1daI9hStOvMEaIvFxBQV+W71GdJZda8C6cAup4 hM+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=T4wF8U34VkSeGfIoWrZ5D0JWOFBxC1tPRtvHQFSlCaw=; b=JDfGEDqP2vRd/xVDr8fb6GASaFQ5miSu3wRGBXgEQjQsk5DAu+pC6phHeqoigrvTii SlNRGGI0WfrXjd6p3kvMdoz4iDBJgHykIow4ajL3Wa8JQMknDM+NjWRcGNRAo8vpXI/9 +pjzVnPeCS1UeO3oGueQtN1AfBm3KXlLvFw8TbKZbpjbKgQ0pLmvZUkuI1fsBSOzLz/X 8upjkcgznlnhWkZMWZYfjvfbgD1VLb9l6OJB6FDWbMiP2jHiD/7XJkgPfHz9qUq6B5I2 lZe6NpHIYHoRqoRxDmLkf0XRKlVUMnQ80UmvyJCqaQ9AyhhXr9kjbcfatkLV/t39VF6g eiEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t8si28109018pfl.344.2018.05.09.16.22.21; Wed, 09 May 2018 16:22:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966011AbeEIXUa (ORCPT + 99 others); Wed, 9 May 2018 19:20:30 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:51830 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965939AbeEIXU3 (ORCPT ); Wed, 9 May 2018 19:20:29 -0400 Received: from akpm3.svl.corp.google.com (unknown [104.133.9.71]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 7E567CFA; Wed, 9 May 2018 23:20:28 +0000 (UTC) Date: Wed, 9 May 2018 16:20:27 -0700 From: Andrew Morton To: Dexuan Cui Cc: Ingo Molnar , Alexey Dobriyan , Peter Zijlstra , Thomas Gleixner , Greg Kroah-Hartman , Rakib Mullick , "'linux-kernel@vger.kernel.org'" Subject: Re: for_each_cpu() is buggy for UP kernel? Message-Id: <20180509162027.95ffa21312f7363d13d5ea1e@linux-foundation.org> In-Reply-To: References: X-Mailer: Sylpheed 3.6.0 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 9 May 2018 06:24:16 +0000 Dexuan Cui wrote: > In include/linux/cpumask.h, for_each_cpu is defined like this for UP kernel (CONFIG_NR_CPUS=1): > > #define for_each_cpu(cpu, mask) \ > for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask) > > Here 'mask' is ignored, but what if 'mask' contains 0 CPU? -- in this case, the for loop should not > run at all, but with the current code, we run the loop once with cpu==0. > > I think I'm seeing a bug in my UP kernel that is caused by the buggy for_each_cpu(): > > in kernel/time/tick-broadcast.c: tick_handle_oneshot_broadcast(), tick_broadcast_oneshot_mask > contains 0 CPU, but due to the buggy for_each_cpu(), the variable 'next_event' is changed from > its default value KTIME_MAX to "next_event = td->evtdev->next_event"; as a result, > tick_handle_oneshot_broadcast () -> tick_broadcast_set_event() -> clockevents_program_event() > -> pit_next_event() is programming the PIT timer by accident, causing an interrupt storm of PIT > interrupts in some way: I'm seeing that the kernel is receiving ~8000 PIT interrupts per second for > 1~5 minutes when the UP kernel boots, and it looks the kernel hangs, but in 1~5 minutes, finally > somehow the kernel can recover and boot up fine. But, occasionally, the kernel just hangs there > forever, receiving ~8000 PIT timers per second. > > With the below change in kernel/time/tick-broadcast.c, the interrupt storm will go away: > > +#undef for_each_cpu > +#define for_each_cpu(cpu, mask) \ > + for ((cpu) = 0; (((cpu) < 1) && ((mask)[0].bits[0] & 1)); (cpu)++, (void)mask) > > Should we fix the for_each_cpu() in include/linux/cpumask.h for UP? I think so, yes. That might reveal new peculiarities, but such is life. I guess we should use bitmap_empty() rather than open-coding it.