Received: by 10.223.185.116 with SMTP id b49csp2656178wrg; Mon, 5 Mar 2018 06:42:53 -0800 (PST) X-Google-Smtp-Source: AG47ELvaNpPu5Re4RNgd2o5gNuul4Xd1jnghGfYigcK6AXIVxUpd7LD4pGTXgWlgwOWHv8QRag9W X-Received: by 2002:a17:902:501:: with SMTP id 1-v6mr13292044plf.283.1520260973658; Mon, 05 Mar 2018 06:42:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520260973; cv=none; d=google.com; s=arc-20160816; b=n3RZnA1EU3ub23TIl6Yrh9eumc47AXr8madgvjYmwPn/YwuUIB9d+xvq5917Wacy1F ysVjY6KzRpzJ+l03fJmXUJni3RaRZ1KgLddMEJaQMg+hUzUNAqx4rKqUyMgktEwQFO2i bNJOJYq28Azo3/ta52Law/bbw1skYmc33TREZPRPqZNlspiQBYTcJEMog5UtCH/cPUQy xy871r/9gLZ5MTC3zyT7QQrPCDSRRYGB7O/npotyADxtisO3o+xb6Xu6dyTRLf5ewLes 9rTqacSfpXke3K/m8MyO9iuAXynNhHe8I9SI4gKF3R/qdgHG4zCkHbSbDGXGsE5n5fbL 0W9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:reply-to:from:subject:message-id :arc-authentication-results; bh=rU+0luxOJLr3yz5qnlmebckUkIBT7QMAmGsiWpKmNBY=; b=MsrxEqqlp6+DcoyAFKsfbRU7FSbZOC8hKCUBFFnCt4azIeCifW6FcNor5Uys9Ip9sz xpnA/JRsk32xp4CM6sxCwBjurXqTcWiJTZ8MQf6txhWPNk5vYw7W+YumCrwzhSL5gKx1 g4ZQKWRWgFFEaDKyvDrnnFe2uu+c6E30l5qFjg1YESY9lYMd7HPP2Fp+q2x7XEsTZTyA l6bsvHBFipFm9fpWWe6CBMWYp7gZeOwQ/rqxEKQWavu6IHhXAGmFIX7i7yk3XfH2PUV0 6i5evugh8AvRV5kIlsL+nI4SUK6UdvgtT5uLz4KtwCSHv5IPvd3f+22wPpdr0OP47sQp jrYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u10si1514065pgr.247.2018.03.05.06.42.39; Mon, 05 Mar 2018 06:42:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752566AbeCEOlu (ORCPT + 99 others); Mon, 5 Mar 2018 09:41:50 -0500 Received: from mga01.intel.com ([192.55.52.88]:42835 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752455AbeCEOlr (ORCPT ); Mon, 5 Mar 2018 09:41:47 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Mar 2018 06:41:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,427,1515484800"; d="scan'208";a="22161652" Received: from linux.intel.com ([10.54.29.200]) by orsmga007.jf.intel.com with ESMTP; 05 Mar 2018 06:41:47 -0800 Received: from abityuts-desk.fi.intel.com (abityuts-desk.fi.intel.com [10.237.68.39]) by linux.intel.com (Postfix) with ESMTP id 25A49580425; Mon, 5 Mar 2018 06:41:44 -0800 (PST) Message-ID: <1520260903.2637.34.camel@gmail.com> Subject: Re: regression: SCSI/SATA failure From: Artem Bityutskiy Reply-To: dedekind1@gmail.com To: Christoph Hellwig Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , Christian Borntraeger , Stefan Haberland , Jens Axboe , "Herring, Jan-kristian Augustin" , Thorsten Leemhuis Date: Mon, 05 Mar 2018 16:41:43 +0200 In-Reply-To: <1519311270.2535.53.camel@intel.com> References: <1519311270.2535.53.camel@intel.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.5 (3.26.5-1.fc27) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linux-Regression-ID: lr#15a115 On Thu, 2018-02-22 at 16:54 +0200, Artem Bityutskiy wrote: > Hi Christoph, > > one of our test box Skylake servers does not boot with v4.16-rcX. > Bisection lead us to this commit: > > 84676c1f21e8 genirq/affinity: assign vectors to all possible CPUs > > Reverting this single commit fixes the problem. > > The server is a Dell R640 machine with the latest Dell BIOS. It has a > single SATA SSD and we do not use raid, even though the system does > have a megaraid controller. Correction: we have Raid0 with this single disk. > Are you aware of this issue? Below is the failure message and the > full > dmesg with some debugging boot parameters is here: > > https://pastebin.com/raw/tTYrTAEQ FYI, the regression still exists and reverting this single patch fixes it. But today Dell server I did not have time to really debug this, but I think people who are working with this should quickly see what is going on. I think the platform reports way too large possible CPU count. Indeed, in dmesg I see this: [ 0.000000] smpboot: Allowing 328 CPUs, 224 hotplug CPUs 224 is way too large for this system. It only has 2 sockets, it but the number looks like if the system had 4 sockets. The commit changes IRQ affinity logic from being per-present CPU to being per-possible CPU: - for_each_present_cpu(cpu) + for_each_possible_cpu(cpu) And it looks like this has an unexpected side-effect on this Dell platform. Artem.