Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp4073120imm; Tue, 25 Sep 2018 10:50:58 -0700 (PDT) X-Google-Smtp-Source: ACcGV63pd3SpjJKQllgbz9G9r+4RmAo9lA+iijX2se1+CprcnNokY7gjObXFoeFWanK6LOD5zaUi X-Received: by 2002:a63:d556:: with SMTP id v22-v6mr1954754pgi.357.1537897857976; Tue, 25 Sep 2018 10:50:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537897857; cv=none; d=google.com; s=arc-20160816; b=YMFtg6WC67MrJ4tdK+pRVdVBKhc1JYw/Kvne9OV4GwKrdfGE26ho78hbdln0Tfyi9A OxQ145JGwAlCKgyoHwaWPtecBy3nElX1Da/gmwbrHuE/O812cxjotC70afYBBA4HcLXG V1QiJE9864fQLv0DRJAOsjauBmBl9yZK0YKIe+8QCyw6R7Vh4X4oHXTlY3fc1vsIo2hg MOwDsK+9ZT1hzIAyU1iyV4DOBN/+cx/pgrz2EjHtJq3GPO+QQ4D7mYHJRLVm+owBKmnp 2Alz4QgDY5VobEfw9hz6fuxOH2gl1aPgTN+j2c1Vun9DrOU1axJu0+szmUiFi4BvsULI v5qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=DeknSL5a6ZAt9NJuwU/9ehd05BaPNmpU7C8K82Y2Zao=; b=WAV8IjmDNTdPj8w44Q5mEMVmneZQT5ajFofQL2FbTSbLSOdOyJjqljGroCUu3ofvnK iTcpUoTwgixelRySBAWtGw5EepuNYt8V4NBEs/0zdgSnfZQYzf/BwSgMCyz5rpAUF2rc n9VKsmup8Pg9ZPX3AuYryko7hoyfPpKnhJR/f5j2QGCzrqoVfR+x6ZB4u+GqZfNYU4ah TGe6MHw8hvsA1LcODGQU2I+JsA7La51U4NmlOuGJ0nVV+HtK2q927Epo9qhjD5g4Kqax tGUdA3ycs9TgRktem6pYEaLdSgpz+RM/2q2pIcqMF5JAAFI46g3JmvsgV+UqEfsLFHYy 1+gQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id gn22si2866739plb.139.2018.09.25.10.50.41; Tue, 25 Sep 2018 10:50:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727169AbeIYX7C (ORCPT + 99 others); Tue, 25 Sep 2018 19:59:02 -0400 Received: from mga14.intel.com ([192.55.52.115]:28323 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725918AbeIYX7C (ORCPT ); Tue, 25 Sep 2018 19:59:02 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Sep 2018 10:50:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,303,1534834800"; d="scan'208";a="73625962" Received: from agluck-desk.sc.intel.com (HELO agluck-desk) ([10.3.52.160]) by fmsmga008.fm.intel.com with ESMTP; 25 Sep 2018 10:50:23 -0700 Date: Tue, 25 Sep 2018 10:50:23 -0700 From: "Luck, Tony" To: Borislav Petkov Cc: Justin Ernst , russ.anderson@hpe.com, Mauro Carvalho Chehab , linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] Raise maximum number of memory controllers Message-ID: <20180925175023.GA16725@agluck-desk> References: <20180925143449.284634-1-justin.ernst@hpe.com> <20180925152659.GE23986@zn.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180925152659.GE23986@zn.tnic> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 25, 2018 at 05:26:59PM +0200, Borislav Petkov wrote: > On Tue, Sep 25, 2018 at 09:34:49AM -0500, Justin Ernst wrote: > > We observe an oops in the skx_edac module during boot. > > Examining /var/log/messages: > > [ 3401.985757] EDAC MC0: Giving out device to module skx_edac controller Skylake Socket#0 IMC#0 > > [ 3401.985887] EDAC MC1: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1 > > [ 3401.986014] EDAC MC2: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0 > > ... > > [ 3401.987318] EDAC MC13: Giving out device to module skx_edac controller Skylake Socket#0 IMC#1 > > [ 3401.987435] EDAC MC14: Giving out device to module skx_edac controller Skylake Socket#1 IMC#0 > > [ 3401.987556] EDAC MC15: Giving out device to module skx_edac controller Skylake Socket#1 IMC#1 > > [ 3401.987579] Too many memory controllers: 16 > > [ 3402.042614] EDAC MC: Removed device 0 for skx_edac Skylake Socket#0 IMC#0 > > > > We observe there are two memory controllers per socket, with a limit of 16. > > Raise the maximum number of memory controllers from 16 to 2 * MAX_NUMNODES (1024). > > Tony, > > can we read that out from the hardware instead of having this silly > static number? > > Leaving in the rest. There are way too many places where we use the identifier "bus" in the edac core and drivers. But I'm not sure that we need a static array mc_bus[EDAC_MAX_MCS]. Why can't we: - mci->bus = &mc_bus[mci->mc_idx]; + mci->bus = kmalloc(sizeof *(mci->bus), GFP_KERNEL); and then figure out where to kfree(mci->bus) on driver removal? Do we every do arithmetic on different mci->bus pointers that assume they are all part of a single array? -Tony