Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp5575100imm; Tue, 18 Sep 2018 11:43:22 -0700 (PDT) X-Google-Smtp-Source: ANB0VdaHBIAUxaEWHQggJQRtsFooIwYYv9evNdCHXinxEzy5s9kGp1OLpNCpLdCKyr6ggnsQei1M X-Received: by 2002:a62:9992:: with SMTP id t18-v6mr31997143pfk.239.1537296202737; Tue, 18 Sep 2018 11:43:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537296202; cv=none; d=google.com; s=arc-20160816; b=0RESOV4bTS5mAMMY9f8WpdQjR+dOMdQ9/RHd3YjDX7nSmDoGcogPZpqjLoyAYwiIla LEtcBX9uP7gNiSEPnpKeNiBjY3O5pOCyTqIlZMZhv16maA0ZmoWlqRs0kA2PVi90slxk QU4MasCpuvASElHrq8UIRDRJMFG+zGlIz2+vCtvfXB2/7mlKQIbjnW+hwGeDnEQQ/nY6 CvldTqn2jGPcYlz0d8Y7uPHiAmI+wqqGQfqwO4piQHVL/Cweof70JfJ9W5hvpQS6m7fB OZi6E5rfHCHZhJj+PONGg65YWLI9ydL2PdOKIZ7gHL0GE2zRRAlUsdq0aX/xrjAmw0mn q7WA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ch1q9tyLzHlfEUq+lFlyelPOQLAB/H3ySbiSj0A7Z2U=; b=TUgPTWRp7XL7NIR0TzdoFRyndiOif3P0DGfNjpc2fPPmyRZcbrTJLkkD2vn15097+6 PNfj3PtrNV8+ZE4Yx1/JFdeN1GWEaQjZz10H/aw/ZAAX3xwkq1PtKuSDtgavcR1n98GK y7rUV4tyOvDHu+uRCJFcC3EkYeVX9nO7WH5nI0x2UP9qKgbxa6KVx7EY8b87AVYxWFH1 CvKQsDDYg9JLZfC9HTMggCjRQyTPvuW3rB1VzAq4uVRoqGIXYiQGgbbAEVjWeXLjJS1a 9YcMzQ0/gXYgtdX7ByqxTQ42AuwfjTJbttTHUmpDJwUpg8fdnOuwLX3ct17Wo3O1zCKG OtOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=EI+b6CDh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u1-v6si21264762pfc.337.2018.09.18.11.43.02; Tue, 18 Sep 2018 11:43:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=EI+b6CDh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730195AbeISAQq (ORCPT + 99 others); Tue, 18 Sep 2018 20:16:46 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:40853 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729618AbeISAQq (ORCPT ); Tue, 18 Sep 2018 20:16:46 -0400 Received: by mail-pf1-f196.google.com with SMTP id s13-v6so1433760pfi.7 for ; Tue, 18 Sep 2018 11:42:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ch1q9tyLzHlfEUq+lFlyelPOQLAB/H3ySbiSj0A7Z2U=; b=EI+b6CDh4TaWw68vsOVgxPBy2WQCXms9LwucTVv11Y65hepRvA+bEcwgRUzfHGC6fM PLJUAxXaeTSeFOdSfx0qw350EByQADvGLDJj9/gwJSqNDv48IjvreIShSmf/WmV1vEkZ sKpbiMNnv9f6egk+Y6/lPoF1Nz57iX8EVAnoJakOrP50+uqvhRf2kIlblzP+C9DS2Hr/ QOx0/aVfxiumiNqeNGtUotqh7WthbAu7FCGM3kqEwyxZYLTm6aSXNkzuS8vMUBHLThC1 89HLI+2PJOh+RnuA6HDtH/qNCEcZZWfpQz2wmLS2/xtU3amqTnyW7vXIe58B7sMYoJw8 DCQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ch1q9tyLzHlfEUq+lFlyelPOQLAB/H3ySbiSj0A7Z2U=; b=r8u1Nx5pMfEmkYw8fLh2raz6o5ZVhqoLJxmoTASyLOImBJPciYPmcO4Y2lGEpuN9sP +xJVkFwLg7fKUf3Emqsowj0nGsPsKB5X+lbgGdeMoXEaVmwAjtrnS8E/5qs7Q6/zguB2 T09jbbpCAto2Mapjve6j4aFIDE/m65naDMeYrZtnGrSRlYtqyN8xUSRP0WcJr67TOWwp 7YjqN+NpTyR2WjNwbq9aF8iBlD5Boibvm6PAZmA9gQvQEJKtV4JcW4D2bHMSp1WtyUtR OqqoePssuIJEx4q9GBpuHvkvhD9u4wp3scGvggfm+9X93dUYqaBB2A61DJ8hPEmtRoau cArQ== X-Gm-Message-State: APzg51CzRmxrgDHGvslJ+/wcCUl7ECYgTS0uQsCS4yg+IxQ8gXZcH24r La15h7lt/S4X1B3BbfmfRG9E4EkFxFHKKsO2VspE1A== X-Received: by 2002:a63:8543:: with SMTP id u64-v6mr29876886pgd.248.1537296171927; Tue, 18 Sep 2018 11:42:51 -0700 (PDT) MIME-Version: 1.0 References: <20180911225630.124502-1-venture@google.com> <585d1c3a-6121-c20d-f6d6-7567595cd1af@acm.org> In-Reply-To: From: Patrick Venture Date: Tue, 18 Sep 2018 11:42:40 -0700 Message-ID: Subject: Re: [PATCH v2] ipmi: looped device detection To: Corey Minyard Cc: Arnd Bergmann , Greg KH , openipmi-developer@lists.sourceforge.net, Linux Kernel Mailing List , OpenBMC Maillist Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 12, 2018 at 3:54 PM Patrick Venture wrote: > > On Wed, Sep 12, 2018 at 3:10 PM Corey Minyard wrote: > > > > On 09/11/2018 05:56 PM, Patrick Venture wrote: > > > Try to get the device ID repeatedly during initialization before giving up. > > > The BMC isn't always responsive, and this allows it to be slightly flaky > > > during early boot. > > > > > > Tested: Installed on a system with the BMC software disabled > > > such that it was non-responsive. The driver correctly detected this > > > and gave up as expected. Then I re-enabled the BMC software unloaded > > > and reloaded the driver and it was detected properly. > > > > The patch looks fine, but I wonder if this is something that is really > > valuable. > > I have wondered about this before. > > > > The question is: If the BMC is unavailable, what are the chances of it > > becoming > > available by the time you do 5 attempts? I would guess that is a pretty > > small > > chance, which is why I haven't done this already. Friendly ping. I'd like to get a sense of whether you're likely to accept this. If not, it's fine, will close out patch in current downstream rebase. Thanks > > This patch was actually critical for us to provide a reliable IPMI > interface. The version of OpenBMC or the state of the BMC at the > point the kernel was loading was flaky, so following the example in > the BIOS source, we just re-try a few times. We also can hold boot X > seconds until it's responding, but, this avoided some issues inherent > with that. > > > > > You could have something that re-tested periodically, but there are so many > > systems with IPMI specified in ACPI or SMBIOS that is wrong, and it would > > try forever. Also not really a good thing. > > If we did a periodic check, it could check X times, but I felt going > for a simple solution was ideal -- and this idea was proved out on a > few platforms. We have other drivers that are loaded by the kernel > (not at run-time) and they depend on IPMI, and without this patch they > would then have a non-trivial probability of failure. > > > > > So I've left it to reload the driver or use the hotmod interface. > > > > -corey > > > > > Signed-off-by: Patrick Venture > > > --- > > > v2: > > > - removed extra variable that was set but not used. > > > --- > > > drivers/char/ipmi/ipmi_si_intf.c | 23 ++++++++++++++++++++++- > > > 1 file changed, 22 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c > > > index 90ec010bffbd..5fed96897fe8 100644 > > > --- a/drivers/char/ipmi/ipmi_si_intf.c > > > +++ b/drivers/char/ipmi/ipmi_si_intf.c > > > @@ -1918,11 +1918,13 @@ int ipmi_si_add_smi(struct si_sm_io *io) > > > * held, primarily to keep smi_num consistent, we only one to do these > > > * one at a time. > > > */ > > > +#define GET_DEVICE_ID_ATTEMPTS 5 > > > static int try_smi_init(struct smi_info *new_smi) > > > { > > > int rv = 0; > > > int i; > > > char *init_name = NULL; > > > + unsigned long sleep_rm; > > > > > > pr_info(PFX "Trying %s-specified %s state machine at %s address 0x%lx, slave address 0x%x, irq %d\n", > > > ipmi_addr_src_to_str(new_smi->io.addr_source), > > > @@ -2003,7 +2005,26 @@ static int try_smi_init(struct smi_info *new_smi) > > > * Attempt a get device id command. If it fails, we probably > > > * don't have a BMC here. > > > */ > > > - rv = try_get_dev_id(new_smi); > > > + for (i = 0; i < GET_DEVICE_ID_ATTEMPTS; i++) { > > > + pr_info(PFX "Attempting to read BMC device ID\n"); > > > + rv = try_get_dev_id(new_smi); > > > + /* If it succeeded, stop trying */ > > > + if (!rv) > > > + break; > > > + > > > + /* Sleep for ~0.25s before trying again instead of hammering > > > + * the BMC. > > > + */ > > > + sleep_rm = msleep_interruptible(250); > > > + if (sleep_rm != 0) { > > > + pr_info(PFX "Find BMC interrupted\n"); > > > + rv = -EINTR; > > > + goto out_err; > > > + } > > > + } > > > + > > > + /* If we exited the loop above and rv is non-zero we ran out of tries. > > > + */ > > > if (rv) { > > > if (new_smi->io.addr_source) > > > dev_err(new_smi->io.dev, > > > >