Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1534018ybt; Thu, 9 Jul 2020 09:13:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzRznJpTmKNHeVW+CWqYo7rJ43uJi0MIT2bBifaKSW1015oEArp/rfntRJzSMFLYiiV+8N1 X-Received: by 2002:a17:906:7a46:: with SMTP id i6mr54863675ejo.475.1594311200030; Thu, 09 Jul 2020 09:13:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594311200; cv=none; d=google.com; s=arc-20160816; b=MCmTsXKjW1CycO/SAPd+fwAl4eA9UkFTzBizhI0h4y6g4lXEf+161I3OqryHh5lSqo uqor3fFSdWilLoVj7pVczmu1IK1POlR5+ssrKjVyr6dVCCfxDZOGOH5i+yInuYI0hgqM B1VWJmR0np5Ar51lNvmCin45v5b0tbZwbmnbLVKzsfYUrqafXVI2rQ9LsmPSvhzf1S2p kl2JI3gDGTV4qbFMsQgo+cS24Nrs5ANvn9VrASgBW4D0QBEcY6il/RxY8Sbmn7zhTKPs /CrziRVz/N9bWbOwBHZFiAwxeMNq4PZEOooGdlzwySHVyo83UE1FZ6m0S9QhDxLciZxt rXUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=OXGyI1jQM4MyDlznYgbObqnrQ6fGYg1DRfNByglLv1o=; b=JGqZWmNzG+ckHh7tQy7+XJ0d1ygO0QuivCTVM4aEpV8WeZnouVc1MgNEgnnCOYeF5K lTGYbDv+UQ4poXV+3x3B1z919i/+PWlkpmh/JWoVRxPuwpVp0dupNIKiFYoq4wmdxh/u LzNsmX9EeX8Wfn1AV0u0/KhNtE/XWPY8k6qI6WMhe9hFgDZHDTFjFE5qY/h5ug5sMv/n wW9mKYEgsUHJnbsdvHSCd+tkqxLkn9FZiOTohkxcDDDEseBSZ+SoeV9vwYeKsuSX6+Bh u/RG2W85eeGQSekOjX8DLHh5nMGauC3En0Z3xVgtQ98hgOVU+RkEJVGxshq787yfhmh0 U3zA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b="UjOsw/EY"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y25si2440865edm.63.2020.07.09.09.12.56; Thu, 09 Jul 2020 09:13:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b="UjOsw/EY"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727963AbgGIQKT (ORCPT + 99 others); Thu, 9 Jul 2020 12:10:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727095AbgGIQKT (ORCPT ); Thu, 9 Jul 2020 12:10:19 -0400 Received: from mail-ed1-x543.google.com (mail-ed1-x543.google.com [IPv6:2a00:1450:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A58E0C08C5DC for ; Thu, 9 Jul 2020 09:10:18 -0700 (PDT) Received: by mail-ed1-x543.google.com with SMTP id by13so2255601edb.11 for ; Thu, 09 Jul 2020 09:10:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OXGyI1jQM4MyDlznYgbObqnrQ6fGYg1DRfNByglLv1o=; b=UjOsw/EYQjZBeUPhxTDnLwz5hcjG7XJKZAgCs2jnN8hFjADCB++r0ATOAqmENgW/VP hikL0VJL7vWu6z3P4UkTBqdu2KGKyfsHsPW/Bi3c+q1lNKQwFaQ3jtXQyW/uMEGDpkA/ TqqxyTc44LHLjuIv9MUJBAD9nKjyvaMtoS5ILB0gpHpaESvb3166bLYfXcKF04TJDOs0 pa1AMO6kHH27YF7o2AKynrWSx+i1ba/Z1ZCBZyA3sL/VFV7J9RZ/YtnFWWUqfQN8T7G1 xdoFwAAA2eCajSA9FJRXSZ84ILtv1QbBjC3Pfdo/yF5RcJ+LZqB+kCzPE+zAD4yYJPWE AlkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OXGyI1jQM4MyDlznYgbObqnrQ6fGYg1DRfNByglLv1o=; b=lgCrFxckwKHnx2Yp1IMXYzRVQCPr5xggZbCilcs3BKUcWik4TNvenCI4RHs6qI1oHB j6CsyHHeQcF1XNskg/CS+wErIQhq52uBsvmvdL6R9Q85WRmQ9OYlDwYn6OmtPkq13VZG VgFxiLbM+ND/HVZffbYad8i9VJ6d8rVQYeGwMpUiVQN0yRSi/sb4GsefKxWnuweSJOnD rkX84Ljheg9qYwa4ql34XwOcN9B/+gM6sYbSHaGoYcowIEmFJPin85CCxsHaBkBeP7lu kHmpLqR8+9sXaNezlvkTYf6aw0/BhHzDR5cdsSarl2qfUmdhQjzCHw5Yok9/prl3qSRP Gpxw== X-Gm-Message-State: AOAM532VaYe/iHrvr5rr+RVN7k2Vz5O75gULY6Ta9lFHG1AJYy5UBW9g s9744nJeympy8YJTAlt1vYSqTVmDIu4Qfe0g2IY+Uw== X-Received: by 2002:aa7:d043:: with SMTP id n3mr75626053edo.102.1594311017269; Thu, 09 Jul 2020 09:10:17 -0700 (PDT) MIME-Version: 1.0 References: <159408711335.2385045.2567600405906448375.stgit@dwillia2-desk3.amr.corp.intel.com> <159408717289.2385045.14094866475168644020.stgit@dwillia2-desk3.amr.corp.intel.com> <20200709150051.GA17342@infradead.org> <20200709153854.GY23821@mellanox.com> In-Reply-To: <20200709153854.GY23821@mellanox.com> From: Dan Williams Date: Thu, 9 Jul 2020 09:10:06 -0700 Message-ID: Subject: Re: [PATCH v2 11/12] PM, libnvdimm: Add 'mem-quiet' state and callback for firmware activation To: Jason Gunthorpe Cc: Christoph Hellwig , linux-nvdimm , Greg Kroah-Hartman , "Rafael J. Wysocki" , Doug Ledford , Pavel Machek , Len Brown , Linux ACPI , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 9, 2020 at 8:39 AM Jason Gunthorpe wrote: > > On Thu, Jul 09, 2020 at 04:00:51PM +0100, Christoph Hellwig wrote: > > On Mon, Jul 06, 2020 at 06:59:32PM -0700, Dan Williams wrote: > > > The runtime firmware activation capability of Intel NVDIMM devices > > > requires memory transactions to be disabled for 100s of microseconds. > > > This timeout is large enough to cause in-flight DMA to fail and other > > > application detectable timeouts. Arrange for firmware activation to be > > > executed while the system is "quiesced", all processes and device-DMA > > > frozen. > > > > > > It is already required that invoking device ->freeze() callbacks is > > > sufficient to cease DMA. A device that continues memory writes outside > > > of user-direction violates expectations of the PM core to be to > > > establish a coherent hibernation image. > > > > > > That said, RDMA devices are an example of a device that access memory > > > outside of user process direction. > > Are you saying freeze doesn't work for some RDMA drivers? That would > be a driver bug, I think. Right, it's more my hunch than a known bug at this point, but in my experience with testing server class hardware when I've reported a power management bugs I've sometimes got the incredulous response "who suspends / hibernates servers!?". I can drop that comment. Are there protocol timeouts that might need to be adjusted for a 100s of microseconds blip in memory controller response? > The consequences of doing freeze are pretty serious, but it should > still stop DMA. Ok, and there is still the option to race the quiesce if the effects of the freeze are worse than a potential timeout from the quiesce.