Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E57D8C61DA4 for ; Tue, 14 Mar 2023 01:39:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229901AbjCNBj2 (ORCPT ); Mon, 13 Mar 2023 21:39:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229473AbjCNBj0 (ORCPT ); Mon, 13 Mar 2023 21:39:26 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC2FF6A2EC; Mon, 13 Mar 2023 18:39:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678757964; x=1710293964; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=3vPLOFWsJOBvuuturyqTl+saM1091xVHkhHfac1DvsU=; b=jGue/zc4bk1qU/YRUUtwt77lDxwjYpux3gonxKcBgiAOjAMncE9PmeKL v0bsYvjHAhXdkZJFsrMWJVQlDIE1v9hb5DeZkVK9hMKgagJNwBIRW0GC1 gU8s4mbqLNN2YrkcY922Tp6carB+3alDAvnuxUwCvLrCz1ZkB/dy44jVE VnqhWt5vHbwCywyz0NFJIUU0+IsX0kEFiAldmO4VU9jALq/YcZW9f+2jF dRFa0FRBe+j9Ffsh9H0MnN5BcNI2sir3Dy4Otn883ZJgzwqvN0/N5zASN /+4ZzIQRd3zn0e8CtjrKJAD5V0Zax0PRTftoBhsIGL8U+hnoLX2mhbZZY w==; X-IronPort-AV: E=McAfee;i="6500,9779,10648"; a="316948996" X-IronPort-AV: E=Sophos;i="5.98,258,1673942400"; d="scan'208";a="316948996" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 18:39:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10648"; a="711326209" X-IronPort-AV: E=Sophos;i="5.98,258,1673942400"; d="scan'208";a="711326209" Received: from aschofie-mobl2.amr.corp.intel.com (HELO aschofie-mobl2) ([10.209.44.112]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 18:39:23 -0700 Date: Mon, 13 Mar 2023 18:39:22 -0700 From: Alison Schofield To: Ira Weiny Cc: Dan Williams , Vishal Verma , Dave Jiang , Ben Widawsky , Steven Rostedt , linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v8 4/6] cxl/region: Provide region info to the cxl_poison trace event Message-ID: References: <640fa8f32013d_22c07f294e8@iweiny-mobl.notmuch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <640fa8f32013d_22c07f294e8@iweiny-mobl.notmuch> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 13, 2023 at 03:51:31PM -0700, Ira Weiny wrote: > alison.schofield@ wrote: > > From: Alison Schofield > > > > User space may need to know which region, if any, maps the poison > > address(es) logged in a cxl_poison trace event. Since the mapping > > of DPAs (device physical addresses) to a region can change, the > > kernel must provide this information at the time the poison list > > is read. The event informs user space that at event > > this mapped to this , which is poisoned. > > > > The cxl_poison trace event is already wired up to log the region > > name and uuid if it receives param 'struct cxl_region'. > > > > In order to provide that cxl_region, add another method for gathering > > poison - by committed endpoint decoder mappings. This method is only > > available with CONFIG_CXL_REGION and is only used if a region actually > > maps the memdev where poison is being read. After the poison list is > > read for all the mapped resources, poison is read for the unmapped > > resources, and those events are logged without the region info. > > > > Mixed mode decoders are not currently supported in Linux. Add a debug > > message to the poison request path. That will serve as an alert that > > poison list retrieval needs to add support for mixed mode. > > > > The default method remains: read the poison by memdev resource. > > > > Signed-off-by: Alison Schofield > > [...] > > > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c > > index ea996057815e..5e65818d2171 100644 > > --- a/drivers/cxl/core/memdev.c > > +++ b/drivers/cxl/core/memdev.c > > @@ -139,14 +139,29 @@ static ssize_t trigger_poison_list_store(struct device *dev, > > const char *buf, size_t len) > > { > > struct cxl_memdev *cxlmd = to_cxl_memdev(dev); > > + struct cxl_port *port; > > bool trigger; > > int rc; > > > > if (kstrtobool(buf, &trigger) || !trigger) > > return -EINVAL; > > > > + port = dev_get_drvdata(&cxlmd->dev); > > + if (!port || !is_cxl_endpoint(port)) > > + return -EINVAL; > > + > > down_read(&cxl_dpa_rwsem); > > - rc = cxl_get_poison_by_memdev(cxlmd); > > + if (port->commit_end == -1) { > > + /* No regions mapped to this memdev */ > > + rc = cxl_get_poison_by_memdev(cxlmd); > > + } else { > > + /* Regions mapped, collect poison by endpoint */ > > + rc = device_for_each_child(&port->dev, port, > > + cxl_get_poison_by_endpoint); > > + if (rc == 1) > > + rc = 0; > > + } > > + > > up_read(&cxl_dpa_rwsem); > > > > return rc ? rc : len; > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > > index f29028148806..1a558adfe32d 100644 > > --- a/drivers/cxl/core/region.c > > +++ b/drivers/cxl/core/region.c > > @@ -2213,6 +2213,102 @@ struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev) > > } > > EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, CXL); > > > > +int cxl_get_poison_by_endpoint(struct device *dev, void *data) > > +{ > > + struct cxl_endpoint_decoder *cxled; > > + struct cxl_port *port = data; > > + struct cxl_dev_state *cxlds; > > + struct cxl_memdev *cxlmd; > > + u64 offset, length; > > + int rc = 0; > > + > > + down_read(&cxl_dpa_rwsem); > > Oh just realized aren't we already under the read lock when this is called? > > Replacing this with a lockdep_assert() could make the logic here easier. > > Ira Yes! Thanks for the catch. That will allow removal of the goto out's! > > [...]