Received: by 2002:ac0:a874:0:0:0:0:0 with SMTP id c49csp601144ima; Fri, 15 Mar 2019 09:43:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqzsSq5RPt2ahpFnM2kroxwiwRLgChJI0JQoc1dFb0wRBy8weW7yUiHcZAvcx0MRzuycO+Dk X-Received: by 2002:a63:181a:: with SMTP id y26mr4353170pgl.268.1552668213474; Fri, 15 Mar 2019 09:43:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552668213; cv=none; d=google.com; s=arc-20160816; b=VHipzRLNmYDRpGJVGe44ei9uAuTZWAxErqXpK8S3vD4BICd9pbzrb4BR84Ii/hOTih qx8HcQvNhkrp0Yhs7emRZqXkJkoJfNDFopAOrJCW6IP3DpkLhFKFdZ6uOVQgSCdI98CQ tZR3OJ9z1tz34xRAG+505vdIg/8ipggDM5GVvJKx1LsuTx7DuTRe+aXyT1mj7sbofTYr SwFMoPvEkjn0/fi1vnvQnZsuczmdls4i1DVrJylGrhAtuDbosMRXwGxV6zGGfIXoaN0U KLkXWQ9xgfETBtVq2hXDqWWnK10Wih1hWWVHz9x2ixRJWcLbA2uF4WyViQtaheU2lXSZ /Nbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=/0S66ZXENPqpjEVB1GdTl8PJJFhPMHou500Nx8LsDho=; b=dxlG4plCDehoXvp84pCIoxkc2op5jEYK1lWd5vGfTsFfUj6J6A5SrWGPEttJFdjT2x zagh8yIVb9jt1IUtLBEVeXDLVzUb74kEynkPGLcXTuyqkdo4Ae7CYpcxf/6IaLKw6qrp pZ1FP1KpPaeIdShq9Xg3elPqtCTpzK1fmflhiJbdTw1pud6xuBGfK7uGonIhIB1ZZ147 eblb9RvB5hJm7kgHCPeYNhYPD4dRgyoCA9wW8mzLCgrB5/vyRtmXi9XS6A7E9jszAK+5 JM2nufu/Q425k1jhUEVYK0S6D7eyEPJzqlow4JlTGAFgRQYMxHW7JU4DwwwUty9DX5EL oJcA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j12si2099688pll.63.2019.03.15.09.43.18; Fri, 15 Mar 2019 09:43:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729612AbfCOQmn (ORCPT + 99 others); Fri, 15 Mar 2019 12:42:43 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:53783 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728480AbfCOQmm (ORCPT ); Fri, 15 Mar 2019 12:42:42 -0400 Received: from bigeasy by Galois.linutronix.de with local (Exim 4.80) (envelope-from ) id 1h4pum-0004z8-7P; Fri, 15 Mar 2019 17:42:36 +0100 Date: Fri, 15 Mar 2019 17:42:36 +0100 From: Sebastian Andrzej Siewior To: "Liu, Yongxin" Cc: "linux-kernel@vger.kernel.org" , "linux-rt-users@vger.kernel.org" , "tglx@linutronix.de" , "rostedt@goodmis.org" , "dan.j.williams@intel.com" , "pagupta@redhat.com" , "Gortmaker, Paul" , "linux-nvdimm@lists.01.org" Subject: Re: [PATCH RT] nvdimm: make lane acquirement RT aware Message-ID: <20190315164236.rzbwe7reeprjv3um@linutronix.de> References: <20190306095709.23138-1-yongxin.liu@windriver.com> <20190307143344.ytsnbmot5tjzjhip@linutronix.de> <597B109EC20B76429F71A8A97770610D12A52669@ALA-MBD.corp.ad.wrs.com> <20190308094131.ge4wbsvz4p6xikdf@linutronix.de> <597B109EC20B76429F71A8A97770610D12A5643B@ALA-MBD.corp.ad.wrs.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <597B109EC20B76429F71A8A97770610D12A5643B@ALA-MBD.corp.ad.wrs.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019-03-11 00:44:58 [+0000], Liu, Yongxin wrote: > > but you still have the ndl_lock->lock which protects the resource. So in > > the unlikely (but possible event) that you switch CPUs after obtaining > > the CPU number you block on the lock. No harm is done, right? > > The resource "lane" can be acquired recursively, so "ndl_lock->lock" is a conditional lock. > > ndl_count->count is per CPU. > ndl_lock->lock is per lane. > > Here is an example: > Thread A on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> get "ndl_lock->lock" > --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" due to "ndl_count->count++". > > Thread B on CPU 5 --> nd_region_acquire_lane --> lane# 5 --> bypass "ndl_lock->lock" ("ndl_count->count" > was changed by Thread A) > > If we use raw_smp_processor_id(), no matter which CPU the thread was migrated to, > if there is another thread running on the old CPU, there will be race condition > due to per CPU variable "ndl_count->count". so I've been looking at it again. The recursive locking could have been solved better. Like the local_lock() on -RT is doing it. Given that you lock with preempt_disable() there should be no in-IRQ usage. But in the "nd_region->num_lanes >= nr_cpu_ids" case you don't take any locks. That would be a problem with raw_smp_processor_id() approach. So what about the completely untested patch here: diff --git a/drivers/nvdimm/nd.h b/drivers/nvdimm/nd.h index 379bf4305e615..98c2e9df4b2e4 100644 --- a/drivers/nvdimm/nd.h +++ b/drivers/nvdimm/nd.h @@ -109,7 +109,8 @@ unsigned sizeof_namespace_label(struct nvdimm_drvdata *ndd); res; res = next, next = next ? next->sibling : NULL) struct nd_percpu_lane { - int count; + struct task_struct *owner; + int nestcnt; spinlock_t lock; }; diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c index e2818f94f2928..8a62f9833513f 100644 --- a/drivers/nvdimm/region_devs.c +++ b/drivers/nvdimm/region_devs.c @@ -946,19 +946,17 @@ int nd_blk_region_init(struct nd_region *nd_region) */ unsigned int nd_region_acquire_lane(struct nd_region *nd_region) { + struct nd_percpu_lane *ndl_lock; unsigned int cpu, lane; - cpu = get_cpu(); - if (nd_region->num_lanes < nr_cpu_ids) { - struct nd_percpu_lane *ndl_lock, *ndl_count; - - lane = cpu % nd_region->num_lanes; - ndl_count = per_cpu_ptr(nd_region->lane, cpu); - ndl_lock = per_cpu_ptr(nd_region->lane, lane); - if (ndl_count->count++ == 0) - spin_lock(&ndl_lock->lock); - } else - lane = cpu; + cpu = raw_smp_processor_id(); + lane = cpu % nd_region->num_lanes; + ndl_lock = per_cpu_ptr(nd_region->lane, lane); + if (ndl_lock->owner != current) { + spin_lock(&ndl_lock->lock); + ndl_lock->owner = current; + } + ndl_lock->nestcnt++; return lane; } @@ -966,17 +964,16 @@ EXPORT_SYMBOL(nd_region_acquire_lane); void nd_region_release_lane(struct nd_region *nd_region, unsigned int lane) { - if (nd_region->num_lanes < nr_cpu_ids) { - unsigned int cpu = get_cpu(); - struct nd_percpu_lane *ndl_lock, *ndl_count; + struct nd_percpu_lane *ndl_lock; - ndl_count = per_cpu_ptr(nd_region->lane, cpu); - ndl_lock = per_cpu_ptr(nd_region->lane, lane); - if (--ndl_count->count == 0) - spin_unlock(&ndl_lock->lock); - put_cpu(); - } - put_cpu(); + ndl_lock = per_cpu_ptr(nd_region->lane, lane); + WARN_ON(ndl_lock->nestcnt == 0); + WARN_ON(ndl_lock->owner != current); + if (--ndl_lock->nestcnt) + return; + + ndl_lock->owner = NULL; + spin_unlock(&ndl_lock->lock); } EXPORT_SYMBOL(nd_region_release_lane); @@ -1042,7 +1039,8 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus, ndl = per_cpu_ptr(nd_region->lane, i); spin_lock_init(&ndl->lock); - ndl->count = 0; + ndl->owner = NULL; + ndl->nestcnt = 0; } for (i = 0; i < ndr_desc->num_mappings; i++) { > Thanks, > Yongxin Sebastian