Received: by 10.192.165.148 with SMTP id m20csp947807imm; Wed, 2 May 2018 11:21:55 -0700 (PDT) X-Google-Smtp-Source: AB8JxZo7hlZJKBUysQeZPKkSWKwfxk8vDGPWJg9mhHbsLZTNiJPRF5Y3wR2E1mJ/W2E3j+DQNWHr X-Received: by 2002:a17:902:6ac3:: with SMTP id i3-v6mr10849566plt.378.1525285315930; Wed, 02 May 2018 11:21:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525285315; cv=none; d=google.com; s=arc-20160816; b=KL9xSJBWyUNkjmzoEzeKq3vAPiEGfCVkvKjEOZrpON9zMnhVVCcLsK6AbeN/qauzSF 3sJ2jnq13WjJW/0+/dTPKEQmyId69+fwjiyk8xrpe7Amf9XFJ4TgGTx5mLA7RiCKKErs dDOiL8rJVyiqeGQb+rb5u+MhadroM+PPgaYS4JIUA3o5aizoCudOVN9dwkZWDDI3Ye03 B4sH9LgOY/qY3vD2p98GZVaEfvhjg1m5I+Ou55gogoaMVsGpl2neJTt5tA10C5I+gMdu uTBmOpeKjILxtLEDZZmwhKtawDurNoT1iiL8txibbrQFCOTIxLTsJughvauVvpn2Xoqq rO2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=6TvN8px7Fco7ENQcdr8YubSuhVOiP90MWFIfoncHgRc=; b=i0WVpHdEsGW1oYynguKFahWjtzJmS6RFO+ljCdgHN3+ou/XexblpvcFFVT61SLfZl/ r1jrHcnvj2uAdHoZHIbi7Zhxf796bu0Naxug1a6uBi1vqRIeM/TPi3zFaPreToiz870Y 9BpubXLXaUlH5l7IPph+Lt0VE7jPEPXHChbJF4UdF0eYg6ydRY/D6RSFEmZA/3EGHRUT 4x4jAa1D+6BGcyA7te4OIFi5SoVkDcdvq6WuqnLx13Ay/wNTDSA5OjGP4oxIInsYEqaU RvmX0kw3SMoWZ2URzLcmmCI8RIczVSIp4gR630YfExj41pVDMm4y3jy6wrEwzT93Zz+x +A3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=gvPqFe2C; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m9-v6si9994943pgr.373.2018.05.02.11.21.41; Wed, 02 May 2018 11:21:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=gvPqFe2C; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751253AbeEBSVO (ORCPT + 99 others); Wed, 2 May 2018 14:21:14 -0400 Received: from casper.infradead.org ([85.118.1.10]:47710 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750947AbeEBSVL (ORCPT ); Wed, 2 May 2018 14:21:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Message-ID:In-Reply-To:Subject:cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=6TvN8px7Fco7ENQcdr8YubSuhVOiP90MWFIfoncHgRc=; b=gvPqFe2CgkDYw/hKb7epflhfz 7zddD1pNn7JhkAy94TzfA86BU+Ib7BPHj1V2/8HsmDXC9mR0b9jU6A7wxpJXKxNa+6Hijn0QsC7dV 5Ht+Fce9y2sZ6NcWvJ8g12vZXIv6sJcbj3RnNb71ouCS3v+X85pxdwUCqw1ouSXoAwj2gQaiSZuMh UVNxmPcBDef4b8rz1rNtvlkX6pBolheDOcvW9qyn6+OrQt05GrjAHg1hJUWn6fxxX/mbQdLdAZdbc EZZlatIuWL3rVz4ga/zAaKUqzgJaQuz2o0LYnxdkcwVD0OI77IzxZRWrsQGlgSu+PxNz6grMOKncS HH7G4TRtg==; Received: from jsimmons (helo=localhost) by casper.infradead.org with local-esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fDwNF-0000ef-Aj; Wed, 02 May 2018 18:21:07 +0000 Date: Wed, 2 May 2018 19:21:05 +0100 (BST) From: James Simmons To: "Dilger, Andreas" cc: NeilBrown , "Drokin, Oleg" , Greg Kroah-Hartman , Linux Kernel Mailing List , Lustre Development List Subject: Re: [lustre-devel] [PATCH 04/10] staging: lustre: lu_object: move retry logic inside htable_lookup In-Reply-To: Message-ID: References: <152514658325.17843.11455067361317157487.stgit@noble> <152514675897.17843.15112214060540196720.stgit@noble> User-Agent: Alpine 2.21 (LFD 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180502_192105_363492_10E5D5E6 X-CRM114-Status: GOOD ( 42.65 ) X-Spam-Score: -0.0 (/) X-Spam-Report: SpamAssassin version 3.4.1 on casper.infradead.org summary: Content analysis details: (-0.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 NO_RELAYS Informational: message was not relayed via SMTP Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Apr 30, 2018, at 21:52, NeilBrown wrote: > > > > The current retry logic, to wait when a 'dying' object is found, > > spans multiple functions. The process is attached to a waitqueue > > and set TASK_UNINTERRUPTIBLE in htable_lookup, and this status > > is passed back through lu_object_find_try() to lu_object_find_at() > > where schedule() is called and the process is removed from the queue. > > > > This can be simplified by moving all the logic (including > > hashtable locking) inside htable_lookup(), which now never returns > > EAGAIN. > > > > Note that htable_lookup() is called with the hash bucket lock > > held, and will drop and retake it if it needs to schedule. > > > > I made this a 'goto' loop rather than a 'while(1)' loop as the > > diff is easier to read. > > > > Signed-off-by: NeilBrown > > --- > > drivers/staging/lustre/lustre/obdclass/lu_object.c | 73 +++++++------------- > > 1 file changed, 27 insertions(+), 46 deletions(-) > > > > diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c > > index 2bf089817157..93daa52e2535 100644 > > --- a/drivers/staging/lustre/lustre/obdclass/lu_object.c > > +++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c > > @@ -586,16 +586,21 @@ EXPORT_SYMBOL(lu_object_print); > > static struct lu_object *htable_lookup(struct lu_site *s, > > It's probably a good idea to add a comment for this function that it may > drop and re-acquire the hash bucket lock internally. > > > struct cfs_hash_bd *bd, > > const struct lu_fid *f, > > - wait_queue_entry_t *waiter, > > __u64 *version) > > { > > + struct cfs_hash *hs = s->ls_obj_hash; > > struct lu_site_bkt_data *bkt; > > struct lu_object_header *h; > > struct hlist_node *hnode; > > - __u64 ver = cfs_hash_bd_version_get(bd); > > + __u64 ver; > > + wait_queue_entry_t waiter; > > > > - if (*version == ver) > > +retry: > > + ver = cfs_hash_bd_version_get(bd); > > + > > + if (*version == ver) { > > return ERR_PTR(-ENOENT); > > + } > > (style) we don't need the {} around a single-line if statement I hate to be that guy but could you run checkpatch on your patches. > > *version = ver; > > bkt = cfs_hash_bd_extra_get(s->ls_obj_hash, bd); > > @@ -625,11 +630,15 @@ static struct lu_object *htable_lookup(struct lu_site *s, > > * drained), and moreover, lookup has to wait until object is freed. > > */ > > > > - init_waitqueue_entry(waiter, current); > > - add_wait_queue(&bkt->lsb_marche_funebre, waiter); > > + init_waitqueue_entry(&waiter, current); > > + add_wait_queue(&bkt->lsb_marche_funebre, &waiter); > > set_current_state(TASK_UNINTERRUPTIBLE); > > lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_DEATH_RACE); > > - return ERR_PTR(-EAGAIN); > > + cfs_hash_bd_unlock(hs, bd, 1); > > This looks like it isn't unlocking and locking the hash bucket in the same > manner that it was done in the caller. Here excl = 1, but in the caller > you changed it to excl = 0? This is very much like the work done by Lai. The difference is Lai remove the work queue handling complete in htable_lookup(). You can see the details at https://jira.hpdd.intel.com/browse/LU-9049. I will push the missing lu_object fixes including LU-9049 on top of your patch set so you can see the approach Lai did. Form their we can figure out merge the lu_object work and fixing the issues Andreas and I pointed out. > > + schedule(); > > + remove_wait_queue(&bkt->lsb_marche_funebre, &waiter); > > Is it worthwhile to use your new helper function here to get the wq from "s"? > > > + cfs_hash_bd_lock(hs, bd, 1); > > + goto retry; > > } > > > > /** > > @@ -693,13 +702,14 @@ static struct lu_object *lu_object_new(const struct lu_env *env, > > } > > > > /** > > - * Core logic of lu_object_find*() functions. > > + * Much like lu_object_find(), but top level device of object is specifically > > + * \a dev rather than top level device of the site. This interface allows > > + * objects of different "stacking" to be created within the same site. > > */ > > -static struct lu_object *lu_object_find_try(const struct lu_env *env, > > - struct lu_device *dev, > > - const struct lu_fid *f, > > - const struct lu_object_conf *conf, > > - wait_queue_entry_t *waiter) > > +struct lu_object *lu_object_find_at(const struct lu_env *env, > > + struct lu_device *dev, > > + const struct lu_fid *f, > > + const struct lu_object_conf *conf) > > { > > struct lu_object *o; > > struct lu_object *shadow; > > @@ -725,17 +735,16 @@ static struct lu_object *lu_object_find_try(const struct lu_env *env, > > * It is unnecessary to perform lookup-alloc-lookup-insert, instead, > > * just alloc and insert directly. > > * > > - * If dying object is found during index search, add @waiter to the > > - * site wait-queue and return ERR_PTR(-EAGAIN). > > */ > > if (conf && conf->loc_flags & LOC_F_NEW) > > return lu_object_new(env, dev, f, conf); > > > > s = dev->ld_site; > > hs = s->ls_obj_hash; > > - cfs_hash_bd_get_and_lock(hs, (void *)f, &bd, 1); > > - o = htable_lookup(s, &bd, f, waiter, &version); > > - cfs_hash_bd_unlock(hs, &bd, 1); > > + cfs_hash_bd_get_and_lock(hs, (void *)f, &bd, 0); > > + o = htable_lookup(s, &bd, f, &version); > > + cfs_hash_bd_unlock(hs, &bd, 0); > > Here you changed the locking to a non-exclusive (read) lock instead of an > exclusive (write) lock? Why. I have the same question. > > > + > > if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT) > > return o; > > > > @@ -751,7 +760,7 @@ static struct lu_object *lu_object_find_try(const struct lu_env *env, > > > > cfs_hash_bd_lock(hs, &bd, 1); > > > > - shadow = htable_lookup(s, &bd, f, waiter, &version); > > + shadow = htable_lookup(s, &bd, f, &version); > > if (likely(PTR_ERR(shadow) == -ENOENT)) { > > cfs_hash_bd_add_locked(hs, &bd, &o->lo_header->loh_hash); > > cfs_hash_bd_unlock(hs, &bd, 1); > > @@ -766,34 +775,6 @@ static struct lu_object *lu_object_find_try(const struct lu_env *env, > > lu_object_free(env, o); > > return shadow; > > } > > - > > -/** > > - * Much like lu_object_find(), but top level device of object is specifically > > - * \a dev rather than top level device of the site. This interface allows > > - * objects of different "stacking" to be created within the same site. > > - */ > > -struct lu_object *lu_object_find_at(const struct lu_env *env, > > - struct lu_device *dev, > > - const struct lu_fid *f, > > - const struct lu_object_conf *conf) > > -{ > > - wait_queue_head_t *wq; > > - struct lu_object *obj; > > - wait_queue_entry_t wait; > > - > > - while (1) { > > - obj = lu_object_find_try(env, dev, f, conf, &wait); > > - if (obj != ERR_PTR(-EAGAIN)) > > - return obj; > > - /* > > - * lu_object_find_try() already added waiter into the > > - * wait queue. > > - */ > > - schedule(); > > - wq = lu_site_wq_from_fid(dev->ld_site, (void *)f); > > - remove_wait_queue(wq, &wait); > > - } > > -} > > EXPORT_SYMBOL(lu_object_find_at); > > > > /** > > > > > > _______________________________________________ > > lustre-devel mailing list > > lustre-devel@lists.lustre.org > > http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org > > Cheers, Andreas > -- > Andreas Dilger > Lustre Principal Architect > Intel Corporation > > > > > > > >