Received: by 10.192.165.148 with SMTP id m20csp948640imm; Wed, 2 May 2018 11:22:43 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpI9GiNV5m91JrB/orFP2CQM8jiPznQv3PV3Xm9ZoEBNHAAL9VnF4pYYzS+0nYOWDooAAK4 X-Received: by 2002:a63:bf44:: with SMTP id i4-v6mr16976340pgo.66.1525285363542; Wed, 02 May 2018 11:22:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525285363; cv=none; d=google.com; s=arc-20160816; b=tnZ3uIEF+MTVYR0ztjSZHmCQuAbswU3wjdLM0i7xzAUNzWXKiaMpCkGXT4g+SfQFtP Vgo2z8iT6sp275jSRCj14bvQPL32P0WCZgoe12iLo2VIawGv+TgWkYbmyWsk9erYuqaW aY/OgJfqM4TAp3KZF1q5WO0SKhqxp7mYBZUaHugfGs0/wgp8JnHacUMMJ0nR01sfuizB J6zrl6n5i8odv+in9cMtvx5S/TQr43VoNuQU+sLPOZCavG0Z9eJft/Wh7FlwRuu+M9Td br4BkwD+sCehFlldT+EzYFoIGliGCwDyCGo0SqwGQtseNzyc4TO2rcyq7h2lJ3SmqB2l JboQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=/MztQmPdsHdxZbmtCr77rGtekClz1bTeXUV+HhMyoxE=; b=AgJ+C3UKpvbNZgj2iaIB2M6qKcceOk3XACFDTRJVlPDYEFMIF+vW6j9zO4MKjEzdAg id0H0pjQfGJv4ZEV79ST39XbMY/jPbo3xAHN3DWx7Tw59V9bgemDESIOwWwzIop/fXl0 YLP78bfkOrBKXCgCRCXHHApgXqnMzFB/q4Z47LxwgG1+n13gDxAgVoui3zpDghy0QtAU sc/Q1k7kgNDwQydikmcX0baM1iOAtsjl67W+WXzV/Ks9fecoJWASwC+DRZ/Oikfal+2u toq/bIrSSL4KC+481Gy5+TQs7dAbY6qfjHcg/U4haLtIE28mz/i45UbRDXEHltfWKfu1 dFHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k6-v6si4077934pla.509.2018.05.02.11.22.29; Wed, 02 May 2018 11:22:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751477AbeEBSWD (ORCPT + 99 others); Wed, 2 May 2018 14:22:03 -0400 Received: from smtp3.ccs.ornl.gov ([160.91.203.39]:55674 "EHLO smtp3.ccs.ornl.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751277AbeEBSV6 (ORCPT ); Wed, 2 May 2018 14:21:58 -0400 Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id B161BE21; Wed, 2 May 2018 14:21:51 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id AEBD3D96; Wed, 2 May 2018 14:21:51 -0400 (EDT) From: James Simmons To: Greg Kroah-Hartman , devel@driverdev.osuosl.org, Andreas Dilger , Oleg Drokin , Lai Siyao , Jinshan Xiong , NeilBrown Cc: Linux Kernel Mailing List , Lustre Development List , James Simmons Subject: [PATCH 4/4] staging: lustre: obdclass: change object lookup to no wait mode Date: Wed, 2 May 2018 14:21:48 -0400 Message-Id: <1525285308-15347-5-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1525285308-15347-1-git-send-email-jsimmons@infradead.org> References: <1525285308-15347-1-git-send-email-jsimmons@infradead.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Lai Siyao Currently we set LU_OBJECT_HEARD_BANSHEE on object when we want to remove object from cache, but this may lead to deadlock, because when other process lookup such object, it needs to wait for this object until release (done at last refcount put), while that process maybe already hold an LDLM lock. Now that current code can handle dying object correctly, we can just return such object in lookup, thus the above deadlock can be avoided. Signed-off-by: Lai Siyao Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-9049 Reviewed-on: https://review.whamcloud.com/26965 Reviewed-by: Alex Zhuravlev Tested-by: Cliff White Reviewed-by: Fan Yong Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- drivers/staging/lustre/lustre/include/lu_object.h | 2 +- drivers/staging/lustre/lustre/obdclass/lu_object.c | 82 +++++++++------------- 2 files changed, 36 insertions(+), 48 deletions(-) diff --git a/drivers/staging/lustre/lustre/include/lu_object.h b/drivers/staging/lustre/lustre/include/lu_object.h index f29bbca..232063a 100644 --- a/drivers/staging/lustre/lustre/include/lu_object.h +++ b/drivers/staging/lustre/lustre/include/lu_object.h @@ -673,7 +673,7 @@ static inline void lu_object_get(struct lu_object *o) } /** - * Return true of object will not be cached after last reference to it is + * Return true if object will not be cached after last reference to it is * released. */ static inline int lu_object_is_dying(const struct lu_object_header *h) diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c index 8b507f1..9311703 100644 --- a/drivers/staging/lustre/lustre/obdclass/lu_object.c +++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c @@ -589,19 +589,13 @@ static struct lu_object *htable_lookup(struct lu_site *s, const struct lu_fid *f, __u64 *version) { - struct cfs_hash *hs = s->ls_obj_hash; struct lu_site_bkt_data *bkt; struct lu_object_header *h; struct hlist_node *hnode; - __u64 ver; - wait_queue_entry_t waiter; + u64 ver = cfs_hash_bd_version_get(bd); -retry: - ver = cfs_hash_bd_version_get(bd); - - if (*version == ver) { + if (*version == ver) return ERR_PTR(-ENOENT); - } *version = ver; bkt = cfs_hash_bd_extra_get(s->ls_obj_hash, bd); @@ -615,31 +609,13 @@ static struct lu_object *htable_lookup(struct lu_site *s, } h = container_of(hnode, struct lu_object_header, loh_hash); - if (likely(!lu_object_is_dying(h))) { - cfs_hash_get(s->ls_obj_hash, hnode); - lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT); - if (!list_empty(&h->loh_lru)) { - list_del_init(&h->loh_lru); - percpu_counter_dec(&s->ls_lru_len_counter); - } - return lu_object_top(h); + cfs_hash_get(s->ls_obj_hash, hnode); + lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT); + if (!list_empty(&h->loh_lru)) { + list_del_init(&h->loh_lru); + percpu_counter_dec(&s->ls_lru_len_counter); } - - /* - * Lookup found an object being destroyed this object cannot be - * returned (to assure that references to dying objects are eventually - * drained), and moreover, lookup has to wait until object is freed. - */ - - init_waitqueue_entry(&waiter, current); - add_wait_queue(&bkt->lsb_marche_funebre, &waiter); - set_current_state(TASK_UNINTERRUPTIBLE); - lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_DEATH_RACE); - cfs_hash_bd_unlock(hs, bd, 1); - schedule(); - remove_wait_queue(&bkt->lsb_marche_funebre, &waiter); - cfs_hash_bd_lock(hs, bd, 1); - goto retry; + return lu_object_top(h); } /** @@ -680,6 +656,8 @@ static void lu_object_limit(const struct lu_env *env, struct lu_device *dev) } /** + * Core logic of lu_object_find*() functions. + * * Much like lu_object_find(), but top level device of object is specifically * \a dev rather than top level device of the site. This interface allows * objects of different "stacking" to be created within the same site. @@ -713,36 +691,46 @@ struct lu_object *lu_object_find_at(const struct lu_env *env, * It is unnecessary to perform lookup-alloc-lookup-insert, instead, * just alloc and insert directly. * + * If dying object is found during index search, add @waiter to the + * site wait-queue and return ERR_PTR(-EAGAIN). */ - s = dev->ld_site; - hs = s->ls_obj_hash; + if (conf && conf->loc_flags & LOC_F_NEW) { + o = lu_object_alloc(env, dev, f, conf); + if (unlikely(IS_ERR(o))) + return o; + + hs = dev->ld_site->ls_obj_hash; + cfs_hash_bd_get_and_lock(hs, (void *)f, &bd, 1); + cfs_hash_bd_add_locked(hs, &bd, &o->lo_header->loh_hash); + cfs_hash_bd_unlock(hs, &bd, 1); - cfs_hash_bd_get(hs, f, &bd); - if (!(conf && conf->loc_flags & LOC_F_NEW)) { - cfs_hash_bd_lock(hs, &bd, 0); - o = htable_lookup(s, &bd, f, &version); - cfs_hash_bd_unlock(hs, &bd, 0); + lu_object_limit(env, dev); - if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT) - return o; + return o; } + + s = dev->ld_site; + hs = s->ls_obj_hash; + cfs_hash_bd_get_and_lock(hs, f, &bd, 1); + o = htable_lookup(s, &bd, f, &version); + cfs_hash_bd_unlock(hs, &bd, 0); + if (!IS_ERR(o) || PTR_ERR(o) != -ENOENT) + return o; + /* * Allocate new object. This may result in rather complicated * operations, including fld queries, inode loading, etc. */ o = lu_object_alloc(env, dev, f, conf); - if (IS_ERR(o)) + if (unlikely(IS_ERR(o))) return o; LASSERT(lu_fid_eq(lu_object_fid(o), f)); cfs_hash_bd_lock(hs, &bd, 1); - if (conf && conf->loc_flags & LOC_F_NEW) - shadow = ERR_PTR(-ENOENT); - else - shadow = htable_lookup(s, &bd, f, &version); - if (likely(PTR_ERR(shadow) == -ENOENT)) { + shadow = htable_lookup(s, &bd, f, &version); + if (likely(IS_ERR(shadow) && PTR_ERR(shadow) == -ENOENT)) { cfs_hash_bd_add_locked(hs, &bd, &o->lo_header->loh_hash); cfs_hash_bd_unlock(hs, &bd, 1); -- 1.8.3.1