Received: by 10.192.165.148 with SMTP id m20csp216182imm; Thu, 3 May 2018 18:31:10 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrdixBLVxKpqtXHeOc+0Lb4BxpmYAtcdXPpEytfzED4qqCuMM4x+YvwGOiHcxPHARVmf53/ X-Received: by 2002:a63:9e12:: with SMTP id s18-v6mr10425573pgd.207.1525397470413; Thu, 03 May 2018 18:31:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525397470; cv=none; d=google.com; s=arc-20160816; b=QWzGxG9AM427H+EYM9GtdFfRd115uNzE8Eyv3u7nUVQ76PUGlWY6XHOr1wziKIhGHe Z703ENtMVgVmDeRMjCBt9twJLk7TP5usV+i8nOiyNkNS9FGilw0ghS/KDl6z0JeAqy7r s7yVttCSY1IvzP3yxfojrggsZYhthzMvkkVwgKAR+Sl8bSR12n7sjra5wtD1XJB0OwMP 8ixZ+zGOcMGeGfD2oRnyyF3SVgofWlFACU9pMrVm02jHewPzgThmbgl3DCPqVF+G0KO0 XE6VqJop8nifAYqUNMWQoD7hMUCv5XEtjz/x53xuwv1aM9hnGa3u2JPAwzv2Mzkm658Z 7hmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:references :in-reply-to:subject:cc:date:to:from:arc-authentication-results; bh=6f3fU5K81CkvIZicX7qIkdvglPL6I1FsqS6KZnqWWt8=; b=Hg/KGWFLVv/jsrc9huQs2accQGyctY/CdKAErBT0rdSqtJd51WBqETB4f+yj+nePLH bIAs+86B5OLXwf0aNa9+F486jv80HgfN3NI/uhl0KRnLmIJonSYBky38kPe0L2YjxWmq lgV+j4ow6VgqtKGgxdCX+PUPQ9oVxVb2llUwnmpo+VjJrmS7X6n86NEkpq/d1xVktQ3h /a+3xaI2/0KK7YTzB6OFWDY0A/wuiAvaodOUwpqn1xgBqSHuKT88ACK3llMe7+qZ9oRV rOo9NurB/QAeWgOBm49yoIQ6ua8nTNAYZutGGv9UDtF4iNkNXuOTd9x+JIk+UEJfgyGX 6K3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c196-v6si12457420pga.494.2018.05.03.18.30.55; Thu, 03 May 2018 18:31:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751360AbeEDBag (ORCPT + 99 others); Thu, 3 May 2018 21:30:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:37914 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751036AbeEDBaf (ORCPT ); Thu, 3 May 2018 21:30:35 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 9AE08AFBC; Fri, 4 May 2018 01:30:34 +0000 (UTC) From: NeilBrown To: "Dilger\, Andreas" Date: Fri, 04 May 2018 11:30:25 +1000 Cc: "Drokin\, Oleg" , Greg Kroah-Hartman , James Simmons , "Linux Kernel Mailing List" , Lustre Development List Subject: Re: [lustre-devel] [PATCH 04/10] staging: lustre: lu_object: move retry logic inside htable_lookup In-Reply-To: References: <152514658325.17843.11455067361317157487.stgit@noble> <152514675897.17843.15112214060540196720.stgit@noble> Message-ID: <874ljofbri.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue, May 01 2018, Dilger, Andreas wrote: > On Apr 30, 2018, at 21:52, NeilBrown wrote: >>=20 >> The current retry logic, to wait when a 'dying' object is found, >> spans multiple functions. The process is attached to a waitqueue >> and set TASK_UNINTERRUPTIBLE in htable_lookup, and this status >> is passed back through lu_object_find_try() to lu_object_find_at() >> where schedule() is called and the process is removed from the queue. >>=20 >> This can be simplified by moving all the logic (including >> hashtable locking) inside htable_lookup(), which now never returns >> EAGAIN. >>=20 >> Note that htable_lookup() is called with the hash bucket lock >> held, and will drop and retake it if it needs to schedule. >>=20 >> I made this a 'goto' loop rather than a 'while(1)' loop as the >> diff is easier to read. >>=20 >> Signed-off-by: NeilBrown >> --- >> drivers/staging/lustre/lustre/obdclass/lu_object.c | 73 +++++++-------= ------ >> 1 file changed, 27 insertions(+), 46 deletions(-) >>=20 >> diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/driver= s/staging/lustre/lustre/obdclass/lu_object.c >> index 2bf089817157..93daa52e2535 100644 >> --- a/drivers/staging/lustre/lustre/obdclass/lu_object.c >> +++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c >> @@ -586,16 +586,21 @@ EXPORT_SYMBOL(lu_object_print); >> static struct lu_object *htable_lookup(struct lu_site *s, > > It's probably a good idea to add a comment for this function that it may > drop and re-acquire the hash bucket lock internally. Added - thanks. > >> struct cfs_hash_bd *bd, >> const struct lu_fid *f, >> - wait_queue_entry_t *waiter, >> __u64 *version) >> { >> + struct cfs_hash *hs =3D s->ls_obj_hash; >> struct lu_site_bkt_data *bkt; >> struct lu_object_header *h; >> struct hlist_node *hnode; >> - __u64 ver =3D cfs_hash_bd_version_get(bd); >> + __u64 ver; >> + wait_queue_entry_t waiter; >>=20 >> - if (*version =3D=3D ver) >> +retry: >> + ver =3D cfs_hash_bd_version_get(bd); >> + >> + if (*version =3D=3D ver) { >> return ERR_PTR(-ENOENT); >> + } > > (style) we don't need the {} around a single-line if statement Fixed. > >> *version =3D ver; >> bkt =3D cfs_hash_bd_extra_get(s->ls_obj_hash, bd); >> @@ -625,11 +630,15 @@ static struct lu_object *htable_lookup(struct lu_s= ite *s, >> * drained), and moreover, lookup has to wait until object is freed. >> */ >>=20 >> - init_waitqueue_entry(waiter, current); >> - add_wait_queue(&bkt->lsb_marche_funebre, waiter); >> + init_waitqueue_entry(&waiter, current); >> + add_wait_queue(&bkt->lsb_marche_funebre, &waiter); >> set_current_state(TASK_UNINTERRUPTIBLE); >> lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_DEATH_RACE); >> - return ERR_PTR(-EAGAIN); >> + cfs_hash_bd_unlock(hs, bd, 1); > > This looks like it isn't unlocking and locking the hash bucket in the same > manner that it was done in the caller. Here excl =3D 1, but in the caller > you changed it to excl =3D 0? Don't know what happened there... though as the tables is created with CFS_HASH_SPIN_BLKLOCK it doesn't make any behavioral difference. I've put it back to use '1' uniformly. > >> + schedule(); >> + remove_wait_queue(&bkt->lsb_marche_funebre, &waiter); > > Is it worthwhile to use your new helper function here to get the wq from = "s"? I don't think so. We already have the 'bkt' and it seems pointless to compute the hash a second time and use it to find the bucket and then the queue, just to use a nice wrapper function. > >> + cfs_hash_bd_lock(hs, bd, 1); >> + goto retry; >> } >>=20 >> /** >> @@ -693,13 +702,14 @@ static struct lu_object *lu_object_new(const struc= t lu_env *env, >> } >>=20 >> /** >> - * Core logic of lu_object_find*() functions. >> + * Much like lu_object_find(), but top level device of object is specif= ically >> + * \a dev rather than top level device of the site. This interface allo= ws >> + * objects of different "stacking" to be created within the same site. >> */ >> -static struct lu_object *lu_object_find_try(const struct lu_env *env, >> - struct lu_device *dev, >> - const struct lu_fid *f, >> - const struct lu_object_conf *conf, >> - wait_queue_entry_t *waiter) >> +struct lu_object *lu_object_find_at(const struct lu_env *env, >> + struct lu_device *dev, >> + const struct lu_fid *f, >> + const struct lu_object_conf *conf) >> { >> struct lu_object *o; >> struct lu_object *shadow; >> @@ -725,17 +735,16 @@ static struct lu_object *lu_object_find_try(const = struct lu_env *env, >> * It is unnecessary to perform lookup-alloc-lookup-insert, instead, >> * just alloc and insert directly. >> * >> - * If dying object is found during index search, add @waiter to the >> - * site wait-queue and return ERR_PTR(-EAGAIN). >> */ >> if (conf && conf->loc_flags & LOC_F_NEW) >> return lu_object_new(env, dev, f, conf); >>=20 >> s =3D dev->ld_site; >> hs =3D s->ls_obj_hash; >> - cfs_hash_bd_get_and_lock(hs, (void *)f, &bd, 1); >> - o =3D htable_lookup(s, &bd, f, waiter, &version); >> - cfs_hash_bd_unlock(hs, &bd, 1); >> + cfs_hash_bd_get_and_lock(hs, (void *)f, &bd, 0); >> + o =3D htable_lookup(s, &bd, f, &version); >> + cfs_hash_bd_unlock(hs, &bd, 0); > > Here you changed the locking to a non-exclusive (read) lock instead of an > exclusive (write) lock? Why. Carelessness is all. And the fact that my thinking was focused on rhashtable which doesn't make that distinction. Fixed. Thanks, NeilBrown --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAlrrt7IACgkQOeye3VZi gbkL8A//Ya3S/RrfiF55FlkafVHhlW9Qv3rNRMQ0N7IKhLMImnybNi3V+X29bjVq /z6OQ9BnBzxZ7OutM/0tgyepJPzJuRC6wRi06F5e3TnnAaMgD68aUSPz0r6+yGLz x82uEG8CovNN6v0Aq0G3ECcWcWF9t8A7gqyy12oF/upK1Eu64Ro17VbHTyax6syO iv1iI9unwPTtUwsK+icVzLpbITrewnzPsf6TElJMKsVvpv+ZqrkAcoCPprp+s2rR ytYvxAMFOipuEd69GkvlZa9sUIOikR2arQ78LyMsI6UiN7iEKfTyQUUB8d3MhNZI 4Jn+03YETwGOmrmg3CUrt34DfS7iWhi7ypyWukLSyvrhlkvM3utYFpDshutBSnZF BTRYM3HdEHufIZaa1yeI3y3vzrXJac4zKGz/+0Vr0YbjH3qaJT3JGy4Rh1KrNAMY k5/41/GVs52zx1vx9ElkkEmOIui3sP8V/zvPMeNlE2f/qo0LbSCqndDVvs1T+ehk 0R3RUyOrMC1PsRDHU15m5SizmBOyfHTxgJ2+YRqxDMFIxuMbnCQgmE/+sEtyuHkr ZHqfR8hXOMF27AO+QpqnVnGynWpb9BKna3q3B84FDlYDa89610+X17OZ0+Y1SDI4 VJN9Po3TQodaY4YY9uQcZgQFwj9ULPFetNgxUwrY4gZJlIN9S8Y= =ZdaX -----END PGP SIGNATURE----- --=-=-=--