Received: by 10.192.165.148 with SMTP id m20csp3940511imm; Mon, 23 Apr 2018 15:31:38 -0700 (PDT) X-Google-Smtp-Source: AIpwx49Pm1Mr9G8QMUL3T7FEAh+NcE6ikUchzLzHWxwa/MGDnv/FIEcvuTjWaY7ncZRkU8XPL6Vz X-Received: by 10.99.106.4 with SMTP id f4mr17688136pgc.225.1524522698874; Mon, 23 Apr 2018 15:31:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524522698; cv=none; d=google.com; s=arc-20160816; b=uW7q8Is4NnclYQRb4zE0OLrliNrWnHwIxuGB+wUHVWVi3iBP6ywE4g8FEpBz9Loefq dsglMiYtWYQuowPF4KLj9iCxUxMudytjuG3qhJTySEPjXreuyAFxm2pW8Cwxor88oF44 Yh6JcuDymhNw3E71Gj8etO8P547eqMze17qBmVO4ckbeWzJhms23TJobqeqxuZX9JfYf W3A5i3hcmO5PDS9ZOQGhLNh+DlLhFSp+bl4nzJ0BoNELkVuSGkHRQod3QEqj1e2dvcAa 5xbRau6YXZ7qiu7u1O0eFyU9GLTe605TNJ7tBR7pSAV0retqtnDETk3khsm4uxjLyS1Q 047A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:cc:subject:date:to :from:arc-authentication-results; bh=K5Ufdc6Bsf1cTLqRT/Fc+PR3WIDm2SnIhkkHg416DYc=; b=cFuJSqXMFb20ZIQ8mLuopcfdo0UdqoRZO8M68VMOe5sc7hzCaqsWegMGEM9hhVhD3L 5xtP9Fc7E9OGmqHqDuMWFBdCi8OO+TDi2SAtuwVGh+owzUa5iT8XsR4tx8X7YWnpJVSY 6uuLsRhsswh9pv93mPVMCsVL1BbEpf1xUnCqftXHa8aFMDDuizxgM/2KHy1kipVwgFNa F05lM1mKxXGulxfFcP9XbNkn+EDvLDahScJFVTSeFdwa0qYaxfy1vm2hfBdi9McXINlt /wVnkGowtp88KFpnqTAY7IU1wApKsoJFH1ZWRjP9jdsdJNZCfTibWtNwPJ7xSu9FsYSa nDHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1-v6si12359170plw.519.2018.04.23.15.31.24; Mon, 23 Apr 2018 15:31:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932669AbeDWWaJ (ORCPT + 99 others); Mon, 23 Apr 2018 18:30:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:54639 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932645AbeDWWaG (ORCPT ); Mon, 23 Apr 2018 18:30:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 3D98BAC92; Mon, 23 Apr 2018 22:30:04 +0000 (UTC) From: NeilBrown To: Thomas Graf , Herbert Xu , David Miller Date: Tue, 24 Apr 2018 08:29:13 +1000 Subject: [PATCH 4/4] rhashtable: improve rhashtable_walk stability when stop/start used. Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Message-ID: <152452255351.1456.12384285355497513812.stgit@noble> In-Reply-To: <152452244405.1456.8175298512483573078.stgit@noble> References: <152452244405.1456.8175298512483573078.stgit@noble> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a walk of an rhashtable is interrupted with rhastable_walk_stop() and then rhashtable_walk_start(), the location to restart from is based on a 'skip' count in the current hash chain, and this can be incorrect if insertions or deletions have happened. This does not happen when the walk is not stopped and started as iter->p is a placeholder which is safe to use while holding the RCU read lock. In rhashtable_walk_start() we can revalidate that 'p' is still in the same hash chain. If it isn't then the current method is still used. With this patch, if a rhashtable walker ensures that the current object remains in the table over a stop/start period (possibly by elevating the reference count if that is sufficient), it can be sure that a walk will not miss objects that were in the hashtable for the whole time of the walk. rhashtable_walk_start() may not find the object even though it is still in the hashtable if a rehash has moved it to a new table. In this case it will (eventually) get -EAGAIN and will need to proceed through the whole table again to be sure to see everything at least once. Acked-by: Herbert Xu Signed-off-by: NeilBrown --- lib/rhashtable.c | 44 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 41 insertions(+), 3 deletions(-) diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 81edf1ab38ab..9427b5766134 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -727,6 +727,7 @@ int rhashtable_walk_start_check(struct rhashtable_iter *iter) __acquires(RCU) { struct rhashtable *ht = iter->ht; + bool rhlist = ht->rhlist; rcu_read_lock(); @@ -735,13 +736,52 @@ int rhashtable_walk_start_check(struct rhashtable_iter *iter) list_del(&iter->walker.list); spin_unlock(&ht->lock); - if (!iter->walker.tbl && !iter->end_of_table) { + if (iter->end_of_table) + return 0; + if (!iter->walker.tbl) { iter->walker.tbl = rht_dereference_rcu(ht->tbl, ht); iter->slot = 0; iter->skip = 0; return -EAGAIN; } + if (iter->p && !rhlist) { + /* + * We need to validate that 'p' is still in the table, and + * if so, update 'skip' + */ + struct rhash_head *p; + int skip = 0; + rht_for_each_rcu(p, iter->walker.tbl, iter->slot) { + skip++; + if (p == iter->p) { + iter->skip = skip; + goto found; + } + } + iter->p = NULL; + } else if (iter->p && rhlist) { + /* Need to validate that 'list' is still in the table, and + * if so, update 'skip' and 'p'. + */ + struct rhash_head *p; + struct rhlist_head *list; + int skip = 0; + rht_for_each_rcu(p, iter->walker.tbl, iter->slot) { + for (list = container_of(p, struct rhlist_head, rhead); + list; + list = rcu_dereference(list->next)) { + skip++; + if (list == iter->list) { + iter->p = p; + skip = skip; + goto found; + } + } + } + iter->p = NULL; + } +found: return 0; } EXPORT_SYMBOL_GPL(rhashtable_walk_start_check); @@ -917,8 +957,6 @@ void rhashtable_walk_stop(struct rhashtable_iter *iter) iter->walker.tbl = NULL; spin_unlock(&ht->lock); - iter->p = NULL; - out: rcu_read_unlock(); }