Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1394688ybf; Thu, 27 Feb 2020 10:07:31 -0800 (PST) X-Google-Smtp-Source: APXvYqwp5vZrFSrLDgEXsS72SPevT5C2wSBLPM/vs8UiF3VvPspD/I2kQoVkcHUcXsHAy/aXsUXC X-Received: by 2002:a9d:74c4:: with SMTP id a4mr113211otl.119.1582826850840; Thu, 27 Feb 2020 10:07:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582826850; cv=none; d=google.com; s=arc-20160816; b=Uxnr3UxZTck2g45azhTYwDkkkkdK5je54iWwYBndOEaAd00HwnS9nw5aYKqiR7VshY +T32vvEHdoxfzkYmIAP7FIMNfKK4HWE3zLtOO7vkwks6GubvLmUY+PtdKbPcb4xlLmwE sGB5V1GLBGt0VyQYEX4JC07jWgnFpdJwMUJ5C7n5O4yrahRhoZElhaLWH56/3KeW66V5 9TxOy78McrjkVsHDpllUEhc0cfGtiTBsnzItqdDruCOdSlIZfERryg5F0yzvomAjyYeQ w422SNbVtrbozAit36qiPEOIOrvsLbM3FYx+tr0vaHb3hJ35Xfckoe3CMCQUvVNLGsFP uQqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:content-transfer-encoding:mime-version :message-id:date:subject:cc:from:dkim-signature; bh=DXpYA4uWFjCld7w/2i2JWerrfYMLcOkZ4f6tlpHJ5Zg=; b=EnRQC2KGXneDxZECTdU2J7K9dRqs000/EvvmJuDayMtlC3JbBx9zZZvBiGYQARKgkZ GlMYqks6ADeVfbpFj/zXAz8/mhnf2i4cylUb7osiWqlnACMwvzmjUZGQvCDqZnSE6pbV jKFtApYP12KtyHfrTyEeF89D++R+gkOl5Fs1b7gAy3HyHW08YbJAJx6IVSVsQV3niIfa B8aeOkw0KBldLxjpI2sSMhS+D2HSs6zQZo5fkETWS1RKLxRgII5ibVd1eoHvQE/at6bn WkqJQFW2sD7USJrGAUULdB5RcQaEbCk3mKjW6zEGz1dYa5GjeYu0G55keDmd1Su3kYlG UgZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=EGXEeyU2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t1si277916oic.140.2020.02.27.10.07.18; Thu, 27 Feb 2020 10:07:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=EGXEeyU2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729580AbgB0SHH (ORCPT + 99 others); Thu, 27 Feb 2020 13:07:07 -0500 Received: from mail-pl1-f194.google.com ([209.85.214.194]:34391 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726805AbgB0SHH (ORCPT ); Thu, 27 Feb 2020 13:07:07 -0500 Received: by mail-pl1-f194.google.com with SMTP id j7so119702plt.1; Thu, 27 Feb 2020 10:07:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=DXpYA4uWFjCld7w/2i2JWerrfYMLcOkZ4f6tlpHJ5Zg=; b=EGXEeyU2so7TeL8mrOoWfqXxbEaWvAOrVAKbPgfRMHeVnlVNe4ZCkzT0OkpOO3uJtk VaaJOetGJP9NPwO/z475KPLFeZ08j7x7ed5iOP/WKOegVg9uEqGDSHG31Sa1fYFcowfu T6HiuXmM4RRWnpKlElNKTxfCyPs1lya5EX4V28vJoAMYQ1PubwjMhkzfpHHGFdrKbryb A0aq2yZZYREvLvrRaDSB7SfnknBh0ozwgV91z+UNu9Ni4aSGDy+zxiMukjio0DLsWoUF PHmELLp71WCwJw4nKfrpNLPBqlssNccqbu0GGatCQ7ZECGnB2CXffvW0tcToWkJt+qd0 KaKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=DXpYA4uWFjCld7w/2i2JWerrfYMLcOkZ4f6tlpHJ5Zg=; b=RGtvo491xGRs2oYsSEotlFhke4N+9Pu99r3VVvSXk1sb3D7ujFLueeINLx0TH8Lhin LnDuo9iwG2wE2kaiYeOeggVP+49BbIbhBB+2mV/dwZJzSuafNmtOpJRfHdVmITta6xKK CMvRcdoGFaTM0sZMiheN7nUmHL+qAp4vPLLobGPwU0S1gkyUtVC0sezMQSWXYd5gRWkO Q0EKWQqvjpA8tezyjjsfZGGTNS1OspzAyxwxRTnppT01XB0OdLdKGUP541gVOe0T5WIV 3kZA+GNJ2QBw9HIksPItsZye/QdaaT/Ot0YRA+J2PfojyOtY+Fafx6oFO98ND52uNIbl tV3Q== X-Gm-Message-State: APjAAAU4eCiYoHGsCZuz4jbpIp30GsnZBVh2+TlHCgNGFb/2umEcffE8 Eoz65SCaJDQa6btqQgvXuso= X-Received: by 2002:a17:902:462:: with SMTP id 89mr20554ple.270.1582826824450; Thu, 27 Feb 2020 10:07:04 -0800 (PST) Received: from localhost.localdomain ([103.46.201.58]) by smtp.gmail.com with ESMTPSA id d1sm7207096pgj.79.2020.02.27.10.06.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Feb 2020 10:07:03 -0800 (PST) From: Aman Sharma Cc: linux-kernel-mentees@lists.linuxfoundation.org, amanharitsh123@gmail.com, "Paul E. McKenney" , Josh Triplett , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Joel Fernandes , Jonathan Corbet , rcu@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [Linux-kernel-mentees] [PATCH v3] doc: Convert rculist_nulls.txt to rculist_nulls.rst Date: Thu, 27 Feb 2020 23:36:53 +0530 Message-Id: <20200227180656.15187-1-amanharitsh123@gmail.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit To: unlisted-recipients:; (no To-header on input) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch converts rculist_nulls from .txt to .rst format, and also adds it to the index.rst file. Major changes includes: 1) Addition of section headers and subsection headers. 2) Addition of literal blocks which contains all the codes. 3) Making enumerated list item by changing X) to X. where X is number like 1,2,3 etc. Signed-off-by: Aman Sharma --- Documentation/RCU/index.rst | 1 + Documentation/RCU/rculist_nulls.rst | 179 ++++++++++++++++++++++++++++ Documentation/RCU/rculist_nulls.txt | 172 -------------------------- 3 files changed, 180 insertions(+), 172 deletions(-) create mode 100644 Documentation/RCU/rculist_nulls.rst delete mode 100644 Documentation/RCU/rculist_nulls.txt diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst index d60eb4ba2cd0..2cf55bd141b3 100644 --- a/Documentation/RCU/index.rst +++ b/Documentation/RCU/index.rst @@ -10,6 +10,7 @@ RCU concepts arrayRCU rcubarrier rcu_dereference + rculist_nulls checklist whatisRCU rcu diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst new file mode 100644 index 000000000000..8e04de44fe3d --- /dev/null +++ b/Documentation/RCU/rculist_nulls.rst @@ -0,0 +1,179 @@ +.. _rculist_nulls_doc: + +Using hlist_nulls to protect read-mostly linked lists and objects using SLAB_TYPESAFE_BY_RCU allocations. +========================================================================================================= + +Please read the basics in Documentation/RCU/listRCU.rst + +Using special makers (called 'nulls') is a convenient way +to solve following problem : + +A typical RCU linked list managing objects which are +allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can +use following algos : + +1. Lookup algo +-------------- +:: + + rcu_read_lock() + begin: + obj = lockless_lookup(key); + if (obj) { + if (!try_get_ref(obj)) // might fail for free objects + goto begin; + /* + * Because a writer could delete object, and a writer could + * reuse these object before the RCU grace period, we + * must check key after getting the reference on object + */ + if (obj->key != key) { // not the object we expected + put_ref(obj); + goto begin; + } + } + rcu_read_unlock(); + +Beware that lockless_lookup(key) cannot use traditional hlist_for_each_entry_rcu() +but a version with an additional memory barrier (smp_rmb()) :: + + lockless_lookup(key) + { + struct hlist_node *node, *next; + for (pos = rcu_dereference((head)->first); + pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); + pos = rcu_dereference(next)) + if (obj->key == key) + return obj; + return NULL; + +And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb() :: + + struct hlist_node *node; + for (pos = rcu_dereference((head)->first); + pos && ({ prefetch(pos->next); 1; }) && + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); + pos = rcu_dereference(pos->next)) + if (obj->key == key) + return obj; + return NULL; +} + +Quoting Corey Minyard : + +"If the object is moved from one list to another list in-between the + time the hash is calculated and the next field is accessed, and the + object has moved to the end of a new list, the traversal will not + complete properly on the list it should have, since the object will + be on the end of the new list and there's not a way to tell it's on a + new list and restart the list traversal. I think that this can be + solved by pre-fetching the "next" field (with proper barriers) before + checking the key." + +2. Insert algo +---------------- + +We need to make sure a reader cannot read the new 'obj->obj_next' value +and previous value of 'obj->key'. Or else, an item could be deleted +from a chain, and inserted into another chain. If new chain was empty +before the move, 'next' pointer is NULL, and lockless reader can +not detect it missed following items in original chain :: + + /* + * Please note that new inserts are done at the head of list, + * not in the middle or end. + */ + obj = kmem_cache_alloc(...); + lock_chain(); // typically a spin_lock() + obj->key = key; + /* + * we need to make sure obj->key is updated before obj->next + * or obj->refcnt + */ + smp_wmb(); + atomic_set(&obj->refcnt, 1); + hlist_add_head_rcu(&obj->obj_node, list); + unlock_chain(); // typically a spin_unlock() + + +3. Remove algo +-------------- +Nothing special here, we can use a standard RCU hlist deletion. +But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused +very very fast (before the end of RCU grace period) :: + + if (put_last_reference_on(obj) { + lock_chain(); // typically a spin_lock() + hlist_del_init_rcu(&obj->obj_node); + unlock_chain(); // typically a spin_unlock() + kmem_cache_free(cachep, obj); + } + + + +With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup() +and extra smp_wmb() in insert function. + +For example, if we choose to store the slot number as the 'nulls' +end-of-list marker for each slot of the hash table, we can detect +a race (some writer did a delete and/or a move of an object +to another chain) checking the final 'nulls' value if +the lookup met the end of chain. If final 'nulls' value +is not the slot number, then we must restart the lookup at +the beginning. If the object was moved to the same chain, +then the reader doesn't care : It might eventually +scan the list again without harm. + + +1. lookup algo +-------------- +:: + + head = &table[slot]; + rcu_read_lock(); + begin: + hlist_nulls_for_each_entry_rcu(obj, node, head, member) { + if (obj->key == key) { + if (!try_get_ref(obj)) // might fail for free objects + goto begin; + if (obj->key != key) { // not the object we expected + put_ref(obj); + goto begin; + } + goto out; + } + /* + * if the nulls value we got at the end of this lookup is + * not the expected one, we must restart lookup. + * We probably met an item that was moved to another chain. + */ + if (get_nulls_value(node) != slot) + goto begin; + obj = NULL; + + out: + rcu_read_unlock(); + +2. Insert function +------------------ + +:: + + /* + * Please note that new inserts are done at the head of list, + * not in the middle or end. + */ + obj = kmem_cache_alloc(cachep); + lock_chain(); // typically a spin_lock() + obj->key = key; + /* + * changes to obj->key must be visible before refcnt one + */ + smp_wmb(); + atomic_set(&obj->refcnt, 1); + /* + * insert obj in RCU way (readers might be traversing chain) + */ + hlist_nulls_add_head_rcu(&obj->obj_node, list); + unlock_chain(); // typically a spin_unlock() diff --git a/Documentation/RCU/rculist_nulls.txt b/Documentation/RCU/rculist_nulls.txt deleted file mode 100644 index 23f115dc87cf..000000000000 --- a/Documentation/RCU/rculist_nulls.txt +++ /dev/null @@ -1,172 +0,0 @@ -Using hlist_nulls to protect read-mostly linked lists and -objects using SLAB_TYPESAFE_BY_RCU allocations. - -Please read the basics in Documentation/RCU/listRCU.rst - -Using special makers (called 'nulls') is a convenient way -to solve following problem : - -A typical RCU linked list managing objects which are -allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can -use following algos : - -1) Lookup algo --------------- -rcu_read_lock() -begin: -obj = lockless_lookup(key); -if (obj) { - if (!try_get_ref(obj)) // might fail for free objects - goto begin; - /* - * Because a writer could delete object, and a writer could - * reuse these object before the RCU grace period, we - * must check key after getting the reference on object - */ - if (obj->key != key) { // not the object we expected - put_ref(obj); - goto begin; - } -} -rcu_read_unlock(); - -Beware that lockless_lookup(key) cannot use traditional hlist_for_each_entry_rcu() -but a version with an additional memory barrier (smp_rmb()) - -lockless_lookup(key) -{ - struct hlist_node *node, *next; - for (pos = rcu_dereference((head)->first); - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); - pos = rcu_dereference(next)) - if (obj->key == key) - return obj; - return NULL; - -And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb() : - - struct hlist_node *node; - for (pos = rcu_dereference((head)->first); - pos && ({ prefetch(pos->next); 1; }) && - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); - pos = rcu_dereference(pos->next)) - if (obj->key == key) - return obj; - return NULL; -} - -Quoting Corey Minyard : - -"If the object is moved from one list to another list in-between the - time the hash is calculated and the next field is accessed, and the - object has moved to the end of a new list, the traversal will not - complete properly on the list it should have, since the object will - be on the end of the new list and there's not a way to tell it's on a - new list and restart the list traversal. I think that this can be - solved by pre-fetching the "next" field (with proper barriers) before - checking the key." - -2) Insert algo : ----------------- - -We need to make sure a reader cannot read the new 'obj->obj_next' value -and previous value of 'obj->key'. Or else, an item could be deleted -from a chain, and inserted into another chain. If new chain was empty -before the move, 'next' pointer is NULL, and lockless reader can -not detect it missed following items in original chain. - -/* - * Please note that new inserts are done at the head of list, - * not in the middle or end. - */ -obj = kmem_cache_alloc(...); -lock_chain(); // typically a spin_lock() -obj->key = key; -/* - * we need to make sure obj->key is updated before obj->next - * or obj->refcnt - */ -smp_wmb(); -atomic_set(&obj->refcnt, 1); -hlist_add_head_rcu(&obj->obj_node, list); -unlock_chain(); // typically a spin_unlock() - - -3) Remove algo --------------- -Nothing special here, we can use a standard RCU hlist deletion. -But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused -very very fast (before the end of RCU grace period) - -if (put_last_reference_on(obj) { - lock_chain(); // typically a spin_lock() - hlist_del_init_rcu(&obj->obj_node); - unlock_chain(); // typically a spin_unlock() - kmem_cache_free(cachep, obj); -} - - - --------------------------------------------------------------------------- -With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup() -and extra smp_wmb() in insert function. - -For example, if we choose to store the slot number as the 'nulls' -end-of-list marker for each slot of the hash table, we can detect -a race (some writer did a delete and/or a move of an object -to another chain) checking the final 'nulls' value if -the lookup met the end of chain. If final 'nulls' value -is not the slot number, then we must restart the lookup at -the beginning. If the object was moved to the same chain, -then the reader doesn't care : It might eventually -scan the list again without harm. - - -1) lookup algo - - head = &table[slot]; - rcu_read_lock(); -begin: - hlist_nulls_for_each_entry_rcu(obj, node, head, member) { - if (obj->key == key) { - if (!try_get_ref(obj)) // might fail for free objects - goto begin; - if (obj->key != key) { // not the object we expected - put_ref(obj); - goto begin; - } - goto out; - } -/* - * if the nulls value we got at the end of this lookup is - * not the expected one, we must restart lookup. - * We probably met an item that was moved to another chain. - */ - if (get_nulls_value(node) != slot) - goto begin; - obj = NULL; - -out: - rcu_read_unlock(); - -2) Insert function : --------------------- - -/* - * Please note that new inserts are done at the head of list, - * not in the middle or end. - */ -obj = kmem_cache_alloc(cachep); -lock_chain(); // typically a spin_lock() -obj->key = key; -/* - * changes to obj->key must be visible before refcnt one - */ -smp_wmb(); -atomic_set(&obj->refcnt, 1); -/* - * insert obj in RCU way (readers might be traversing chain) - */ -hlist_nulls_add_head_rcu(&obj->obj_node, list); -unlock_chain(); // typically a spin_unlock() -- 2.20.1