Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932856AbdHVM3y (ORCPT ); Tue, 22 Aug 2017 08:29:54 -0400 Received: from mail-db5eur01on0094.outbound.protection.outlook.com ([104.47.2.94]:17382 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932757AbdHVM3Y (ORCPT ); Tue, 22 Aug 2017 08:29:24 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=ktkhai@virtuozzo.com; Subject: [PATCH 2/3] mm: Make list_lru_node::memcg_lrus RCU protected From: Kirill Tkhai To: apolyakov@beget.ru, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ktkhai@virtuozzo.com, vdavydov.dev@gmail.com, aryabinin@virtuozzo.com, akpm@linux-foundation.org Date: Tue, 22 Aug 2017 15:29:26 +0300 Message-ID: <150340496641.3845.291357513974178821.stgit@localhost.localdomain> In-Reply-To: <150340381428.3845.6099251634440472539.stgit@localhost.localdomain> References: <150340381428.3845.6099251634440472539.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [195.214.232.6] X-ClientProxiedBy: AM5P189CA0016.EURP189.PROD.OUTLOOK.COM (2603:10a6:206:15::29) To VI1PR0801MB1343.eurprd08.prod.outlook.com (2603:10a6:800:3b::7) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 16e007be-7dc4-4455-e53a-08d4e9596a63 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:VI1PR0801MB1343; X-Microsoft-Exchange-Diagnostics: 1;VI1PR0801MB1343;3:tHvBq5h9b160pALHFKdli3pOhXl46P0Ww4h4v2aQu0z9GJeDpolmV1U/Ad0F1yQOQQS0LXeWSSJ67vTc7b2lrPmNcXyDffdhg7DKGkoMQ3p/X24OCI5hVXohJPxrE47ompklzbH6FxokZBzp1bjWSe7ZQQ9KwvvhdHMCY6zP9qYNLVLbYQO9YxeGTpqilFwK/tGh36/9SmyOjAouqcwgTleF6DIFRcMj5OdUg6cYqt4bU/s5Wt8y6k+Tdv80obb2;25:xsLjXJdtLCIpW3GJbRpyd5liPShOY+4KnFxIcpjHLBfXge+DAKrY8amsfFmmGkVV9hMYPWyF2qP7BWFaSZFSUkG6riGN0KHUHlNZJnZTvYj6iTUJ6IkmrpBCXTxhkhDKd76UzcynIFjG1FhoWlkxsa+XKThkYwyIxWNY7ArGdW/UWHQ/Jf4V6t11qqvWwrAxDilf8aOq+KaDxD/K1GbOVwd57UuA/ft89n2xYZAjkPo5i5lPc6m0/ZbM5lz5dTRnIev4QWMKM0WfGpm5NVbP9jiIxyiXsLTVL/MYbW8bwpJ4QWmqcVgS4kOJ2pfW5a5xo5MAVwaeyxtpQztAGkW+dw==;31:8cwT0PnT5n8xjXegSX2fBEDxgDSJGXzjyCJ0nAy9mMYgnEv4p9BGoUACOtUAT2JHd7k1I8tSM+2vwfK/SLaSsCFcNKdtQ+6jUHlqBDTxAueZ6QzF/Di1eWP6RDmFccPbn0ijucapXbvVpuofeZdOSLYRpVlyVzHBISpDZPrJ9+MzWtCdopns2yjlkMWjBOnOMecElu7GCAfax0aW1J4aEPjyGziLVv2sn7cIYzFi6HY= X-MS-TrafficTypeDiagnostic: VI1PR0801MB1343: X-Microsoft-Exchange-Diagnostics: 1;VI1PR0801MB1343;20:/LoQ+qsiqw+Kudr68C6zVVpGvQhyrdR71Ay2mrgwxNsbeUWOISAn6shx1UrDDHjK4CkDyoFBgHRoSDHxlZzaJ82QvOLXyzU2AC/bXItpyYGwtbs+Jrf1pHrfCWfwx3RWnRQQI54F5Lb58hDjul+DMD+kunLuADvNy7IsGS3qcn/1QL2TVp+6qRiaYC2F33PIodIO97JV1SQXUW4ihMMbVCSPx7nSiggROQJsxFP6Yk4p0FOKQGCcNUI+OsYODONi0lJyp8+Tz9/z45cMv7I0u9Uk2VjmEkP28K+4I5xAHDfIQa4HDVXBiI+dhB88vnDDiXJBvJ6O49lLLUIIF8eBZsXYdy89qudsntp+TrG2ckLpNOMK5ea5WI3HO/oRjgHKZn5ZaQtHWeaIJkyEQScl9VsmkGZHViU7N+R34jM3ZSI=;4:32uCgH/M32O5A/DOpvusF8S6FRPZpCObyb9OSP8JWXZ5xvu531KDjeI0FDtIIZltQLMR0C2eLZJO5dv0sP2e1MOn/k/anw8N99eqdQ+jLrNBRffxl70+EnP9ZVOo90LhBt6csLIzJkzOirrTrN+wtmRJBPb2pgAzXKZDTEBKeTJxslHA8ios7CJ5XwJzp4dFqhCnLg20YE14+jIWnNzvyjhYNRaJeG61SJw+XdUgr1sh2qEVMEJx3Kvy6xlHar+P X-Exchange-Antispam-Report-Test: UriScan:; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(3002001)(93006095)(93001095)(100000703101)(100105400095)(10201501046)(6041248)(20161123555025)(20161123564025)(20161123562025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:VI1PR0801MB1343;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:VI1PR0801MB1343; X-Forefront-PRVS: 04073E895A X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(7370300001)(6069001)(6009001)(189002)(199003)(61506002)(189998001)(6506006)(23676002)(47776003)(4001350100001)(25786009)(68736007)(305945005)(86362001)(5660300001)(2950100002)(8676002)(7736002)(83506001)(81156014)(81166006)(42186005)(76176999)(50466002)(103116003)(478600001)(2906002)(230700001)(55016002)(33646002)(7350300001)(106356001)(54356999)(50986999)(53936002)(66066001)(9686003)(101416001)(6116002)(105586002)(110136004)(97736004)(3846002)(309714004);DIR:OUT;SFP:1102;SCL:1;SRVR:VI1PR0801MB1343;H:localhost.localdomain;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtWSTFQUjA4MDFNQjEzNDM7MjM6NEFTK0RoWGp6MTF5UUVySTVvTGYvQ2p4?= =?utf-8?B?RnQxekE0RzFyZVBKZHZTa203T0RER3NOWU5CcjJweUcwOTMvcnAzS1lPbVF3?= =?utf-8?B?SmsxTUZhdVI3UEVWaUpUNVllYnhmUUd2ZlRCNlhZK01ZaGZ3eWtJTjBTQnJn?= =?utf-8?B?UGcwMFYwT3g5NytlSEphRjlJNzdHMlQ3NnNvZTN2YzhOR0hJRWt4T015QWJ4?= =?utf-8?B?N25oOVlZTXJNekZWclpBOG1MVHZpRGNxNTVDMllKRGJjUEhMLyttbHJnNExB?= =?utf-8?B?dnBPN3Uzc2VjenE1a3lJVnpXTG9QdkV2Mk1jbVk1YmN0cUtmNzVZRWxndjdp?= =?utf-8?B?dkJiZTBnRDhXb2xxQnBOQllYNTl4M3NuNXNjMFQ2TThxUkZnYjVmTmg4NnZu?= =?utf-8?B?N0s2TFhkOUorbTcrczdqeXZ5bE9NL2tPYkZLNmVHVk1wUWcyYy92VlU1akhk?= =?utf-8?B?M1huQnVGU1ZqQUR6OC9rVDY1TngwZ2pPRCtsVjFaTHBEaHVHcnFkNTJZajZF?= =?utf-8?B?SW1CWk5TVTlZZDNHTWRPNlBodzNpaFVTZ0xYcS80aWw2c1NHVzBXYkgvV05Q?= =?utf-8?B?UzljaHRhcFloNHVGcmY5MDRBTTZNYU9UYWMwUjlUZGRaYjI3U2xTdXIvL1hQ?= =?utf-8?B?dThDdW5mNHIrcWVlbW5PTTJUVVpqTVcxei90dXNxWG1zZmIvdEhaU2ErNExB?= =?utf-8?B?UkY1Z1BKSFFWeVhyRmtBQUpEVVBpUE5VQ1dRMmRUTzUrbDRrNjhUYlA2VkNn?= =?utf-8?B?Nm5ZbnNFNWxEelZKakhvdEZVYWI2blYrUUdCaUU5TzBQck5VMG80RmhHVUF3?= =?utf-8?B?alYwekZianlaTE5WV0tCU2tmelBHbmM5WTBXZ0MxMmhkc2pURVA5N2IwNEFv?= =?utf-8?B?ZXY4ZkFPWlFDZTk5VndEdXJkT29OaVR3WFhPRTh6RGVuV3R5bEZNVGNWSlFz?= =?utf-8?B?U05uT01XNDNPZ3RpWUpEd2RmZVNnV0RsQlcvaWpZTko2RmszNEFzWkJURExZ?= =?utf-8?B?VzZmUmdIWTlaZXlLQU5vNVVLMVVvNHNFRjdzMDd0cEFYS2xNNUZ0WG53RnE3?= =?utf-8?B?Mjcyc0F4VDh2OHRnNGlCaWE3R3BlVTRtRHhhZFg1Z29GbnNyelJhVXJYeFFS?= =?utf-8?B?cHRRVXJKaXhKT0tkcmVVODJ6Z0pSbXdOczR4cnp6UWZNeWUzdWdBcGwyQzNu?= =?utf-8?B?TVUyQmVPRkdUM1g5c0FNR1NwNDhpaGhPUEdyOWtKY3hFeVZpYXJOY3phUS8w?= =?utf-8?B?RFJXMWp0YTAzR2FUSmYvcUo3YjJQN0lUcTZaQTVwUzBnbzBJd3F5Vm0rVFRV?= =?utf-8?B?NFc3MVdJMTNtVlNKRnBlUE1XelZNejNUVkdmL1lPQXNMdi9ML05sRVhZU0pl?= =?utf-8?B?Q0l2Q2NUWVlyVjFTbHgzWWowLzlzdklRQktGa1ZSSldXOHllUUIwK2lqWk5I?= =?utf-8?B?U0o5ZHo4YlF2VTFhT0hzWExhR1paeit6T0ZIaVI1d3BrSzR4RVZyd0phcnR2?= =?utf-8?B?dmhpbDhTRlJnRnErMXJIMWR3RFRsNTgrbXBwUHhZTzg1SjQxRlEvSzJORTZq?= =?utf-8?Q?Xds?= X-Microsoft-Exchange-Diagnostics: 1;VI1PR0801MB1343;6:d5t1bEgtmnYZhYtmFcOUxnXL8rvEojYFOJWe1GkL6MKZ89tb9fZpu27CeJ5Gc8ciR1GJeNzff+DRGFuZ8D07ma1CiIiKDajfD3OX5FvKQi5ms9ZlP/1YBOK+uP9z5+cdbz85WzjVo6qkvWcBydlLTtXk36Tvd/HZAbJ5OPMZ7gJJD8ejkrSS+WyV+blCvQsD/WhqfNkd51SiBiOKSCYcrjtTCggbbsIIF79R0BZXyX6uxDwbTT1wuralK0ST0wcrfKc9ETBksVfHoJvZf3fu59YLmntdXU+sRd79XG9yqOJrPeqwXyPuT/kuM5oh6vbrvdTBUBjQGR6RX2JhiVyxpQ==;5:0ArNGm9Emi7lwHoJ9tk/8zwS3wg5eDBh/lAcbHTNAWsA/D6JY85I2IM2MSOlbWiOn6ySdUKpM+Tyz8L9PCnDFKrAsyavceh+Ok2yFgKLOAK5sqiCS1Opxx6nkMDy8KfiE0YyoOACTEU2nGg2pOdK7A==;24:O5NYfQStztdxOMD2mrKMbab3fqw8UaKVievM4simZREWkRapCjz6zvlQmP00thEpzWtNjClG7Qq7QEjLw6r4fA5k31mOI5Dx/wlYyvV1gZc=;7:CG462wOW0Y2Mc+yYgYIBN+GcI6DZ/UJoCkgpzrR6aawob+teZkAt6erexy3vcE7TXta9TWCHL5RNHnPuzFv9yIMfiHqsAv4qCRQ9Wij/UXkwKBYMYf+SEHUwCBtXt09tZPQcVc4vQKj+FkNL1y+pmMdiywJpvQSJOimZ3B3Kozs5ZP6hu1UnzcMf2Ft+E2eUPwkS7Enw43GxTszuEDNWTmxVhIHihF5xIW2pc/NwQ2E= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;VI1PR0801MB1343;20:zCk3B0hA49cj715m00zSnciv2bgKIJ3+8yM5ka9N0KYjk80h9GByDuu/7NMLmYegFcy9ErTP6V1t67fAJTTiUQyflW1YhGqegmoDQbo1C/YAAaRKAZ3RSWtZdFoaK78giVHXr4HDe7Pkyx2fxcG0eshvphjBKHcGFNckBtJ1rE0= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Aug 2017 12:29:20.5968 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1343 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6664 Lines: 191 The array list_lru_node::memcg_lrus::list_lru_one[] only grows, and it never shrinks. The growths happens in memcg_update_list_lru_node(), and old array's members remain the same after it. So, the access to the array's members may become RCU protected, and it's possible to avoid using list_lru_node::lock to dereference it. This will be used to get list's nr_items in next patch lockless. Signed-off-by: Kirill Tkhai --- include/linux/list_lru.h | 2 + mm/list_lru.c | 70 +++++++++++++++++++++++++++++++--------------- 2 files changed, 48 insertions(+), 24 deletions(-) diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h index b65505b32a3d..a55258100e40 100644 --- a/include/linux/list_lru.h +++ b/include/linux/list_lru.h @@ -43,7 +43,7 @@ struct list_lru_node { struct list_lru_one lru; #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB) /* for cgroup aware lrus points to per cgroup lists, otherwise NULL */ - struct list_lru_memcg *memcg_lrus; + struct list_lru_memcg __rcu *memcg_lrus; #endif long nr_items; } ____cacheline_aligned_in_smp; diff --git a/mm/list_lru.c b/mm/list_lru.c index a726e321bf3e..2db3cdadb577 100644 --- a/mm/list_lru.c +++ b/mm/list_lru.c @@ -42,24 +42,30 @@ static void list_lru_unregister(struct list_lru *lru) #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB) static inline bool list_lru_memcg_aware(struct list_lru *lru) { + struct list_lru_memcg *memcg_lrus; /* * This needs node 0 to be always present, even * in the systems supporting sparse numa ids. + * + * Here we only check the pointer is not NULL, + * so RCU lock is not need. */ - return !!lru->node[0].memcg_lrus; + memcg_lrus = rcu_dereference_check(lru->node[0].memcg_lrus, true); + return !!memcg_lrus; } static inline struct list_lru_one * list_lru_from_memcg_idx(struct list_lru_node *nlru, int idx) { + struct list_lru_memcg *memcg_lrus; /* - * The lock protects the array of per cgroup lists from relocation - * (see memcg_update_list_lru_node). + * Either lock and RCU protects the array of per cgroup lists + * from relocation (see memcg_update_list_lru_node). */ - lockdep_assert_held(&nlru->lock); - if (nlru->memcg_lrus && idx >= 0) - return nlru->memcg_lrus->lru[idx]; - + memcg_lrus = rcu_dereference_check(nlru->memcg_lrus, + lockdep_is_held(&nlru->lock)); + if (memcg_lrus && idx >= 0) + return memcg_lrus->lru[idx]; return &nlru->lru; } @@ -76,9 +82,12 @@ static __always_inline struct mem_cgroup *mem_cgroup_from_kmem(void *ptr) static inline struct list_lru_one * list_lru_from_kmem(struct list_lru_node *nlru, void *ptr) { + struct list_lru_memcg *memcg_lrus; struct mem_cgroup *memcg; - if (!nlru->memcg_lrus) + /* Here we only check the pointer is not NULL, so RCU lock isn't need */ + memcg_lrus = rcu_dereference_check(nlru->memcg_lrus, true); + if (!memcg_lrus) return &nlru->lru; memcg = mem_cgroup_from_kmem(ptr); @@ -323,25 +332,33 @@ static int __memcg_init_list_lru_node(struct list_lru_memcg *memcg_lrus, static int memcg_init_list_lru_node(struct list_lru_node *nlru) { + struct list_lru_memcg *memcg_lrus; int size = memcg_nr_cache_ids; - nlru->memcg_lrus = kmalloc(sizeof(struct list_lru_memcg) + - size * sizeof(void *), GFP_KERNEL); - if (!nlru->memcg_lrus) + memcg_lrus = kmalloc(sizeof(*memcg_lrus) + + size * sizeof(void *), GFP_KERNEL); + if (!memcg_lrus) return -ENOMEM; - if (__memcg_init_list_lru_node(nlru->memcg_lrus, 0, size)) { - kfree(nlru->memcg_lrus); + if (__memcg_init_list_lru_node(memcg_lrus, 0, size)) { + kfree(memcg_lrus); return -ENOMEM; } + rcu_assign_pointer(nlru->memcg_lrus, memcg_lrus); return 0; } static void memcg_destroy_list_lru_node(struct list_lru_node *nlru) { - __memcg_destroy_list_lru_node(nlru->memcg_lrus, 0, memcg_nr_cache_ids); - kfree(nlru->memcg_lrus); + struct list_lru_memcg *memcg_lrus; + /* + * This is called when shrinker has already been unregistered, + * and nobody can use it. So, it's not need to use kfree_rcu(). + */ + memcg_lrus = rcu_dereference_check(nlru->memcg_lrus, true); + __memcg_destroy_list_lru_node(memcg_lrus, 0, memcg_nr_cache_ids); + kfree(memcg_lrus); } static int memcg_update_list_lru_node(struct list_lru_node *nlru, @@ -350,8 +367,10 @@ static int memcg_update_list_lru_node(struct list_lru_node *nlru, struct list_lru_memcg *old, *new; BUG_ON(old_size > new_size); + lockdep_assert_held(&list_lrus_mutex); - old = nlru->memcg_lrus; + /* list_lrus_mutex is held, nobody can change memcg_lrus. Silence RCU */ + old = rcu_dereference_check(nlru->memcg_lrus, true); new = kmalloc(sizeof(*new) + new_size * sizeof(void *), GFP_KERNEL); if (!new) return -ENOMEM; @@ -364,26 +383,31 @@ static int memcg_update_list_lru_node(struct list_lru_node *nlru, memcpy(&new->lru, &old->lru, old_size * sizeof(void *)); /* - * The lock guarantees that we won't race with a reader - * (see list_lru_from_memcg_idx). + * The locking below allows the readers, that already take nlru->lock, + * not to use additional rcu_read_lock()/rcu_read_unlock() pair. * * Since list_lru_{add,del} may be called under an IRQ-safe lock, * we have to use IRQ-safe primitives here to avoid deadlock. */ spin_lock_irq(&nlru->lock); - nlru->memcg_lrus = new; + rcu_assign_pointer(nlru->memcg_lrus, new); spin_unlock_irq(&nlru->lock); - kfree(old); + kfree_rcu(old, rcu); return 0; } static void memcg_cancel_update_list_lru_node(struct list_lru_node *nlru, int old_size, int new_size) { + struct list_lru_memcg *memcg_lrus; + + lockdep_assert_held(&list_lrus_mutex); + memcg_lrus = rcu_dereference_check(nlru->memcg_lrus, true); + /* do not bother shrinking the array back to the old size, because we * cannot handle allocation failures here */ - __memcg_destroy_list_lru_node(nlru->memcg_lrus, old_size, new_size); + __memcg_destroy_list_lru_node(memcg_lrus, old_size, new_size); } static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware) @@ -400,7 +424,7 @@ static int memcg_init_list_lru(struct list_lru *lru, bool memcg_aware) return 0; fail: for (i = i - 1; i >= 0; i--) { - if (!lru->node[i].memcg_lrus) + if (!rcu_dereference_check(lru->node[i].memcg_lrus, true)) continue; memcg_destroy_list_lru_node(&lru->node[i]); } @@ -434,7 +458,7 @@ static int memcg_update_list_lru(struct list_lru *lru, return 0; fail: for (i = i - 1; i >= 0; i--) { - if (!lru->node[i].memcg_lrus) + if (!rcu_dereference_check(lru->node[i].memcg_lrus, true)) continue; memcg_cancel_update_list_lru_node(&lru->node[i],