Received: by 10.213.65.68 with SMTP id h4csp3760441imn; Tue, 10 Apr 2018 04:23:42 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/tMLANSgl3PqDmOMfXRGdtwJm/PABhlIAAKyyh/vOrdzbjQSchSwV1LRXAWImYtD5MHOsa X-Received: by 10.99.95.22 with SMTP id t22mr28540366pgb.315.1523359422397; Tue, 10 Apr 2018 04:23:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523359422; cv=none; d=google.com; s=arc-20160816; b=lgBv780edLEWoBgBtAyq5/sQTW5toprEAQFHFnE86utmXOood/21qsrUxmIUk2RxRK 0s8TpChAXdC8Lb8BZhWNprQjArvdijk9ovSEgDy17VoBQykt2cDh+5pppV1Ao/c0aDuC H9K5pbXD+Z8BeFNgXh0nv+g6Ax20ie7XxVFZS8yHOrzWmDnFQUpnIztqN2zYTM+eHryF +eGsvd6aN/CZ3c73b12Ykzog3BRZ7gH09YsXOxcpMsruDYGBXRYRQOor91blsYRGHBi6 c70asEzNqaWESq7AzLWTKwUY8PHjYAmczznXHsyBBqqGBQRKWC7uK+ZP+Fjh/Hz7uROu 9Aaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=s7YSbbglaHmnrzNl35VnqacaZDI0ivnldpmd1T71mwU=; b=J/Obs/sXslEK89m20PUHTPGrwMntkOdoRclFaoVpwuppKKmOglZKVXBhwcMyCn0li4 E1hp2tJG9mmSMCICrhWv7b5TYD23qIoST5WBkcNV/+9g3rOkShUqpcvbSVAnyxr63OQt SiKBcuajuIV4guQrrGQbgeuBTRH8YCtcUpwSI4dL8aOnxwYEdKKX8kYJyBpqviQgGKaQ NTVrmdwr5+EPUt6jGb4kXT3xd2ukf85vcXU4kRWEOUSsN+A6ouKeB6QHg3Q2L6SQAHSy Y3u8VBs7NE8W7Mcs3FGiR/6BbltlgDf1iMUA2C2pL2C5M9xoSmJSvmoV3zWsk7Y0gE0b s6zw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=bdGAUR28; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d14-v6si2460404plj.191.2018.04.10.04.23.04; Tue, 10 Apr 2018 04:23:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=bdGAUR28; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752652AbeDJLTl (ORCPT + 99 others); Tue, 10 Apr 2018 07:19:41 -0400 Received: from mail-pl0-f65.google.com ([209.85.160.65]:36330 "EHLO mail-pl0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751774AbeDJLTj (ORCPT ); Tue, 10 Apr 2018 07:19:39 -0400 Received: by mail-pl0-f65.google.com with SMTP id 91-v6so7305049pld.3 for ; Tue, 10 Apr 2018 04:19:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=s7YSbbglaHmnrzNl35VnqacaZDI0ivnldpmd1T71mwU=; b=bdGAUR289QYPejt1Zs6qyseKipvlGc/uQO/DH8qNqasYoDnHyyR25qtwilVAjpmnIj ACPVm1q/ZWDeyJTuTuAcuKFfwV3KBXxY/UVtD0u+xFDlsOL9B/aipE2qvV/4sNu04tMH MzV1lDlDy+zfuFqRQpWC1vciZN/oRrLA1i/2+9XAVr+2tPq3Ge8rXuvQ5mJsUvf5fJls 9B5K7L1GpmMcguRTL3mAnpfyRN4lbEE0DTPtiO8nKxt+n790J7EmFYS9Ow93Nugabltm txToqaBrqSg5Ygn8dFaGxKA/0fKD2x3+jCtMTZ1dfiWemrLGORNkpFnaIUbT2/Rl/g0r G7tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=s7YSbbglaHmnrzNl35VnqacaZDI0ivnldpmd1T71mwU=; b=qo/53V64qjA+qrKEFfGvePYfAjuE7N+TRiKlPfyCWj9vU/tnUbvlTRtNQzUyx5r+E9 0wWKQP0mN4yRt7F9K7RpH8ZG7gllb2gixpdtcUSN3ITfKXcCJEM4NUNYSiTqcycwlOa0 kqOHJH5dStFDz/0YlU19ayqjjwbIPaj1F9L4E+SGfI656TlBrt1px2Sk7I02BnyDOYB5 FSSGWWs83LjzXoh/jGhxz4Wwmrw7vgv+BwYkkDZGaQS+6wbinfSXreJyRdsapysPmuie th4sbAFGLrX/2iG+2Kpnlmd45aJzhOd291UWeyqRCksTFULwhUzMy7Z/9IxUaK9/8kQp SKQg== X-Gm-Message-State: AElRT7Hb5mS6uZmnRj+TAuZgNXPfGRjxcnWIeYEwK1B8bW5vT+R0tdJo njR55q8rCUZKMcTLG1iaDlw= X-Received: by 2002:a17:902:3001:: with SMTP id u1-v6mr43623699plb.164.1523359178645; Tue, 10 Apr 2018 04:19:38 -0700 (PDT) Received: from rodete-laptop-imager.corp.google.com ([122.38.223.241]) by smtp.gmail.com with ESMTPSA id j3sm5486599pfj.60.2018.04.10.04.19.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 10 Apr 2018 04:19:37 -0700 (PDT) Date: Tue, 10 Apr 2018 20:19:31 +0900 From: Minchan Kim To: Jan Kara Cc: Michal Hocko , Andrew Morton , linux-mm , LKML , Johannes Weiner , Chris Fries Subject: Re: [PATCH] mm: workingset: fix NULL ptr dereference Message-ID: <20180410111931.GA5113@rodete-laptop-imager.corp.google.com> References: <20180409015815.235943-1-minchan@kernel.org> <20180410082243.GW21835@dhcp22.suse.cz> <20180410085531.m2xvzi7nenbrgbve@quack2.suse.cz> <20180410093241.GA21835@dhcp22.suse.cz> <20180410102845.3ixg2lbnumqn2o6z@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180410102845.3ixg2lbnumqn2o6z@quack2.suse.cz> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 10, 2018 at 12:28:45PM +0200, Jan Kara wrote: > On Tue 10-04-18 11:32:41, Michal Hocko wrote: > > On Tue 10-04-18 10:55:31, Jan Kara wrote: > > > On Tue 10-04-18 10:22:43, Michal Hocko wrote: > > > > On Mon 09-04-18 10:58:15, Minchan Kim wrote: > > > > > Recently, I got a report like below. > > > > > > > > > > [ 7858.792946] [] __list_del_entry+0x30/0xd0 > > > > > [ 7858.792951] [] list_lru_del+0xac/0x1ac > > > > > [ 7858.792957] [] page_cache_tree_insert+0xd8/0x110 > > > > > [ 7858.792962] [] __add_to_page_cache_locked+0xf8/0x4e0 > > > > > [ 7858.792967] [] add_to_page_cache_lru+0x50/0x1ac > > > > > [ 7858.792972] [] pagecache_get_page+0x468/0x57c > > > > > [ 7858.792979] [] __get_node_page+0x84/0x764 > > > > > [ 7858.792986] [] f2fs_iget+0x264/0xdc8 > > > > > [ 7858.792991] [] f2fs_lookup+0x3b4/0x660 > > > > > [ 7858.792998] [] lookup_slow+0x1e4/0x348 > > > > > [ 7858.793003] [] walk_component+0x21c/0x320 > > > > > [ 7858.793008] [] path_lookupat+0x90/0x1bc > > > > > [ 7858.793013] [] filename_lookup+0x8c/0x1a0 > > > > > [ 7858.793018] [] vfs_fstatat+0x84/0x10c > > > > > [ 7858.793023] [] SyS_newfstatat+0x28/0x64 > > > > > > > > > > v4.9 kenrel already has the d3798ae8c6f3,("mm: filemap: don't > > > > > plant shadow entries without radix tree node") so I thought > > > > > it should be okay. When I was googling, I found others report > > > > > such problem and I think current kernel still has the problem. > > > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1431567 > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1420335 > > > > > > > > > > It assumes shadow entry of radix tree relies on the init state > > > > > that node->private_list allocated should be list_empty state. > > > > > Currently, it's initailized in SLAB constructor which means > > > > > node of radix tree would be initialized only when *slub allocates > > > > > new page*, not *new object*. So, if some FS or subsystem pass > > > > > gfp_mask to __GFP_ZERO, slub allocator will do memset blindly. > > > > > That means allocated node can have !list_empty(node->private_list). > > > > > It ends up calling NULL deference at workingset_update_node by > > > > > failing list_empty check. > > > > > > > > > > This patch should fix it. > > > > > > > > > > Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") > > > > > Reported-by: Chris Fries > > > > > Cc: Johannes Weiner > > > > > Cc: Jan Kara > > > > > Signed-off-by: Minchan Kim > > > > > > > > Regardless of whether it makes sense to use __GFP_ZERO from the upper > > > > layer or not, it is subtle as hell to rely on the pre-existing state > > > > for a newly allocated object. So yes this makes perfect sense. > > > > > > > > Do we want CC: stable? > > > > Acked-by: Michal Hocko Thanks, Michal. > > > > > > Well, for hot allocations we do rely on previous state a lot. After all > > > that's what slab constructor was created for. Whether radix tree node > > > allocation is such a hot path is a question for debate, I agree. > > > > I really doubt that LIST_INIT is something to notice for the radix tree > > allocation. > > I agree with that. I totally agree with Michal's opinion. I don't want to play with semantic game here atlhough we can make the API work with simple one line without any performance lose. As I stated in description, there was other report hitting the bug and I believe we didn't fixed it for a long time. Maybe, FS out of tree and ouf of radix tree users could affect by this bug once they use __GFP_ZERO intentionally or by chance. MM didn't give any guide to them. I hope let's make it simple unless we lose big thing. > > > So I would rather have safe code than rely on the previous state which is > > really subtle. > > And I agree on subtlety part here as well. But even with LIST_INIT we'll be > relying on some fields being 0 / NULL so you cannot really say that with > LIST_INIT we won't be relying on previous state. And fully memsetting > radix_tree_node on allocation *would* IMO have effect on the performance. It also does memset in radix_tree_node_rcu_free. I think if it's really want to get benefit from slab constructor, the object should have init state when the object is freeing time so next allocation don't need to do anyting. In this perspecitve, I think radix_tree_node's constructor is pointless. > So I'm not convinced LIST_INIT buys us much. It deals with __GFP_ZERO > problem but not much else. Jan, so, what is your stance for this patch? If you're okay for that, I really want to go my original patch Michal already gave Acked-by. Thanks.