Received: by 10.213.65.68 with SMTP id h4csp94302imn; Tue, 3 Apr 2018 16:15:41 -0700 (PDT) X-Google-Smtp-Source: AIpwx49AXUg5+ktZsOuBU2hSQlTRv+So/RfPurij9NZEvmxMjE5fDTHVieTi3CIKArMRb0Y+VtVn X-Received: by 2002:a17:902:8d97:: with SMTP id v23-v6mr15772507plo.285.1522797341429; Tue, 03 Apr 2018 16:15:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522797341; cv=none; d=google.com; s=arc-20160816; b=k85Jlr32EWvGIrB7L7ZtvQy/5ongQgvhJCIh5LAN2UOA4TIn8brYfeUvTIt1aZHv6U RFlW2iH/8r/tdQpXra5FTsVT88pG6X9/TJ/2PVho3NCx9UNNb611jUQ8ZOnjl2nwbvD5 x+TPr4FqYXJYnCbDMZYOfWh+MlFFOdrTZxXhAlfIbQTQ7f1kjXcRUocPipaeqhFhBk4O BV2XMxDsLpGvbI/Q8CVnz3dmcaWYE/kCGTdsop8Ww163fMo5i8CT3pGga61ELTEazjEo Dx+By559MuYno9xq5Z2l7MEaSbTB4JTRqwxAUdZy9atkG23h13hzhrhPiLR02tR0XgC9 LSaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=3vrX2cLC0WYxPIHeQZ05EQlCDEqvHisx/3b24ODSelc=; b=DqWC1hT+xIxY9OBnAZ+ZoXRK01HzKkOrp8QBtP3fVOoNrX3K67E27TzfheJb6s9uTX OtYHoWfCXNUwApjczcxDYV81EwXq3tp7DiccA0MJjwSj9FJfUP/arfLy0I+gCVGWLd5S C0rKVaGgjzZ6+pmRmqLDluJmLW2z+FFZ5DvyonbLaC1Q8gorlAEbQ+iFmMuJHbqX5AY3 TkqWGdC/FBilciVkkVNEqLL0p7l3a8CRHrWocJZpgYOmt8VnfCo6zCDNt55ag6hyipYe VkUVBItRpiTBAThPdMbuAm67ML+91eJN/ZSy1wkRYyCcCTbuqX5TycvOZEs/yRUzhWDN vkuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=CVo1xu2A; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t9-v6si1706653plo.62.2018.04.03.16.15.27; Tue, 03 Apr 2018 16:15:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=CVo1xu2A; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755544AbeDCXMs (ORCPT + 99 others); Tue, 3 Apr 2018 19:12:48 -0400 Received: from mail-yb0-f196.google.com ([209.85.213.196]:37746 "EHLO mail-yb0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754252AbeDCXMr (ORCPT ); Tue, 3 Apr 2018 19:12:47 -0400 Received: by mail-yb0-f196.google.com with SMTP id u5-v6so7115785ybf.4 for ; Tue, 03 Apr 2018 16:12:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=3vrX2cLC0WYxPIHeQZ05EQlCDEqvHisx/3b24ODSelc=; b=CVo1xu2ADxrqYYzI4TvhbKRmuR4JALobREsoZ4Itfcm5kdwJivgEQienir9naVHVb4 skEXiwa3Ulpl9TcbD+30hq7siuTUuhRbI/OnzyTpbxIS/suYj/rkUXWGebDmJ4YDDYoH CvWxYcteZtLKrADl3/VBKWmS8lL+sReLYJ7eOIBtf0yM0Q6gGQxZezVSE4/AEuZYJxep 87Brb5Tn4i9TRymh7V5huiXNqUcx6VZEL3Jywa1lfFj5ryCVAhAxHLLV//TEYAesgGYC O/eWuEQ2z43aqUvUXOcc+lV8EL3IS7K5CQG6+3da9j7cgjZWeONFi4+fERr19hBAMx86 upPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3vrX2cLC0WYxPIHeQZ05EQlCDEqvHisx/3b24ODSelc=; b=knnVXiR5SLqXiBpr85bQMKjsAyLGFF7o4f4+RDsssJexYISY+KD4NfB1Yx5KdiDY6a Buzo4Hs47YpHjHP4y2BWQyRXGk1DUxFDnSpueF7veAXsvQMwUmL+X0j0w5ER3O+UybX5 v2WwzHkAtlMnwmnYWKhQ/dE1uLfcktohBRKBgO04+XaM4vwxq9apf5PSpFydhGJ6n4n7 9uOruDh3HwtRuysd/GvNfXQs8zZ4Gbp/0LqD2T4es8Kktg5Nwyf5Lj+GT9Bghm4pPTYG Qm1abxv/GL2BwPHvvpRA5PjQ7AqM7le+FXG2wyS1ikoCPUj6V0KAZY9eMEr3PRq9PEPt BZzw== X-Gm-Message-State: ALQs6tAX371Zqj4Jnpwy/IIDjYj/DfV6rW47QFcqGtPhSEU8SCWwRrvR OGJGhaMyObb4q2edfptijilSdntzdPswUYFQz+Zr4g== X-Received: by 10.13.197.65 with SMTP id h62mr823998ywd.464.1522797166112; Tue, 03 Apr 2018 16:12:46 -0700 (PDT) MIME-Version: 1.0 References: <157ed606-4a61-508b-d26a-2f5d638f39bb@meituan.com> <20180403120312.GS5501@dhcp22.suse.cz> In-Reply-To: <20180403120312.GS5501@dhcp22.suse.cz> From: Greg Thelen Date: Tue, 03 Apr 2018 23:12:34 +0000 Message-ID: Subject: Re: [RFC] Is it correctly that the usage for spin_{lock|unlock}_irq in clear_page_dirty_for_io To: Michal Hocko Cc: wanglong19@meituan.com, Tejun Heo , Johannes Weiner , akpm@osdl.org, LKML , npiggin@gmail.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 3, 2018 at 5:03 AM Michal Hocko wrote: > On Mon 02-04-18 19:50:50, Wang Long wrote: > > > > Hi, Johannes Weiner and Tejun Heo > > > > I use linux-4.4.y to test the new cgroup controller io and the current > > stable kernel linux-4.4.y has the follow logic > > > > > > int clear_page_dirty_for_io(struct page *page){ > > ... > > ... > > memcg = mem_cgroup_begin_page_stat(page); ----------(a) > > wb = unlocked_inode_to_wb_begin(inode, &locked); ---------(b) > > if (TestClearPageDirty(page)) { > > mem_cgroup_dec_page_stat(memcg, MEM_CGROUP_STAT_DIRTY); > > dec_zone_page_state(page, NR_FILE_DIRTY); > > dec_wb_stat(wb, WB_RECLAIMABLE); > > ret =1; > > } > > unlocked_inode_to_wb_end(inode, locked); -----------(c) > > mem_cgroup_end_page_stat(memcg); -------------(d) > > return ret; > > ... > > ... > > } > > > > > > when memcg is moving, and I_WB_SWITCH flags for inode is set. the logic > > is the following: > > > > > > spin_lock_irqsave(&memcg->move_lock, flags); -------------(a) > > spin_lock_irq(&inode->i_mapping->tree_lock); ------------(b) > > spin_unlock_irq(&inode->i_mapping->tree_lock); -----------(c) > > spin_unlock_irqrestore(&memcg->move_lock, flags); -----------(d) > > > > > > after (c) , the local irq is enabled. I think it is not correct. > > > > We get a deadlock backtrace after (c), the cpu get an softirq and in the > > irq it also call mem_cgroup_begin_page_stat to lock the same > > memcg->move_lock. > > > > Since the conditions are too harsh, this scenario is difficult to > > reproduce. But it really exists. > > > > So how about change (b) (c) to spin_lock_irqsave/spin_lock_irqrestore? > Yes, it seems we really need this even for the current tree. Please note > that At least clear_page_dirty_for_io doesn't lock memcg anymore. > __cancel_dirty_page still uses lock_page_memcg though (former > mem_cgroup_begin_page_stat). > -- > Michal Hocko > SUSE Labs I agree the issue looks real in 4.4 stable and upstream. It seems like unlocked_inode_to_wb_begin/_end should use spin_lock_irqsave/restore. I'm testing a little patch now.