Received: by 2002:ac0:e34a:0:0:0:0:0 with SMTP id g10csp603567imn; Tue, 26 Jul 2022 04:49:22 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vJN+6hvUEFSXnzrYtxlkdxpo7JiIOVIDufIo1w6vxAS5LczhECosSUUlwrt+8aiwjFUEMq X-Received: by 2002:a17:907:1612:b0:72f:17c7:dabc with SMTP id hb18-20020a170907161200b0072f17c7dabcmr14051289ejc.269.1658836161960; Tue, 26 Jul 2022 04:49:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658836161; cv=none; d=google.com; s=arc-20160816; b=VS8yPuAeXuIRvBMqsPvcGI6XToQc6z20vxSSvcgFN0maxrLx4u2OONFUN/HQQyE85I 8sucp3kyuTrWpyDEENXD828g0Q5iKTqAvYly+jv2AxOSDt96+XXW4StidVEqZdBtVXix 9VhYT5WMaRQFG+AliJDZMFbg2UXg+b09eSyX0N0s3Nb0JFY+k7xwoyUJOkrJ3IyV+h8Q IHyKjxn9kN9f4Hz+1TI/C6z+4QewyWZSzYqHyJrY6cKo4CnsDi88rC9cCkht73miQraY 9fwiLV8FW0aEsANbjevLHp6Pwuk3hnHqDUF1i2w8JFFtsczgp9iB+ER7OzN1gmlD6zEj GPKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=tZcwxUcSaw+tsicTE+zo88VbiLdB/vW/O6OClHj+mlQ=; b=TIY5ukiunImphcmVjHJmiWcyWJkPruPy3mYub3kfuEE5wKIrzfKBegFMwIHosm7119 jTOROrusmDXWrx5gGe77rAomzsW7PM2T5reXJzTcu/HZV4uc2ythLo9S8YAOzkWZvSiM ddZ+t0kXXAhVR8h/ZNjuNTB5KHKx2hfA9kwbibBVRw4CJarc69C9VuGkwM/kTt2KhqYK dxTg3RdDtYtiTB7E7DPy6gKd7qwqnUDur3Za5sF0yNofZFnywX9uxHEQ7RC8eHDGNEG+ +XPcoxtAb5tPsCKXRWZcsXAuBRc7XV9mfeqZ9ZFFXUMt2a7HaLPNEAUps0htoyeOOi2P 0uAA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sd6-20020a170906ce2600b0072b3a3dfbb7si12449799ejb.887.2022.07.26.04.48.57; Tue, 26 Jul 2022 04:49:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233025AbiGZLbV (ORCPT + 99 others); Tue, 26 Jul 2022 07:31:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230198AbiGZLbU (ORCPT ); Tue, 26 Jul 2022 07:31:20 -0400 Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30C4227CCE for ; Tue, 26 Jul 2022 04:31:19 -0700 (PDT) Received: from fsav118.sakura.ne.jp (fsav118.sakura.ne.jp [27.133.134.245]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 26QBVHaw054373; Tue, 26 Jul 2022 20:31:17 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav118.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav118.sakura.ne.jp); Tue, 26 Jul 2022 20:31:17 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav118.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 26QBVHmF054370 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Tue, 26 Jul 2022 20:31:17 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Message-ID: Date: Tue, 26 Jul 2022 20:31:17 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: + mm-memcontrol-fix-potential-oom_lock-recursion-deadlock.patch added to mm-unstable branch Content-Language: en-US To: Michal Hocko Cc: mm-commits@vger.kernel.org, syzbot+2d2aeadc6ce1e1f11d45@syzkaller.appspotmail.com, shakeelb@google.com, roman.gushchin@linux.dev, hannes@cmpxchg.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org References: <20220725220032.B4C30C341C8@smtp.kernel.org> From: Tetsuo Handa In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NICE_REPLY_A, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022/07/26 17:14, Michal Hocko wrote: > As we have concluded there are two issues possible here which would be > great to have reflected in the changelog. > > On Mon 25-07-22 15:00:32, Andrew Morton wrote: >> From: Tetsuo Handa >> Subject: mm: memcontrol: fix potential oom_lock recursion deadlock >> Date: Fri, 22 Jul 2022 19:45:39 +0900 >> >> syzbot is reporting GFP_KERNEL allocation with oom_lock held when >> reporting memcg OOM [1]. Such allocation request might deadlock the >> system, for __alloc_pages_may_oom() cannot invoke global OOM killer due to >> oom_lock being already held by the caller. > > I would phrase it like this: This report is difficult to explain correctly. > syzbot is reporting GFP_KERNEL allocation with oom_lock held when > reporting memcg OOM [1]. Correct. But > This is problematic because this creates a > dependency between GFP_NOFS and GFP_KERNEL over oom_lock which could > dead lock the system. oom_lock is irrelevant when trying GFP_KERNEL allocation from GFP_NOFS context. Therefore, something like: ---------- syzbot is reporting GFP_KERNEL allocation with oom_lock held when reporting memcg OOM [1]. If this allocation triggers the global OOM situation then the system can livelock because the GFP_KERNEL allocation with oom_lock held cannot trigger the global OOM killer because __alloc_pages_may_oom() fails to hold oom_lock. Fix this problem by removing the allocation from memory_stat_format() completely, and pass static buffer when calling from memcg OOM path. Note that the caller holding filesystem lock was the trigger for syzbot to report this locking dependency. Doing GFP_KERNEL allocation with filesystem lock held can deadlock the system even without involving OOM situation. ----------