Received: by 2002:a05:6a10:6006:0:0:0:0 with SMTP id w6csp167817pxa; Wed, 26 Aug 2020 07:34:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJydtX1PWJ50X67Mm4E069xcCdH7qnkDX/wGORUQgYJd2ly0uZ0F46aMxloXe+nRcL06muJt X-Received: by 2002:a17:906:840c:: with SMTP id n12mr15270217ejx.246.1598452491568; Wed, 26 Aug 2020 07:34:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598452491; cv=none; d=google.com; s=arc-20160816; b=FM60xGylnWulUbhEFPdL/+YwuDIyv5cjjX7x0a3y9DHQb1lqJcoVzzUlEztp5Vw21H DFLHM4a6SPFXVwhQAreoBHcJOQTMSuuC0D46yW6xeT1rg0vrHwxfOwhxdDIg/mgjpyJp yIj8+9zEREUUcpBs+IJ0tzIqpYL82waqjRjQaUBbRqyGotmGIluLLmfLebBT/nxuVG0J z46g3QaxQXzv0su9sBvbQdFTWTBEiIGq7f61eXoWLAV60vUO42oQu+eN3bRDints+AnX Xqsx27vfutTiFbwbd9LRCAOWo86uogqdCX66MfV6K9RQ7Oj4lOWzcGiLQk4i78vZzXGL EJUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=JKR/oUorVCzf2KyFLOwksJXSCtV8ylUmPJA/ReLJp2s=; b=itgyXxGJGlPLvsqsYvl8baJ1b7XoA32y6n/+8wN7N5Lr5mSpXxlAEQHEpupvxEi40x rH0wdgo0v6LoLTfHFyPWEpqiyJrK+caylqqXl19P23VKWD/DWhgss73gUkwuCUXquTFs cZ/1veKFTf2QPm0GFP8iQkEQddBupbknghpqFs3IvExdy9L1UZi8iq6tnIKxMMIFYwCC sP1P9fyrHPaAs9Ea+vV5uyjJeXeXoqLVP1lG5YuNZEduy3JxnS94i73ykUwod0UCS5qs HatKRcOX0gB4/QZbtQ7zdj5pwO3YQo4ooopxm/K79H7JXJC6I/ZE/bDcfD93a0bh7Gbc bKzg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 26si1571916edv.92.2020.08.26.07.34.28; Wed, 26 Aug 2020 07:34:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729792AbgHZMsS (ORCPT + 99 others); Wed, 26 Aug 2020 08:48:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:45238 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729391AbgHZMsQ (ORCPT ); Wed, 26 Aug 2020 08:48:16 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id BCDECB17A; Wed, 26 Aug 2020 12:48:45 +0000 (UTC) Date: Wed, 26 Aug 2020 14:48:10 +0200 From: Michal Hocko To: xunlei Cc: Johannes Weiner , Andrew Morton , Vladimir Davydov , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] mm: memcg: Fix memcg reclaim soft lockup Message-ID: <20200826124810.GQ22869@dhcp22.suse.cz> References: <1598426822-93737-1-git-send-email-xlpang@linux.alibaba.com> <20200826081102.GM22869@dhcp22.suse.cz> <99efed0e-050a-e313-46ab-8fe6228839d5@linux.alibaba.com> <20200826110015.GO22869@dhcp22.suse.cz> <20200826120740.GP22869@dhcp22.suse.cz> <19eb48db-7d5e-0f55-5dfc-6a71274fd896@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <19eb48db-7d5e-0f55-5dfc-6a71274fd896@linux.alibaba.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 26-08-20 20:21:39, xunlei wrote: > On 2020/8/26 下午8:07, Michal Hocko wrote: > > On Wed 26-08-20 20:00:47, xunlei wrote: > >> On 2020/8/26 下午7:00, Michal Hocko wrote: > >>> On Wed 26-08-20 18:41:18, xunlei wrote: > >>>> On 2020/8/26 下午4:11, Michal Hocko wrote: > >>>>> On Wed 26-08-20 15:27:02, Xunlei Pang wrote: > >>>>>> We've met softlockup with "CONFIG_PREEMPT_NONE=y", when > >>>>>> the target memcg doesn't have any reclaimable memory. > >>>>> > >>>>> Do you have any scenario when this happens or is this some sort of a > >>>>> test case? > >>>> > >>>> It can happen on tiny guest scenarios. > >>> > >>> OK, you made me more curious. If this is a tiny guest and this is a hard > >>> limit reclaim path then we should trigger an oom killer which should > >>> kill the offender and that in turn bail out from the try_charge lopp > >>> (see should_force_charge). So how come this repeats enough in your setup > >>> that it causes soft lockups? > >>> > >> > >> should_force_charge() is false, the current trapped in endless loop is > >> not the oom victim. > > > > How is that possible? If the oom killer kills a task and that doesn't > > resolve the oom situation then it would go after another one until all > > tasks are killed. Or is your task living outside of the memcg it tries > > to charge? > > > > All tasks are in memcgs. Looks like the first oom victim is not finished > (unable to schedule), later mem_cgroup_oom()->...->oom_evaluate_task() > will set oc->chosen to -1 and abort. This shouldn't be possible for too long because oom_reaper would make it invisible to the oom killer so it should proceed. Also mem_cgroup_out_of_memory takes a mutex and that is an implicit scheduling point already. Which kernel version is this? And just for the clarification. I am not against the additional cond_resched. That sounds like a good thing in general because we do want to have a predictable scheduling during reclaim which is independent on reclaimability as much as possible. But I would like to drill down to why you are seeing the lockup because those shouldn't really happen. -- Michal Hocko SUSE Labs