Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3028539pxb; Sun, 8 Nov 2020 23:54:46 -0800 (PST) X-Google-Smtp-Source: ABdhPJwqAXuwFGt5Gq4oMwExna/S18uyltcQrMmA4hPjT1Q223iX32yenGbtXoyVOl2G9qXw/kRu X-Received: by 2002:a17:906:1e45:: with SMTP id i5mr13539085ejj.203.1604908485866; Sun, 08 Nov 2020 23:54:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604908485; cv=none; d=google.com; s=arc-20160816; b=iEZzB7aSog4V/5eJQvtYVbBk8XeEEBKxuagTB4NTY7J8hQqaMyrQX3PtEx1jD+WR6D otYjEHBcoHxoZToMefQd3fLgpPfaBfdMUGnVA3wW/DMbaYBVyPdtX1rffzw8E6/OMvF8 JndpLhmVWc8nw5VaEfO2EzUaYXKXtIAQ0v+dgN425l64akLoIV+ctlu9uuKiQVHfWtKV 2cjJnIHvl/kJIJ0gNag5+S/Puk3NAsnf05HnBBTnWRoL/QQYs6YLQWmZ1RYdTEhtR1/R 5RvxAjWBMevxSn9aJEmZe0n6X1x1TLt/iiferJPOPxEKcAy3Eo0urISTO5/U7DXOM6vZ 7nQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=qg5Pi+sMAA7PwXHi8JkHRstYeV1aftzdzAHHAsIIAtQ=; b=oMTFZ6prV2IsvYEclgEsm1bLkEJvDWzqdh6l2nRv/wKwgudhucYReaklEKJTSVQEcb R7mTCmp4LVl48zFzdoMnALlYiXPC2Nt89prhBsXbHuTs8OMFgWrX1NiLLZsiMWqmzk71 kg3kBP5fz7JbJ37Im06QARt4LNguXJi1BdFuaTkSCXSFey3gVCrk3A9eoDw3Xg1UVDR8 fildZCR+6E2j9dyImSLhRjm59wWYrPMQ0kNmousq+o2Z+Te0EyHr5MQUZDNKgSTEYtCQ e3LYcsARpvRklV8KsNohW6cQzzfQseHLocZlrrVVKnk0jZw9cl5O8rV2eiFFyKgqQwTX mKhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=MhURRHaV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f16si8531179edy.384.2020.11.08.23.54.21; Sun, 08 Nov 2020 23:54:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=MhURRHaV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729647AbgKIHvB (ORCPT + 99 others); Mon, 9 Nov 2020 02:51:01 -0500 Received: from mx2.suse.de ([195.135.220.15]:33750 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729092AbgKIHvA (ORCPT ); Mon, 9 Nov 2020 02:51:00 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1604908259; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=qg5Pi+sMAA7PwXHi8JkHRstYeV1aftzdzAHHAsIIAtQ=; b=MhURRHaVmstXZXNgSFKtAk7ca7t+TSOfXWrVZwYRchDM0keBxmMEmUXOTZjVO3xNvgQoEm bYKI9gk9CBuZiHgPjHkX5o6eYw537WrX3vUzqAYn4XRvR39Hp87TSOTagGwjkodSvcOjT9 GhzVvld2ARF7iEov7ecElQMdiimyEGE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 24C7FABCC; Mon, 9 Nov 2020 07:50:59 +0000 (UTC) Date: Mon, 9 Nov 2020 08:50:58 +0100 From: Michal Hocko To: NeilBrown Cc: Tejun Heo , Lai Jiangshan , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Trond Myklebust , linux-kernel@vger.kernel.org Subject: Re: [PATCH rfc] workqueue: honour cond_resched() more effectively. Message-ID: <20201109075058.GC12240@dhcp22.suse.cz> References: <87v9efp7cs.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87v9efp7cs.fsf@notabene.neil.brown.name> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon 09-11-20 13:54:59, Neil Brown wrote: > > If a worker task for a normal (bound, non-CPU-intensive) calls > cond_resched(), this allows other non-workqueue processes to run, but > does *not* allow other workqueue workers to run. This is because > workqueue will only attempt to run one task at a time on each CPU, > unless the current task actually sleeps. > > This is already a problem for should_reclaim_retry() in mm/page_alloc.c, > which inserts a small sleep just to convince workqueue to allow other > workers to run. > > It can also be a problem for NFS when closing a very large file (e.g. > 100 million pages in memory). NFS can call the final iput() from a > workqueue, which can then take long enough to trigger a workqueue-lockup > warning, and long enough for performance problems to be observed. > > I added a WARN_ON_ONCE() in cond_resched() to report when it is run from > a workqueue, and got about 20 hits during boot, many of system_wq (aka > "events") which suggests there is a real chance that worker are being > delayed unnecessarily be tasks which are written to politely relinquish > the CPU. > > I think that once a worker calls cond_resched(), it should be treated as > though it was run from a WQ_CPU_INTENSIVE queue, because only cpu-intensive > tasks need to call cond_resched(). This would allow other workers to be > scheduled. > > The following patch achieves this I believe. I cannot really judge the implementation because my understanding of the WQ concurrency control is very superficial but I echo that the existing behavior is really nonintuitive. It certainly burnt me for the oom situations where the page allocator cannot make much progress to reclaim memory and it has to retry really hard. Having to handle worker context explicitly/differently is error prone and as your example of final iput in NFS shows that the allocator is not the only path affected so having a general solution is better. That being said I would really love to see cond_resched to work transparently. Thanks! -- Michal Hocko SUSE Labs