Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp599844imm; Wed, 10 Oct 2018 00:57:06 -0700 (PDT) X-Google-Smtp-Source: ACcGV63FbuNgc4J0KImDcP0jzD1dVv4Kvizm+l2z+I+x+THo0jw9/kmnCEPLA4mXSH/IgCYNQeWE X-Received: by 2002:a63:e216:: with SMTP id q22-v6mr28312028pgh.206.1539158226866; Wed, 10 Oct 2018 00:57:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539158226; cv=none; d=google.com; s=arc-20160816; b=XO+oO2ux8/IRkf+mnLPva37QMaQZd6AGkGGUasn5zNzViJnk6OFMopQD2rkjhN98UR IO+x4cw7ohkD62amw72oxLvo73a+Vyb+sAQ+SUvWl+2L2h5AIOrPWSQnwtYnHtCcIs7g mAdVx+D+NxFioL+sj1nisJlUknMku6P4tB1pYo7+op+kHXJtLMR2KYcsVScM8AMRhnEv 06aYnYrsRUIqP5O61eagpt9p9IbmZ9z3LxzaNHEb21vTu+Qjdh+nQNwojILJzONhwkUZ q6rlw399SUZofl0q6PUhbKHvpJz4FRQI4tDLqlvYRKmFw5SIVdJ7qFt4QdNkKs+VG7Xa SdNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=yLpp7IxaaGQA301sTQJX9PWLhiBJD8NnJclg3s8a/fQ=; b=UFSO1venMq8hbSL7ws7Gxtb0JKjOXkQX+RRFHf9gO/8FiXfG+3cv/CdQQgpVGHYFR/ 9hRguGwko7Ds0aEX0tbfhnj6MiXcaJUZB4lzwfxsGn377aKEm9dufAATTx/PJrrwfEoK LVTPb9xZazQ+bBQJVW4G2SWdqWc7D0LbjZCMWf4J4NVJpKoMp4yimDx0O+Uronme2ieT 70Ps9ly4YCjC/wQpt1KQ9hSXnkNwZvkfyLdBanrovz+iSLfLRny7r4UTFytTTERTytPS BABlMnlnH3tdujmRAlpr+eHeYdQcyzY5cBx80K7ZcXhhzSEc158IPb+t4VWn5HpwTfND F4XA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=YsQOOd2X; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o4-v6si23045297pll.431.2018.10.10.00.56.51; Wed, 10 Oct 2018 00:57:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=YsQOOd2X; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726873AbeJJPRQ (ORCPT + 99 others); Wed, 10 Oct 2018 11:17:16 -0400 Received: from mail-it1-f196.google.com ([209.85.166.196]:37395 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725874AbeJJPRP (ORCPT ); Wed, 10 Oct 2018 11:17:15 -0400 Received: by mail-it1-f196.google.com with SMTP id e74-v6so6678915ita.2 for ; Wed, 10 Oct 2018 00:56:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=yLpp7IxaaGQA301sTQJX9PWLhiBJD8NnJclg3s8a/fQ=; b=YsQOOd2Xd/WhOly4R8weECmi3YkVIx8ETmu/kbNTtYFZGSZyP+wuDN8MO6M4CHhHFz kf1Am1Qb0K64qw3ecNuWIGE8eXu6jC0u1dpjEdRl32lbAyk3uGH5T3npb7tVoI2qu+jn PmDOTu6deeslaKH0DapBeE6DC50Xe9b6xJPsqzFrkH1FpyzkqFjYOxfuWnHnQ3hOm+e9 cQrBuxvrj8rZnW3KP7E8eRUUs3kZr712QJWq613khrviX8QFuLqAV194tZcs62vkv+wv 8yjts2rGLRk6HK1PAhcOIqMoaNprpp1ZD2Cp/P8zFNDIRzvtXGqQyDjT89GbKdou0S5V p20A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=yLpp7IxaaGQA301sTQJX9PWLhiBJD8NnJclg3s8a/fQ=; b=cqJ3hmd5Do/DWopkrmTNQJNCquN0MAiFPul8dlVOok0OuHTYV5SHU6ITBcymKri2up K6COAwfhn9XQk3e6Rc9S157W4u4wn7aX9f04MqU3bd5iTfs062P6tuguprGC4hrpqjtp SPtua7Htc9zimOnGq1P8MchSiUAY27cIARIoa78Cf4vmYpBZh3boDu9INqsQ6gOscz/L CsL9SjQECLPe4XHuUf6WU5hV9HACeR0qcuCdWLYyayqxOznulCVMkhMXtPMNmCV2dqaV LkVWqqFmkUhIydleC0dDF1eCyrLKMuIfAAUzFPmvYNRmBlv9T0uUvexalAWttNqXu8eT GkHA== X-Gm-Message-State: ABuFfoiRGysG6JYLBSvsBc+czJgEm9MUXpYlFO3j04jBkq2nXnpA/0YT fg4xW2MY6FUzViYK8sVsDIAPaiGUTNA+BtcoGVghjQ== X-Received: by 2002:a24:24c9:: with SMTP id f192-v6mr3865189ita.144.1539158177786; Wed, 10 Oct 2018 00:56:17 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:1003:0:0:0:0:0 with HTTP; Wed, 10 Oct 2018 00:55:57 -0700 (PDT) In-Reply-To: References: <000000000000dc48d40577d4a587@google.com> <201810100012.w9A0Cjtn047782@www262.sakura.ne.jp> From: Dmitry Vyukov Date: Wed, 10 Oct 2018 09:55:57 +0200 Message-ID: Subject: Re: INFO: rcu detected stall in shmem_fault To: David Rientjes Cc: Tetsuo Handa , syzbot , Johannes Weiner , Michal Hocko , Andrew Morton , guro@fb.com, "Kirill A. Shutemov" , LKML , Linux-MM , syzkaller-bugs , Yang Shi Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 10, 2018 at 6:11 AM, 'David Rientjes' via syzkaller-bugs wrote: > On Wed, 10 Oct 2018, Tetsuo Handa wrote: > >> syzbot is hitting RCU stall due to memcg-OOM event. >> https://syzkaller.appspot.com/bug?id=4ae3fff7fcf4c33a47c1192d2d62d2e03efffa64 >> >> What should we do if memcg-OOM found no killable task because the allocating task >> was oom_score_adj == -1000 ? Flooding printk() until RCU stall watchdog fires >> (which seems to be caused by commit 3100dab2aa09dc6e ("mm: memcontrol: print proper >> OOM header when no eligible victim left") because syzbot was terminating the test >> upon WARN(1) removed by that commit) is not a good behavior. You want to say that most of the recent hangs and stalls are actually caused by our attempt to sandbox test processes with memory cgroup? The process with oom_score_adj == -1000 is not supposed to consume any significant memory; we have another (test) process with oom_score_adj == 0 that's actually consuming memory. But should we refrain from using -1000? Perhaps it would be better to use -500/500 for control/test process, or -999/1000? > Not printing anything would be the obvious solution but the ideal solution > would probably involve > > - adding feedback to the memcg oom killer that there are no killable > processes, > > - adding complete coverage for memcg_oom_recover() in all uncharge paths > where the oom memcg's page_counter is decremented, and > > - having all processes stall until memcg_oom_recover() is called so > looping back into try_charge() has a reasonable expectation to succeed. > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/alpine.DEB.2.21.1810092106190.83503%40chino.kir.corp.google.com. > For more options, visit https://groups.google.com/d/optout.