Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp864365imm; Wed, 10 Oct 2018 05:38:11 -0700 (PDT) X-Google-Smtp-Source: ACcGV60qiYJvyq9BKcsYsPdrNKLwMXc+7bDGfaNx2u/aC6XG8zPq9iBLXHj+gYXe9qX/IJMLUrqV X-Received: by 2002:a17:902:7887:: with SMTP id q7-v6mr32999143pll.111.1539175091265; Wed, 10 Oct 2018 05:38:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539175091; cv=none; d=google.com; s=arc-20160816; b=qHvjv8o92J7fZ3sjgIGuH6SseIhAsiMlKeSpsYVG+/oII6eL41kzgCGVB6w5ZmMbc+ aSE+Y4vwSzWEQZOusTuIA5np9KTCC7rzGrYmf9bQCmF60vm3aw03SmVEiNuS5+uHEm6E /tcftHy3iZJQUQxIRks6AnYNSq+r0X7aKHpoOOVqHeDUYlzbiSs7NBGh0FTLwUN2R7XP bSgBLHV9+2aom8LmC2xztxy261V2fJRoe/WNw2R7+bkrFpo2HlsZdhwpky2ThuP8gdna 0wvc6ThnHL8aRM8lqJi6AiGCsFvT12CQl/TXjUPMEZ4RBP8sHW/MKiQwYEFq1Y7wZFDu SLwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=OFgOEIE0zZpQrPIwr1VeeUcXGjgM/50RUStkFcvjXss=; b=UC1OHQe2Zf7fdxeqKeXI7ceB3d7v+y/KMT6mjQuL1RJKqk3MgD4qVaMmigg2Y1w1Yc CFJFz5iC/GpTL7iZ7G9pJkaReSGAfPJDDBGAFZNunrgqLLB022j4I8o8A8fr+1jjApi7 j/mBrwPLmmYEeVgY/6VjHlbfmnzoc1JN615FXi8h4RvKQYO3K4OvEpYLSEuwfXd5cQPL OkSGq3kd/WH4oQGbTPKrck8rG2ncssjrUdh6Oz03atDntP4fQHvQSyWgCzEguhYCA0kF aWPCjIa9NfYVQOwv7VBNgXp4MFH3X+f53Y+ABwrx03cYtnwLY68MpoWeIjUVAs43pWLT MVqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fRZZjail; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p19-v6si24356723plo.26.2018.10.10.05.37.56; Wed, 10 Oct 2018 05:38:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fRZZjail; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726933AbeJJT6u (ORCPT + 99 others); Wed, 10 Oct 2018 15:58:50 -0400 Received: from mail-io1-f67.google.com ([209.85.166.67]:39551 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726206AbeJJT6t (ORCPT ); Wed, 10 Oct 2018 15:58:49 -0400 Received: by mail-io1-f67.google.com with SMTP id z16-v6so3732091iol.6 for ; Wed, 10 Oct 2018 05:36:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=OFgOEIE0zZpQrPIwr1VeeUcXGjgM/50RUStkFcvjXss=; b=fRZZjailblvJEF3/RYWm8MdQN8HD3lWWYw/m/VbLAsha4tBxY8/68K/wGqYvMSJYdc 0gycHJPfhY7ByK8CrZ6F7zEmBsd+b5Hw6snd0Nyz6tdhr0UADpYtFCLFX2NnYUKW+Nmy MGhO448QbvbPKpudYOD2OAujDzhS6R1UhhhBmmMedbqWXwCC+1C+HDs/8N4AtmjLFhwA BM6LGtzcsHbjvT3wkHuR9HLjcPL8fuVHAOW6nzY8gO/PugYgE2HJBoQQ1XOr27bozPAb 96fS3P6GU8Ir+/fIPG9qz9d4nK3bbSWHU7XwGSsGNfgsaYSbFfR7E8QJw/sExym3yxXO ViyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=OFgOEIE0zZpQrPIwr1VeeUcXGjgM/50RUStkFcvjXss=; b=gWYI48sviryk2Ohh0fkO1EIYVarnB8hAe3yo59uYpNc4uggoynTSKscI25qkQ5QInk 39zuaxrS5rEXpqGpITj2uvNGmXq9tqVZ3XICUM4yJhJK+W+ggVD45nJt2GrUsdpy70pw uNWctyLRV2TEsPWMKQF3mi6ArRaDq33WsoY3EATMWTmhKW4mXK4lGy8DGbcsBVn1U0rD uwvY3wE8ED9Jl2XqaNbP9qub3d9dmc9M6jQP9Xq3oNJRcXlu7dU4UR/RIhIscPGrrI1H 92UtTbRIX27S7iKWuQZoRoytDOseW3CD9x6ISQMMYsynYSvXrb6mvJjGuCFoqo5FTEJH rqAA== X-Gm-Message-State: ABuFfoh/yTpwSkpklk7yfhDSjWl6kgekVAiypKA4lK/+8bgyPvBhZWN9 d262UclXVYlGCuB1d6vS3K/fSnddoz0y88dnd5ayyA== X-Received: by 2002:a6b:6209:: with SMTP id f9-v6mr12143922iog.11.1539175009722; Wed, 10 Oct 2018 05:36:49 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:1003:0:0:0:0:0 with HTTP; Wed, 10 Oct 2018 05:36:29 -0700 (PDT) In-Reply-To: References: <000000000000dc48d40577d4a587@google.com> <201810100012.w9A0Cjtn047782@www262.sakura.ne.jp> <20181010085945.GC5873@dhcp22.suse.cz> <20181010113500.GH5873@dhcp22.suse.cz> <20181010114833.GB3949@tigerII.localdomain> <20181010122539.GI5873@dhcp22.suse.cz> From: Dmitry Vyukov Date: Wed, 10 Oct 2018 14:36:29 +0200 Message-ID: Subject: Re: INFO: rcu detected stall in shmem_fault To: Michal Hocko Cc: Sergey Senozhatsky , Tetsuo Handa , syzbot , Johannes Weiner , Andrew Morton , guro@fb.com, "Kirill A. Shutemov" , LKML , Linux-MM , David Rientjes , syzkaller-bugs , Yang Shi , Sergey Senozhatsky , Petr Mladek Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 10, 2018 at 2:29 PM, Dmitry Vyukov wrote: > On Wed, Oct 10, 2018 at 2:25 PM, Michal Hocko wrote: >> On Wed 10-10-18 20:48:33, Sergey Senozhatsky wrote: >>> On (10/10/18 13:35), Michal Hocko wrote: >>> > > Just flooding out of memory messages can trigger RCU stall problems. >>> > > For example, a severe skbuff_head_cache or kmalloc-512 leak bug is causing >>> > >>> > [...] >>> > >>> > Quite some of them, indeed! I guess we want to rate limit the output. >>> > What about the following? >>> >>> A bit unrelated, but while we are at it: >>> >>> I like it when we rate-limit printk-s that lookup the system. >>> But it seems that default rate-limit values are not always good enough, >>> DEFAULT_RATELIMIT_INTERVAL / DEFAULT_RATELIMIT_BURST can still be too >>> verbose. For instance, when we have a very slow IPMI emulated serial >>> console -- e.g. baud rate at 57600. DEFAULT_RATELIMIT_INTERVAL and >>> DEFAULT_RATELIMIT_BURST can add new OOM headers and backtraces faster >>> than we evict them. >>> >>> Does it sound reasonable enough to use larger than default rate-limits >>> for printk-s in OOM print-outs? OOM reports tend to be somewhat large >>> and the reported numbers are not always *very* unique. >>> >>> What do you think? >> >> I do not really care about the current inerval/burst values. This change >> should be done seprately and ideally with some numbers. > > I think Sergey meant that this place may need to use > larger-than-default values because it prints lots of output per > instance (whereas the default limit is more tuned for cases that print > just 1 line). > > I've found at least 1 place that uses DEFAULT_RATELIMIT_INTERVAL*10: > https://elixir.bootlin.com/linux/latest/source/fs/btrfs/extent-tree.c#L8365 > Probably we need something similar here. In parallel with the kernel changes I've also made a change to syzkaller that (1) makes it not use oom_score_adj=-1000, this hard killing limit looks like quite risky thing, (2) increase memcg size beyond expected KASAN quarantine size: https://github.com/google/syzkaller/commit/adedaf77a18f3d03d695723c86fc083c3551ff5b If this will stop the flow of hang/stall reports, then we can just close all old reports as invalid.