Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp1045716imw; Fri, 15 Jul 2022 19:29:11 -0700 (PDT) X-Google-Smtp-Source: AGRyM1u+Ltj+3Yj853bRR9Fg13BjnyAx/MliAQpnH+8CXV6/nQ0QghTXX/NDXlUUnyFySpbWyw0+ X-Received: by 2002:a17:903:1208:b0:16b:81f6:e992 with SMTP id l8-20020a170903120800b0016b81f6e992mr17102574plh.55.1657938550906; Fri, 15 Jul 2022 19:29:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657938550; cv=none; d=google.com; s=arc-20160816; b=bW8NmYGz0LZzyA3DDM2lwQKTnyjl1JAS7AZSnY1zC4k0ieefAe1HNvdYreuLmw6rOf sxeJG0WtNuCyurgt9PGGo6jcFITRgBGog6xV0h5GSyTq5J+0JaOmj9A0ki793zuq0G2f qfot4UOW4YqVIxcKqI+tcisCDXx3t3g+QuKNBGkmOXeKwT6IDL8EJJC75vZsrREJRZ7G y1bTAQ7RyVrgkzLQnNy09xV7kpkXG6wYwrgCcMR0XXHdlfLMGoasPcP/ub3w7tfeeLDk PvUMJgumxeYf5GeKOWexXsCOioIb26QRmqXmsT0Gp/UPgZGODLj+vTFUQV5xvEm9iIrP dBJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=oNUmtpieKLTpKC9Phy1E+O2Qwr+LcgpjVlKPKz6ZsMY=; b=yY5Akb0WHtBGNU4dZGgBDHNdb5ZqXjk02ZQ4+j6q0Ir6jMUM28ZQgEl53B4PHgyHE2 oHnTZVoXoLehA4Zf7lOod/EGgppDjGjod2+nL7AdOjLozknkuymDfnw3UoyBzZsrue9c nj82sthbgD6dmjFc/4F1uj9IMGZnjRXvlmS8ERtyXN5hwO2mFhWfmpAQaXo82RxgF61b O5SdLYsB33oliFDUeT4MqzKg7CgIANOyj6ALxlFWooObWnk/3uQjmGvYpLl1f+T8iPfH gD8r5Mz3+/gvXkAAs4/3k4y30aj2sRRFSE9zdnvbn2SVPUDnTjp0auANtS2mtQ0qS9sf QOaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=o3YYx9oR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t7-20020a170902e84700b0016c278aa325si6401860plg.601.2022.07.15.19.28.37; Fri, 15 Jul 2022 19:29:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=o3YYx9oR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231148AbiGPB1i (ORCPT + 99 others); Fri, 15 Jul 2022 21:27:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230072AbiGPB1g (ORCPT ); Fri, 15 Jul 2022 21:27:36 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD7028AEDB for ; Fri, 15 Jul 2022 18:27:34 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-31dfe25bd47so18406817b3.18 for ; Fri, 15 Jul 2022 18:27:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=oNUmtpieKLTpKC9Phy1E+O2Qwr+LcgpjVlKPKz6ZsMY=; b=o3YYx9oRz4vs+i20GvTWtA2fo2ETFMeVSiAqrPAjMumbgk6QsUfn/L2857Y1U4kr91 vklLL5DSuzEDM920bdCqKAMwVm4TZrdIbmt15k1jgJIHxfR2M3E7zYEP8JAC7UfjbOPP BUfsSW6TIpP5Zd+0767BBU6jYGQIuKTH6NuHwbuI8Ok0AGwudcmQcNEJKE0uPakdLJha BSQHl/QlbT7Tx2Gt4EzHMIkcAgxAelFpUjH17w/t9BXELeqSi0tBgNQP9F0wr/k6HFmy H0kpyAk/8g5j0lUBzM1Q5DdfBFyv+9bdMt4D3AGxguc5NSFR9q3njhxZLt3co38DKu8I 5FYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=oNUmtpieKLTpKC9Phy1E+O2Qwr+LcgpjVlKPKz6ZsMY=; b=zSr3Wk7PXtElL0ZxU1mWwygqR0ZTpXj30wGchg/0FRRn593Y/0b28lvHXj3fTatP+T OYvyaQl13vaxLvzEYdyKRgt4QjZM2hO5UzCe0sQCgqcGCDMXBPNqX0YJvD9YT/VPARd6 1C2jeEeri+kxeWdd/5Kn2GA+GOU03vOnoGVjpphLDnEk4uAsL4aIvDaN7q8Pcv92m4OJ oa+SXhx/M0uTOY/xH+Ok37IH/rC12qxn7NiI9Tm036YoEeKyEPkLIzZgiV2VWjzDJxpo hIFvHQXKm6WvNWR0u+ioEkzE/c19ESPMhWnbf5bfkxZEs5diOz45zoOkEgIAjUdPSV/y Lbvw== X-Gm-Message-State: AJIora/n8r4Y3XOGncOkHY8ux3stfbJXWrSSiF93ohQwhpOvr1kn1e4Y I3vceCTr2/Jt9ZAu+akp7S8L6pA7kHxaEg== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:28b]) (user=shakeelb job=sendgmr) by 2002:a25:e74d:0:b0:66e:5c8e:609 with SMTP id e74-20020a25e74d000000b0066e5c8e0609mr16394088ybh.585.1657934854024; Fri, 15 Jul 2022 18:27:34 -0700 (PDT) Date: Sat, 16 Jul 2022 01:27:31 +0000 In-Reply-To: Message-Id: <20220716012731.2zz7hpg3qbhwgeqd@google.com> Mime-Version: 1.0 References: <20220629165542.da7fc8a2a5dbd53cf99572aa@linux-foundation.org> <20220629192435.df27c0dbb07ef72165e1de5e@linux-foundation.org> Subject: Re: [RESEND RFC PATCH] epoll: autoremove wakers even more aggressively From: Shakeel Butt To: Andrew Morton Cc: Benjamin Segall , Alexander Viro , linux-fsdevel , LKML , Linus Torvalds , Eric Dumazet , Roman Penyaev , Jason Baron , Khazhismel Kumykov , Heiher Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 30, 2022 at 07:59:05AM -0700, Shakeel Butt wrote: > On Wed, Jun 29, 2022 at 7:24 PM Andrew Morton wrote: > > > > On Wed, 29 Jun 2022 18:12:46 -0700 Shakeel Butt wrote: > > > > > On Wed, Jun 29, 2022 at 4:55 PM Andrew Morton wrote: > > > > > > > > On Wed, 15 Jun 2022 14:24:23 -0700 Benjamin Segall wrote: > > > > > > > > > If a process is killed or otherwise exits while having active network > > > > > connections and many threads waiting on epoll_wait, the threads will all > > > > > be woken immediately, but not removed from ep->wq. Then when network > > > > > traffic scans ep->wq in wake_up, every wakeup attempt will fail, and > > > > > will not remove the entries from the list. > > > > > > > > > > This means that the cost of the wakeup attempt is far higher than usual, > > > > > does not decrease, and this also competes with the dying threads trying > > > > > to actually make progress and remove themselves from the wq. > > > > > > > > > > Handle this by removing visited epoll wq entries unconditionally, rather > > > > > than only when the wakeup succeeds - the structure of ep_poll means that > > > > > the only potential loss is the timed_out->eavail heuristic, which now > > > > > can race and result in a redundant ep_send_events attempt. (But only > > > > > when incoming data and a timeout actually race, not on every timeout) > > > > > > > > > > > > > Thanks. I added people from 412895f03cbf96 ("epoll: atomically remove > > > > wait entry on wake up") to cc. Hopefully someone there can help review > > > > and maybe test this. > > > > > > > > > > > > > > Thanks Andrew. Just wanted to add that we are seeing this issue in > > > production with real workloads and it has caused hard lockups. > > > Particularly network heavy workloads with a lot of threads in > > > epoll_wait() can easily trigger this issue if they get killed > > > (oom-killed in our case). > > > > Hard lockups are undesirable. Is a cc:stable justified here? > > Not for now as I don't know if we can blame a patch which might be the > source of this behavior. I am able to repro the epoll hard lockup on next-20220715 with Ben's patch reverted. The repro is a simple TCP server and tens of clients communicating over loopback. Though to cause the hard lockup I have to create a couple thousand threads in epoll_wait() in server and also reduce the kernel.watchdog_thresh. With Ben's patch the repro does not cause the hard lockup even with kernel.watchdog.thresh=1. Please add: Tested-by: Shakeel Butt