Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp1119652imw; Fri, 15 Jul 2022 21:59:37 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vxIQWLCEf8XLB6kHpErR5u6CmzGMiU6D3qWXplbkQDjhWtIOPl2A9rpxJA5+wx4P3tYgPy X-Received: by 2002:a17:907:7349:b0:72d:a080:f8b0 with SMTP id dq9-20020a170907734900b0072da080f8b0mr14198783ejc.389.1657947577354; Fri, 15 Jul 2022 21:59:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657947577; cv=none; d=google.com; s=arc-20160816; b=uWo+mhW86lswvhwEJO3CiRB86Fq3Xd3k7JMeeAbtR7NQVRSpW6ZCLAU5cl3TPfqeEA rumGRaTssSZ3zVdiwTFhZM1AvhB7VwjlpVQ8ZErGZIpyNTnTHx+iIZXf6s4H/yoWV1AP d2m3YphzExxUu2w9ZjsoDgYqfSEEpOWQDQ37IHUdp3NzSpIuaKytjA51iSxJVQwxOlmU DQIjgjWWaKLXPXIPuHD68vGhE+qEFCRunfG4Uf3LKYV04sC8yYUbS8PcbHKTH8iwZWoy s0wFTSB4QTpxDYg1TcBD3ivNmV13w4jOSCRveg1EwqwLyUyvdERcjtUcBnrgN2sH3jQM UeAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=/+iUSyE/QSQ9eQJLjwSZ8kMZaUZ0xxz6wpGZYe37/3c=; b=NnmSx5pVOc19bc93ZKh9aBoNQTP6zQ/6Wc9fpd0L+imNUrCl93PFv5jQWT7g7MAm5T cepHpuZIynCJP4PBtkSEmw5NE/NZDvfymFS2Up1fMxmODvPT68nrQdYOEROYkJbABJYT zL3xGJEgQexIVRI5c5cyKdOjdPGHcZm9WU1FpCmp3KAckxDLz7f5B1/pDu1UGru0NgvG 5/TuGC2CSQKfW9h7VLwMN1S2bB6kNFhlXgj/yJtxIJJ/tV5ObtqimOPVYd20op2CxaCB 5WWuLKqzTKUF1P65sl8EsEN0ktyEkusAaSWiLu+hLdpkVIfgfx5GknLBllaGoEy3LTXY eSpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=hCSI4gJT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dm21-20020a170907949500b006fed02b6307si9021078ejc.44.2022.07.15.21.59.13; Fri, 15 Jul 2022 21:59:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=hCSI4gJT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231266AbiGPEzd (ORCPT + 99 others); Sat, 16 Jul 2022 00:55:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229448AbiGPEzb (ORCPT ); Sat, 16 Jul 2022 00:55:31 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8538488F25; Fri, 15 Jul 2022 21:55:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1DE9260A6E; Sat, 16 Jul 2022 04:55:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16D4FC34114; Sat, 16 Jul 2022 04:55:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1657947329; bh=pMVy+7ApQ1a3/ZhtsqYbelRKmEogN0bTAJep1ceVhbo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=hCSI4gJTo+N76KhNPvtEgm53nX9O3Bt3RV5RH49+RR7CuTQ1Xa1jKnwhiMsOjhPkL OGOaUrQbFdJsLCktWDenvSOy0vtgTuudcvyeO1g8cgxn2uU9EYf2ow+nGm5jNjMjAy SAWUVSzI+z9X5aUvqSmfCJm3G6QqnV9zmn+q2edg= Date: Fri, 15 Jul 2022 21:55:28 -0700 From: Andrew Morton To: Shakeel Butt Cc: Benjamin Segall , Alexander Viro , linux-fsdevel , LKML , Linus Torvalds , Eric Dumazet , Roman Penyaev , Jason Baron , Khazhismel Kumykov , Heiher Subject: Re: [RESEND RFC PATCH] epoll: autoremove wakers even more aggressively Message-Id: <20220715215528.213e9340e62df36320e89b22@linux-foundation.org> In-Reply-To: <20220716012731.2zz7hpg3qbhwgeqd@google.com> References: <20220629165542.da7fc8a2a5dbd53cf99572aa@linux-foundation.org> <20220629192435.df27c0dbb07ef72165e1de5e@linux-foundation.org> <20220716012731.2zz7hpg3qbhwgeqd@google.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 16 Jul 2022 01:27:31 +0000 Shakeel Butt wrote: > > ... > > > > > production with real workloads and it has caused hard lockups. > > > > Particularly network heavy workloads with a lot of threads in > > > > epoll_wait() can easily trigger this issue if they get killed > > > > (oom-killed in our case). > > > > > > Hard lockups are undesirable. Is a cc:stable justified here? > > > > Not for now as I don't know if we can blame a patch which might be the > > source of this behavior. > > I am able to repro the epoll hard lockup on next-20220715 with Ben's > patch reverted. The repro is a simple TCP server and tens of clients > communicating over loopback. Though to cause the hard lockup I have to > create a couple thousand threads in epoll_wait() in server and also > reduce the kernel.watchdog_thresh. With Ben's patch the repro does not > cause the hard lockup even with kernel.watchdog.thresh=1. > > Please add: > > Tested-by: Shakeel Butt OK, thanks. I added the cc:stable. No Fixes:, as it has presumably been there for a long time, perhaps for all time.