Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2861448pxb; Mon, 18 Oct 2021 03:34:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxGpEpwhUz3vUmF/rgivZdjnL1wAtrSSM4rVAnMJxaZa/4cY7sMe6k0BXeSzBfCOEavAbqv X-Received: by 2002:a17:903:22d0:b0:13f:507:6414 with SMTP id y16-20020a17090322d000b0013f05076414mr26486294plg.69.1634553244752; Mon, 18 Oct 2021 03:34:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634553244; cv=none; d=google.com; s=arc-20160816; b=QBhzjxYLusoc3kss+Lq4HDaEOGmSq9fv33LomSgfesHk2Gj+rWMtlh8Ck5hLjVfhrV 06d5kruNM170pLb5oo9/3w1X16tSKwy6O1Gm4FoiUaosSp4XS7o+a+5u2shAvIP74B25 ohswLRD83h0Gv2zkMx9BoO6f4SLZ926KUgPTLFZkruTyR3ZJ/OCOOJLAb8N2w8SqeLSP E9WKMSJsPyKqSo4Qje3MK6YWzGqlc6rl3y7q99ZveER0g2v4QdQ9Hoe1xDz9znZVM5Zr HYyoxQlNbMTj5QCQJ90d3VZtYNHqGycmk+vArruzCUj5ySMSrbUh06xl7qkG+oxA1Y49 D7Bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:reply-to:cc:from:to :dkim-signature:date; bh=HYCqN5T+QYzLh/zNNwmm3Jblg3/9nkHgKNp4c3zRw0E=; b=B9MYBUYTSJyWYctZ1Fs7FuBh5A2PutknbK8F9xh/LuU+Xmj7metmndcBsA7X4Txqg8 efhMbE6g0f2IKiPutU8kyqaNJZ2ClDuXGFdzgFsUDZ+ioYk3XFEozP5lB92YHALr1YJv kpIRgy+9WUeeVko+qjm1XdoAPDG6RjgHvMdLyOmKkLzv7Jgi77Xw5ZAqtfE0kFbzVDHf RQi2apLBDtrZGqpg0pChU42zNO9RHX71coBEW6Mqi18hfPxz9MwColPAdtDn8ZcJP5Y8 RFMenzKEYcsChMNO4JtW7kI6aP7vJppI83af6zy1GifzlBLm8a1ucrEnLhf+C11tzs4S hMZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@protonmail.ch header.s=protonmail header.b=iz1aBOLh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=protonmail.ch Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k3si483299pfc.213.2021.10.18.03.33.52; Mon, 18 Oct 2021 03:34:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@protonmail.ch header.s=protonmail header.b=iz1aBOLh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=protonmail.ch Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230106AbhJRKd7 (ORCPT + 99 others); Mon, 18 Oct 2021 06:33:59 -0400 Received: from mail-4325.protonmail.ch ([185.70.43.25]:20531 "EHLO mail-4325.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229491AbhJRKd7 (ORCPT ); Mon, 18 Oct 2021 06:33:59 -0400 Date: Mon, 18 Oct 2021 10:31:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.ch; s=protonmail; t=1634553106; bh=HYCqN5T+QYzLh/zNNwmm3Jblg3/9nkHgKNp4c3zRw0E=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=iz1aBOLhcLUFZzHy48w+1/uKpOt/mY4U/WDXHPNkkh+DjIl6qys/9icS6CYXhHcl1 Q85uMCjYqJqY18kz+SnEIq9l0yahlhsZkjATSG2L4iCtQyYmNA7cuXNOLczQklYKA9 I+HqGiM1jna5pcCZ0eGV8A8eGXH5ABCdZN75vlvc= To: Yu Zhao From: Jordan Glover Cc: Rune Kleveland , "Eric W. Biederman" , Alexey Gladkov , LKML , Linux-MM , "containers\\\\@lists.linux-foundation.org" Reply-To: Jordan Glover Subject: Re: [CFT][PATCH] ucounts: Fix signal ucount refcounting Message-ID: In-Reply-To: References: <1M9_d6wrcu6rdPe1ON0_k0lOxJMyyot3KAb1gdyuwzDPC777XVUWPHoTCEVmcK3fYfgu7sIo3PSaLe9KulUdm4TWVuqlbKyYGxRAjsf_Cpk=@protonmail.ch> <20210930130640.wudkpmn3cmah2cjz@example.org> <878rz8wwb6.fsf@disp2133> <87v92cvhbf.fsf@disp2133> <87mtnavszx.fsf_-_@disp2133> <24192747-7f69-ef22-7bf1-96b2e7c2bca1@infomedia.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.7 required=10.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,FREEMAIL_REPLYTO_END_DIGIT shortcircuit=no autolearn=disabled version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on mailout.protonmail.ch Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Monday, October 18th, 2021 at 6:25 AM, Yu Zhao wrote= : > On Sun, Oct 17, 2021 at 10:47 AM Rune Kleveland > > rune.kleveland@infomedia.dk wrote: > > > Hi! > > > > After applying the below patch, the 5 most problematic servers have run > > > > without any issues for 23 hours. That never happened before the patch o= n > > > > 5.14, so the patch seems to have fixed the issue for me. > > Confirm. I couldn't reproduce the problem on 5.14 either. > I'm also unable to reproduce the crash as for now. Thx for the patch. Jordan > > On Monday there will be more load on the servers, which caused them to > > > > crash faster without the patch. I will let you know if it happens again= . > > > > Best regards, > > > > Rune > > > > On 16/10/2021 00:10, Eric W. Biederman wrote: > > > > > In commit fda31c50292a ("signal: avoid double atomic counter > > > > > > increments for user accounting") Linus made a clever optimization to > > > > > > how rlimits and the struct user_struct. Unfortunately that > > > > > > optimization does not work in the obvious way when moved to nested > > > > > > rlimits. The problem is that the last decrement of the per user > > > > > > namespace per user sigpending counter might also be the last decremen= t > > > > > > of the sigpending counter in the parent user namespace as well. Which > > > > > > means that simply freeing the leaf ucount in __free_sigqueue is not > > > > > > enough. > > > > > > Maintain the optimization and handle the tricky cases by introducing > > > > > > inc_rlimit_get_ucounts and dec_rlimit_put_ucounts. > > > > > > By moving the entire optimization into functions that perform all of > > > > > > the work it becomes possible to ensure that every level is handled > > > > > > properly. > > > > > > I wish we had a single user across all of the threads whose rlimit > > > > > > could be charged so we did not need this complexity.