Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp162085pxb; Thu, 21 Jan 2021 04:09:25 -0800 (PST) X-Google-Smtp-Source: ABdhPJziBQpS8vTeXNqMhtGvA9cXyypaDChTC+fYSecBM3u6wmQgcrLd5db5SAbIGTU0BUgdbb0m X-Received: by 2002:a17:906:158c:: with SMTP id k12mr9236412ejd.119.1611230965750; Thu, 21 Jan 2021 04:09:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611230965; cv=none; d=google.com; s=arc-20160816; b=gjvwQGOhslxZVg3/4m2R4VN4GflFU69gLmbupxZFTdqIr/6xXhZqFTHoa6aJb8PzaM 7BQyX6GoPj/HVPSBggibwn20M2Y+uMPRJM6T2YKQ5E3dcDKMwzLT2RIGtMAP6SHJK8kl kxD3dItxwf30n7lUedefTBLYSjwwlhZlU0imZ+QGaUyKVfo3k5ObqU3uSxNtCwvBPqcB DMdGuPzXS28zgSOsgtYMgSmo31uk3gPmQLbxXpL6w9uTWWH0hN7/h447JnAjZHtPTkQx onv6vNcPZvV0o6ycpKlFxj34d4arawUq1xqIFCPRn3I4nm4qP9fCOSGrn51OMi+XnWg5 hm5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=P91PiuqjuprJLkKRiVVxkEOoQV+KBb5YI4ATkY+LhM4=; b=OIzo5V65TD81IQmsobzR2UqBvdhW2eNxMsw+ggZwrX2em9Ttg6LQnzg9+H97r1lnmF XFblUox81vCVdwmjNBbr/azCStOzVXt5y8aKRlu4TONvotzIlp7jqmA3FDweYI27YA+m xKdgAhVWwOmBWfidKk4ur7aBn9VRffnUSRpO6RgWbrChsWzpZmKFV6oyUNbXPe2Fn5Se mKFL/QM0TAQ1eZoSd7LkWCjb5rK89SxbS2ODdA8u9s3wx6heYLHZHRbt1y1CDrLPGGs6 9mXCIBAZkBTn5fJt4jNB0THRaB6cgl+oKJLAw9xwaOxUicJl2fvi8wl0WNLk9TrAbhKU tDeg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f18si1698176ejb.475.2021.01.21.04.08.35; Thu, 21 Jan 2021 04:09:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729820AbhAUMHg (ORCPT + 99 others); Thu, 21 Jan 2021 07:07:36 -0500 Received: from raptor.unsafe.ru ([5.9.43.93]:33590 "EHLO raptor.unsafe.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730975AbhAUMGX (ORCPT ); Thu, 21 Jan 2021 07:06:23 -0500 Received: from example.org (ip-94-112-41-137.net.upcbroadband.cz [94.112.41.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by raptor.unsafe.ru (Postfix) with ESMTPSA id 8DBD220459; Thu, 21 Jan 2021 12:04:38 +0000 (UTC) Date: Thu, 21 Jan 2021 13:04:27 +0100 From: Alexey Gladkov To: "Eric W. Biederman" Cc: Linus Torvalds , LKML , io-uring , Kernel Hardening , Linux Containers , Linux-MM , Andrew Morton , Christian Brauner , Jann Horn , Jens Axboe , Kees Cook , Oleg Nesterov Subject: Re: [RFC PATCH v3 1/8] Use refcount_t for ucounts reference counting Message-ID: <20210121120427.iiggfmw3tpsmyzeb@example.org> References: <116c7669744404364651e3b380db2d82bb23f983.1610722473.git.gladkov.alexey@gmail.com> <20210118194551.h2hrwof7b3q5vgoi@example.org> <20210118205629.zro2qkd3ut42bpyq@example.org> <87eeig74kv.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87eeig74kv.fsf@x220.int.ebiederm.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.6.1 (raptor.unsafe.ru [5.9.43.93]); Thu, 21 Jan 2021 12:05:05 +0000 (UTC) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 19, 2021 at 07:57:36PM -0600, Eric W. Biederman wrote: > Alexey Gladkov writes: > > > On Mon, Jan 18, 2021 at 12:34:29PM -0800, Linus Torvalds wrote: > >> On Mon, Jan 18, 2021 at 11:46 AM Alexey Gladkov > >> wrote: > >> > > >> > Sorry about that. I thought that this code is not needed when switching > >> > from int to refcount_t. I was wrong. > >> > >> Well, you _may_ be right. I personally didn't check how the return > >> value is used. > >> > >> I only reacted to "it certainly _may_ be used, and there is absolutely > >> no comment anywhere about why it wouldn't matter". > > > > I have not found examples where checked the overflow after calling > > refcount_inc/refcount_add. > > > > For example in kernel/fork.c:2298 : > > > > current->signal->nr_threads++; > > atomic_inc(¤t->signal->live); > > refcount_inc(¤t->signal->sigcnt); > > > > $ semind search signal_struct.sigcnt > > def include/linux/sched/signal.h:83 refcount_t sigcnt; > > m-- kernel/fork.c:723 put_signal_struct if (refcount_dec_and_test(&sig->sigcnt)) > > m-- kernel/fork.c:1571 copy_signal refcount_set(&sig->sigcnt, 1); > > m-- kernel/fork.c:2298 copy_process refcount_inc(¤t->signal->sigcnt); > > > > It seems to me that the only way is to use __refcount_inc and then compare > > the old value with REFCOUNT_MAX > > > > Since I have not seen examples of such checks, I thought that this is > > acceptable. Sorry once again. I have not tried to hide these changes. > > The current ucount code does check for overflow and fails the increment > in every case. > > So arguably it will be a regression and inferior error handling behavior > if the code switches to the ``better'' refcount_t data structure. > > I originally didn't use refcount_t because silently saturating and not > bothering to handle the error makes me uncomfortable. > > Not having to acquire the ucounts_lock every time seems nice. Perhaps > the path forward would be to start with stupid/correct code that always > takes the ucounts_lock for every increment of ucounts->count, that is > later replaced with something more optimal. > > Not impacting performance in the non-namespace cases and having good > performance in the other cases is a fundamental requirement of merging > code like this. Did I understand your suggestion correctly that you suggest to use spin_lock for atomic_read and atomic_inc ? If so, then we are already incrementing the counter under ucounts_lock. ... if (atomic_read(&ucounts->count) == INT_MAX) ucounts = NULL; else atomic_inc(&ucounts->count); spin_unlock_irq(&ucounts_lock); return ucounts; something like this ? -- Rgrds, legion