Received: by 2002:a05:6358:53a8:b0:117:f937:c515 with SMTP id z40csp264293rwe; Fri, 14 Apr 2023 02:22:01 -0700 (PDT) X-Google-Smtp-Source: AKy350ZXpoSik5WNcip9C6tyvqDXoGvYmeto1erMZmm3ubmxiv8mb6FDiB444RVsF3nkJFWUOPpt X-Received: by 2002:a17:90b:3005:b0:247:ea8:1ac1 with SMTP id hg5-20020a17090b300500b002470ea81ac1mr4849280pjb.11.1681464120824; Fri, 14 Apr 2023 02:22:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681464120; cv=none; d=google.com; s=arc-20160816; b=geQNgYQGBaso37QfDcb8zPjsE5DrjgoWvcHfk/IKBRKPObAY0eu+8l8K/aksiB/PwT /GjjrGWCctt85axgx7J1qxbvKjiy1UCcz+RQ+EkZnfJnd2GasZ+5Iq9vs1d8WYk/S45C dSiUGuW9jtAVHXVZH+p6yg29ELoaO3+MW8626dcRzVHN6W+zYIZdB8naZpBqIvSCeo6e 2z3h/itPMfSPvLiuyJj4ZdglDkGd6Q/hQ5ap/2h7Mxkd4kQCFqzK3PI2/ZArdhT4WPgR bTOHGuPRtndYTemE8AtcHVzDAQhYpaMsrizCkiASvcGzmeuPO4pYD0nCPMtKtpbbTo/k 7wgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=5HNTAjGfOnwHodQgfKDsEJsOyAHcvXT7KNt41mybjRo=; b=KTGDYg5olVwACE+XBICeSmzImaH51Z5UuEKL8IcWWqLMQVf/BMVkMPxWe1pQvGDiZN pP5LR27p2lrMU5c9roYXUd7d3j67DXysIebof9opJKblEtjnNzCDkqrxUqfuGiNUI8uS cfPbKbEoYVjOlNMtEkc03CTW5xlSGM9/mYmahMS2uD1ZkMc/I2XZqKIwuQ/lzhuVjoN0 8+pibNmRcG1DldAViaVwX8xmMV65OFnK17oy5h8+ZYeyFG08Dts7lXki8gk/IgqfNOx5 xJcwB+wZ9O6jowN3popztWTYuAHu40kvMxXYAQc7neXZLKpuLMRwQtLgH/8nn36uxGTn c3lw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=QthTqZRe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s4-20020a170902a50400b001a1d76e7214si3974902plq.111.2023.04.14.02.21.49; Fri, 14 Apr 2023 02:22:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=QthTqZRe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230299AbjDNJBP (ORCPT + 99 others); Fri, 14 Apr 2023 05:01:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbjDNJBO (ORCPT ); Fri, 14 Apr 2023 05:01:14 -0400 Received: from mail-yw1-x112c.google.com (mail-yw1-x112c.google.com [IPv6:2607:f8b0:4864:20::112c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C58BE525A; Fri, 14 Apr 2023 02:01:09 -0700 (PDT) Received: by mail-yw1-x112c.google.com with SMTP id 00721157ae682-54f6a796bd0so218158177b3.12; Fri, 14 Apr 2023 02:01:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681462869; x=1684054869; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=5HNTAjGfOnwHodQgfKDsEJsOyAHcvXT7KNt41mybjRo=; b=QthTqZReTo6Ehvx7wyUrpW5uyGsJcaTeTlGjeN+v5CQwzXvnjhcX8Z/TZlSQ8jSWef WXyE57DyJG1RXCvSxbCzH6itkxoP0hysqUZHSuLR5SMqX9de2s2r2w7T+EKz3FtSctRz Tx6CX2NialhRImDhrGq1pGONU+Nqff/sUmsTAjBc5sQCwY/ubFZlVzY/pq0gOV9M67JT RGoNr5hCs2/ktRMOyVbfpNxg7fzxAK3YEOfTgPJoCzkGtEkwZTLT2mMY2icGFH+9aRoP 8EJ2Amu3eYAC1zHaJMQBlLZfIvYTv2jlLis1wtp+fWzn0Z1x29Hth5PMNIomx5p5xeu0 GeOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681462869; x=1684054869; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=5HNTAjGfOnwHodQgfKDsEJsOyAHcvXT7KNt41mybjRo=; b=aeBtsfzL5agLrOiX6D4kYDbZctcBjp5ZdjpluEjJMAMkzyRFUB/p02DlYyuNL4Vb2/ WhOYA5EoZZIrMy83s8H2xzjLi51vxqZjVedpV1vGaUPwN53C6baDNmRJAgcKPtXVQ2Q5 4f2lbrUK3guNJF4U2JwkSoFx3acIyV4MQZPVbNm++gng5zJAVEq6NXSPKXl13okIda59 XfvvW+YPme7ma2HqrZujM6mliGmIjvOOiszVaL5tyESOlHHjPXv24rmGuLBh1h7FTPB9 tvdZkiw3YkNaybjaqHkTrkyl0ViV8+bLLphZG8KhBY8/YMlGkJaeZL1YnQuh5VdbgO/6 X26g== X-Gm-Message-State: AAQBX9dBzSrS+0KYrCEm41+Bq0lR0/6duFFMlbCChlr+AbRHSlRK5uXJ Tz/pVRNkJ5oOd9yo1rehcM7vf5NJkXvjkdDzreQ= X-Received: by 2002:a81:a787:0:b0:549:2cc8:6e3e with SMTP id e129-20020a81a787000000b005492cc86e3emr3273767ywh.9.1681462868653; Fri, 14 Apr 2023 02:01:08 -0700 (PDT) MIME-Version: 1.0 References: <20230411054543.21278-1-wedsonaf@gmail.com> <20230411054543.21278-8-wedsonaf@gmail.com> <9619d06c-d631-1edb-cf92-3a998e7b98f2@proton.me> In-Reply-To: <9619d06c-d631-1edb-cf92-3a998e7b98f2@proton.me> From: Wedson Almeida Filho Date: Fri, 14 Apr 2023 06:00:57 -0300 Message-ID: Subject: Re: [PATCH v4 08/13] rust: introduce `ARef` To: Benno Lossin Cc: rust-for-linux@vger.kernel.org, Miguel Ojeda , Alex Gaynor , Boqun Feng , Gary Guo , =?UTF-8?Q?Bj=C3=B6rn_Roy_Baron?= , linux-kernel@vger.kernel.org, Wedson Almeida Filho , Martin Rodriguez Reboredo Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 13 Apr 2023 at 19:30, Benno Lossin wrote: > > On 13.04.23 19:06, Wedson Almeida Filho wrote: > > On Thu, 13 Apr 2023 at 06:19, Benno Lossin wrote: > >> > >> On 11.04.23 07:45, Wedson Almeida Filho wrote: > >>> From: Wedson Almeida Filho > >>> > >>> This is an owned reference to an object that is always ref-counted. This > >>> is meant to be used in wrappers for C types that have their own ref > >>> counting functions, for example, tasks, files, inodes, dentries, etc. > >>> > >>> Reviewed-by: Martin Rodriguez Reboredo > >>> Signed-off-by: Wedson Almeida Filho > >>> --- > >>> v1 -> v2: No changes > >>> v2 -> v3: No changes > >>> v3 -> v4: No changes > >>> > >>> rust/kernel/types.rs | 107 +++++++++++++++++++++++++++++++++++++++++++ > >>> 1 file changed, 107 insertions(+) > >>> > >>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs > >>> index a4b1e3778da7..29db59d6119a 100644 > >>> --- a/rust/kernel/types.rs > >>> +++ b/rust/kernel/types.rs > >>> @@ -6,8 +6,10 @@ use crate::init::{self, PinInit}; > >>> use alloc::boxed::Box; > >>> use core::{ > >>> cell::UnsafeCell, > >>> + marker::PhantomData, > >>> mem::MaybeUninit, > >>> ops::{Deref, DerefMut}, > >>> + ptr::NonNull, > >>> }; > >>> > >>> /// Used to transfer ownership to and from foreign (non-Rust) languages. > >>> @@ -268,6 +270,111 @@ impl Opaque { > >>> } > >>> } > >>> > >>> +/// Types that are _always_ reference counted. > >>> +/// > >>> +/// It allows such types to define their own custom ref increment and decrement functions. > >>> +/// Additionally, it allows users to convert from a shared reference `&T` to an owned reference > >>> +/// [`ARef`]. > >>> +/// > >>> +/// This is usually implemented by wrappers to existing structures on the C side of the code. For > >>> +/// Rust code, the recommendation is to use [`Arc`](crate::sync::Arc) to create reference-counted > >>> +/// instances of a type. > >>> +/// > >>> +/// # Safety > >>> +/// > >>> +/// Implementers must ensure that increments to the reference count keep the object alive in memory > >>> +/// at least until matching decrements are performed. > >>> +/// > >>> +/// Implementers must also ensure that all instances are reference-counted. (Otherwise they > >>> +/// won't be able to honour the requirement that [`AlwaysRefCounted::inc_ref`] keep the object > >>> +/// alive.) > >> > >> `dec_ref` states below that it 'Frees the object when the count reaches > >> zero.', this should also be stated here, since implementers should adhere > >> to that when implementing `dec_ref`. > > > > This section is for safety requirements. Freeing the object doesn't > > fall into this category. > > It still needs to be upheld by the implementer, since it is guaranteed by > the documentation on the `dec_ref` function. Even non-safety requirements > are listed on the `unsafe` traits, if users should be able to rely on them. > If users should not rely on this, then maybe change the docs of `dec_ref` > to "when the refcount reaches zero, the object might be freed.". I disagree that non-safety requirements should be listed under the Safety section. This section is meant for rules that implementers must adhere to to ensure their implementations are safe. So it's usually read before writing a "SAFETY:" comment for their "unsafe impl" blocks -- adding extraneous information is counterproductive. > >>> +pub unsafe trait AlwaysRefCounted { > >>> + /// Increments the reference count on the object. > >>> + fn inc_ref(&self); > >> > >> > >> > >>> + > >>> + /// Decrements the reference count on the object. > >>> + /// > >>> + /// Frees the object when the count reaches zero. > >>> + /// > >>> + /// # Safety > >>> + /// > >>> + /// Callers must ensure that there was a previous matching increment to the reference count, > >>> + /// and that the object is no longer used after its reference count is decremented (as it may > >>> + /// result in the object being freed), unless the caller owns another increment on the refcount > >>> + /// (e.g., it calls [`AlwaysRefCounted::inc_ref`] twice, then calls > >>> + /// [`AlwaysRefCounted::dec_ref`] once). > >>> + unsafe fn dec_ref(obj: NonNull); > >>> +} > >>> + > >>> +/// An owned reference to an always-reference-counted object. > >>> +/// > >>> +/// The object's reference count is automatically decremented when an instance of [`ARef`] is > >>> +/// dropped. It is also automatically incremented when a new instance is created via > >>> +/// [`ARef::clone`]. > >>> +/// > >>> +/// # Invariants > >>> +/// > >>> +/// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In > >>> +/// particular, the [`ARef`] instance owns an increment on the underlying object's reference count. > >>> +pub struct ARef { > >>> + ptr: NonNull, > >>> + _p: PhantomData, > >>> +} > >>> + > >>> +impl ARef { > >>> + /// Creates a new instance of [`ARef`]. > >>> + /// > >>> + /// It takes over an increment of the reference count on the underlying object. > >>> + /// > >>> + /// # Safety > >>> + /// > >>> + /// Callers must ensure that the reference count was incremented at least once, and that they > >>> + /// are properly relinquishing one increment. That is, if there is only one increment, callers > >>> + /// must not use the underlying object anymore -- it is only safe to do so via the newly > >>> + /// created [`ARef`]. > >>> + pub unsafe fn from_raw(ptr: NonNull) -> Self { > >>> + // INVARIANT: The safety requirements guarantee that the new instance now owns the > >>> + // increment on the refcount. > >>> + Self { > >>> + ptr, > >>> + _p: PhantomData, > >>> + } > >>> + } > >>> +} > >>> + > >>> +impl Clone for ARef { > >>> + fn clone(&self) -> Self { > >>> + self.inc_ref(); > >>> + // SAFETY: We just incremented the refcount above. > >>> + unsafe { Self::from_raw(self.ptr) } > >>> + } > >>> +} > >>> + > >>> +impl Deref for ARef { > >>> + type Target = T; > >>> + > >>> + fn deref(&self) -> &Self::Target { > >>> + // SAFETY: The type invariants guarantee that the object is valid. > >>> + unsafe { self.ptr.as_ref() } > >>> + } > >>> +} > >>> + > >>> +impl From<&T> for ARef { > >>> + fn from(b: &T) -> Self { > >>> + b.inc_ref(); > >>> + // SAFETY: We just incremented the refcount above. > >>> + unsafe { Self::from_raw(NonNull::from(b)) } > >>> + } > >>> +} > >> > >> This impl seems unsound to me, as we can do this: > >> > >> struct MyStruct { > >> raw: Opaque, // This has a `refcount_t` inside. > >> } > >> > >> impl MyStruct { > >> fn new() -> Self { ... } > >> } > >> > >> unsafe impl AlwaysRefCounted for MyStruct { ... } // Implemented correctly. > >> > >> fn evil() -> ARef { > >> let my_struct = MyStruct::new(); > >> ARef::from(&my_struct) // We return a pointer to the stack! > >> } > >> > >> similarly, this can also be done with a `Box`: > >> > >> fn evil2() -> ARef { > >> let my_struct = Box::new(MyStruct::new()); > >> ARef::from(&*my_struct) > >> // Box is freed here, even just dropping the `ARef` will result in > >> // a UAF. > >> } > > > > This implementation of `AlwaysRefCounted` is in violation of the > > safety requirements of the trait, namely: > > > > /// Implementers must ensure that increments to the reference count > > keep the object alive in memory > > /// at least until matching decrements are performed. > > /// > > /// Implementers must also ensure that all instances are > > reference-counted. (Otherwise they > > /// won't be able to honour the requirement that > > [`AlwaysRefCounted::inc_ref`] keep the object > > /// alive.) > > > > It boils down `MyStruct::new` in your example. It's not refcounted. > > > >> Additionally, I think that `AlwaysRefCounted::inc_ref` should not be safe, > >> as the caller must not deallocate the memory until the refcount is zero. > > > > The existence of an `&T` is evidence that the refcount is non-zero, so > > it is safe to increment it. The caller cannot free the object without > > violating the safety requirements. > > > >> Another pitfall of `ARef`: it does not deallocate the memory when the > >> refcount reaches zero. People might expect that this code would not leak > >> memory: > >> > >> let foo = Box::try_new(Foo::new())?; > >> let foo = Box::leak(foo); // Leak the box, such that we do not > >> // deallocate the memory too early. > >> let foo = ARef::from(foo); > >> drop(foo); // refcount is now zero, but the memory is never deallocated. > > > > This is also in violation of the safety requirements of `AlwaysRefCounted`. > > It seems I have misunderstood the term "always reference counted". > We should document this in more detail, since this places a lot of > constraints on the implementers: > > Implementing `AlwaysRefCounted` for `T` places the following constraint on shared references `&T`: > - Every `&T` points to memory that is not deallocated until the reference count reaches zero. > - The existence of `&T` proves that the reference count is at least 1. This is implied by the existing safety rules. > This has some important consequences: > - Exposing safe a way to get `T` is not allowed, since stack allocations are freed when the scope > ends even though the reference count is non-zero. Stack allocations are ok, as long as they wait for the refcount to drop to zero before the variable goes out of scope. > - Similarly giving safe access to `Box` or other smart pointers is not allowed, since a `Box` can > be freed independent from the reference count. `ARef` is a smart pointer and it is definitely allowed. Similarly to stack allocations I mention above, a `Box` implementation is conceivable as long as it ensures that the allocation is freed only once the refcount reaches zero, for example, by having a drop implementation that performs such a wait. (IOW, when `Box` goes out of scope, it always calls `drop` on `T` before actually freeing the memory, so this implementation could block until it is safe to do so, i.e., until the refcount reaches zero.) > This type is intended to be implemented for C types that embedd a `refcount_t` and that are both > created and destroyed by C. Static references also work with this type, since they stay live > indefinitely. Embedding a `refcount_t` is not a requirement. I already mention in the documentation that this is usually used for C structs and that Rust code should use `Arc`. > Implementers must also ensure that they never give out `&mut T`, since > - it can be reborrowed as `&T`, > - converted to `ARef`, > - which can yield a `&T` that is alive at the same time as the `&mut T`. I agree with this. And I don't think this is a direct consequence of the safety requirements, so I think it makes sense to add something that covers this. > >>> + > >>> +impl Drop for ARef { > >>> + fn drop(&mut self) { > >>> + // SAFETY: The type invariants guarantee that the `ARef` owns the reference we're about to > >>> + // decrement. > >>> + unsafe { T::dec_ref(self.ptr) }; > >>> + } > >>> +} > >>> + > >>> /// A sum type that always holds either a value of type `L` or `R`. > >>> pub enum Either { > >>> /// Constructs an instance of [`Either`] containing a value of type `L`. > >>> -- > >>> 2.34.1 > >>> > >> >