Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp472037lqp; Thu, 21 Mar 2024 06:56:45 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUUwTeJjwflHmN863KNnz/z22CjBjMP5JI1z1/Mln79yReXKaeNdY9YAV20NsOEGzvFdXsexdZXxSY6WrWtPBTNVYRNTBkcdnMhEtY40w== X-Google-Smtp-Source: AGHT+IGUOZkHv6RspxchfMvOqj9Jpxnrb3eBmYigmPRnHed292zNYT9bqmZATfFWvmeY37PdkplF X-Received: by 2002:a05:622a:1786:b0:430:f76c:3c4a with SMTP id s6-20020a05622a178600b00430f76c3c4amr6056175qtk.4.1711029405144; Thu, 21 Mar 2024 06:56:45 -0700 (PDT) Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id g7-20020ac85d47000000b00430e2e878f7si7087722qtx.660.2024.03.21.06.56.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Mar 2024 06:56:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-110115-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@proton.me header.s=protonmail header.b=CwO5JKKg; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-110115-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-110115-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=QUARANTINE) header.from=proton.me Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id C75591C21460 for ; Thu, 21 Mar 2024 13:56:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3619E84FAB; Thu, 21 Mar 2024 13:56:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=proton.me header.i=@proton.me header.b="CwO5JKKg" Received: from mail-0201.mail-europe.com (mail-0201.mail-europe.com [51.77.79.158]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA32184A54 for ; Thu, 21 Mar 2024 13:56:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=51.77.79.158 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711029397; cv=none; b=fe1WdVqL4+qPer2xHC8D3VEZIeFRn/AYRF3hkGB6H8IwOLiBrapQf2P6m+lkiVYAAzm1EEnuX8CM2NIqEzdkn5xEwlxe/p08lNK4BrAFiZ4GnIraocafcLaiC+1OLWdK0r65rEiKwffDVNdvRPXY0JNf1R141e2wR1JE57/G8C0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711029397; c=relaxed/simple; bh=qXTQzXE42jIqU86dY8JNyjmzbaiF6mqxNVesECoZkAk=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cMNfsAWw/lEdCJBfF8fv1DofDDdVdHl5zpBwyy7tsaDGphpGljz8Ls224fCxJrmY1blNmXbOPT5UGdmO/Uy8TBcVCimtE2gkpe9KhlpA0vdMmvvB67VWEaH78C3hJ8QxC0EdcWMI/tzzHPnlauVKjTo38DKp7HgFQo4gKJyIHQ4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=proton.me; spf=pass smtp.mailfrom=proton.me; dkim=pass (2048-bit key) header.d=proton.me header.i=@proton.me header.b=CwO5JKKg; arc=none smtp.client-ip=51.77.79.158 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=proton.me Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=proton.me DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proton.me; s=protonmail; t=1711029385; x=1711288585; bh=+DZVzwGA3DNwg1DykuRDpqsopt/Gmedd14krbU6mhdE=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=CwO5JKKgXVn+JoGs/oxfHnLuOyuxpDFkQiKIHtgtT472HlueJhv0iSK590fc+QdMr 4ZZvdyJfubOA8rbBKo2cbbYcbxRANSBUeL3kxYYKUMv1Rk/BeIsHh1JwT4ZstA6VRF czmYQzARvgK0c39fgn0Dfq27/7/VF+zelB8iskbVAoCPdmaNc7zJlgYCpl1S8hBUir /dq64fhB4jQaqrCMX+tH/8OPGLRcHV5qfhVLhl90sh9WXwfSFDWPARDMrXdFqb0CQR adGovePRsjpcc92pFMbrvFVfk7tKdAkzrmDzUX/9LAFBzAkYwRGSZE3gfvKg6rA/XI /kG4xT5Y5ceOQ== Date: Thu, 21 Mar 2024 13:56:11 +0000 To: Alice Ryhl From: Benno Lossin Cc: a.hindborg@samsung.com, akpm@linux-foundation.org, alex.gaynor@gmail.com, arnd@arndb.de, arve@android.com, bjorn3_gh@protonmail.com, boqun.feng@gmail.com, brauner@kernel.org, cmllamas@google.com, gary@garyguo.net, gregkh@linuxfoundation.org, joel@joelfernandes.org, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, maco@android.com, ojeda@kernel.org, rust-for-linux@vger.kernel.org, surenb@google.com, tkjos@android.com, viro@zeniv.linux.org.uk, wedsonaf@gmail.com, willy@infradead.org Subject: Re: [PATCH v3 4/4] rust: add abstraction for `struct page` Message-ID: In-Reply-To: References: <20240320084630.2727355-1-aliceryhl@google.com> <9ed03148-5ceb-40f2-9c2d-31e2b8918888@proton.me> Feedback-ID: 71780778:user:proton Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 3/21/24 14:42, Alice Ryhl wrote: > On Thu, Mar 21, 2024 at 2:16=E2=80=AFPM Benno Lossin wrote: >> >> On 3/20/24 09:46, Alice Ryhl wrote: >>>> On 3/11/24 11:47, Alice Ryhl wrote: >>>>> +/// A pointer to a page that owns the page allocation. >>>>> +/// >>>>> +/// # Invariants >>>>> +/// >>>>> +/// The pointer points at a page, and has ownership over the page. >>>> >>>> Why not "`page` is valid"? >>>> Do you mean by ownership of the page that `page` has ownership of the >>>> allocation, or does that entail any other property/privilege? >>> >>> I can add "at a valid page". >> >> I don't think that helps, what you need as an invariant is that the >> pointer is valid. >=20 > To me "points at a page" implies that the pointer is valid. I mean, if > it was dangling, it would not point at a page? >=20 > But I can reword to something else if you have a preferred phrasing. I would just say "`page` is valid" or "`self.page` is valid". >>>>> + /// Runs a piece of code with this page mapped to an address. >>>>> + /// >>>>> + /// The page is unmapped when this call returns. >>>>> + /// >>>>> + /// It is up to the caller to use the provided raw pointer corre= ctly. >>>> >>>> This says nothing about what 'correctly' means. What I gathered from t= he >>>> implementation is that the supplied pointer is valid for the execution >>>> of `f` for `PAGE_SIZE` bytes. >>>> What other things are you allowed to rely upon? >>>> >>>> Is it really OK for this function to be called from multiple threads? >>>> Could that not result in the same page being mapped multiple times? If >>>> that is fine, what about potential data races when two threads write t= o >>>> the pointer given to `f`? >>>> >>>>> + pub fn with_page_mapped(&self, f: impl FnOnce(*mut u8) -> T) = -> T { >>> >>> I will say: >>> >>> /// It is up to the caller to use the provided raw pointer correctly. >>> /// The pointer is valid for `PAGE_SIZE` bytes and for the duration in >>> /// which the closure is called. Depending on the gfp flags and kernel >>> /// configuration, the pointer may only be mapped on the current thread= , >>> /// and in those cases, dereferencing it on other threads is UB. Other >>> /// than that, the usual rules for dereferencing a raw pointer apply. >>> /// (E.g., don't cause data races, the memory may be uninitialized, and >>> /// so on.) >> >> I would simplify and drop "depending on the gfp flags and kernel..." and >> just say that the pointer is only valid on the current thread. >=20 > Sure, that works for me. >=20 >> Also would it make sense to make the pointer type *mut [u8; PAGE_SIZE]? >=20 > I think it's a trade-off. That makes the code more error-prone, since > `pointer::add` now doesn't move by a number of bytes, but a number of > pages. Yeah. As long as you document that the pointer is valid for r/w with offsets in `0..PAGE_SIZE` bytes, leaving the type as is, is fine by me. >>> It's okay to map it multiple times from different threads. >> >> Do you still need to take care of data races? >> So would it be fine to execute this code on two threads in parallel? >> >> static PAGE: Page =3D ...; // assume we have a page accessible by = both threads >> >> PAGE.with_page_mapped(|ptr| { >> loop { >> unsafe { ptr.write(0) }; >> pr_info!("{}", unsafe { ptr.read() }); >> } >> }); >=20 > Like I said, the usual pointer rules apply. Two threads can access it > in parallel as long as one of the following are satisfied: >=20 > * Both accesses are reads. > * Both accesses are atomic. > * They access disjoint byte ranges. >=20 > Other than the fact that it uses a thread-local mapping on machines > that can't address all of their memory at the same time, it's > completely normal memory. It's literally just a PAGE_SIZE-aligned > allocation of PAGE_SIZE bytes. Thanks for the info, what do you think of this?: /// It is up to the caller to use the provided raw pointer correctly. The p= ointer is valid for reads /// and writes for `PAGE_SIZE` bytes and for the duration in which the clos= ure is called. The /// pointer must only be used on the current thread. The caller must also e= nsure that no data races /// occur: when mapping the same page on two threads accesses to memory wit= h the same offset must be /// synchronized. >=20 >> If this is not allowed, I don't really like the API. As a raw version it >> would be fine, but I think we should have a safer version (eg by taking >> `&mut self`). >=20 > I don't understand what you mean. It is the *most* raw API that `Page` > has. I can make them private if you want me to. The API cannot take > `&mut self` because I need to be able to unsafely perform concurrent > writes to disjoint byte ranges. If you don't need these functions to be public, I think we should definitely make them private. Also we could add a `raw` suffix to the functions to make it clear that it is a primitive API. If you think that it is highly unlikely that we get a safer version, then I don't think there is value in adding the suffix. --=20 Cheers, Benno