Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp731800rwr; Wed, 3 May 2023 05:33:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6qIGDhwO1hR0CZetySG2W0Rz6As0rSSzRqednngNVAmbELm+jmq6VqBtIRQvmU4+1PAr3V X-Received: by 2002:a05:6a20:4321:b0:da:501:55e with SMTP id h33-20020a056a20432100b000da0501055emr27599521pzk.40.1683117201464; Wed, 03 May 2023 05:33:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683117201; cv=none; d=google.com; s=arc-20160816; b=VYeOOgsz6b/EVbZxmUayAW96V8KKDfuRrS9GR7ptszsg+H011F06tFQVeraeYf7b+D Ar1OjPYtY8YRlo1bbi3OvMSE5Fwd+jPOFCFTIjQt/ZW0OXkJ9fA0g9SKKegfy5PpybPn WN4LZcN0WLDxD6Sqoz0UvX3y/b8w+21M+Kj8llobnGDJRABXroNfQWxsu25QfMBXpai8 hpgTVU3rST97Mn8eSfzXWCGiO5mc0vHyPKDrddVas6O607UGRJJJRDU/+uzwFkJJnCk9 sVbJvYIZ0XPLxzqnkYpdyybdute0IPhOFcNKcSJq72wnPKdyFT+gQuUBzwSiZK2Cpt2b I7NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :feedback-id:references:in-reply-to:message-id:subject:cc:from:to :dkim-signature:date; bh=H/Cq7pFaigX2dUKEq8WTB0frQmt8n3yAg0FsAZllXZw=; b=ZxYYTAIiTaUwdLXF2Gp/10Cq+3xNKVJd4q7KU9dvWoFXF5InKZ1/GOhMDYZEYzT0VB 0tQTfgaNz4A+PxrEsNVHY3VOZy/0tYgpATDjkiKyWcV9wTwUNT3v/zYoHB/74AuPSN5G 8ra8fwQPb2d6kxdu+6BG5o/NLue5pXmHVfFuUrYGNUgfpsrL6dOYcHF3H7MO9Omd8rU9 +vEJEbgqv8gS1gshzbkGnILhXm+rezpoCBDgRdewJTISSbAJWlfm2BU0HhaJb4bON+98 vdunTsdhGCdnn7fnO9Ct5ggpb04qodzw8VRwG7CJnTP4YohRQeUdN3dsYFMMTo6waUSj KjFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@proton.me header.s=protonmail header.b=Qr2Fc62o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=proton.me Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v66-20020a632f45000000b0051b54441fc9si33202139pgv.253.2023.05.03.05.33.03; Wed, 03 May 2023 05:33:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@proton.me header.s=protonmail header.b=Qr2Fc62o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=proton.me Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229802AbjECMb7 (ORCPT + 99 others); Wed, 3 May 2023 08:31:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44222 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229697AbjECMb6 (ORCPT ); Wed, 3 May 2023 08:31:58 -0400 Received: from mail-4316.protonmail.ch (mail-4316.protonmail.ch [185.70.43.16]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5282C10E5; Wed, 3 May 2023 05:31:52 -0700 (PDT) Date: Wed, 03 May 2023 12:31:43 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proton.me; s=protonmail; t=1683117108; x=1683376308; bh=H/Cq7pFaigX2dUKEq8WTB0frQmt8n3yAg0FsAZllXZw=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=Qr2Fc62o5IXOSSTJXhbKWa7+sgdGsObhwRAn/Qe3ln7HwupE9okoCS8K5SY8LzC0q BMmEb6JaXKKnEQ1J+9CE7HD1dZRXDmNasXE3USEk8pYR47FwzYIbCZZz5k1Wj1+NBa glFkdDqmCm6ZPjRiDpd3l4ffxj0ne92Koy/V8/p/y5GE5kJ/rIgyYDI89zHyZv8/Rz Lfz4bLhBS/EeIXJDTvYPCBhrV1eOCm5ca6e0MkCg55VGEjSklcE5c2kdh/9idCBaMx sIkWln+xFPP0FfM0FlEdgFMy5cSoqgFS6KF4Kxdycsnuhfl/gAEVGo/2SmEu8fwu1N f2127SINa9BJg== To: Andreas Hindborg From: Benno Lossin Cc: Jens Axboe , Christoph Hellwig , Keith Busch , Damien Le Moal , Hannes Reinecke , lsf-pc@lists.linux-foundation.org, rust-for-linux@vger.kernel.org, linux-block@vger.kernel.org, Andreas Hindborg , Matthew Wilcox , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho , Boqun Feng , Gary Guo , =?utf-8?Q?Bj=C3=B6rn_Roy_Baron?= , linux-kernel@vger.kernel.org, gost.dev@samsung.com Subject: Re: [RFC PATCH 02/11] rust: add `pages` module for handling page allocation Message-ID: In-Reply-To: <20230503090708.2524310-3-nmi@metaspace.dk> References: <20230503090708.2524310-1-nmi@metaspace.dk> <20230503090708.2524310-3-nmi@metaspace.dk> Feedback-ID: 71780778:user:proton MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03.05.23 11:06, Andreas Hindborg wrote: > From: Andreas Hindborg >=20 > This patch adds support for working with pages of order 0. Support for pa= ges > with higher order is deferred. Page allocation flags are fixed in this pa= tch. > Future work might allow the user to specify allocation flags. >=20 > This patch is a heavily modified version of code available in the rust tr= ee [1], > primarily adding support for multiple page mapping strategies. >=20 > [1] https://github.com/rust-for-Linux/linux/tree/bc22545f38d74473cfef3e9f= d65432733435b79f/rust/kernel/pages.rs >=20 > Signed-off-by: Andreas Hindborg > --- > rust/helpers.c | 31 +++++ > rust/kernel/lib.rs | 6 + > rust/kernel/pages.rs | 284 +++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 321 insertions(+) > create mode 100644 rust/kernel/pages.rs >=20 > diff --git a/rust/helpers.c b/rust/helpers.c > index 5dd5e325b7cc..9bd9d95da951 100644 > --- a/rust/helpers.c > +++ b/rust/helpers.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include >=20 > __noreturn void rust_helper_BUG(void) > { > @@ -150,6 +151,36 @@ void **rust_helper_radix_tree_next_slot(void **slot, > } > EXPORT_SYMBOL_GPL(rust_helper_radix_tree_next_slot); >=20 > +void *rust_helper_kmap(struct page *page) > +{ > +=09return kmap(page); > +} > +EXPORT_SYMBOL_GPL(rust_helper_kmap); > + > +void rust_helper_kunmap(struct page *page) > +{ > +=09return kunmap(page); > +} > +EXPORT_SYMBOL_GPL(rust_helper_kunmap); > + > +void *rust_helper_kmap_atomic(struct page *page) > +{ > +=09return kmap_atomic(page); > +} > +EXPORT_SYMBOL_GPL(rust_helper_kmap_atomic); > + > +void rust_helper_kunmap_atomic(void *address) > +{ > +=09kunmap_atomic(address); > +} > +EXPORT_SYMBOL_GPL(rust_helper_kunmap_atomic); > + > +struct page *rust_helper_alloc_pages(gfp_t gfp_mask, unsigned int order) > +{ > +=09return alloc_pages(gfp_mask, order); > +} > +EXPORT_SYMBOL_GPL(rust_helper_alloc_pages); > + > /* > * We use `bindgen`'s `--size_t-is-usize` option to bind the C `size_t` = type > * as the Rust `usize` type, so we can use it in contexts where Rust > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs > index a85cb6aae8d6..8bef6686504b 100644 > --- a/rust/kernel/lib.rs > +++ b/rust/kernel/lib.rs > @@ -38,6 +38,7 @@ mod build_assert; > pub mod error; > pub mod init; > pub mod ioctl; > +pub mod pages; > pub mod prelude; > pub mod print; > pub mod radix_tree; > @@ -57,6 +58,11 @@ pub use uapi; > #[doc(hidden)] > pub use build_error::build_error; >=20 > +/// Page size defined in terms of the `PAGE_SHIFT` macro from C. > #@ > #@ `PAGE_SHIFT` is not using a doc-link. > #@ > +/// > +/// [`PAGE_SHIFT`]: ../../../include/asm-generic/page.h > +pub const PAGE_SIZE: u32 =3D 1 << bindings::PAGE_SHIFT; > #@ > #@ This should be of type `usize`. > #@ > + > /// Prefix to appear before log messages printed from within the `kernel= ` crate. > const __LOG_PREFIX: &[u8] =3D b"rust_kernel\0"; >=20 > diff --git a/rust/kernel/pages.rs b/rust/kernel/pages.rs > new file mode 100644 > index 000000000000..ed51b053dd5d > --- /dev/null > +++ b/rust/kernel/pages.rs > @@ -0,0 +1,284 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +//! Kernel page allocation and management. > +//! > +//! This module currently provides limited support. It supports pages of= order 0 > +//! for most operations. Page allocation flags are fixed. > + > +use crate::{bindings, error::code::*, error::Result, PAGE_SIZE}; > +use core::{marker::PhantomData, ptr}; > + > +/// A set of physical pages. > +/// > +/// `Pages` holds a reference to a set of pages of order `ORDER`. Having= the order as a generic > +/// const allows the struct to have the same size as a pointer. > #@ > #@ I would remove the 'Having the order as a...' sentence. Since that is > #@ just implementation detail. > #@ > +/// > +/// # Invariants > +/// > +/// The pointer `Pages::pages` is valid and points to 2^ORDER pages. > #@ > #@ `Pages::pages` -> `pages`. > #@ > +pub struct Pages { > + pub(crate) pages: *mut bindings::page, > +} > + > +impl Pages { > + /// Allocates a new set of contiguous pages. > + pub fn new() -> Result { > + let pages =3D unsafe { > + bindings::alloc_pages( > + bindings::GFP_KERNEL | bindings::__GFP_ZERO | bindings::= ___GFP_HIGHMEM, > + ORDER, > + ) > + }; > #@ > #@ Missing `SAFETY` comment. > #@ > + if pages.is_null() { > + return Err(ENOMEM); > + } > + // INVARIANTS: We checked that the allocation above succeeded. > + // SAFETY: We allocated pages above > + Ok(unsafe { Self::from_raw(pages) }) > + } > + > + /// Create a `Pages` from a raw `struct page` pointer > + /// > + /// # Safety > + /// > + /// Caller must own the pages pointed to by `ptr` as these will be f= reed > + /// when the returned `Pages` is dropped. > + pub unsafe fn from_raw(ptr: *mut bindings::page) -> Self { > + Self { pages: ptr } > + } > +} > + > +impl Pages<0> { > + #[inline(always)] > #@ > #@ Is this really needed? I think this function should be inlined > #@ automatically. > #@ > + fn check_offset_and_map( > + &self, > + offset: usize, > + len: usize, > + ) -> Result> > + where > + Pages<0>: MappingActions, > #@ > #@ Why not use `Self: MappingActions`? > #@ > + { > + let end =3D offset.checked_add(len).ok_or(EINVAL)?; > + if end as u32 > PAGE_SIZE { > #@ > #@ Remove the `as u32`, since `PAGE_SIZE` should be of type `usize`. > #@ > + return Err(EINVAL); > #@ > #@ I think it would make sense to create a more descriptive Rust error wi= th > #@ a `From` impl to turn it into an `Error`. It always is better to know = from > #@ the signature what exactly can go wrong when calling a function. > #@ > + } > + > + let mapping =3D >::map(self); > + > + Ok(mapping) > #@ > #@ I would merge these lines. > #@ > + } > + > + #[inline(always)] > + unsafe fn read_internal( > #@ > #@ Missing `# Safety` section. > #@ > + &self, > + dest: *mut u8, > + offset: usize, > + len: usize, > + ) -> Result > + where > + Pages<0>: MappingActions, > + { > + let mapping =3D self.check_offset_and_map::(offset, len)?; > + > + unsafe { ptr::copy_nonoverlapping((mapping.ptr as *mut u8).add(o= ffset), dest, len) }; > #@ > #@ Missing `SAFETY` comment. Replace `as *mut u8` with `.cast::()`. > #@ > + Ok(()) > + } > + > + /// Maps the pages and reads from them into the given buffer. > + /// > + /// # Safety > + /// > + /// Callers must ensure that the destination buffer is valid for the= given > + /// length. Additionally, if the raw buffer is intended to be recast= , they > + /// must ensure that the data can be safely cast; > + /// [`crate::io_buffer::ReadableFromBytes`] has more details about i= t. > + /// `dest` may not point to the source page. > #@ > #@ - `dest` is valid for writes for `len`. > #@ - What is meant by 'the raw buffer is intended to be recast'? > #@ - `io_buffer` does not yet exist in `rust-next`. > #@ > + #[inline(always)] > + pub unsafe fn read(&self, dest: *mut u8, offset: usize, len: usize) = -> Result { > + unsafe { self.read_internal::(dest, offset, l= en) } > #@ > #@ Missing `SAFETY` comment. > #@ > + } > + > + /// Maps the pages and reads from them into the given buffer. The pa= ge is > + /// mapped atomically. > + /// > + /// # Safety > + /// > + /// Callers must ensure that the destination buffer is valid for the= given > + /// length. Additionally, if the raw buffer is intended to be recast= , they > + /// must ensure that the data can be safely cast; > + /// [`crate::io_buffer::ReadableFromBytes`] has more details about i= t. > + /// `dest` may not point to the source page. > + #[inline(always)] > + pub unsafe fn read_atomic(&self, dest: *mut u8, offset: usize, len: = usize) -> Result { > + unsafe { self.read_internal::(dest, offset, l= en) } > #@ > #@ Missing `SAFETY` comment. > #@ > + } > + > + #[inline(always)] > + unsafe fn write_internal( > #@ > #@ Missing `# Safety` section. > #@ > + &self, > + src: *const u8, > + offset: usize, > + len: usize, > + ) -> Result > + where > + Pages<0>: MappingActions, > + { > + let mapping =3D self.check_offset_and_map::(offset, len)?; > + > + unsafe { ptr::copy_nonoverlapping(src, (mapping.ptr as *mut u8).= add(offset), len) }; > #@ > #@ Missing `SAFETY` comment. > #@ > + Ok(()) > + } > + > + /// Maps the pages and writes into them from the given buffer. > + /// > + /// # Safety > + /// > + /// Callers must ensure that the buffer is valid for the given lengt= h. > + /// Additionally, if the page is (or will be) mapped by userspace, t= hey must > + /// ensure that no kernel data is leaked through padding if it was c= ast from > + /// another type; [`crate::io_buffer::WritableToBytes`] has more det= ails > + /// about it. `src` must not point to the destination page. > #@ > #@ `src` is valid for reads for `len`. > #@ > + #[inline(always)] > + pub unsafe fn write(&self, src: *const u8, offset: usize, len: usize= ) -> Result { > + unsafe { self.write_internal::(src, offset, l= en) } > + } > + > + /// Maps the pages and writes into them from the given buffer. The p= age is > + /// mapped atomically. > + /// > + /// # Safety > + /// > + /// Callers must ensure that the buffer is valid for the given lengt= h. > + /// Additionally, if the page is (or will be) mapped by userspace, t= hey must > + /// ensure that no kernel data is leaked through padding if it was c= ast from > + /// another type; [`crate::io_buffer::WritableToBytes`] has more det= ails > + /// about it. `src` must not point to the destination page. > + #[inline(always)] > + pub unsafe fn write_atomic(&self, src: *const u8, offset: usize, len= : usize) -> Result { > + unsafe { self.write_internal::(src, offset, l= en) } > + } > + > + /// Maps the page at index 0. > + #[inline(always)] > + pub fn kmap(&self) -> PageMapping<'_, NormalMappingInfo> { > + let ptr =3D unsafe { bindings::kmap(self.pages) }; > #@ > #@ Missing `SAFETY` comment. > #@ > + > + PageMapping { > + page: self.pages, > + ptr, > + _phantom: PhantomData, > + _phantom2: PhantomData, > + } > + } > + > + /// Atomically Maps the page at index 0. > + #[inline(always)] > + pub fn kmap_atomic(&self) -> PageMapping<'_, AtomicMappingInfo> { > + let ptr =3D unsafe { bindings::kmap_atomic(self.pages) }; > #@ > #@ Missing `SAFETY` comment. > #@ > + > + PageMapping { > + page: self.pages, > + ptr, > + _phantom: PhantomData, > + _phantom2: PhantomData, > + } > + } > +} > + > +impl Drop for Pages { > + fn drop(&mut self) { > + // SAFETY: By the type invariants, we know the pages are allocat= ed with the given order. > + unsafe { bindings::__free_pages(self.pages, ORDER) }; > + } > +} > + > +/// Specifies the type of page mapping > +pub trait MappingInfo {} > + > +/// Encapsulates methods to map and unmap pages > +pub trait MappingActions > +where > + Pages<0>: MappingActions, > +{ > + /// Map a page into the kernel address scpace > #@ > #@ Typo. > #@ > + fn map(pages: &Pages<0>) -> PageMapping<'_, I>; > + > + /// Unmap a page specified by `mapping` > + /// > + /// # Safety > + /// > + /// Must only be called by `PageMapping::drop()`. > + unsafe fn unmap(mapping: &PageMapping<'_, I>); > +} > + > +/// A type state indicating that pages were mapped atomically > +pub struct AtomicMappingInfo; > +impl MappingInfo for AtomicMappingInfo {} > + > +/// A type state indicating that pages were not mapped atomically > +pub struct NormalMappingInfo; > +impl MappingInfo for NormalMappingInfo {} > + > +impl MappingActions for Pages<0> { > + #[inline(always)] > + fn map(pages: &Pages<0>) -> PageMapping<'_, AtomicMappingInfo> { > + pages.kmap_atomic() > + } > + > + #[inline(always)] > + unsafe fn unmap(mapping: &PageMapping<'_, AtomicMappingInfo>) { > + // SAFETY: An instance of `PageMapping` is created only when `km= ap` succeeded for the given > + // page, so it is safe to unmap it here. > + unsafe { bindings::kunmap_atomic(mapping.ptr) }; > + } > +} > + > +impl MappingActions for Pages<0> { > + #[inline(always)] > + fn map(pages: &Pages<0>) -> PageMapping<'_, NormalMappingInfo> { > + pages.kmap() > + } > + > + #[inline(always)] > + unsafe fn unmap(mapping: &PageMapping<'_, NormalMappingInfo>) { > + // SAFETY: An instance of `PageMapping` is created only when `km= ap` succeeded for the given > + // page, so it is safe to unmap it here. > + unsafe { bindings::kunmap(mapping.page) }; > + } > +} > #@ > #@ I am not sure if this is the best implementation, why do the `kmap` an= d > #@ `kmap_atomic` functions exist? Would it not make sense to implement > #@ them entirely in `MappingActions::map`? > #@ > + > +/// An owned page mapping. When this struct is dropped, the page is unma= pped. > +pub struct PageMapping<'a, I: MappingInfo> > +where > + Pages<0>: MappingActions, > +{ > + page: *mut bindings::page, > + ptr: *mut core::ffi::c_void, > + _phantom: PhantomData<&'a i32>, > + _phantom2: PhantomData, > +} > + > +impl<'a, I: MappingInfo> PageMapping<'a, I> > +where > + Pages<0>: MappingActions, > +{ > + /// Return a pointer to the wrapped `struct page` > + #[inline(always)] > + pub fn get_ptr(&self) -> *mut core::ffi::c_void { > + self.ptr > + } > +} > + > +// Because we do not have Drop specialization, we have to do this dance.= Life > +// would be much more simple if we could have `impl Drop for PageMapping= <'_, > +// Atomic>` and `impl Drop for PageMapping<'_, NotAtomic>` > +impl Drop for PageMapping<'_, I> > +where > + Pages<0>: MappingActions, > +{ > + #[inline(always)] > + fn drop(&mut self) { > + // SAFETY: We are OK to call this because we are `PageMapping::d= rop()` > + unsafe { as MappingActions>::unmap(self) } > + } > +} > -- > 2.40.0 Here are some more general things: - I think we could use this as an opportunity to add more docs about how paging works, or at least add some links to the C documentation. - Can we improve the paging API? I have not given it any thought yet, but the current API looks very primitive. - Documentation comments should form complete sentences (so end with `.`). -- Cheers, Benno