From: Ira Weiny <[email protected]>
get_kernel_pages() only had one caller [shm_get_kernel_pages()] which did not
need the functionality it provided. Furthermore, it called kmap_to_page()
which we are looking to removed.
Alter shm_get_kernel_pages() to no longer call get_kernel_pages() and remove
get_kernel_pages(). Along the way it was noted that shm_get_kernel_pages()
does not have any need to support vmalloc'ed addresses either. Remove that
functionality to clean up the logic.
This series also fixes and uses is_kmap_addr().
Ira Weiny (4):
highmem: Enhance is_kmap_addr() to check kmap_local_page() mappings
tee: Remove vmalloc page support
tee: Remove call to get_kernel_pages()
mm: Remove get_kernel_pages()
drivers/tee/tee_shm.c | 41 ++++++++++++--------------------
include/linux/highmem-internal.h | 5 +++-
include/linux/mm.h | 2 --
mm/swap.c | 30 -----------------------
4 files changed, 19 insertions(+), 59 deletions(-)
base-commit: 274d7803837da78dfc911bcda0d593412676fc20
--
2.37.2
From: Ira Weiny <[email protected]>
The kernel pages used by shm_get_kernel_pages() are allocated using
GFP_KERNEL through the following call stack:
trusted_instantiate()
trusted_payload_alloc() -> GFP_KERNEL
<trusted key op>
tee_shm_register_kernel_buf()
register_shm_helper()
shm_get_kernel_pages()
Where <trusted key op> is one of:
trusted_key_unseal()
trusted_key_get_random()
trusted_key_seal()
Remove the vmalloc page support from shm_get_kernel_pages(). Replace
with a warn on once.
Cc: Jens Wiklander <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Fabio M. De Francesco" <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Linus Torvalds <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
---
Jens I went with the suggestion from Linus and Christoph and rejected
vmalloc addresses. I did not hear back from you regarding Linus'
question if the vmalloc page support was required by an up coming patch
set or not. So I assumed it was something out of tree.
---
drivers/tee/tee_shm.c | 36 ++++++++++++------------------------
1 file changed, 12 insertions(+), 24 deletions(-)
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
index 27295bda3e0b..527a6eabc03e 100644
--- a/drivers/tee/tee_shm.c
+++ b/drivers/tee/tee_shm.c
@@ -24,37 +24,25 @@ static void shm_put_kernel_pages(struct page **pages, size_t page_count)
static int shm_get_kernel_pages(unsigned long start, size_t page_count,
struct page **pages)
{
+ struct kvec *kiov;
size_t n;
int rc;
- if (is_vmalloc_addr((void *)start)) {
- struct page *page;
-
- for (n = 0; n < page_count; n++) {
- page = vmalloc_to_page((void *)(start + PAGE_SIZE * n));
- if (!page)
- return -ENOMEM;
-
- get_page(page);
- pages[n] = page;
- }
- rc = page_count;
- } else {
- struct kvec *kiov;
-
- kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
- if (!kiov)
- return -ENOMEM;
+ if (WARN_ON_ONCE(is_vmalloc_addr((void *)start)))
+ return -EINVAL;
- for (n = 0; n < page_count; n++) {
- kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
- kiov[n].iov_len = PAGE_SIZE;
- }
+ kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
+ if (!kiov)
+ return -ENOMEM;
- rc = get_kernel_pages(kiov, page_count, 0, pages);
- kfree(kiov);
+ for (n = 0; n < page_count; n++) {
+ kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
+ kiov[n].iov_len = PAGE_SIZE;
}
+ rc = get_kernel_pages(kiov, page_count, 0, pages);
+ kfree(kiov);
+
return rc;
}
--
2.37.2
From: Ira Weiny <[email protected]>
The only caller to get_kernel_pages() [shm_get_kernel_pages()] has been
updated to not need it.
Remove get_kernel_pages().
Cc: Mel Gorman <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Fabio M. De Francesco" <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
---
include/linux/mm.h | 2 --
mm/swap.c | 30 ------------------------------
2 files changed, 32 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8bbcccbc5565..9a06df4f057c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1969,8 +1969,6 @@ int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
struct task_struct *task, bool bypass_rlim);
struct kvec;
-int get_kernel_pages(const struct kvec *iov, int nr_pages, int write,
- struct page **pages);
struct page *get_dump_page(unsigned long addr);
bool folio_mark_dirty(struct folio *folio);
diff --git a/mm/swap.c b/mm/swap.c
index 955930f41d20..a9aa648eb0d0 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -157,36 +157,6 @@ void put_pages_list(struct list_head *pages)
}
EXPORT_SYMBOL(put_pages_list);
-/*
- * get_kernel_pages() - pin kernel pages in memory
- * @kiov: An array of struct kvec structures
- * @nr_segs: number of segments to pin
- * @write: pinning for read/write, currently ignored
- * @pages: array that receives pointers to the pages pinned.
- * Should be at least nr_segs long.
- *
- * Returns number of pages pinned. This may be fewer than the number requested.
- * If nr_segs is 0 or negative, returns 0. If no pages were pinned, returns 0.
- * Each page returned must be released with a put_page() call when it is
- * finished with.
- */
-int get_kernel_pages(const struct kvec *kiov, int nr_segs, int write,
- struct page **pages)
-{
- int seg;
-
- for (seg = 0; seg < nr_segs; seg++) {
- if (WARN_ON(kiov[seg].iov_len != PAGE_SIZE))
- return seg;
-
- pages[seg] = kmap_to_page(kiov[seg].iov_base);
- get_page(pages[seg]);
- }
-
- return seg;
-}
-EXPORT_SYMBOL_GPL(get_kernel_pages);
-
typedef void (*move_fn_t)(struct lruvec *lruvec, struct folio *folio);
static void lru_add_fn(struct lruvec *lruvec, struct folio *folio)
--
2.37.2
From: Ira Weiny <[email protected]>
is_kmap_addr() is only looking at the kmap() address range which may
cause check_heap_object() to miss checking an overflow on a
kmap_local_page() page.
Add a check for the kmap_local_page() address range to is_kmap_addr().
Cc: Matthew Wilcox <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Fabio M. De Francesco" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Andrew Morton <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
---
include/linux/highmem-internal.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/highmem-internal.h b/include/linux/highmem-internal.h
index 034b1106d022..5fd5cb58f109 100644
--- a/include/linux/highmem-internal.h
+++ b/include/linux/highmem-internal.h
@@ -152,7 +152,10 @@ static inline void totalhigh_pages_add(long count)
static inline bool is_kmap_addr(const void *x)
{
unsigned long addr = (unsigned long)x;
- return addr >= PKMAP_ADDR(0) && addr < PKMAP_ADDR(LAST_PKMAP);
+
+ return (addr >= PKMAP_ADDR(0) && addr < PKMAP_ADDR(LAST_PKMAP)) ||
+ (addr >= __fix_to_virt(FIX_KMAP_END) &&
+ addr < __fix_to_virt(FIX_KMAP_BEGIN));
}
#else /* CONFIG_HIGHMEM */
--
2.37.2
+ Phil
Hi Ira,
On Sun, 2 Oct 2022 at 05:53, <[email protected]> wrote:
>
> From: Ira Weiny <[email protected]>
>
> The kernel pages used by shm_get_kernel_pages() are allocated using
> GFP_KERNEL through the following call stack:
>
> trusted_instantiate()
> trusted_payload_alloc() -> GFP_KERNEL
> <trusted key op>
> tee_shm_register_kernel_buf()
> register_shm_helper()
> shm_get_kernel_pages()
>
> Where <trusted key op> is one of:
>
> trusted_key_unseal()
> trusted_key_get_random()
> trusted_key_seal()
>
> Remove the vmalloc page support from shm_get_kernel_pages(). Replace
> with a warn on once.
>
> Cc: Jens Wiklander <[email protected]>
> Cc: Al Viro <[email protected]>
> Cc: "Fabio M. De Francesco" <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
>
> ---
> Jens I went with the suggestion from Linus and Christoph and rejected
> vmalloc addresses. I did not hear back from you regarding Linus'
> question if the vmalloc page support was required by an up coming patch
> set or not. So I assumed it was something out of tree.
It looks like I wasn't CC'd to that conversation. IIRC, support for
vmalloc addresses was added recently by Phil here [1]. So I would like
to give him a chance if he is planning to post a corresponding kernel
driver upstream.
[1] https://lists.trustedfirmware.org/archives/list/[email protected]/thread/M7HI3P2M66V27SK35CGQRICZ7DJZ5J2W/
-Sumit
> ---
> drivers/tee/tee_shm.c | 36 ++++++++++++------------------------
> 1 file changed, 12 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
> index 27295bda3e0b..527a6eabc03e 100644
> --- a/drivers/tee/tee_shm.c
> +++ b/drivers/tee/tee_shm.c
> @@ -24,37 +24,25 @@ static void shm_put_kernel_pages(struct page **pages, size_t page_count)
> static int shm_get_kernel_pages(unsigned long start, size_t page_count,
> struct page **pages)
> {
> + struct kvec *kiov;
> size_t n;
> int rc;
>
> - if (is_vmalloc_addr((void *)start)) {
> - struct page *page;
> -
> - for (n = 0; n < page_count; n++) {
> - page = vmalloc_to_page((void *)(start + PAGE_SIZE * n));
> - if (!page)
> - return -ENOMEM;
> -
> - get_page(page);
> - pages[n] = page;
> - }
> - rc = page_count;
> - } else {
> - struct kvec *kiov;
> -
> - kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> - if (!kiov)
> - return -ENOMEM;
> + if (WARN_ON_ONCE(is_vmalloc_addr((void *)start)))
> + return -EINVAL;
>
> - for (n = 0; n < page_count; n++) {
> - kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> - kiov[n].iov_len = PAGE_SIZE;
> - }
> + kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> + if (!kiov)
> + return -ENOMEM;
>
> - rc = get_kernel_pages(kiov, page_count, 0, pages);
> - kfree(kiov);
> + for (n = 0; n < page_count; n++) {
> + kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> + kiov[n].iov_len = PAGE_SIZE;
> }
>
> + rc = get_kernel_pages(kiov, page_count, 0, pages);
> + kfree(kiov);
> +
> return rc;
> }
>
> --
> 2.37.2
>
Hi Ira,
On Sun, Oct 2, 2022 at 2:23 AM <[email protected]> wrote:
>
> From: Ira Weiny <[email protected]>
>
> The kernel pages used by shm_get_kernel_pages() are allocated using
> GFP_KERNEL through the following call stack:
>
> trusted_instantiate()
> trusted_payload_alloc() -> GFP_KERNEL
> <trusted key op>
> tee_shm_register_kernel_buf()
> register_shm_helper()
> shm_get_kernel_pages()
>
> Where <trusted key op> is one of:
>
> trusted_key_unseal()
> trusted_key_get_random()
> trusted_key_seal()
>
> Remove the vmalloc page support from shm_get_kernel_pages(). Replace
> with a warn on once.
>
> Cc: Jens Wiklander <[email protected]>
> Cc: Al Viro <[email protected]>
> Cc: "Fabio M. De Francesco" <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
>
> ---
> Jens I went with the suggestion from Linus and Christoph and rejected
> vmalloc addresses. I did not hear back from you regarding Linus'
> question if the vmalloc page support was required by an up coming patch
> set or not. So I assumed it was something out of tree.
Yes, that's correctly assumed, sorry for not confirming that earlier.
Reviewed-by: Jens Wiklander <[email protected]>
Thanks,
Jens
> ---
> drivers/tee/tee_shm.c | 36 ++++++++++++------------------------
> 1 file changed, 12 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c
> index 27295bda3e0b..527a6eabc03e 100644
> --- a/drivers/tee/tee_shm.c
> +++ b/drivers/tee/tee_shm.c
> @@ -24,37 +24,25 @@ static void shm_put_kernel_pages(struct page **pages, size_t page_count)
> static int shm_get_kernel_pages(unsigned long start, size_t page_count,
> struct page **pages)
> {
> + struct kvec *kiov;
> size_t n;
> int rc;
>
> - if (is_vmalloc_addr((void *)start)) {
> - struct page *page;
> -
> - for (n = 0; n < page_count; n++) {
> - page = vmalloc_to_page((void *)(start + PAGE_SIZE * n));
> - if (!page)
> - return -ENOMEM;
> -
> - get_page(page);
> - pages[n] = page;
> - }
> - rc = page_count;
> - } else {
> - struct kvec *kiov;
> -
> - kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> - if (!kiov)
> - return -ENOMEM;
> + if (WARN_ON_ONCE(is_vmalloc_addr((void *)start)))
> + return -EINVAL;
>
> - for (n = 0; n < page_count; n++) {
> - kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> - kiov[n].iov_len = PAGE_SIZE;
> - }
> + kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> + if (!kiov)
> + return -ENOMEM;
>
> - rc = get_kernel_pages(kiov, page_count, 0, pages);
> - kfree(kiov);
> + for (n = 0; n < page_count; n++) {
> + kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> + kiov[n].iov_len = PAGE_SIZE;
> }
>
> + rc = get_kernel_pages(kiov, page_count, 0, pages);
> + kfree(kiov);
> +
> return rc;
> }
>
> --
> 2.37.2
>
Hi Ira,
On Sun, 2 Oct 2022 at 05:53, <[email protected]> wrote:
>
> From: Ira Weiny <[email protected]>
>
> get_kernel_pages() only had one caller [shm_get_kernel_pages()] which did not
> need the functionality it provided. Furthermore, it called kmap_to_page()
> which we are looking to removed.
>
> Alter shm_get_kernel_pages() to no longer call get_kernel_pages() and remove
> get_kernel_pages(). Along the way it was noted that shm_get_kernel_pages()
> does not have any need to support vmalloc'ed addresses either. Remove that
> functionality to clean up the logic.
>
> This series also fixes and uses is_kmap_addr().
From the above description, I am failing to see the motivation behind
this change. Can you elaborate here?
Also, since you are targeting to remove kmap_to_page(), is there any
alternative way to support highmem for future TEE bus drivers? As I
can see higmem being enabled for multiple Arm defconfigs [1] which can
also support TEE (an example which already does it:
arch/arm/configs/imx_v6_v7_defconfig).
[1] git grep CONFIG_HIGHMEM arch/arm/
-Sumit
>
> Ira Weiny (4):
> highmem: Enhance is_kmap_addr() to check kmap_local_page() mappings
> tee: Remove vmalloc page support
> tee: Remove call to get_kernel_pages()
> mm: Remove get_kernel_pages()
>
> drivers/tee/tee_shm.c | 41 ++++++++++++--------------------
> include/linux/highmem-internal.h | 5 +++-
> include/linux/mm.h | 2 --
> mm/swap.c | 30 -----------------------
> 4 files changed, 19 insertions(+), 59 deletions(-)
>
>
> base-commit: 274d7803837da78dfc911bcda0d593412676fc20
> --
> 2.37.2
>
On Mon, Oct 03, 2022 at 02:55:02PM +0530, Sumit Garg wrote:
> Hi Ira,
>
> On Sun, 2 Oct 2022 at 05:53, <[email protected]> wrote:
> >
> > From: Ira Weiny <[email protected]>
> >
> > get_kernel_pages() only had one caller [shm_get_kernel_pages()] which did not
> > need the functionality it provided. Furthermore, it called kmap_to_page()
> > which we are looking to removed.
> >
> > Alter shm_get_kernel_pages() to no longer call get_kernel_pages() and remove
> > get_kernel_pages(). Along the way it was noted that shm_get_kernel_pages()
> > does not have any need to support vmalloc'ed addresses either. Remove that
> > functionality to clean up the logic.
> >
> > This series also fixes and uses is_kmap_addr().
>
> From the above description, I am failing to see the motivation behind
> this change. Can you elaborate here?
Al Viro found[1] that kmap_to_page() is broken. But not only was it broken but
it presents confusion over how highmem should be used because kmap() and
friends should not be used for 'long term' mappings.
[1] https://lore.kernel.org/lkml/YzSSl1ItVlARDvG3@ZenIV
>
> Also, since you are targeting to remove kmap_to_page(), is there any
> alternative way to support highmem for future TEE bus drivers? As I
> can see higmem being enabled for multiple Arm defconfigs [1] which can
> also support TEE (an example which already does it:
> arch/arm/configs/imx_v6_v7_defconfig).
With TEE how are the highmem pages used? Right now the code does not seem to
use any user pages. So I can't really say how this should work. Why does the
kernel need a mapping of those pages?
Ira
>
> [1] git grep CONFIG_HIGHMEM arch/arm/
>
> -Sumit
>
> >
> > Ira Weiny (4):
> > highmem: Enhance is_kmap_addr() to check kmap_local_page() mappings
> > tee: Remove vmalloc page support
> > tee: Remove call to get_kernel_pages()
> > mm: Remove get_kernel_pages()
> >
> > drivers/tee/tee_shm.c | 41 ++++++++++++--------------------
> > include/linux/highmem-internal.h | 5 +++-
> > include/linux/mm.h | 2 --
> > mm/swap.c | 30 -----------------------
> > 4 files changed, 19 insertions(+), 59 deletions(-)
> >
> >
> > base-commit: 274d7803837da78dfc911bcda0d593412676fc20
> > --
> > 2.37.2
> >
On 10/1/22 17:23, [email protected] wrote:
> From: Ira Weiny <[email protected]>
>
> The only caller to get_kernel_pages() [shm_get_kernel_pages()] has been
> updated to not need it.
>
> Remove get_kernel_pages().
>
> Cc: Mel Gorman <[email protected]>
> Cc: Al Viro <[email protected]>
> Cc: "Fabio M. De Francesco" <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
> ---
> include/linux/mm.h | 2 --
> mm/swap.c | 30 ------------------------------
> 2 files changed, 32 deletions(-)
>
Good to see this removed (including the EXPORT), even if the
functionality still remains in a less obvious form, over in shm.
The fewer "all your page are pinned" calls we need, the simpler things
get. :)
Acked-by: John Hubbard <[email protected]>
thanks,
--
John Hubbard
NVIDIA
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 8bbcccbc5565..9a06df4f057c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1969,8 +1969,6 @@ int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
> struct task_struct *task, bool bypass_rlim);
>
> struct kvec;
> -int get_kernel_pages(const struct kvec *iov, int nr_pages, int write,
> - struct page **pages);
> struct page *get_dump_page(unsigned long addr);
>
> bool folio_mark_dirty(struct folio *folio);
> diff --git a/mm/swap.c b/mm/swap.c
> index 955930f41d20..a9aa648eb0d0 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -157,36 +157,6 @@ void put_pages_list(struct list_head *pages)
> }
> EXPORT_SYMBOL(put_pages_list);
>
> -/*
> - * get_kernel_pages() - pin kernel pages in memory
> - * @kiov: An array of struct kvec structures
> - * @nr_segs: number of segments to pin
> - * @write: pinning for read/write, currently ignored
> - * @pages: array that receives pointers to the pages pinned.
> - * Should be at least nr_segs long.
> - *
> - * Returns number of pages pinned. This may be fewer than the number requested.
> - * If nr_segs is 0 or negative, returns 0. If no pages were pinned, returns 0.
> - * Each page returned must be released with a put_page() call when it is
> - * finished with.
> - */
> -int get_kernel_pages(const struct kvec *kiov, int nr_segs, int write,
> - struct page **pages)
> -{
> - int seg;
> -
> - for (seg = 0; seg < nr_segs; seg++) {
> - if (WARN_ON(kiov[seg].iov_len != PAGE_SIZE))
> - return seg;
> -
> - pages[seg] = kmap_to_page(kiov[seg].iov_base);
> - get_page(pages[seg]);
> - }
> -
> - return seg;
> -}
> -EXPORT_SYMBOL_GPL(get_kernel_pages);
> -
> typedef void (*move_fn_t)(struct lruvec *lruvec, struct folio *folio);
>
> static void lru_add_fn(struct lruvec *lruvec, struct folio *folio)
On Mon, 3 Oct 2022 at 20:52, Ira Weiny <[email protected]> wrote:
>
> On Mon, Oct 03, 2022 at 02:55:02PM +0530, Sumit Garg wrote:
> > Hi Ira,
> >
> > On Sun, 2 Oct 2022 at 05:53, <[email protected]> wrote:
> > >
> > > From: Ira Weiny <[email protected]>
> > >
> > > get_kernel_pages() only had one caller [shm_get_kernel_pages()] which did not
> > > need the functionality it provided. Furthermore, it called kmap_to_page()
> > > which we are looking to removed.
> > >
> > > Alter shm_get_kernel_pages() to no longer call get_kernel_pages() and remove
> > > get_kernel_pages(). Along the way it was noted that shm_get_kernel_pages()
> > > does not have any need to support vmalloc'ed addresses either. Remove that
> > > functionality to clean up the logic.
> > >
> > > This series also fixes and uses is_kmap_addr().
> >
> > From the above description, I am failing to see the motivation behind
> > this change. Can you elaborate here?
>
> Al Viro found[1] that kmap_to_page() is broken. But not only was it broken but
> it presents confusion over how highmem should be used because kmap() and
> friends should not be used for 'long term' mappings.
>
> [1] https://lore.kernel.org/lkml/YzSSl1ItVlARDvG3@ZenIV
>
Thanks for the background info. This should be part of the cover letter.
> >
> > Also, since you are targeting to remove kmap_to_page(), is there any
> > alternative way to support highmem for future TEE bus drivers? As I
> > can see higmem being enabled for multiple Arm defconfigs [1] which can
> > also support TEE (an example which already does it:
> > arch/arm/configs/imx_v6_v7_defconfig).
>
> With TEE how are the highmem pages used? Right now the code does not seem to
> use any user pages. So I can't really say how this should work. Why does the
> kernel need a mapping of those pages?
Fair enough, I don't have a real kernel driver use-case for highmem
which is required to be registered with TEE.
-Sumit
>
> Ira
>
> >
> > [1] git grep CONFIG_HIGHMEM arch/arm/
> >
> > -Sumit
> >
> > >
> > > Ira Weiny (4):
> > > highmem: Enhance is_kmap_addr() to check kmap_local_page() mappings
> > > tee: Remove vmalloc page support
> > > tee: Remove call to get_kernel_pages()
> > > mm: Remove get_kernel_pages()
> > >
> > > drivers/tee/tee_shm.c | 41 ++++++++++++--------------------
> > > include/linux/highmem-internal.h | 5 +++-
> > > include/linux/mm.h | 2 --
> > > mm/swap.c | 30 -----------------------
> > > 4 files changed, 19 insertions(+), 59 deletions(-)
> > >
> > >
> > > base-commit: 274d7803837da78dfc911bcda0d593412676fc20
> > > --
> > > 2.37.2
> > >
Hi Sumit
Thanks for mentioning that, in fact, our product is low memory devices, and continuous pages are extremely valuable.
Although our driver is not upstream yet but highly dependent on tee shm vmalloc support,
some scenarios are driver alloc high order pages but system memory is fragmentation so that alloc failed.
In this situation, vmalloc support is important and gives flexible usage to user.
-----Original Message-----
From: Sumit Garg <[email protected]>
Sent: Monday, October 3, 2022 2:57 PM
To: [email protected]
Cc: Jens Wiklander <[email protected]>; Andrew Morton <[email protected]>; Al Viro <[email protected]>; Fabio M. De Francesco <[email protected]>; Christoph Hellwig <[email protected]>; Linus Torvalds <[email protected]>; [email protected]; [email protected]; [email protected]; Phil Chang (張世勳) <[email protected]>
Subject: Re: [PATCH 2/4] tee: Remove vmalloc page support
+ Phil
Hi Ira,
On Sun, 2 Oct 2022 at 05:53, <[email protected]> wrote:
>
> From: Ira Weiny <[email protected]>
>
> The kernel pages used by shm_get_kernel_pages() are allocated using
> GFP_KERNEL through the following call stack:
>
> trusted_instantiate()
> trusted_payload_alloc() -> GFP_KERNEL
> <trusted key op>
> tee_shm_register_kernel_buf()
> register_shm_helper()
> shm_get_kernel_pages()
>
> Where <trusted key op> is one of:
>
> trusted_key_unseal()
> trusted_key_get_random()
> trusted_key_seal()
>
> Remove the vmalloc page support from shm_get_kernel_pages(). Replace
> with a warn on once.
>
> Cc: Jens Wiklander <[email protected]>
> Cc: Al Viro <[email protected]>
> Cc: "Fabio M. De Francesco" <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Signed-off-by: Ira Weiny <[email protected]>
>
> ---
> Jens I went with the suggestion from Linus and Christoph and rejected
> vmalloc addresses. I did not hear back from you regarding Linus'
> question if the vmalloc page support was required by an up coming
> patch set or not. So I assumed it was something out of tree.
It looks like I wasn't CC'd to that conversation. IIRC, support for vmalloc addresses was added recently by Phil here [1]. So I would like to give him a chance if he is planning to post a corresponding kernel driver upstream.
[1] https://urldefense.com/v3/__https://lists.trustedfirmware.org/archives/list/[email protected]/thread/M7HI3P2M66V27SK35CGQRICZ7DJZ5J2W/__;!!CTRNKA9wMg0ARbw!wGOKR9k3khZJlPt1K_xBCXX4EBM5ZCfWKuruFgSP45H8wTvJrx4_St3Fb5ZrljD5QQ$
-Sumit
> ---
> drivers/tee/tee_shm.c | 36 ++++++++++++------------------------
> 1 file changed, 12 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index
> 27295bda3e0b..527a6eabc03e 100644
> --- a/drivers/tee/tee_shm.c
> +++ b/drivers/tee/tee_shm.c
> @@ -24,37 +24,25 @@ static void shm_put_kernel_pages(struct page
> **pages, size_t page_count) static int shm_get_kernel_pages(unsigned long start, size_t page_count,
> struct page **pages) {
> + struct kvec *kiov;
> size_t n;
> int rc;
>
> - if (is_vmalloc_addr((void *)start)) {
> - struct page *page;
> -
> - for (n = 0; n < page_count; n++) {
> - page = vmalloc_to_page((void *)(start + PAGE_SIZE * n));
> - if (!page)
> - return -ENOMEM;
> -
> - get_page(page);
> - pages[n] = page;
> - }
> - rc = page_count;
> - } else {
> - struct kvec *kiov;
> -
> - kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> - if (!kiov)
> - return -ENOMEM;
> + if (WARN_ON_ONCE(is_vmalloc_addr((void *)start)))
> + return -EINVAL;
>
> - for (n = 0; n < page_count; n++) {
> - kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> - kiov[n].iov_len = PAGE_SIZE;
> - }
> + kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> + if (!kiov)
> + return -ENOMEM;
>
> - rc = get_kernel_pages(kiov, page_count, 0, pages);
> - kfree(kiov);
> + for (n = 0; n < page_count; n++) {
> + kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> + kiov[n].iov_len = PAGE_SIZE;
> }
>
> + rc = get_kernel_pages(kiov, page_count, 0, pages);
> + kfree(kiov);
> +
> return rc;
> }
>
> --
> 2.37.2
>
Hi Phil,
Please don't top-post in the OSS mailing list.
On Wed, 5 Oct 2022 at 08:59, Phil Chang (張世勳) <[email protected]> wrote:
>
> Hi Sumit
>
> Thanks for mentioning that, in fact, our product is low memory devices, and continuous pages are extremely valuable.
> Although our driver is not upstream yet but highly dependent on tee shm vmalloc support,
Sorry but you need to get your driver mainline in order to support
vmalloc interface. As otherwise it's a maintenance nightmare to
support interfaces in the mainline for out-of-tree drivers.
-Sumit
> some scenarios are driver alloc high order pages but system memory is fragmentation so that alloc failed.
> In this situation, vmalloc support is important and gives flexible usage to user.
>
>
> -----Original Message-----
> From: Sumit Garg <[email protected]>
> Sent: Monday, October 3, 2022 2:57 PM
> To: [email protected]
> Cc: Jens Wiklander <[email protected]>; Andrew Morton <[email protected]>; Al Viro <[email protected]>; Fabio M. De Francesco <[email protected]>; Christoph Hellwig <[email protected]>; Linus Torvalds <[email protected]>; [email protected]; [email protected]; [email protected]; Phil Chang (張世勳) <[email protected]>
> Subject: Re: [PATCH 2/4] tee: Remove vmalloc page support
>
> + Phil
>
> Hi Ira,
>
> On Sun, 2 Oct 2022 at 05:53, <[email protected]> wrote:
> >
> > From: Ira Weiny <[email protected]>
> >
> > The kernel pages used by shm_get_kernel_pages() are allocated using
> > GFP_KERNEL through the following call stack:
> >
> > trusted_instantiate()
> > trusted_payload_alloc() -> GFP_KERNEL
> > <trusted key op>
> > tee_shm_register_kernel_buf()
> > register_shm_helper()
> > shm_get_kernel_pages()
> >
> > Where <trusted key op> is one of:
> >
> > trusted_key_unseal()
> > trusted_key_get_random()
> > trusted_key_seal()
> >
> > Remove the vmalloc page support from shm_get_kernel_pages(). Replace
> > with a warn on once.
> >
> > Cc: Jens Wiklander <[email protected]>
> > Cc: Al Viro <[email protected]>
> > Cc: "Fabio M. De Francesco" <[email protected]>
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Linus Torvalds <[email protected]>
> > Signed-off-by: Ira Weiny <[email protected]>
> >
> > ---
> > Jens I went with the suggestion from Linus and Christoph and rejected
> > vmalloc addresses. I did not hear back from you regarding Linus'
> > question if the vmalloc page support was required by an up coming
> > patch set or not. So I assumed it was something out of tree.
>
> It looks like I wasn't CC'd to that conversation. IIRC, support for vmalloc addresses was added recently by Phil here [1]. So I would like to give him a chance if he is planning to post a corresponding kernel driver upstream.
>
> [1] https://urldefense.com/v3/__https://lists.trustedfirmware.org/archives/list/[email protected]/thread/M7HI3P2M66V27SK35CGQRICZ7DJZ5J2W/__;!!CTRNKA9wMg0ARbw!wGOKR9k3khZJlPt1K_xBCXX4EBM5ZCfWKuruFgSP45H8wTvJrx4_St3Fb5ZrljD5QQ$
>
> -Sumit
>
> > ---
> > drivers/tee/tee_shm.c | 36 ++++++++++++------------------------
> > 1 file changed, 12 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index
> > 27295bda3e0b..527a6eabc03e 100644
> > --- a/drivers/tee/tee_shm.c
> > +++ b/drivers/tee/tee_shm.c
> > @@ -24,37 +24,25 @@ static void shm_put_kernel_pages(struct page
> > **pages, size_t page_count) static int shm_get_kernel_pages(unsigned long start, size_t page_count,
> > struct page **pages) {
> > + struct kvec *kiov;
> > size_t n;
> > int rc;
> >
> > - if (is_vmalloc_addr((void *)start)) {
> > - struct page *page;
> > -
> > - for (n = 0; n < page_count; n++) {
> > - page = vmalloc_to_page((void *)(start + PAGE_SIZE * n));
> > - if (!page)
> > - return -ENOMEM;
> > -
> > - get_page(page);
> > - pages[n] = page;
> > - }
> > - rc = page_count;
> > - } else {
> > - struct kvec *kiov;
> > -
> > - kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> > - if (!kiov)
> > - return -ENOMEM;
> > + if (WARN_ON_ONCE(is_vmalloc_addr((void *)start)))
> > + return -EINVAL;
> >
> > - for (n = 0; n < page_count; n++) {
> > - kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> > - kiov[n].iov_len = PAGE_SIZE;
> > - }
> > + kiov = kcalloc(page_count, sizeof(*kiov), GFP_KERNEL);
> > + if (!kiov)
> > + return -ENOMEM;
> >
> > - rc = get_kernel_pages(kiov, page_count, 0, pages);
> > - kfree(kiov);
> > + for (n = 0; n < page_count; n++) {
> > + kiov[n].iov_base = (void *)(start + n * PAGE_SIZE);
> > + kiov[n].iov_len = PAGE_SIZE;
> > }
> >
> > + rc = get_kernel_pages(kiov, page_count, 0, pages);
> > + kfree(kiov);
> > +
> > return rc;
> > }
> >
> > --
> > 2.37.2
> >
On Thu, Oct 06, 2022 at 11:53:55AM +0530, Sumit Garg wrote:
> Hi Phil,
>
> Please don't top-post in the OSS mailing list.
>
> On Wed, 5 Oct 2022 at 08:59, Phil Chang (張世勳) <[email protected]> wrote:
> >
> > Hi Sumit
> >
> > Thanks for mentioning that, in fact, our product is low memory devices, and continuous pages are extremely valuable.
> > Although our driver is not upstream yet but highly dependent on tee shm vmalloc support,
>
> Sorry but you need to get your driver mainline in order to support
> vmalloc interface. As otherwise it's a maintenance nightmare to
> support interfaces in the mainline for out-of-tree drivers.
I agree. It sounds like this driver is not going to be submitted any time
soon. So I'll respin this series as is. But I do have a couple of other
things to deal with first. So if I'm wrong let me know soon.
Thanks,
Ira
On Wed, Oct 5, 2022 at 11:24 PM Sumit Garg <[email protected]> wrote:
>
> Sorry but you need to get your driver mainline in order to support
> vmalloc interface.
Actually, I think even then we shouldn't support vmalloc - and
register_shm_helper() just needs to be changed to pass in an array of
actual page pointers instead.
At that point TEE_SHM_USER_MAPPED should also go away, because then
it's the caller that should just do either the user space page
pinning, or pass in the kernel page pointer.
JensW, is there some reason that wouldn't work?
Linus
On Thu, Oct 6, 2022 at 8:20 PM Linus Torvalds
<[email protected]> wrote:
>
> On Wed, Oct 5, 2022 at 11:24 PM Sumit Garg <[email protected]> wrote:
> >
> > Sorry but you need to get your driver mainline in order to support
> > vmalloc interface.
>
> Actually, I think even then we shouldn't support vmalloc - and
> register_shm_helper() just needs to be changed to pass in an array of
> actual page pointers instead.
register_shm_helper() is an internal function, I suppose it's what's
passed to tee_shm_register_user_buf() and especially
tee_shm_register_kernel_buf() in this case.
So the gain is that in the kernel it becomes the caller's
responsibility to provide the array of page pointers and the TEE
subsystem doesn't need to care about what kind of kernel memory it is
any longer. Yes, that should avoid eventual complexities with
vmalloc() etc.
>
> At that point TEE_SHM_USER_MAPPED should also go away, because then
> it's the caller that should just do either the user space page
> pinning, or pass in the kernel page pointer.
>
> JensW, is there some reason that wouldn't work?
We still need to know if it's kernel or user pages in
release_registered_pages().
The struct tee_shm:s acquired with syscalls from user space are
reference counted. As are the kernel tee_shm:s, but I believe we could
separate them to avoid reference counting tee_shm:s used by kernel
clients if needed. I'll need to look closer at this if we're going to
use that approach.
Without reference counting the caller of tee_shm_free() can be certain
that the secure world is done with the memory so we could delegate the
kernel pages part of release_registered_pages() to the caller instead.
Cheers,
Jens
>
> Linus
On Thu, Oct 06, 2022 at 11:20:16AM -0700, Linus Torvalds wrote:
> Actually, I think even then we shouldn't support vmalloc - and
> register_shm_helper() just needs to be changed to pass in an array of
> actual page pointers instead.
>
> At that point TEE_SHM_USER_MAPPED should also go away, because then
> it's the caller that should just do either the user space page
> pinning, or pass in the kernel page pointer.
>
> JensW, is there some reason that wouldn't work?
I suspect the best long term option would be to just pass an iov_iter..
On Mon, Oct 10, 2022 at 12:42 AM Christoph Hellwig <[email protected]> wrote:
>
> I suspect the best long term option would be to just pass an iov_iter..
Hmm. Yeah, that sounds like a workable model, and solves the problem
JensW pointed out with my simplistic "just pass a page array" approach
where you also need to keep track of how to release things.
Linus
On Mon, Oct 10, 2022 at 10:20:15AM -0700, Linus Torvalds wrote:
> On Mon, Oct 10, 2022 at 12:42 AM Christoph Hellwig <[email protected]> wrote:
> >
> > I suspect the best long term option would be to just pass an iov_iter..
>
> Hmm. Yeah, that sounds like a workable model, and solves the problem
> JensW pointed out with my simplistic "just pass a page array" approach
> where you also need to keep track of how to release things.
Except that then you need to get iov_iter_get_pages analogue that would
work for ITER_KVEC, which is exact same problem right back.
On Fri, Oct 07, 2022 at 10:12:57AM +0200, Jens Wiklander wrote:
> On Thu, Oct 6, 2022 at 8:20 PM Linus Torvalds
> <[email protected]> wrote:
> >
> > On Wed, Oct 5, 2022 at 11:24 PM Sumit Garg <[email protected]> wrote:
> > >
> > > Sorry but you need to get your driver mainline in order to support
> > > vmalloc interface.
> >
> > Actually, I think even then we shouldn't support vmalloc - and
> > register_shm_helper() just needs to be changed to pass in an array of
> > actual page pointers instead.
>
> register_shm_helper() is an internal function, I suppose it's what's
> passed to tee_shm_register_user_buf() and especially
> tee_shm_register_kernel_buf() in this case.
>
> So the gain is that in the kernel it becomes the caller's
> responsibility to provide the array of page pointers and the TEE
> subsystem doesn't need to care about what kind of kernel memory it is
> any longer. Yes, that should avoid eventual complexities with
> vmalloc() etc.
I finally spent some time digging into this again.
Overall I'm not opposed to trying to clean up the code more but I feel like the
removal of TEE_SHM_USER_MAPPED is too complex for the main goal; to remove a
caller of kmap_to_page().
Not only is that flag used in release_registered_pages() but it is also used in
tee_shm_fop_mmap(). I'm not following exactly why. I think this is to allow
mmap of the tee_shm's allocated by kernel users? Which I _think_ is
orthogonal to the callers of tee_shm_register_kernel_buf()?
>
> >
> > At that point TEE_SHM_USER_MAPPED should also go away, because then
> > it's the caller that should just do either the user space page
> > pinning, or pass in the kernel page pointer.
> >
> > JensW, is there some reason that wouldn't work?
>
> We still need to know if it's kernel or user pages in
> release_registered_pages().
Yes.
As I dug into this it seemed ok to define a tee_shm_kernel_free(). Pull out
the allocation of the page array from register_shm_helper() such that it could
be handled by tee_shm_register_kernel_buf() and this new tee_shm_kernel_free().
This seems reasonable because the only callers of tee_shm_register_kernel_buf()
are in trusted_tee.c and they all call tee_shm_free() on the registered memory
prior to returning.
Other callers[*] of tee_shm_free() obtained tee_shm from
tee_shm_alloc_kernel_buf() which AFAICT avoids all this nonsense.
[*] such as .../drivers/firmware/broadcom/tee_bnxt_fw.c.
>
> The struct tee_shm:s acquired with syscalls from user space are
> reference counted. As are the kernel tee_shm:s, but I believe we could
> separate them to avoid reference counting tee_shm:s used by kernel
> clients if needed. I'll need to look closer at this if we're going to
> use that approach.
>
> Without reference counting the caller of tee_shm_free() can be certain
> that the secure world is done with the memory so we could delegate the
> kernel pages part of release_registered_pages() to the caller instead.
>
I'm not sure I follow you here. Would this be along the lines of creating a
tee_shm_free_kernel() to be used in trusted_tee.c for those specific kernel
data?
Overall I feel like submitting this series again with Christoph and Al's
comments addressed is the best way forward to get rid of kmap_to_page(). I
would really like to get moving on that to avoid any further issues with the
kmap conversions.
But if folks feel strongly enough about removing that flag I can certainly try
to do so.
Ira
> Cheers,
> Jens
>
> >
> > Linus
On Fri, 16 Dec 2022 at 06:11, Ira Weiny <[email protected]> wrote:
>
> On Fri, Oct 07, 2022 at 10:12:57AM +0200, Jens Wiklander wrote:
> > On Thu, Oct 6, 2022 at 8:20 PM Linus Torvalds
> > <[email protected]> wrote:
> > >
> > > On Wed, Oct 5, 2022 at 11:24 PM Sumit Garg <[email protected]> wrote:
> > > >
> > > > Sorry but you need to get your driver mainline in order to support
> > > > vmalloc interface.
> > >
> > > Actually, I think even then we shouldn't support vmalloc - and
> > > register_shm_helper() just needs to be changed to pass in an array of
> > > actual page pointers instead.
> >
> > register_shm_helper() is an internal function, I suppose it's what's
> > passed to tee_shm_register_user_buf() and especially
> > tee_shm_register_kernel_buf() in this case.
> >
> > So the gain is that in the kernel it becomes the caller's
> > responsibility to provide the array of page pointers and the TEE
> > subsystem doesn't need to care about what kind of kernel memory it is
> > any longer. Yes, that should avoid eventual complexities with
> > vmalloc() etc.
>
> I finally spent some time digging into this again.
>
> Overall I'm not opposed to trying to clean up the code more but I feel like the
> removal of TEE_SHM_USER_MAPPED is too complex for the main goal; to remove a
> caller of kmap_to_page().
>
> Not only is that flag used in release_registered_pages() but it is also used in
> tee_shm_fop_mmap(). I'm not following exactly why. I think this is to allow
> mmap of the tee_shm's allocated by kernel users?
No, its rather to allow mmap of tee_shm allocated via
tee_shm_alloc_user_buf(). Have a look at its user-space usage here
[1]. So overall I agree here that we can't get rid of
TEE_SHM_USER_MAPPED completely as it has a valid mmap use-case.
[1] https://github.com/OP-TEE/optee_client/blob/master/libteec/src/tee_client_api.c#L907
> Which I _think_ is
> orthogonal to the callers of tee_shm_register_kernel_buf()?
>
> >
> > >
> > > At that point TEE_SHM_USER_MAPPED should also go away, because then
> > > it's the caller that should just do either the user space page
> > > pinning, or pass in the kernel page pointer.
> > >
> > > JensW, is there some reason that wouldn't work?
> >
> > We still need to know if it's kernel or user pages in
> > release_registered_pages().
>
> Yes.
>
> As I dug into this it seemed ok to define a tee_shm_kernel_free(). Pull out
> the allocation of the page array from register_shm_helper() such that it could
> be handled by tee_shm_register_kernel_buf() and this new tee_shm_kernel_free().
>
+1
> This seems reasonable because the only callers of tee_shm_register_kernel_buf()
> are in trusted_tee.c and they all call tee_shm_free() on the registered memory
> prior to returning.
>
> Other callers[*] of tee_shm_free() obtained tee_shm from
> tee_shm_alloc_kernel_buf() which AFAICT avoids all this nonsense.
>
> [*] such as .../drivers/firmware/broadcom/tee_bnxt_fw.c.
>
> >
> > The struct tee_shm:s acquired with syscalls from user space are
> > reference counted. As are the kernel tee_shm:s, but I believe we could
> > separate them to avoid reference counting tee_shm:s used by kernel
> > clients if needed. I'll need to look closer at this if we're going to
> > use that approach.
> >
> > Without reference counting the caller of tee_shm_free() can be certain
> > that the secure world is done with the memory so we could delegate the
> > kernel pages part of release_registered_pages() to the caller instead.
> >
>
> I'm not sure I follow you here. Would this be along the lines of creating a
> tee_shm_free_kernel() to be used in trusted_tee.c for those specific kernel
> data?
I can't find a reason/use-case for the TEE subsystem to keep a
refcount of memory registered by other kernel drivers like
trusted_tee.c. So yes I think it should be safe to directly free that
shm via tee_shm_free_kernel(). Also with that we can simplify shm
registration by kernel clients via directly passing page pointers to
tee_shm_register_kernel_buf() (I would rather rename this API as
tee_shm_register_kernel_pages()).
-Sumit
>
> Overall I feel like submitting this series again with Christoph and Al's
> comments addressed is the best way forward to get rid of kmap_to_page(). I
> would really like to get moving on that to avoid any further issues with the
> kmap conversions.
>
> But if folks feel strongly enough about removing that flag I can certainly try
> to do so.
>
> Ira
>
> > Cheers,
> > Jens
> >
> > >
> > > Linus
On Thu, Dec 15, 2022 at 04:41:04PM -0800, Ira Weiny wrote:
> Overall I feel like submitting this series again with Christoph and Al's
> comments addressed is the best way forward to get rid of kmap_to_page(). I
> would really like to get moving on that to avoid any further issues with the
> kmap conversions.
Yes. While the flag is really odd, killing kmap_to_page should be the
priority.
On Fri, 16 Dec 2022 at 10:39, Sumit Garg <[email protected]> wrote:
>
> On Fri, 16 Dec 2022 at 06:11, Ira Weiny <[email protected]> wrote:
> >
> > On Fri, Oct 07, 2022 at 10:12:57AM +0200, Jens Wiklander wrote:
> > > On Thu, Oct 6, 2022 at 8:20 PM Linus Torvalds
> > > <[email protected]> wrote:
> > > >
> > > > On Wed, Oct 5, 2022 at 11:24 PM Sumit Garg <[email protected]> wrote:
> > > > >
> > > > > Sorry but you need to get your driver mainline in order to support
> > > > > vmalloc interface.
> > > >
> > > > Actually, I think even then we shouldn't support vmalloc - and
> > > > register_shm_helper() just needs to be changed to pass in an array of
> > > > actual page pointers instead.
> > >
> > > register_shm_helper() is an internal function, I suppose it's what's
> > > passed to tee_shm_register_user_buf() and especially
> > > tee_shm_register_kernel_buf() in this case.
> > >
> > > So the gain is that in the kernel it becomes the caller's
> > > responsibility to provide the array of page pointers and the TEE
> > > subsystem doesn't need to care about what kind of kernel memory it is
> > > any longer. Yes, that should avoid eventual complexities with
> > > vmalloc() etc.
> >
> > I finally spent some time digging into this again.
> >
> > Overall I'm not opposed to trying to clean up the code more but I feel like the
> > removal of TEE_SHM_USER_MAPPED is too complex for the main goal; to remove a
> > caller of kmap_to_page().
> >
> > Not only is that flag used in release_registered_pages() but it is also used in
> > tee_shm_fop_mmap(). I'm not following exactly why. I think this is to allow
> > mmap of the tee_shm's allocated by kernel users?
>
> No, its rather to allow mmap of tee_shm allocated via
> tee_shm_alloc_user_buf(). Have a look at its user-space usage here
> [1]. So overall I agree here that we can't get rid of
> TEE_SHM_USER_MAPPED completely as it has a valid mmap use-case.
>
> [1] https://github.com/OP-TEE/optee_client/blob/master/libteec/src/tee_client_api.c#L907
>
> > Which I _think_ is
> > orthogonal to the callers of tee_shm_register_kernel_buf()?
> >
> > >
> > > >
> > > > At that point TEE_SHM_USER_MAPPED should also go away, because then
> > > > it's the caller that should just do either the user space page
> > > > pinning, or pass in the kernel page pointer.
> > > >
> > > > JensW, is there some reason that wouldn't work?
> > >
> > > We still need to know if it's kernel or user pages in
> > > release_registered_pages().
> >
> > Yes.
> >
> > As I dug into this it seemed ok to define a tee_shm_kernel_free(). Pull out
> > the allocation of the page array from register_shm_helper() such that it could
> > be handled by tee_shm_register_kernel_buf() and this new tee_shm_kernel_free().
> >
>
> +1
>
> > This seems reasonable because the only callers of tee_shm_register_kernel_buf()
> > are in trusted_tee.c and they all call tee_shm_free() on the registered memory
> > prior to returning.
> >
> > Other callers[*] of tee_shm_free() obtained tee_shm from
> > tee_shm_alloc_kernel_buf() which AFAICT avoids all this nonsense.
> >
> > [*] such as .../drivers/firmware/broadcom/tee_bnxt_fw.c.
> >
> > >
> > > The struct tee_shm:s acquired with syscalls from user space are
> > > reference counted. As are the kernel tee_shm:s, but I believe we could
> > > separate them to avoid reference counting tee_shm:s used by kernel
> > > clients if needed. I'll need to look closer at this if we're going to
> > > use that approach.
> > >
> > > Without reference counting the caller of tee_shm_free() can be certain
> > > that the secure world is done with the memory so we could delegate the
> > > kernel pages part of release_registered_pages() to the caller instead.
> > >
> >
> > I'm not sure I follow you here. Would this be along the lines of creating a
> > tee_shm_free_kernel() to be used in trusted_tee.c for those specific kernel
> > data?
>
> I can't find a reason/use-case for the TEE subsystem to keep a
> refcount of memory registered by other kernel drivers like
> trusted_tee.c. So yes I think it should be safe to directly free that
> shm via tee_shm_free_kernel(). Also with that we can simplify shm
> registration by kernel clients via directly passing page pointers to
> tee_shm_register_kernel_buf() (I would rather rename this API as
> tee_shm_register_kernel_pages()).
Okay, so I will take up this work and get rid of kmap_to_page
invocation from the TEE subsystem.
Ira,
You can then rebase your patchset over my work.
-Sumit
>
> >
> > Overall I feel like submitting this series again with Christoph and Al's
> > comments addressed is the best way forward to get rid of kmap_to_page(). I
> > would really like to get moving on that to avoid any further issues with the
> > kmap conversions.
> >
> > But if folks feel strongly enough about removing that flag I can certainly try
> > to do so.
> >
> > Ira
> >
> > > Cheers,
> > > Jens
> > >
> > > >
> > > > Linus