2023-01-19 16:09:30

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH v3 0/4] fs/sysv: Replace kmap() with kmap_local_page()

kmap() is deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

kmap_local_page() in fs/sysv does not violate any of the strict rules of
its use, therefore it should be preferred.

Therefore, replace kmap() with kmap_local_page() in fs/sysv. kunmap_local()
requires the mapping address, so return that address from dir_get_page()
to be used in dir_put_page().

I had submitted a patch with the same purpose but it has been replaced
by this series.[1] This is based on a long series of very appreciated
comments and suggestions kindly provided by Al Viro (again thanks!).[2][3][4]

Changes from v1:[5]
1/4 - No changes.
2/4 - Delete an unnecessary assignment (thanks to Dan Carpenter).
3/4 - No changes.
4/4 - No changes.

Changes from v2:[6]
1/4 - No changes.
2/4 - Remove a redundant assignment in sysv_dotdot() and add a
comment (thanks to Al Viro for both suggestions).
3/4 - No changes.
4/4 - No changes.

[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/Y4E++JERgUMoqfjG@ZenIV/#t
[3] https://lore.kernel.org/lkml/Y4FG0O7VWTTng5yh@ZenIV/#t
[4] https://lore.kernel.org/lkml/Y4ONIFJatIGsVNpf@ZenIV/#t
[5] https://lore.kernel.org/lkml/[email protected]/
[6] https://lore.kernel.org/lkml/[email protected]/

Cc: Al Viro <[email protected]>
Cc: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>

Fabio M. De Francesco (4):
fs/sysv: Use the offset_in_page() helper
fs/sysv: Change the signature of dir_get_page()
fs/sysv: Use dir_put_page() in sysv_rename()
fs/sysv: Replace kmap() with kmap_local_page()

fs/sysv/dir.c | 120 +++++++++++++++++++++++++++---------------------
fs/sysv/namei.c | 9 ++--
fs/sysv/sysv.h | 1 +
3 files changed, 71 insertions(+), 59 deletions(-)

--
2.39.0


2023-01-19 16:13:44

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since kmap_local_page() would not break the strict rules of local mappings
(i.e., the thread locality and the stack based nesting), this function can
be easily and safely replace the deprecated API.

Therefore, replace kmap() with kmap_local_page() in fs/sysv. kunmap_local()
requires the mapping address, so return that address from dir_get_page()
to be used in dir_put_page().

Suggested-by: Al Viro <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
---
fs/sysv/dir.c | 62 +++++++++++++++++++++++++++++++++----------------
fs/sysv/namei.c | 4 ++--
fs/sysv/sysv.h | 2 +-
3 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/fs/sysv/dir.c b/fs/sysv/dir.c
index 2e35b95d3efb..0a83a01862f3 100644
--- a/fs/sysv/dir.c
+++ b/fs/sysv/dir.c
@@ -28,9 +28,9 @@ const struct file_operations sysv_dir_operations = {
.fsync = generic_file_fsync,
};

-inline void dir_put_page(struct page *page)
+inline void dir_put_page(struct page *page, void *page_addr)
{
- kunmap(page);
+ kunmap_local(page_addr);
put_page(page);
}

@@ -52,15 +52,21 @@ static int dir_commit_chunk(struct page *page, loff_t pos, unsigned len)
return err;
}

+/*
+ * Calls to dir_get_page()/dir_put_page() must be nested according to the
+ * rules documented in mm/highmem.rst.
+ *
+ * NOTE: sysv_find_entry() and sysv_dotdot() act as calls to dir_get_page()
+ * and must be treated accordingly for nesting purposes.
+ */
static void *dir_get_page(struct inode *dir, unsigned long n, struct page **p)
{
struct address_space *mapping = dir->i_mapping;
struct page *page = read_mapping_page(mapping, n, NULL);
if (IS_ERR(page))
return ERR_CAST(page);
- kmap(page);
*p = page;
- return page_address(page);
+ return kmap_local_page(page);
}

static int sysv_readdir(struct file *file, struct dir_context *ctx)
@@ -98,11 +104,11 @@ static int sysv_readdir(struct file *file, struct dir_context *ctx)
if (!dir_emit(ctx, name, strnlen(name,SYSV_NAMELEN),
fs16_to_cpu(SYSV_SB(sb), de->inode),
DT_UNKNOWN)) {
- dir_put_page(page);
+ dir_put_page(page, kaddr);
return 0;
}
}
- dir_put_page(page);
+ dir_put_page(page, kaddr);
}
return 0;
}
@@ -125,6 +131,11 @@ static inline int namecompare(int len, int maxlen,
* returns the cache buffer in which the entry was found, and the entry
* itself (as a parameter - res_dir). It does NOT read the inode of the
* entry - you'll have to do that yourself if you want to.
+ *
+ * On Success dir_put_page() should be called on *res_page.
+ *
+ * sysv_find_entry() acts as a call to dir_get_page() and must be treated
+ * accordingly for nesting purposes.
*/
struct sysv_dir_entry *sysv_find_entry(struct dentry *dentry, struct page **res_page)
{
@@ -156,7 +167,7 @@ struct sysv_dir_entry *sysv_find_entry(struct dentry *dentry, struct page **res_
name, de->name))
goto found;
}
- dir_put_page(page);
+ dir_put_page(page, kaddr);
}

if (++n >= npages)
@@ -199,7 +210,7 @@ int sysv_add_link(struct dentry *dentry, struct inode *inode)
goto out_page;
de++;
}
- dir_put_page(page);
+ dir_put_page(page, kaddr);
}
BUG();
return -EINVAL;
@@ -217,7 +228,7 @@ int sysv_add_link(struct dentry *dentry, struct inode *inode)
dir->i_mtime = dir->i_ctime = current_time(dir);
mark_inode_dirty(dir);
out_page:
- dir_put_page(page);
+ dir_put_page(page, kaddr);
return err;
out_unlock:
unlock_page(page);
@@ -228,6 +239,12 @@ int sysv_delete_entry(struct sysv_dir_entry *de, struct page *page)
{
struct inode *inode = page->mapping->host;
loff_t pos = page_offset(page) + offset_in_page(de);
+ /*
+ * The "de" dentry points somewhere in the same page whose we need the
+ * address of; therefore, we can simply get the base address "kaddr" by
+ * masking the previous with PAGE_MASK.
+ */
+ char *kaddr = (char *)((unsigned long)de & PAGE_MASK);
int err;

lock_page(page);
@@ -235,7 +252,7 @@ int sysv_delete_entry(struct sysv_dir_entry *de, struct page *page)
BUG_ON(err);
de->inode = 0;
err = dir_commit_chunk(page, pos, SYSV_DIRSIZE);
- dir_put_page(page);
+ dir_put_page(page, kaddr);
inode->i_ctime = inode->i_mtime = current_time(inode);
mark_inode_dirty(inode);
return err;
@@ -255,9 +272,7 @@ int sysv_make_empty(struct inode *inode, struct inode *dir)
unlock_page(page);
goto fail;
}
- kmap(page);
-
- base = (char*)page_address(page);
+ base = kmap_local_page(page);
memset(base, 0, PAGE_SIZE);

de = (struct sysv_dir_entry *) base;
@@ -267,7 +282,7 @@ int sysv_make_empty(struct inode *inode, struct inode *dir)
de->inode = cpu_to_fs16(SYSV_SB(inode->i_sb), dir->i_ino);
strcpy(de->name,"..");

- kunmap(page);
+ kunmap_local(base);
err = dir_commit_chunk(page, 0, 2 * SYSV_DIRSIZE);
fail:
put_page(page);
@@ -282,10 +297,10 @@ int sysv_empty_dir(struct inode * inode)
struct super_block *sb = inode->i_sb;
struct page *page = NULL;
unsigned long i, npages = dir_pages(inode);
+ char *kaddr;

for (i = 0; i < npages; i++) {
- char *kaddr;
- struct sysv_dir_entry * de;
+ struct sysv_dir_entry *de;

kaddr = dir_get_page(inode, i, &page);
if (IS_ERR(kaddr))
@@ -309,12 +324,12 @@ int sysv_empty_dir(struct inode * inode)
if (de->name[1] != '.' || de->name[2])
goto not_empty;
}
- dir_put_page(page);
+ dir_put_page(page, kaddr);
}
return 1;

not_empty:
- dir_put_page(page);
+ dir_put_page(page, kaddr);
return 0;
}

@@ -331,11 +346,18 @@ void sysv_set_link(struct sysv_dir_entry *de, struct page *page,
BUG_ON(err);
de->inode = cpu_to_fs16(SYSV_SB(inode->i_sb), inode->i_ino);
err = dir_commit_chunk(page, pos, SYSV_DIRSIZE);
- dir_put_page(page);
+ dir_put_page(page, de);
dir->i_mtime = dir->i_ctime = current_time(dir);
mark_inode_dirty(dir);
}

+/*
+ * Calls to dir_get_page()/dir_put_page() must be nested according to the
+ * rules documented in mm/highmem.rst.
+ *
+ * sysv_dotdot() acts as a call to dir_get_page() and must be treated
+ * accordingly for nesting purposes.
+ */
struct sysv_dir_entry *sysv_dotdot(struct inode *dir, struct page **p)
{
struct sysv_dir_entry *de = dir_get_page(dir, 0, p);
@@ -354,7 +376,7 @@ ino_t sysv_inode_by_name(struct dentry *dentry)

if (de) {
res = fs16_to_cpu(SYSV_SB(dentry->d_sb), de->inode);
- dir_put_page(page);
+ dir_put_page(page, de);
}
return res;
}
diff --git a/fs/sysv/namei.c b/fs/sysv/namei.c
index 981c1d76f342..371cf9012052 100644
--- a/fs/sysv/namei.c
+++ b/fs/sysv/namei.c
@@ -251,9 +251,9 @@ static int sysv_rename(struct user_namespace *mnt_userns, struct inode *old_dir,

out_dir:
if (dir_de)
- dir_put_page(dir_page);
+ dir_put_page(dir_page, dir_de);
out_old:
- dir_put_page(old_page);
+ dir_put_page(old_page, old_de);
out:
return err;
}
diff --git a/fs/sysv/sysv.h b/fs/sysv/sysv.h
index b250ac1dd348..50f19bfd8d10 100644
--- a/fs/sysv/sysv.h
+++ b/fs/sysv/sysv.h
@@ -148,7 +148,7 @@ extern void sysv_destroy_icache(void);


/* dir.c */
-extern void dir_put_page(struct page *page);
+extern void dir_put_page(struct page *page, void *vaddr);
extern struct sysv_dir_entry *sysv_find_entry(struct dentry *, struct page **);
extern int sysv_add_link(struct dentry *, struct inode *);
extern int sysv_delete_entry(struct sysv_dir_entry *, struct page *);
--
2.39.0

2023-01-19 16:28:38

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH v3 3/4] fs/sysv: Use dir_put_page() in sysv_rename()

Use the dir_put_page() helper in sysv_rename() instead of open-coding two
kunmap() + put_page().

Cc: Al Viro <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
---
fs/sysv/dir.c | 2 +-
fs/sysv/namei.c | 9 +++------
fs/sysv/sysv.h | 1 +
3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/sysv/dir.c b/fs/sysv/dir.c
index 8d14c6c02476..2e35b95d3efb 100644
--- a/fs/sysv/dir.c
+++ b/fs/sysv/dir.c
@@ -28,7 +28,7 @@ const struct file_operations sysv_dir_operations = {
.fsync = generic_file_fsync,
};

-static inline void dir_put_page(struct page *page)
+inline void dir_put_page(struct page *page)
{
kunmap(page);
put_page(page);
diff --git a/fs/sysv/namei.c b/fs/sysv/namei.c
index b2e6abc06a2d..981c1d76f342 100644
--- a/fs/sysv/namei.c
+++ b/fs/sysv/namei.c
@@ -250,13 +250,10 @@ static int sysv_rename(struct user_namespace *mnt_userns, struct inode *old_dir,
return 0;

out_dir:
- if (dir_de) {
- kunmap(dir_page);
- put_page(dir_page);
- }
+ if (dir_de)
+ dir_put_page(dir_page);
out_old:
- kunmap(old_page);
- put_page(old_page);
+ dir_put_page(old_page);
out:
return err;
}
diff --git a/fs/sysv/sysv.h b/fs/sysv/sysv.h
index 99ddf033da4f..b250ac1dd348 100644
--- a/fs/sysv/sysv.h
+++ b/fs/sysv/sysv.h
@@ -148,6 +148,7 @@ extern void sysv_destroy_icache(void);


/* dir.c */
+extern void dir_put_page(struct page *page);
extern struct sysv_dir_entry *sysv_find_entry(struct dentry *, struct page **);
extern int sysv_add_link(struct dentry *, struct inode *);
extern int sysv_delete_entry(struct sysv_dir_entry *, struct page *);
--
2.39.0

2023-01-20 01:50:29

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Thu, Jan 19, 2023 at 04:32:32PM +0100, Fabio M. De Francesco wrote:
> @@ -228,6 +239,12 @@ int sysv_delete_entry(struct sysv_dir_entry *de, struct page *page)
> {
> struct inode *inode = page->mapping->host;
> loff_t pos = page_offset(page) + offset_in_page(de);
> + /*
> + * The "de" dentry points somewhere in the same page whose we need the
> + * address of; therefore, we can simply get the base address "kaddr" by
> + * masking the previous with PAGE_MASK.
> + */
> + char *kaddr = (char *)((unsigned long)de & PAGE_MASK);

er... ITYM "therefore we can pass de to dir_put_page() and get rid of that kaddr
thing"...

Anyway, with that change the series rebased and applied on top of Christoph's sysv
patch.

2023-01-20 04:44:53

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Fri, Jan 20, 2023 at 04:21:06AM +0000, Al Viro wrote:
> On Thu, Jan 19, 2023 at 04:32:32PM +0100, Fabio M. De Francesco wrote:
>
> > -inline void dir_put_page(struct page *page)
> > +inline void dir_put_page(struct page *page, void *page_addr)
> > {
> > - kunmap(page);
> > + kunmap_local(page_addr);
>
> ... and that needed to be fixed - at some point "round down to beginning of
> page" got lost in rebasing...

You don't need to round down in kunmap(). See:

void kunmap_local_indexed(const void *vaddr)
{
unsigned long addr = (unsigned long) vaddr & PAGE_MASK;

2023-01-20 04:46:16

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Thu, Jan 19, 2023 at 04:32:32PM +0100, Fabio M. De Francesco wrote:

> -inline void dir_put_page(struct page *page)
> +inline void dir_put_page(struct page *page, void *page_addr)
> {
> - kunmap(page);
> + kunmap_local(page_addr);

... and that needed to be fixed - at some point "round down to beginning of
page" got lost in rebasing...

2023-01-20 05:51:44

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Fri, Jan 20, 2023 at 04:28:12AM +0000, Matthew Wilcox wrote:
> On Fri, Jan 20, 2023 at 04:21:06AM +0000, Al Viro wrote:
> > On Thu, Jan 19, 2023 at 04:32:32PM +0100, Fabio M. De Francesco wrote:
> >
> > > -inline void dir_put_page(struct page *page)
> > > +inline void dir_put_page(struct page *page, void *page_addr)
> > > {
> > > - kunmap(page);
> > > + kunmap_local(page_addr);
> >
> > ... and that needed to be fixed - at some point "round down to beginning of
> > page" got lost in rebasing...
>
> You don't need to round down in kunmap(). See:
>
> void kunmap_local_indexed(const void *vaddr)
> {
> unsigned long addr = (unsigned long) vaddr & PAGE_MASK;
>

Sure, but... there's also this:

static inline void __kunmap_local(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
kunmap_flush_on_unmap(addr);
#endif
}

Are you sure that the guts of that thing will be happy with address that is not
page-aligned? I've looked there at some point, got scared of parisc (IIRC)
MMU details and decided not to rely upon that...

2023-01-20 05:52:43

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Fri, Jan 20, 2023 at 04:45:17AM +0000, Al Viro wrote:
> On Fri, Jan 20, 2023 at 04:28:12AM +0000, Matthew Wilcox wrote:
> > On Fri, Jan 20, 2023 at 04:21:06AM +0000, Al Viro wrote:
> > > On Thu, Jan 19, 2023 at 04:32:32PM +0100, Fabio M. De Francesco wrote:
> > >
> > > > -inline void dir_put_page(struct page *page)
> > > > +inline void dir_put_page(struct page *page, void *page_addr)
> > > > {
> > > > - kunmap(page);
> > > > + kunmap_local(page_addr);
> > >
> > > ... and that needed to be fixed - at some point "round down to beginning of
> > > page" got lost in rebasing...
> >
> > You don't need to round down in kunmap(). See:
> >
> > void kunmap_local_indexed(const void *vaddr)
> > {
> > unsigned long addr = (unsigned long) vaddr & PAGE_MASK;
> >
>
> Sure, but... there's also this:
>
> static inline void __kunmap_local(const void *addr)
> {
> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> kunmap_flush_on_unmap(addr);
> #endif
> }
>
> Are you sure that the guts of that thing will be happy with address that is not
> page-aligned? I've looked there at some point, got scared of parisc (IIRC)
> MMU details and decided not to rely upon that...

Ugh, PA-RISC (the only implementor) definitely will flush the wrong
addresses. I think we should do this, as having bugs that only manifest
on one not-well-tested architecture seems Bad.

static inline void __kunmap_local(const void *addr)
{
#ifdef ARCH_HAS_FLUSH_ON_KUNMAP
- kunmap_flush_on_unmap(addr);
+ kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
#endif
}

Thoughts?

2023-01-20 05:55:38

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Fri, Jan 20, 2023 at 04:54:51AM +0000, Matthew Wilcox wrote:

> > Sure, but... there's also this:
> >
> > static inline void __kunmap_local(const void *addr)
> > {
> > #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> > kunmap_flush_on_unmap(addr);
> > #endif
> > }
> >
> > Are you sure that the guts of that thing will be happy with address that is not
> > page-aligned? I've looked there at some point, got scared of parisc (IIRC)
> > MMU details and decided not to rely upon that...
>
> Ugh, PA-RISC (the only implementor) definitely will flush the wrong
> addresses. I think we should do this, as having bugs that only manifest
> on one not-well-tested architecture seems Bad.
>
> static inline void __kunmap_local(const void *addr)
> {
> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> - kunmap_flush_on_unmap(addr);
> + kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
> #endif
> }

PTR_ALIGN_DOWN(addr, PAGE_SIZE), perhaps?

2023-01-20 06:15:27

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Fri, Jan 20, 2023 at 05:07:48AM +0000, Al Viro wrote:
> On Fri, Jan 20, 2023 at 04:54:51AM +0000, Matthew Wilcox wrote:
>
> > > Sure, but... there's also this:
> > >
> > > static inline void __kunmap_local(const void *addr)
> > > {
> > > #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> > > kunmap_flush_on_unmap(addr);
> > > #endif
> > > }
> > >
> > > Are you sure that the guts of that thing will be happy with address that is not
> > > page-aligned? I've looked there at some point, got scared of parisc (IIRC)
> > > MMU details and decided not to rely upon that...
> >
> > Ugh, PA-RISC (the only implementor) definitely will flush the wrong
> > addresses. I think we should do this, as having bugs that only manifest
> > on one not-well-tested architecture seems Bad.
> >
> > static inline void __kunmap_local(const void *addr)
> > {
> > #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> > - kunmap_flush_on_unmap(addr);
> > + kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
> > #endif
> > }
>
> PTR_ALIGN_DOWN(addr, PAGE_SIZE), perhaps?

Anyway, that's a question to parisc folks; I _think_ pdtlb
quietly ignores the lower bits of address, so that part seems
to be safe, but I wouldn't bet upon that. And when I got to
flush_kernel_dcache_page_asm I gave up - it's been a long time
since I've dealt with parisc assembler.

2023-01-20 07:34:26

by Helge Deller

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On 1/20/23 06:56, Al Viro wrote:
> On Fri, Jan 20, 2023 at 05:07:48AM +0000, Al Viro wrote:
>> On Fri, Jan 20, 2023 at 04:54:51AM +0000, Matthew Wilcox wrote:
>>
>>>> Sure, but... there's also this:
>>>>
>>>> static inline void __kunmap_local(const void *addr)
>>>> {
>>>> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
>>>> kunmap_flush_on_unmap(addr);
>>>> #endif
>>>> }
>>>>
>>>> Are you sure that the guts of that thing will be happy with address that is not
>>>> page-aligned? I've looked there at some point, got scared of parisc (IIRC)
>>>> MMU details and decided not to rely upon that...
>>>
>>> Ugh, PA-RISC (the only implementor) definitely will flush the wrong
>>> addresses. I think we should do this, as having bugs that only manifest
>>> on one not-well-tested architecture seems Bad.
>>>
>>> static inline void __kunmap_local(const void *addr)
>>> {
>>> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
>>> - kunmap_flush_on_unmap(addr);
>>> + kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
>>> #endif
>>> }
>>
>> PTR_ALIGN_DOWN(addr, PAGE_SIZE), perhaps?
>
> Anyway, that's a question to parisc folks; I _think_ pdtlb
> quietly ignores the lower bits of address, so that part seems
> to be safe, but I wouldn't bet upon that.

No, on PA2.0 (64bit) CPUs the lower bits of the address of pdtlb
encodes the amount of memory (page size) to be flushed, see:
http://ftp.parisc-linux.org/docs/arch/parisc2.0.pdf (page 7-106)
So, the proposed page alignment with e.g. PTR_ALIGN_DON() is needed.

Helge

2023-01-21 08:33:39

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

Helge Deller wrote:
> On 1/20/23 06:56, Al Viro wrote:
> > On Fri, Jan 20, 2023 at 05:07:48AM +0000, Al Viro wrote:
> >> On Fri, Jan 20, 2023 at 04:54:51AM +0000, Matthew Wilcox wrote:
> >>
> >>>> Sure, but... there's also this:
> >>>>
> >>>> static inline void __kunmap_local(const void *addr)
> >>>> {
> >>>> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> >>>> kunmap_flush_on_unmap(addr);
> >>>> #endif
> >>>> }
> >>>>
> >>>> Are you sure that the guts of that thing will be happy with address that is not
> >>>> page-aligned? I've looked there at some point, got scared of parisc (IIRC)
> >>>> MMU details and decided not to rely upon that...
> >>>
> >>> Ugh, PA-RISC (the only implementor) definitely will flush the wrong
> >>> addresses. I think we should do this, as having bugs that only manifest
> >>> on one not-well-tested architecture seems Bad.
> >>>
> >>> static inline void __kunmap_local(const void *addr)
> >>> {
> >>> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> >>> - kunmap_flush_on_unmap(addr);
> >>> + kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
> >>> #endif
> >>> }
> >>
> >> PTR_ALIGN_DOWN(addr, PAGE_SIZE), perhaps?
> >
> > Anyway, that's a question to parisc folks; I _think_ pdtlb
> > quietly ignores the lower bits of address, so that part seems
> > to be safe, but I wouldn't bet upon that.
>
> No, on PA2.0 (64bit) CPUs the lower bits of the address of pdtlb
> encodes the amount of memory (page size) to be flushed, see:
> http://ftp.parisc-linux.org/docs/arch/parisc2.0.pdf (page 7-106)
> So, the proposed page alignment with e.g. PTR_ALIGN_DON() is needed.
>
> Helge

I'm not sure I completely understand.

First, arn't PAGE_ALIGN_DOWN(addr) and PTR_ALIGN_DOWN(addr, PAGE_SIZE) the
same?

align.h
#define PTR_ALIGN_DOWN(p, a) ((typeof(p))ALIGN_DOWN((unsigned long)(p), (a)))

mm.h:
#define PAGE_ALIGN_DOWN(addr) ALIGN_DOWN(addr, PAGE_SIZE)

Did parisc redefine it somewhere I'm not seeing?

Second, if the lower bits encode the amount of memory to be flushed is it
required to return the original value returned from page_address()?

Ira

2023-01-21 11:25:17

by Helge Deller

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On 1/21/23 09:05, Ira Weiny wrote:
> Helge Deller wrote:
>> On 1/20/23 06:56, Al Viro wrote:
>>> On Fri, Jan 20, 2023 at 05:07:48AM +0000, Al Viro wrote:
>>>> On Fri, Jan 20, 2023 at 04:54:51AM +0000, Matthew Wilcox wrote:
>>>>
>>>>>> Sure, but... there's also this:
>>>>>>
>>>>>> static inline void __kunmap_local(const void *addr)
>>>>>> {
>>>>>> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
>>>>>> kunmap_flush_on_unmap(addr);
>>>>>> #endif
>>>>>> }
>>>>>>
>>>>>> Are you sure that the guts of that thing will be happy with address that is not
>>>>>> page-aligned? I've looked there at some point, got scared of parisc (IIRC)
>>>>>> MMU details and decided not to rely upon that...
>>>>>
>>>>> Ugh, PA-RISC (the only implementor) definitely will flush the wrong
>>>>> addresses. I think we should do this, as having bugs that only manifest
>>>>> on one not-well-tested architecture seems Bad.
>>>>>
>>>>> static inline void __kunmap_local(const void *addr)
>>>>> {
>>>>> #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
>>>>> - kunmap_flush_on_unmap(addr);
>>>>> + kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
>>>>> #endif
>>>>> }
>>>>
>>>> PTR_ALIGN_DOWN(addr, PAGE_SIZE), perhaps?
>>>
>>> Anyway, that's a question to parisc folks; I _think_ pdtlb
>>> quietly ignores the lower bits of address, so that part seems
>>> to be safe, but I wouldn't bet upon that.
>>
>> No, on PA2.0 (64bit) CPUs the lower bits of the address of pdtlb
>> encodes the amount of memory (page size) to be flushed, see:
>> http://ftp.parisc-linux.org/docs/arch/parisc2.0.pdf (page 7-106)
>> So, the proposed page alignment with e.g. PTR_ALIGN_DON() is needed.
>>
>
> I'm not sure I completely understand.
>
> First, arn't PAGE_ALIGN_DOWN(addr) and PTR_ALIGN_DOWN(addr, PAGE_SIZE) the
> same?

Yes, they are.

> align.h
> #define PTR_ALIGN_DOWN(p, a) ((typeof(p))ALIGN_DOWN((unsigned long)(p), (a)))
>
> mm.h:
> #define PAGE_ALIGN_DOWN(addr) ALIGN_DOWN(addr, PAGE_SIZE)
>
> Did parisc redefine it somewhere I'm not seeing?

No, there is nothing special in this regard on parisc.

> Second, if the lower bits encode the amount of memory to be flushed is it
> required to return the original value returned from page_address()?

No.
If the lower bits are zero, then the default page size (4k) is used for the tlb purge.
So, if you simply strip the lower bits (by using PAGE_ALIGN_DOWN() or ALIGN_DOWN())
you are fine.

Helge

2023-01-21 20:29:43

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Sat, Jan 21, 2023 at 12:05:58AM -0800, Ira Weiny wrote:

> First, arn't PAGE_ALIGN_DOWN(addr) and PTR_ALIGN_DOWN(addr, PAGE_SIZE) the
> same?
>
> align.h
> #define PTR_ALIGN_DOWN(p, a) ((typeof(p))ALIGN_DOWN((unsigned long)(p), (a)))
>
> mm.h:
> #define PAGE_ALIGN_DOWN(addr) ALIGN_DOWN(addr, PAGE_SIZE)

... and ALIGN_DOWN ends up with doing bitwise and on the first argument.
Which doesn't work for pointers, thus the separate variant for those
and typecast to unsigned long in it...

2023-01-23 17:03:21

by Fabio M. De Francesco

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On Thu, Jan 19, 2023 at 04:32:32PM +0100, Fabio M. De Francesco wrote:
> @@ -228,6 +239,12 @@ int sysv_delete_entry(struct sysv_dir_entry *de, struct
page *page)
> {
> struct inode *inode = page->mapping->host;
> loff_t pos = page_offset(page) + offset_in_page(de);
> + /*
> + * The "de" dentry points somewhere in the same page whose we need
the
> + * address of; therefore, we can simply get the base address "kaddr"
by
> + * masking the previous with PAGE_MASK.
> + */
> + char *kaddr = (char *)((unsigned long)de & PAGE_MASK);

er... ITYM "therefore we can pass de to dir_put_page() and get rid of that
kaddr
thing"...

Anyway, with that change the series rebased and applied on top of Christoph's
sysv
patch.

On venerd? 20 gennaio 2023 05:21:06 CET Al Viro wrote:
> On Thu, Jan 19, 2023 at 04:32:32PM +0100, Fabio M. De Francesco wrote:
> > -inline void dir_put_page(struct page *page)
> > +inline void dir_put_page(struct page *page, void *page_addr)
> >
> > {
> >
> > - kunmap(page);
> > + kunmap_local(page_addr);
>
> ... and that needed to be fixed - at some point "round down to beginning of
> page" got lost in rebasing...

Sorry for this late reply.
Unfortunately, at the moment, I have too little free time for kernel
development.

@Al...

I merged the two messages from you because they are related to changes to the
same patch. I would like to know if I am interpreting both correctly or not.

From your posts it sounds like you're saying you made the necessary changes
and then applied this series on top of Christoph's latest patch to fs/sysv, so
don't expect me to push a new version.

Did I understand correctly or am I missing something?
Anyway, if my interpretation is correct, thank you very much.

Can you please confirm that I understood correctly or not?

Thank you,

Fabio



2023-01-23 17:14:29

by Fabio M. De Francesco

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

On venerd? 20 gennaio 2023 06:56:01 CET Al Viro wrote:
> On Fri, Jan 20, 2023 at 05:07:48AM +0000, Al Viro wrote:
> > On Fri, Jan 20, 2023 at 04:54:51AM +0000, Matthew Wilcox wrote:
> > > > Sure, but... there's also this:
> > > >
> > > > static inline void __kunmap_local(const void *addr)
> > > > {
> > > > #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> > > >
> > > > kunmap_flush_on_unmap(addr);
> > > >
> > > > #endif
> > > > }
> > > >
> > > > Are you sure that the guts of that thing will be happy with address
that
> > > > is not page-aligned? I've looked there at some point, got scared of
> > > > parisc (IIRC) MMU details and decided not to rely upon that...
> > >
> > > Ugh, PA-RISC (the only implementer) definitely will flush the wrong
> > > addresses. I think we should do this, as having bugs that only manifest
> > > on one not-well-tested architecture seems Bad.
> > >
> > > static inline void __kunmap_local(const void *addr)
> > > {
> > > #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> > >
> > > - kunmap_flush_on_unmap(addr);
> > > + kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
> > >
> > > #endif
> > > }
> >
> > PTR_ALIGN_DOWN(addr, PAGE_SIZE), perhaps?
>
> Anyway, that's a question to parisc folks; I _think_ pdtlb
> quietly ignores the lower bits of address, so that part seems
> to be safe, but I wouldn't bet upon that. And when I got to
> flush_kernel_dcache_page_asm I gave up - it's been a long time
> since I've dealt with parisc assembler.

There seems to be consensus that __kunmap_local() needs to be fixed for the
parisc case (ARCH_HAS_FLUSH_ON_KUNMAP).

Is anyone doing this task?

If you agree, I could make this change and give proper credits for the tip.

Thank you,

Fabio




2023-01-24 20:17:11

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v3 4/4] fs/sysv: Replace kmap() with kmap_local_page()

Fabio M. De Francesco wrote:
> On venerd? 20 gennaio 2023 06:56:01 CET Al Viro wrote:
> > On Fri, Jan 20, 2023 at 05:07:48AM +0000, Al Viro wrote:
> > > On Fri, Jan 20, 2023 at 04:54:51AM +0000, Matthew Wilcox wrote:
> > > > > Sure, but... there's also this:
> > > > >
> > > > > static inline void __kunmap_local(const void *addr)
> > > > > {
> > > > > #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> > > > >
> > > > > kunmap_flush_on_unmap(addr);
> > > > >
> > > > > #endif
> > > > > }
> > > > >
> > > > > Are you sure that the guts of that thing will be happy with address
> that
> > > > > is not page-aligned? I've looked there at some point, got scared of
> > > > > parisc (IIRC) MMU details and decided not to rely upon that...
> > > >
> > > > Ugh, PA-RISC (the only implementer) definitely will flush the wrong
> > > > addresses. I think we should do this, as having bugs that only manifest
> > > > on one not-well-tested architecture seems Bad.
> > > >
> > > > static inline void __kunmap_local(const void *addr)
> > > > {
> > > > #ifdef ARCH_HAS_FLUSH_ON_KUNMAP
> > > >
> > > > - kunmap_flush_on_unmap(addr);
> > > > + kunmap_flush_on_unmap(PAGE_ALIGN_DOWN(addr));
> > > >
> > > > #endif
> > > > }
> > >
> > > PTR_ALIGN_DOWN(addr, PAGE_SIZE), perhaps?
> >
> > Anyway, that's a question to parisc folks; I _think_ pdtlb
> > quietly ignores the lower bits of address, so that part seems
> > to be safe, but I wouldn't bet upon that. And when I got to
> > flush_kernel_dcache_page_asm I gave up - it's been a long time
> > since I've dealt with parisc assembler.
>
> There seems to be consensus that __kunmap_local() needs to be fixed for the
> parisc case (ARCH_HAS_FLUSH_ON_KUNMAP).
>
> Is anyone doing this task?

I'm not looking at it.

>
> If you agree, I could make this change and give proper credits for the tip.

I think that would be great.

Thanks!
Ira

>
> Thank you,
>
> Fabio
>
>
>