2022-12-21 17:44:24

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH v4 0/3] fs/ufs: Replace kmap() with kmap_local_page

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in fs/ufs is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in fs/ufs. kunmap_local()
requires the mapping address, so return that address from ufs_get_page()
to be used in ufs_put_page().

This series could have not been ever made because nothing prevented the
previous patch from working properly but Al Viro made a long series of
very appreciated comments about how many unnecessary and redundant lines
of code I could have removed. He could see things I was entirely unable
to notice. Furthermore, he also provided solutions and details about how
I could decompose a single patch into a small series of three
independent units.[1][2][3]

I want to thank him so much for the patience, kindness and the time he
decided to spend to provide those analysis and write three messages full
of interesting insights.[1][2][3]

Changes from v1:
1/3: No changes.
2/3: Restore the return of "err" that was mistakenly deleted
together with the removal of the "out" label in
ufs_add_link(). Thanks to Al Viro.[4]
Return the address of the kmap()'ed page instead of a
pointer to a pointer to the mapped page; a page_address()
had been overlooked in ufs_get_page(). Thanks to Al
Viro.[5]
3/3: Return the kernel virtual address got from the call to
kmap_local_page() after conversion from kmap(). Again
thanks to Al Viro.[6]

Changes from v2:
1/3: No changes.
2/3: Rework ufs_get_page() because the previous version had two
errors: (1) It could return an invalid pages with the out
argument "page" and (2) it could return "page_address(page)"
also in cases where read_mapping_page() returned an error
and the page is never kmap()'ed. Thanks to Al Viro.[7]
3/3: Rework ufs_get_page() after conversion to
kmap_local_page(), in accordance to the last changes in 2/3.

Changes from v3:
1/3: No changes.
2/3: No changes.
3/3: Replace kunmap() with kunmap_local().

[1] https://lore.kernel.org/lkml/Y4E++JERgUMoqfjG@ZenIV/
[2] https://lore.kernel.org/lkml/Y4FG0O7VWTTng5yh@ZenIV/
[3] https://lore.kernel.org/lkml/Y4ONIFJatIGsVNpf@ZenIV/
[4] https://lore.kernel.org/lkml/Y5Zc0qZ3+zsI74OZ@ZenIV/
[5] https://lore.kernel.org/lkml/Y5ZZy23FFAnQDR3C@ZenIV/
[6] https://lore.kernel.org/lkml/Y5ZcMPzPG9h6C9eh@ZenIV/
[7] https://lore.kernel.org/lkml/Y5glgpD7fFifC4Fi@ZenIV/#t

The cover letter of the v1 series is at
https://lore.kernel.org/lkml/[email protected]/
The cover letter of the v2 series is at
https://lore.kernel.org/lkml/[email protected]/
The cover letter of the v3 series is at
https://lore.kernel.org/lkml/[email protected]/

Fabio M. De Francesco (3):
fs/ufs: Use the offset_in_page() helper
fs/ufs: Change the signature of ufs_get_page()
fs/ufs: Replace kmap() with kmap_local_page()

fs/ufs/dir.c | 134 +++++++++++++++++++++++++++------------------------
1 file changed, 71 insertions(+), 63 deletions(-)

--
2.39.0


2022-12-21 17:45:03

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH v4 3/3] fs/ufs: Replace kmap() with kmap_local_page()

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in fs/ufs is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in fs/ufs. kunmap_local()
requires the mapping address, so return that address from ufs_get_page()
to be used in ufs_put_page().

Suggested-by: Al Viro <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
---
fs/ufs/dir.c | 75 ++++++++++++++++++++++++++++++++--------------------
1 file changed, 46 insertions(+), 29 deletions(-)

diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
index 9fa86614d2d1..ed3568da29a8 100644
--- a/fs/ufs/dir.c
+++ b/fs/ufs/dir.c
@@ -61,9 +61,9 @@ static int ufs_commit_chunk(struct page *page, loff_t pos, unsigned len)
return err;
}

-static inline void ufs_put_page(struct page *page)
+static inline void ufs_put_page(struct page *page, void *page_addr)
{
- kunmap(page);
+ kunmap_local((void *)((unsigned long)page_addr & PAGE_MASK));
put_page(page);
}

@@ -76,7 +76,7 @@ ino_t ufs_inode_by_name(struct inode *dir, const struct qstr *qstr)
de = ufs_find_entry(dir, qstr, &page);
if (de) {
res = fs32_to_cpu(dir->i_sb, de->d_ino);
- ufs_put_page(page);
+ ufs_put_page(page, de);
}
return res;
}
@@ -99,18 +99,17 @@ void ufs_set_link(struct inode *dir, struct ufs_dir_entry *de,
ufs_set_de_type(dir->i_sb, de, inode->i_mode);

err = ufs_commit_chunk(page, pos, len);
- ufs_put_page(page);
+ ufs_put_page(page, de);
if (update_times)
dir->i_mtime = dir->i_ctime = current_time(dir);
mark_inode_dirty(dir);
}


-static bool ufs_check_page(struct page *page)
+static bool ufs_check_page(struct page *page, char *kaddr)
{
struct inode *dir = page->mapping->host;
struct super_block *sb = dir->i_sb;
- char *kaddr = page_address(page);
unsigned offs, rec_len;
unsigned limit = PAGE_SIZE;
const unsigned chunk_mask = UFS_SB(sb)->s_uspi->s_dirblksize - 1;
@@ -185,23 +184,32 @@ static bool ufs_check_page(struct page *page)
return false;
}

+/*
+ * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
+ * rules documented in kmap_local_page()/kunmap_local().
+ *
+ * NOTE: ufs_find_entry() and ufs_dotdot() act as calls to ufs_get_page()
+ * and must be treated accordingly for nesting purposes.
+ */
static void *ufs_get_page(struct inode *dir, unsigned long n, struct page **p)
{
+ char *kaddr;
+
struct address_space *mapping = dir->i_mapping;
struct page *page = read_mapping_page(mapping, n, NULL);
if (!IS_ERR(page)) {
- kmap(page);
+ kaddr = kmap_local_page(page);
if (unlikely(!PageChecked(page))) {
- if (!ufs_check_page(page))
+ if (!ufs_check_page(page, kaddr))
goto fail;
}
*p = page;
- return page_address(page);
+ return kaddr;
}
return ERR_CAST(page);

fail:
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
return ERR_PTR(-EIO);
}

@@ -227,6 +235,13 @@ ufs_next_entry(struct super_block *sb, struct ufs_dir_entry *p)
fs16_to_cpu(sb, p->d_reclen));
}

+/*
+ * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
+ * rules documented in kmap_local_page()/kunmap_local().
+ *
+ * ufs_dotdot() acts as a call to ufs_get_page() and must be treated
+ * accordingly for nesting purposes.
+ */
struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
{
struct ufs_dir_entry *de = ufs_get_page(dir, 0, p);
@@ -238,12 +253,15 @@ struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
}

/*
- * ufs_find_entry()
+ * Finds an entry in the specified directory with the wanted name. It returns a
+ * pointer to the directory's entry. The page in which the entry was found is
+ * in the res_page out parameter. The page is returned mapped and unlocked.
+ * The entry is guaranteed to be valid.
*
- * finds an entry in the specified directory with the wanted name. It
- * returns the page in which the entry was found, and the entry itself
- * (as a parameter - res_dir). Page is returned mapped and unlocked.
- * Entry is guaranteed to be valid.
+ * On Success ufs_put_page() should be called on *res_page.
+ *
+ * ufs_find_entry() acts as a call to ufs_get_page() and must be treated
+ * accordingly for nesting purposes.
*/
struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr *qstr,
struct page **res_page)
@@ -282,7 +300,7 @@ struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr *qstr,
goto found;
de = ufs_next_entry(sb, de);
}
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
}
if (++n >= npages)
n = 0;
@@ -360,7 +378,7 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
de = (struct ufs_dir_entry *) ((char *) de + rec_len);
}
unlock_page(page);
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
}
BUG();
return -EINVAL;
@@ -390,7 +408,7 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
mark_inode_dirty(dir);
/* OFFSET_CACHE */
out_put:
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
return err;
out_unlock:
unlock_page(page);
@@ -468,13 +486,13 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
ufs_get_de_namlen(sb, de),
fs32_to_cpu(sb, de->d_ino),
d_type)) {
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
return 0;
}
}
ctx->pos += fs16_to_cpu(sb, de->d_reclen);
}
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
}
return 0;
}
@@ -485,10 +503,10 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
* previous entry.
*/
int ufs_delete_entry(struct inode *inode, struct ufs_dir_entry *dir,
- struct page * page)
+ struct page *page)
{
struct super_block *sb = inode->i_sb;
- char *kaddr = page_address(page);
+ char *kaddr = (char *)((unsigned long)dir & PAGE_MASK);
unsigned int from = offset_in_page(dir) & ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1);
unsigned int to = offset_in_page(dir) + fs16_to_cpu(sb, dir->d_reclen);
loff_t pos;
@@ -527,7 +545,7 @@ int ufs_delete_entry(struct inode *inode, struct ufs_dir_entry *dir,
inode->i_ctime = inode->i_mtime = current_time(inode);
mark_inode_dirty(inode);
out:
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
UFSD("EXIT\n");
return err;
}
@@ -551,8 +569,7 @@ int ufs_make_empty(struct inode * inode, struct inode *dir)
goto fail;
}

- kmap(page);
- base = (char*)page_address(page);
+ base = kmap_local_page(page);
memset(base, 0, PAGE_SIZE);

de = (struct ufs_dir_entry *) base;
@@ -569,7 +586,7 @@ int ufs_make_empty(struct inode * inode, struct inode *dir)
de->d_reclen = cpu_to_fs16(sb, chunk_size - UFS_DIR_REC_LEN(1));
ufs_set_de_namlen(sb, de, 2);
strcpy (de->d_name, "..");
- kunmap(page);
+ kunmap_local(base);

err = ufs_commit_chunk(page, 0, chunk_size);
fail:
@@ -585,9 +602,9 @@ int ufs_empty_dir(struct inode * inode)
struct super_block *sb = inode->i_sb;
struct page *page = NULL;
unsigned long i, npages = dir_pages(inode);
+ char *kaddr;

for (i = 0; i < npages; i++) {
- char *kaddr;
struct ufs_dir_entry *de;

kaddr = ufs_get_page(inode, i, &page);
@@ -620,12 +637,12 @@ int ufs_empty_dir(struct inode * inode)
}
de = ufs_next_entry(sb, de);
}
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
}
return 1;

not_empty:
- ufs_put_page(page);
+ ufs_put_page(page, kaddr);
return 0;
}

--
2.39.0

2022-12-21 17:45:40

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH v4 2/3] fs/ufs: Change the signature of ufs_get_page()

Change the signature of ufs_get_page() in order to prepare this function
to the conversion to the use of kmap_local_page(). Change also those call
sites which are required to conform its invocations to the new
signature.

Cc: Ira Weiny <[email protected]>
Suggested-by: Al Viro <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
---
fs/ufs/dir.c | 49 +++++++++++++++++++++----------------------------
1 file changed, 21 insertions(+), 28 deletions(-)

diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
index 69f78583c9c1..9fa86614d2d1 100644
--- a/fs/ufs/dir.c
+++ b/fs/ufs/dir.c
@@ -185,7 +185,7 @@ static bool ufs_check_page(struct page *page)
return false;
}

-static struct page *ufs_get_page(struct inode *dir, unsigned long n)
+static void *ufs_get_page(struct inode *dir, unsigned long n, struct page **p)
{
struct address_space *mapping = dir->i_mapping;
struct page *page = read_mapping_page(mapping, n, NULL);
@@ -195,8 +195,10 @@ static struct page *ufs_get_page(struct inode *dir, unsigned long n)
if (!ufs_check_page(page))
goto fail;
}
+ *p = page;
+ return page_address(page);
}
- return page;
+ return ERR_CAST(page);

fail:
ufs_put_page(page);
@@ -227,15 +229,12 @@ ufs_next_entry(struct super_block *sb, struct ufs_dir_entry *p)

struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
{
- struct page *page = ufs_get_page(dir, 0);
- struct ufs_dir_entry *de = NULL;
+ struct ufs_dir_entry *de = ufs_get_page(dir, 0, p);

- if (!IS_ERR(page)) {
- de = ufs_next_entry(dir->i_sb,
- (struct ufs_dir_entry *)page_address(page));
- *p = page;
- }
- return de;
+ if (!IS_ERR(de))
+ return ufs_next_entry(dir->i_sb, de);
+ else
+ return NULL;
}

/*
@@ -273,11 +272,10 @@ struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr *qstr,
start = 0;
n = start;
do {
- char *kaddr;
- page = ufs_get_page(dir, n);
- if (!IS_ERR(page)) {
- kaddr = page_address(page);
- de = (struct ufs_dir_entry *) kaddr;
+ char *kaddr = ufs_get_page(dir, n, &page);
+
+ if (!IS_ERR(kaddr)) {
+ de = (struct ufs_dir_entry *)kaddr;
kaddr += ufs_last_byte(dir, n) - reclen;
while ((char *) de <= kaddr) {
if (ufs_match(sb, namelen, name, de))
@@ -328,12 +326,10 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
for (n = 0; n <= npages; n++) {
char *dir_end;

- page = ufs_get_page(dir, n);
- err = PTR_ERR(page);
- if (IS_ERR(page))
- goto out;
+ kaddr = ufs_get_page(dir, n, &page);
+ if (IS_ERR(kaddr))
+ return PTR_ERR(kaddr);
lock_page(page);
- kaddr = page_address(page);
dir_end = kaddr + ufs_last_byte(dir, n);
de = (struct ufs_dir_entry *)kaddr;
kaddr += PAGE_SIZE - reclen;
@@ -395,7 +391,6 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
/* OFFSET_CACHE */
out_put:
ufs_put_page(page);
-out:
return err;
out_unlock:
unlock_page(page);
@@ -429,6 +424,7 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
unsigned chunk_mask = ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1);
bool need_revalidate = !inode_eq_iversion(inode, file->f_version);
unsigned flags = UFS_SB(sb)->s_flags;
+ struct page *page;

UFSD("BEGIN\n");

@@ -439,16 +435,14 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
char *kaddr, *limit;
struct ufs_dir_entry *de;

- struct page *page = ufs_get_page(inode, n);
-
- if (IS_ERR(page)) {
+ kaddr = ufs_get_page(inode, n, &page);
+ if (IS_ERR(kaddr)) {
ufs_error(sb, __func__,
"bad page in #%lu",
inode->i_ino);
ctx->pos += PAGE_SIZE - offset;
return -EIO;
}
- kaddr = page_address(page);
if (unlikely(need_revalidate)) {
if (offset) {
offset = ufs_validate_entry(sb, kaddr, offset, chunk_mask);
@@ -595,12 +589,11 @@ int ufs_empty_dir(struct inode * inode)
for (i = 0; i < npages; i++) {
char *kaddr;
struct ufs_dir_entry *de;
- page = ufs_get_page(inode, i);

- if (IS_ERR(page))
+ kaddr = ufs_get_page(inode, i, &page);
+ if (IS_ERR(kaddr))
continue;

- kaddr = page_address(page);
de = (struct ufs_dir_entry *)kaddr;
kaddr += ufs_last_byte(inode, i) - UFS_DIR_REC_LEN(1);

--
2.39.0

2022-12-22 05:51:00

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] fs/ufs: Change the signature of ufs_get_page()

On Wed, Dec 21, 2022 at 06:28:01PM +0100, Fabio M. De Francesco wrote:
> Change the signature of ufs_get_page() in order to prepare this function
> to the conversion to the use of kmap_local_page(). Change also those call
> sites which are required to conform its invocations to the new
> signature.
>
> Cc: Ira Weiny <[email protected]>
> Suggested-by: Al Viro <[email protected]>
> Signed-off-by: Fabio M. De Francesco <[email protected]>
> ---
> fs/ufs/dir.c | 49 +++++++++++++++++++++----------------------------
> 1 file changed, 21 insertions(+), 28 deletions(-)
>
> diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
> index 69f78583c9c1..9fa86614d2d1 100644
> --- a/fs/ufs/dir.c
> +++ b/fs/ufs/dir.c
> @@ -185,7 +185,7 @@ static bool ufs_check_page(struct page *page)
> return false;
> }
>
> -static struct page *ufs_get_page(struct inode *dir, unsigned long n)
> +static void *ufs_get_page(struct inode *dir, unsigned long n, struct page **p)
> {
> struct address_space *mapping = dir->i_mapping;
> struct page *page = read_mapping_page(mapping, n, NULL);
> @@ -195,8 +195,10 @@ static struct page *ufs_get_page(struct inode *dir, unsigned long n)
> if (!ufs_check_page(page))
> goto fail;
> }
> + *p = page;
> + return page_address(page);
> }
> - return page;
> + return ERR_CAST(page);
>
> fail:
> ufs_put_page(page);
> @@ -227,15 +229,12 @@ ufs_next_entry(struct super_block *sb, struct ufs_dir_entry *p)
>
> struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
> {
> - struct page *page = ufs_get_page(dir, 0);
> - struct ufs_dir_entry *de = NULL;
> + struct ufs_dir_entry *de = ufs_get_page(dir, 0, p);

I don't know why but ufs_get_page() returning an address read really odd to me.
But rolling around my head alternative names nothing seems better than this.

>
> - if (!IS_ERR(page)) {
> - de = ufs_next_entry(dir->i_sb,
> - (struct ufs_dir_entry *)page_address(page));
> - *p = page;
> - }
> - return de;
> + if (!IS_ERR(de))
> + return ufs_next_entry(dir->i_sb, de);
> + else
> + return NULL;
> }
>
> /*
> @@ -273,11 +272,10 @@ struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr *qstr,
> start = 0;
> n = start;
> do {
> - char *kaddr;
> - page = ufs_get_page(dir, n);
> - if (!IS_ERR(page)) {
> - kaddr = page_address(page);
> - de = (struct ufs_dir_entry *) kaddr;
> + char *kaddr = ufs_get_page(dir, n, &page);
> +
> + if (!IS_ERR(kaddr)) {
> + de = (struct ufs_dir_entry *)kaddr;
> kaddr += ufs_last_byte(dir, n) - reclen;
> while ((char *) de <= kaddr) {
> if (ufs_match(sb, namelen, name, de))
> @@ -328,12 +326,10 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
> for (n = 0; n <= npages; n++) {
> char *dir_end;
>
> - page = ufs_get_page(dir, n);
> - err = PTR_ERR(page);
> - if (IS_ERR(page))
> - goto out;
> + kaddr = ufs_get_page(dir, n, &page);
> + if (IS_ERR(kaddr))
> + return PTR_ERR(kaddr);
> lock_page(page);
> - kaddr = page_address(page);
> dir_end = kaddr + ufs_last_byte(dir, n);
> de = (struct ufs_dir_entry *)kaddr;
> kaddr += PAGE_SIZE - reclen;
> @@ -395,7 +391,6 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
> /* OFFSET_CACHE */
> out_put:
> ufs_put_page(page);
> -out:
> return err;
> out_unlock:
> unlock_page(page);
> @@ -429,6 +424,7 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
> unsigned chunk_mask = ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1);
> bool need_revalidate = !inode_eq_iversion(inode, file->f_version);
> unsigned flags = UFS_SB(sb)->s_flags;
> + struct page *page;

NIT: Does page now leave the scope of the for loop?

>
> UFSD("BEGIN\n");
>
> @@ -439,16 +435,14 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
> char *kaddr, *limit;
> struct ufs_dir_entry *de;

Couldn't that be declared here?

Regardless I don't think this is broken.

Reviewed-by: Ira Weiny <[email protected]>

>
> - struct page *page = ufs_get_page(inode, n);
> -
> - if (IS_ERR(page)) {
> + kaddr = ufs_get_page(inode, n, &page);
> + if (IS_ERR(kaddr)) {
> ufs_error(sb, __func__,
> "bad page in #%lu",
> inode->i_ino);
> ctx->pos += PAGE_SIZE - offset;
> return -EIO;
> }
> - kaddr = page_address(page);
> if (unlikely(need_revalidate)) {
> if (offset) {
> offset = ufs_validate_entry(sb, kaddr, offset, chunk_mask);
> @@ -595,12 +589,11 @@ int ufs_empty_dir(struct inode * inode)
> for (i = 0; i < npages; i++) {
> char *kaddr;
> struct ufs_dir_entry *de;
> - page = ufs_get_page(inode, i);
>
> - if (IS_ERR(page))
> + kaddr = ufs_get_page(inode, i, &page);
> + if (IS_ERR(kaddr))
> continue;
>
> - kaddr = page_address(page);
> de = (struct ufs_dir_entry *)kaddr;
> kaddr += ufs_last_byte(inode, i) - UFS_DIR_REC_LEN(1);
>
> --
> 2.39.0
>

2022-12-22 06:06:51

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] fs/ufs: Replace kmap() with kmap_local_page()

On Wed, Dec 21, 2022 at 06:28:02PM +0100, Fabio M. De Francesco wrote:
> kmap() is being deprecated in favor of kmap_local_page().
>
> There are two main problems with kmap(): (1) It comes with an overhead as
> the mapping space is restricted and protected by a global lock for
> synchronization and (2) it also requires global TLB invalidation when the
> kmap’s pool wraps and it might block when the mapping space is fully
> utilized until a slot becomes available.
>
> With kmap_local_page() the mappings are per thread, CPU local, can take
> page faults, and can be called from any context (including interrupts).
> It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> the tasks can be preempted and, when they are scheduled to run again, the
> kernel virtual addresses are restored and still valid.
>
> Since its use in fs/ufs is safe everywhere, it should be preferred.
>
> Therefore, replace kmap() with kmap_local_page() in fs/ufs. kunmap_local()
> requires the mapping address, so return that address from ufs_get_page()
> to be used in ufs_put_page().

I don't see the calls to kunmap() in ufs_rename converted here?

Did I miss them?

I think those calls need to be changed to ufs_put_page() calls in a precursor
patch to this one unless I'm missing something.

>
> Suggested-by: Al Viro <[email protected]>
> Suggested-by: Ira Weiny <[email protected]>
> Signed-off-by: Fabio M. De Francesco <[email protected]>
> ---
> fs/ufs/dir.c | 75 ++++++++++++++++++++++++++++++++--------------------
> 1 file changed, 46 insertions(+), 29 deletions(-)
>
> diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
> index 9fa86614d2d1..ed3568da29a8 100644
> --- a/fs/ufs/dir.c
> +++ b/fs/ufs/dir.c
> @@ -61,9 +61,9 @@ static int ufs_commit_chunk(struct page *page, loff_t pos, unsigned len)
> return err;
> }
>
> -static inline void ufs_put_page(struct page *page)
> +static inline void ufs_put_page(struct page *page, void *page_addr)
> {
> - kunmap(page);
> + kunmap_local((void *)((unsigned long)page_addr & PAGE_MASK));

Any address in the page can be passed to kunmap_local() as this mask is done
internally.

> put_page(page);
> }
>
> @@ -76,7 +76,7 @@ ino_t ufs_inode_by_name(struct inode *dir, const struct qstr *qstr)
> de = ufs_find_entry(dir, qstr, &page);
> if (de) {
> res = fs32_to_cpu(dir->i_sb, de->d_ino);
> - ufs_put_page(page);
> + ufs_put_page(page, de);
> }
> return res;
> }
> @@ -99,18 +99,17 @@ void ufs_set_link(struct inode *dir, struct ufs_dir_entry *de,
> ufs_set_de_type(dir->i_sb, de, inode->i_mode);
>
> err = ufs_commit_chunk(page, pos, len);
> - ufs_put_page(page);
> + ufs_put_page(page, de);
> if (update_times)
> dir->i_mtime = dir->i_ctime = current_time(dir);
> mark_inode_dirty(dir);
> }
>
>
> -static bool ufs_check_page(struct page *page)
> +static bool ufs_check_page(struct page *page, char *kaddr)
> {
> struct inode *dir = page->mapping->host;
> struct super_block *sb = dir->i_sb;
> - char *kaddr = page_address(page);
> unsigned offs, rec_len;
> unsigned limit = PAGE_SIZE;
> const unsigned chunk_mask = UFS_SB(sb)->s_uspi->s_dirblksize - 1;
> @@ -185,23 +184,32 @@ static bool ufs_check_page(struct page *page)
> return false;
> }
>
> +/*
> + * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
> + * rules documented in kmap_local_page()/kunmap_local().
> + *
> + * NOTE: ufs_find_entry() and ufs_dotdot() act as calls to ufs_get_page()
> + * and must be treated accordingly for nesting purposes.
> + */
> static void *ufs_get_page(struct inode *dir, unsigned long n, struct page **p)
> {
> + char *kaddr;
> +
> struct address_space *mapping = dir->i_mapping;
> struct page *page = read_mapping_page(mapping, n, NULL);
> if (!IS_ERR(page)) {
> - kmap(page);
> + kaddr = kmap_local_page(page);
> if (unlikely(!PageChecked(page))) {
> - if (!ufs_check_page(page))
> + if (!ufs_check_page(page, kaddr))
> goto fail;
> }
> *p = page;
> - return page_address(page);
> + return kaddr;
> }
> return ERR_CAST(page);
>
> fail:
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> return ERR_PTR(-EIO);
> }
>
> @@ -227,6 +235,13 @@ ufs_next_entry(struct super_block *sb, struct ufs_dir_entry *p)
> fs16_to_cpu(sb, p->d_reclen));
> }
>
> +/*
> + * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
> + * rules documented in kmap_local_page()/kunmap_local().
> + *
> + * ufs_dotdot() acts as a call to ufs_get_page() and must be treated
> + * accordingly for nesting purposes.
> + */
> struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
> {
> struct ufs_dir_entry *de = ufs_get_page(dir, 0, p);
> @@ -238,12 +253,15 @@ struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
> }
>
> /*
> - * ufs_find_entry()
> + * Finds an entry in the specified directory with the wanted name. It returns a
> + * pointer to the directory's entry. The page in which the entry was found is
> + * in the res_page out parameter. The page is returned mapped and unlocked.
> + * The entry is guaranteed to be valid.
> *
> - * finds an entry in the specified directory with the wanted name. It
> - * returns the page in which the entry was found, and the entry itself
> - * (as a parameter - res_dir). Page is returned mapped and unlocked.
> - * Entry is guaranteed to be valid.

I don't follow why this comment needed changing for this patch. It probably
warrants it's own patch.

> + * On Success ufs_put_page() should be called on *res_page.
> + *
> + * ufs_find_entry() acts as a call to ufs_get_page() and must be treated
> + * accordingly for nesting purposes.
> */
> struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr *qstr,
> struct page **res_page)
> @@ -282,7 +300,7 @@ struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr *qstr,
> goto found;
> de = ufs_next_entry(sb, de);
> }
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> }
> if (++n >= npages)
> n = 0;
> @@ -360,7 +378,7 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
> de = (struct ufs_dir_entry *) ((char *) de + rec_len);
> }
> unlock_page(page);
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> }
> BUG();
> return -EINVAL;
> @@ -390,7 +408,7 @@ int ufs_add_link(struct dentry *dentry, struct inode *inode)
> mark_inode_dirty(dir);
> /* OFFSET_CACHE */
> out_put:
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> return err;
> out_unlock:
> unlock_page(page);
> @@ -468,13 +486,13 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
> ufs_get_de_namlen(sb, de),
> fs32_to_cpu(sb, de->d_ino),
> d_type)) {
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> return 0;
> }
> }
> ctx->pos += fs16_to_cpu(sb, de->d_reclen);
> }
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> }
> return 0;
> }
> @@ -485,10 +503,10 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
> * previous entry.
> */
> int ufs_delete_entry(struct inode *inode, struct ufs_dir_entry *dir,
> - struct page * page)
> + struct page *page)
> {
> struct super_block *sb = inode->i_sb;
> - char *kaddr = page_address(page);
> + char *kaddr = (char *)((unsigned long)dir & PAGE_MASK);

I feel like this deserves a comment to clarify that dir points somewhere in the
page we need the base address of.

> unsigned int from = offset_in_page(dir) & ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1);
> unsigned int to = offset_in_page(dir) + fs16_to_cpu(sb, dir->d_reclen);
> loff_t pos;
> @@ -527,7 +545,7 @@ int ufs_delete_entry(struct inode *inode, struct ufs_dir_entry *dir,
> inode->i_ctime = inode->i_mtime = current_time(inode);
> mark_inode_dirty(inode);
> out:
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> UFSD("EXIT\n");
> return err;
> }
> @@ -551,8 +569,7 @@ int ufs_make_empty(struct inode * inode, struct inode *dir)
> goto fail;
> }
>
> - kmap(page);
> - base = (char*)page_address(page);
> + base = kmap_local_page(page);

NIT: I'd make this conversion a separate patch.

Ira

> memset(base, 0, PAGE_SIZE);
>
> de = (struct ufs_dir_entry *) base;
> @@ -569,7 +586,7 @@ int ufs_make_empty(struct inode * inode, struct inode *dir)
> de->d_reclen = cpu_to_fs16(sb, chunk_size - UFS_DIR_REC_LEN(1));
> ufs_set_de_namlen(sb, de, 2);
> strcpy (de->d_name, "..");
> - kunmap(page);
> + kunmap_local(base);
>
> err = ufs_commit_chunk(page, 0, chunk_size);
> fail:
> @@ -585,9 +602,9 @@ int ufs_empty_dir(struct inode * inode)
> struct super_block *sb = inode->i_sb;
> struct page *page = NULL;
> unsigned long i, npages = dir_pages(inode);
> + char *kaddr;
>
> for (i = 0; i < npages; i++) {
> - char *kaddr;
> struct ufs_dir_entry *de;
>
> kaddr = ufs_get_page(inode, i, &page);
> @@ -620,12 +637,12 @@ int ufs_empty_dir(struct inode * inode)
> }
> de = ufs_next_entry(sb, de);
> }
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> }
> return 1;
>
> not_empty:
> - ufs_put_page(page);
> + ufs_put_page(page, kaddr);
> return 0;
> }
>
> --
> 2.39.0
>

2022-12-22 14:57:35

by Fabio M. De Francesco

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] fs/ufs: Replace kmap() with kmap_local_page()

On giovedì 22 dicembre 2022 06:41:01 CET Ira Weiny wrote:
> On Wed, Dec 21, 2022 at 06:28:02PM +0100, Fabio M. De Francesco wrote:
> > kmap() is being deprecated in favor of kmap_local_page().
> >
> > There are two main problems with kmap(): (1) It comes with an overhead as
> > the mapping space is restricted and protected by a global lock for
> > synchronization and (2) it also requires global TLB invalidation when the
> > kmap’s pool wraps and it might block when the mapping space is fully
> > utilized until a slot becomes available.
> >
> > With kmap_local_page() the mappings are per thread, CPU local, can take
> > page faults, and can be called from any context (including interrupts).
> > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> > the tasks can be preempted and, when they are scheduled to run again, the
> > kernel virtual addresses are restored and still valid.
> >
> > Since its use in fs/ufs is safe everywhere, it should be preferred.
> >
> > Therefore, replace kmap() with kmap_local_page() in fs/ufs. kunmap_local()
> > requires the mapping address, so return that address from ufs_get_page()
> > to be used in ufs_put_page().
>
> I don't see the calls to kunmap() in ufs_rename converted here?
>
> Did I miss them?
>

No, it's my fault.
I must have used "grep" on all files in fs/ufs, but I forgot to run it :-(

While at this... I'm wondering whether or not we could benefit from a WARNING
about the use of kunmap(). I'm talking about adding this too to checkpatch.pl,
exactly as we already have it for catching the deprecated use of kmap().

>
> I think those calls need to be changed to ufs_put_page() calls in a
precursor
> patch to this one unless I'm missing something.
>

Again I think that you are not missing anything and that your suggestion
sounds good.

I'll replace the three kunmap() + put_page() with three calls to
ufs_put_page() in ufs_rename(). I'll do these changes in patch 3/4. Instead
the current 3/4 patch will move ahead and become 4/4.

>
> > Suggested-by: Al Viro <[email protected]>
> > Suggested-by: Ira Weiny <[email protected]>
> > Signed-off-by: Fabio M. De Francesco <[email protected]>
> > ---
> >
> > fs/ufs/dir.c | 75 ++++++++++++++++++++++++++++++++--------------------
> > 1 file changed, 46 insertions(+), 29 deletions(-)
> >
> > diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
> > index 9fa86614d2d1..ed3568da29a8 100644
> > --- a/fs/ufs/dir.c
> > +++ b/fs/ufs/dir.c
> > @@ -61,9 +61,9 @@ static int ufs_commit_chunk(struct page *page, loff_t
pos,
> > unsigned len)>
> > return err;
> >
> > }
> >
> > -static inline void ufs_put_page(struct page *page)
> > +static inline void ufs_put_page(struct page *page, void *page_addr)
> >
> > {
> >
> > - kunmap(page);
> > + kunmap_local((void *)((unsigned long)page_addr & PAGE_MASK));
>
> Any address in the page can be passed to kunmap_local() as this mask is done
> internally.
>

I know that any address can be passed and that the bitwise and is performed
internally in kunmap_local_indexed(). This is why I've never done something
like this in any other of my precedent conversions.

However, I thought that Al should have had reasons to suggest to call
kunmap_local() this way. Copy-pasted from one of his message (https://
lore.kernel.org/lkml/Y4E++JERgUMoqfjG@ZenIV/) while commenting the one patch
old conversions:

--- begin ---

-static inline void ufs_put_page(struct page *page)
> +inline void ufs_put_page(struct page *page, void *page_addr)
> {
> - kunmap(page);
> + kunmap_local(page_addr);

Make that
kunmap_local((void *)((unsigned long)page_addr & PAGE_MASK));
and things become much easier.

> put_page(page);
> }

--- end ---

Did I misinterpret his words?
However, it's my fault again because I should have asked why :-(

> > put_page(page);
> >
> > }
> >
> > @@ -76,7 +76,7 @@ ino_t ufs_inode_by_name(struct inode *dir, const struct
> > qstr *qstr)>
> > de = ufs_find_entry(dir, qstr, &page);
> > if (de) {
> >
> > res = fs32_to_cpu(dir->i_sb, de->d_ino);
> >
> > - ufs_put_page(page);
> > + ufs_put_page(page, de);
> >
> > }
> > return res;
> >
> > }
> >
> > @@ -99,18 +99,17 @@ void ufs_set_link(struct inode *dir, struct
> > ufs_dir_entry *de,>
> > ufs_set_de_type(dir->i_sb, de, inode->i_mode);
> >
> > err = ufs_commit_chunk(page, pos, len);
> >
> > - ufs_put_page(page);
> > + ufs_put_page(page, de);
> >
> > if (update_times)
> >
> > dir->i_mtime = dir->i_ctime = current_time(dir);
> >
> > mark_inode_dirty(dir);
> >
> > }
> >
> > -static bool ufs_check_page(struct page *page)
> > +static bool ufs_check_page(struct page *page, char *kaddr)
> >
> > {
> >
> > struct inode *dir = page->mapping->host;
> > struct super_block *sb = dir->i_sb;
> >
> > - char *kaddr = page_address(page);
> >
> > unsigned offs, rec_len;
> > unsigned limit = PAGE_SIZE;
> > const unsigned chunk_mask = UFS_SB(sb)->s_uspi->s_dirblksize - 1;
> >
> > @@ -185,23 +184,32 @@ static bool ufs_check_page(struct page *page)
> >
> > return false;
> >
> > }
> >
> > +/*
> > + * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
> > + * rules documented in kmap_local_page()/kunmap_local().
> > + *
> > + * NOTE: ufs_find_entry() and ufs_dotdot() act as calls to ufs_get_page()
> > + * and must be treated accordingly for nesting purposes.
> > + */
> >
> > static void *ufs_get_page(struct inode *dir, unsigned long n, struct page
> > **p) {
> >
> > + char *kaddr;
> > +
> >
> > struct address_space *mapping = dir->i_mapping;
> > struct page *page = read_mapping_page(mapping, n, NULL);
> > if (!IS_ERR(page)) {
> >
> > - kmap(page);
> > + kaddr = kmap_local_page(page);
> >
> > if (unlikely(!PageChecked(page))) {
> >
> > - if (!ufs_check_page(page))
> > + if (!ufs_check_page(page, kaddr))
> >
> > goto fail;
> >
> > }
> > *p = page;
> >
> > - return page_address(page);
> > + return kaddr;
> >
> > }
> > return ERR_CAST(page);
> >
> > fail:
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > return ERR_PTR(-EIO);
> >
> > }
> >
> > @@ -227,6 +235,13 @@ ufs_next_entry(struct super_block *sb, struct
> > ufs_dir_entry *p)>
> > fs16_to_cpu(sb, p-
>d_reclen));
> >
> > }
> >
> > +/*
> > + * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
> > + * rules documented in kmap_local_page()/kunmap_local().
> > + *
> > + * ufs_dotdot() acts as a call to ufs_get_page() and must be treated
> > + * accordingly for nesting purposes.
> > + */
> >
> > struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
> > {
> >
> > struct ufs_dir_entry *de = ufs_get_page(dir, 0, p);
> >
> > @@ -238,12 +253,15 @@ struct ufs_dir_entry *ufs_dotdot(struct inode *dir,
> > struct page **p)>
> > }
> >
> > /*
> >
> > - * ufs_find_entry()
> > + * Finds an entry in the specified directory with the wanted name. It
> > returns a + * pointer to the directory's entry. The page in which the
entry
> > was found is + * in the res_page out parameter. The page is returned
mapped
> > and unlocked. + * The entry is guaranteed to be valid.
> >
> > *
> >
> > - * finds an entry in the specified directory with the wanted name. It
> > - * returns the page in which the entry was found, and the entry itself
> > - * (as a parameter - res_dir). Page is returned mapped and unlocked.
> > - * Entry is guaranteed to be valid.
>
> I don't follow why this comment needed changing for this patch. It probably
> warrants it's own patch.
>

Sure, the removal of the name of function is a different logical change, so
I'll probably leave it as it was.

> > + * On Success ufs_put_page() should be called on *res_page.
> > + *
> > + * ufs_find_entry() acts as a call to ufs_get_page() and must be treated
> > + * accordingly for nesting purposes.
> >
> > */

But this last part should be still added. Am I wrong?

> > struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr
*qstr,
> > struct page **res_page)
> >
> > @@ -282,7 +300,7 @@ struct ufs_dir_entry *ufs_find_entry(struct inode
*dir,
> > const struct qstr *qstr,>
> > goto found;
> >
> > de = ufs_next_entry(sb, de);
> >
> > }
> >
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > }
> > if (++n >= npages)
> >
> > n = 0;
> >
> > @@ -360,7 +378,7 @@ int ufs_add_link(struct dentry *dentry, struct inode
> > *inode)>
> > de = (struct ufs_dir_entry *) ((char *) de +
rec_len);
> >
> > }
> > unlock_page(page);
> >
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > }
> > BUG();
> > return -EINVAL;
> >
> > @@ -390,7 +408,7 @@ int ufs_add_link(struct dentry *dentry, struct inode
> > *inode)>
> > mark_inode_dirty(dir);
> > /* OFFSET_CACHE */
> >
> > out_put:
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > return err;
> >
> > out_unlock:
> > unlock_page(page);
> >
> > @@ -468,13 +486,13 @@ ufs_readdir(struct file *file, struct dir_context
> > *ctx)
> >
> > ufs_get_de_namlen(sb,
de),
> > fs32_to_cpu(sb, de-
>d_ino),
> > d_type)) {
> >
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > return 0;
> >
> > }
> >
> > }
> > ctx->pos += fs16_to_cpu(sb, de->d_reclen);
> >
> > }
> >
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > }
> > return 0;
> >
> > }
> >
> > @@ -485,10 +503,10 @@ ufs_readdir(struct file *file, struct dir_context
> > *ctx)
> >
> > * previous entry.
> > */
> >
> > int ufs_delete_entry(struct inode *inode, struct ufs_dir_entry *dir,
> >
> > - struct page * page)
> > + struct page *page)
> >
> > {
> >
> > struct super_block *sb = inode->i_sb;
> >
> > - char *kaddr = page_address(page);
> > + char *kaddr = (char *)((unsigned long)dir & PAGE_MASK);
>
> I feel like this deserves a comment to clarify that dir points somewhere in
> the page we need the base address of.

OK, it sounds reasonable.

> > unsigned int from = offset_in_page(dir) &
> > ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1); unsigned int to =
> > offset_in_page(dir) + fs16_to_cpu(sb, dir->d_reclen); loff_t pos;
> >
> > @@ -527,7 +545,7 @@ int ufs_delete_entry(struct inode *inode, struct
> > ufs_dir_entry *dir,>
> > inode->i_ctime = inode->i_mtime = current_time(inode);
> > mark_inode_dirty(inode);
> >
> > out:
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > UFSD("EXIT\n");
> > return err;
> >
> > }
> >
> > @@ -551,8 +569,7 @@ int ufs_make_empty(struct inode * inode, struct inode
> > *dir)>
> > goto fail;
> >
> > }
> >
> > - kmap(page);
> > - base = (char*)page_address(page);
> > + base = kmap_local_page(page);
>
> NIT: I'd make this conversion a separate patch.
>
> Ira
>

We've always done multiple conversions at the same time if in the same file,
even if they were unrelated.

I don't understand why we want to change the usual procedure. Can you please
elaborate a bit more on this topic?

Thanks so much for finding the missing conversions and for your other comments
and advice on this patch.

Fabio

> > memset(base, 0, PAGE_SIZE);
> >
> > de = (struct ufs_dir_entry *) base;
> >
> > @@ -569,7 +586,7 @@ int ufs_make_empty(struct inode * inode, struct inode
> > *dir)>
> > de->d_reclen = cpu_to_fs16(sb, chunk_size - UFS_DIR_REC_LEN(1));
> > ufs_set_de_namlen(sb, de, 2);
> > strcpy (de->d_name, "..");
> >
> > - kunmap(page);
> > + kunmap_local(base);
> >
> > err = ufs_commit_chunk(page, 0, chunk_size);
> >
> > fail:
> > @@ -585,9 +602,9 @@ int ufs_empty_dir(struct inode * inode)
> >
> > struct super_block *sb = inode->i_sb;
> > struct page *page = NULL;
> > unsigned long i, npages = dir_pages(inode);
> >
> > + char *kaddr;
> >
> > for (i = 0; i < npages; i++) {
> >
> > - char *kaddr;
> >
> > struct ufs_dir_entry *de;
> >
> > kaddr = ufs_get_page(inode, i, &page);
> >
> > @@ -620,12 +637,12 @@ int ufs_empty_dir(struct inode * inode)
> >
> > }
> > de = ufs_next_entry(sb, de);
> >
> > }
> >
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > }
> > return 1;
> >
> > not_empty:
> > - ufs_put_page(page);
> > + ufs_put_page(page, kaddr);
> >
> > return 0;
> >
> > }
> >
> > --
> > 2.39.0




2022-12-22 15:08:10

by Fabio M. De Francesco

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] fs/ufs: Change the signature of ufs_get_page()

On gioved? 22 dicembre 2022 06:13:43 CET Ira Weiny wrote:
> On Wed, Dec 21, 2022 at 06:28:01PM +0100, Fabio M. De Francesco wrote:
> > Change the signature of ufs_get_page() in order to prepare this function
> > to the conversion to the use of kmap_local_page(). Change also those call
> > sites which are required to conform its invocations to the new
> > signature.
> >
> > Cc: Ira Weiny <[email protected]>
> > Suggested-by: Al Viro <[email protected]>
> > Signed-off-by: Fabio M. De Francesco <[email protected]>
> > ---
> >
> > fs/ufs/dir.c | 49 +++++++++++++++++++++----------------------------
> > 1 file changed, 21 insertions(+), 28 deletions(-)
> >
> > diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
> > index 69f78583c9c1..9fa86614d2d1 100644
> > --- a/fs/ufs/dir.c
> > +++ b/fs/ufs/dir.c
> > @@ -185,7 +185,7 @@ static bool ufs_check_page(struct page *page)
> >
> > return false;
> >
> > }
> >
> > -static struct page *ufs_get_page(struct inode *dir, unsigned long n)
> > +static void *ufs_get_page(struct inode *dir, unsigned long n, struct page
> > **p)>
> > {
> >
> > struct address_space *mapping = dir->i_mapping;
> > struct page *page = read_mapping_page(mapping, n, NULL);
> >
> > @@ -195,8 +195,10 @@ static struct page *ufs_get_page(struct inode *dir,
> > unsigned long n)>
> > if (!ufs_check_page(page))
> >
> > goto fail;
> >
> > }
> >
> > + *p = page;
> > + return page_address(page);
> >
> > }
> >
> > - return page;
> > + return ERR_CAST(page);
> >
> > fail:
> > ufs_put_page(page);
> >
> > @@ -227,15 +229,12 @@ ufs_next_entry(struct super_block *sb, struct
> > ufs_dir_entry *p)>
> > struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
> > {
> >
> > - struct page *page = ufs_get_page(dir, 0);
> > - struct ufs_dir_entry *de = NULL;
> > + struct ufs_dir_entry *de = ufs_get_page(dir, 0, p);
>
> I don't know why but ufs_get_page() returning an address read really odd to
> me. But rolling around my head alternative names nothing seems better than
> this.

ufs_get_kaddr()?

> > - if (!IS_ERR(page)) {
> > - de = ufs_next_entry(dir->i_sb,
> > - (struct ufs_dir_entry
*)page_address(page));
> > - *p = page;
> > - }
> > - return de;
> > + if (!IS_ERR(de))
> > + return ufs_next_entry(dir->i_sb, de);
> > + else
> > + return NULL;
> >
> > }
> >
> > /*
> >
> > @@ -273,11 +272,10 @@ struct ufs_dir_entry *ufs_find_entry(struct inode
> > *dir, const struct qstr *qstr,>
> > start = 0;
> >
> > n = start;
> > do {
> >
> > - char *kaddr;
> > - page = ufs_get_page(dir, n);
> > - if (!IS_ERR(page)) {
> > - kaddr = page_address(page);
> > - de = (struct ufs_dir_entry *) kaddr;
> > + char *kaddr = ufs_get_page(dir, n, &page);
> > +
> > + if (!IS_ERR(kaddr)) {
> > + de = (struct ufs_dir_entry *)kaddr;
> >
> > kaddr += ufs_last_byte(dir, n) - reclen;
> > while ((char *) de <= kaddr) {
> >
> > if (ufs_match(sb, namelen, name, de))
> >
> > @@ -328,12 +326,10 @@ int ufs_add_link(struct dentry *dentry, struct inode
> > *inode)>
> > for (n = 0; n <= npages; n++) {
> >
> > char *dir_end;
> >
> > - page = ufs_get_page(dir, n);
> > - err = PTR_ERR(page);
> > - if (IS_ERR(page))
> > - goto out;
> > + kaddr = ufs_get_page(dir, n, &page);
> > + if (IS_ERR(kaddr))
> > + return PTR_ERR(kaddr);
> >
> > lock_page(page);
> >
> > - kaddr = page_address(page);
> >
> > dir_end = kaddr + ufs_last_byte(dir, n);
> > de = (struct ufs_dir_entry *)kaddr;
> > kaddr += PAGE_SIZE - reclen;
> >
> > @@ -395,7 +391,6 @@ int ufs_add_link(struct dentry *dentry, struct inode
> > *inode)>
> > /* OFFSET_CACHE */
> >
> > out_put:
> > ufs_put_page(page);
> >
> > -out:
> > return err;
> >
> > out_unlock:
> > unlock_page(page);
> >
> > @@ -429,6 +424,7 @@ ufs_readdir(struct file *file, struct dir_context
*ctx)
> >
> > unsigned chunk_mask = ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1);
> > bool need_revalidate = !inode_eq_iversion(inode, file->f_version);
> > unsigned flags = UFS_SB(sb)->s_flags;
> >
> > + struct page *page;
>
> NIT: Does page now leave the scope of the for loop?
>

Strange...
I can't say why I did so.

> > UFSD("BEGIN\n");
> >
> > @@ -439,16 +435,14 @@ ufs_readdir(struct file *file, struct dir_context
> > *ctx)
> >
> > char *kaddr, *limit;
> > struct ufs_dir_entry *de;
>
> Couldn't that be declared here?

Yes, it could :-)

> Regardless I don't think this is broken.

Since I have to submit a new version of this series, there's no problem moving
the declaration of "page" back into the loop.

> Reviewed-by: Ira Weiny <[email protected]>

Thanks,

Fabio
>
> > - struct page *page = ufs_get_page(inode, n);
> > -
> > - if (IS_ERR(page)) {
> > + kaddr = ufs_get_page(inode, n, &page);
> > + if (IS_ERR(kaddr)) {
> >
> > ufs_error(sb, __func__,
> >
> > "bad page in #%lu",
> > inode->i_ino);
> >
> > ctx->pos += PAGE_SIZE - offset;
> > return -EIO;
> >
> > }
> >
> > - kaddr = page_address(page);
> >
> > if (unlikely(need_revalidate)) {
> >
> > if (offset) {
> >
> > offset = ufs_validate_entry(sb, kaddr,
offset, chunk_mask);
> >
> > @@ -595,12 +589,11 @@ int ufs_empty_dir(struct inode * inode)
> >
> > for (i = 0; i < npages; i++) {
> >
> > char *kaddr;
> > struct ufs_dir_entry *de;
> >
> > - page = ufs_get_page(inode, i);
> >
> > - if (IS_ERR(page))
> > + kaddr = ufs_get_page(inode, i, &page);
> > + if (IS_ERR(kaddr))
> >
> > continue;
> >
> > - kaddr = page_address(page);
> >
> > de = (struct ufs_dir_entry *)kaddr;
> > kaddr += ufs_last_byte(inode, i) - UFS_DIR_REC_LEN(1);
> >
> > --
> > 2.39.0




2022-12-22 23:06:43

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] fs/ufs: Replace kmap() with kmap_local_page()

On Thu, Dec 22, 2022 at 03:27:08PM +0100, Fabio M. De Francesco wrote:
> On giovedì 22 dicembre 2022 06:41:01 CET Ira Weiny wrote:
> > On Wed, Dec 21, 2022 at 06:28:02PM +0100, Fabio M. De Francesco wrote:
> > > kmap() is being deprecated in favor of kmap_local_page().
> > >
> > > There are two main problems with kmap(): (1) It comes with an overhead as
> > > the mapping space is restricted and protected by a global lock for
> > > synchronization and (2) it also requires global TLB invalidation when the
> > > kmap’s pool wraps and it might block when the mapping space is fully
> > > utilized until a slot becomes available.
> > >
> > > With kmap_local_page() the mappings are per thread, CPU local, can take
> > > page faults, and can be called from any context (including interrupts).
> > > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
> > > the tasks can be preempted and, when they are scheduled to run again, the
> > > kernel virtual addresses are restored and still valid.
> > >
> > > Since its use in fs/ufs is safe everywhere, it should be preferred.
> > >
> > > Therefore, replace kmap() with kmap_local_page() in fs/ufs. kunmap_local()
> > > requires the mapping address, so return that address from ufs_get_page()
> > > to be used in ufs_put_page().
> >
> > I don't see the calls to kunmap() in ufs_rename converted here?
> >
> > Did I miss them?
> >
>
> No, it's my fault.
> I must have used "grep" on all files in fs/ufs, but I forgot to run it :-(
>
> While at this... I'm wondering whether or not we could benefit from a WARNING
> about the use of kunmap(). I'm talking about adding this too to checkpatch.pl,
> exactly as we already have it for catching the deprecated use of kmap().

That would not have caught this issue. Any addition of kunmap() in a patch
would have to come with a call to kmap(). (Unless they are fixing some kmap
bug I suppose.) I'm not sure how the checkpatch.pl maintainers would feel
about this. You can always submit a patch and find out but I would not worry
about it.

>
> >
> > I think those calls need to be changed to ufs_put_page() calls in a
> precursor
> > patch to this one unless I'm missing something.
> >
>
> Again I think that you are not missing anything and that your suggestion
> sounds good.
>
> I'll replace the three kunmap() + put_page() with three calls to
> ufs_put_page() in ufs_rename(). I'll do these changes in patch 3/4. Instead
> the current 3/4 patch will move ahead and become 4/4.

Sounds good.

>
> >
> > > Suggested-by: Al Viro <[email protected]>
> > > Suggested-by: Ira Weiny <[email protected]>
> > > Signed-off-by: Fabio M. De Francesco <[email protected]>
> > > ---
> > >
> > > fs/ufs/dir.c | 75 ++++++++++++++++++++++++++++++++--------------------
> > > 1 file changed, 46 insertions(+), 29 deletions(-)
> > >
> > > diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
> > > index 9fa86614d2d1..ed3568da29a8 100644
> > > --- a/fs/ufs/dir.c
> > > +++ b/fs/ufs/dir.c
> > > @@ -61,9 +61,9 @@ static int ufs_commit_chunk(struct page *page, loff_t
> pos,
> > > unsigned len)>
> > > return err;
> > >
> > > }
> > >
> > > -static inline void ufs_put_page(struct page *page)
> > > +static inline void ufs_put_page(struct page *page, void *page_addr)
> > >
> > > {
> > >
> > > - kunmap(page);
> > > + kunmap_local((void *)((unsigned long)page_addr & PAGE_MASK));
> >
> > Any address in the page can be passed to kunmap_local() as this mask is done
> > internally.
> >
>
> I know that any address can be passed and that the bitwise and is performed
> internally in kunmap_local_indexed(). This is why I've never done something
> like this in any other of my precedent conversions.
>
> However, I thought that Al should have had reasons to suggest to call
> kunmap_local() this way. Copy-pasted from one of his message (https://
> lore.kernel.org/lkml/Y4E++JERgUMoqfjG@ZenIV/) while commenting the one patch
> old conversions:
>
> --- begin ---
>
> -static inline void ufs_put_page(struct page *page)
> > +inline void ufs_put_page(struct page *page, void *page_addr)
> > {
> > - kunmap(page);
> > + kunmap_local(page_addr);
>
> Make that
> kunmap_local((void *)((unsigned long)page_addr & PAGE_MASK));
> and things become much easier.
>
> > put_page(page);
> > }
>
> --- end ---
>
> Did I misinterpret his words?
> However, it's my fault again because I should have asked why :-(

Perhaps Al did not know that kunmap_local() would take care of this for you?

>
> > > put_page(page);
> > >
> > > }
> > >
> > > @@ -76,7 +76,7 @@ ino_t ufs_inode_by_name(struct inode *dir, const struct
> > > qstr *qstr)>
> > > de = ufs_find_entry(dir, qstr, &page);
> > > if (de) {
> > >
> > > res = fs32_to_cpu(dir->i_sb, de->d_ino);
> > >
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, de);
> > >
> > > }
> > > return res;
> > >
> > > }
> > >
> > > @@ -99,18 +99,17 @@ void ufs_set_link(struct inode *dir, struct
> > > ufs_dir_entry *de,>
> > > ufs_set_de_type(dir->i_sb, de, inode->i_mode);
> > >
> > > err = ufs_commit_chunk(page, pos, len);
> > >
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, de);
> > >
> > > if (update_times)
> > >
> > > dir->i_mtime = dir->i_ctime = current_time(dir);
> > >
> > > mark_inode_dirty(dir);
> > >
> > > }
> > >
> > > -static bool ufs_check_page(struct page *page)
> > > +static bool ufs_check_page(struct page *page, char *kaddr)
> > >
> > > {
> > >
> > > struct inode *dir = page->mapping->host;
> > > struct super_block *sb = dir->i_sb;
> > >
> > > - char *kaddr = page_address(page);
> > >
> > > unsigned offs, rec_len;
> > > unsigned limit = PAGE_SIZE;
> > > const unsigned chunk_mask = UFS_SB(sb)->s_uspi->s_dirblksize - 1;
> > >
> > > @@ -185,23 +184,32 @@ static bool ufs_check_page(struct page *page)
> > >
> > > return false;
> > >
> > > }
> > >
> > > +/*
> > > + * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
> > > + * rules documented in kmap_local_page()/kunmap_local().
> > > + *
> > > + * NOTE: ufs_find_entry() and ufs_dotdot() act as calls to ufs_get_page()
> > > + * and must be treated accordingly for nesting purposes.
> > > + */
> > >
> > > static void *ufs_get_page(struct inode *dir, unsigned long n, struct page
> > > **p) {
> > >
> > > + char *kaddr;
> > > +
> > >
> > > struct address_space *mapping = dir->i_mapping;
> > > struct page *page = read_mapping_page(mapping, n, NULL);
> > > if (!IS_ERR(page)) {
> > >
> > > - kmap(page);
> > > + kaddr = kmap_local_page(page);
> > >
> > > if (unlikely(!PageChecked(page))) {
> > >
> > > - if (!ufs_check_page(page))
> > > + if (!ufs_check_page(page, kaddr))
> > >
> > > goto fail;
> > >
> > > }
> > > *p = page;
> > >
> > > - return page_address(page);
> > > + return kaddr;
> > >
> > > }
> > > return ERR_CAST(page);
> > >
> > > fail:
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > return ERR_PTR(-EIO);
> > >
> > > }
> > >
> > > @@ -227,6 +235,13 @@ ufs_next_entry(struct super_block *sb, struct
> > > ufs_dir_entry *p)>
> > > fs16_to_cpu(sb, p-
> >d_reclen));
> > >
> > > }
> > >
> > > +/*
> > > + * Calls to ufs_get_page()/ufs_put_page() must be nested according to the
> > > + * rules documented in kmap_local_page()/kunmap_local().
> > > + *
> > > + * ufs_dotdot() acts as a call to ufs_get_page() and must be treated
> > > + * accordingly for nesting purposes.
> > > + */
> > >
> > > struct ufs_dir_entry *ufs_dotdot(struct inode *dir, struct page **p)
> > > {
> > >
> > > struct ufs_dir_entry *de = ufs_get_page(dir, 0, p);
> > >
> > > @@ -238,12 +253,15 @@ struct ufs_dir_entry *ufs_dotdot(struct inode *dir,
> > > struct page **p)>
> > > }
> > >
> > > /*
> > >
> > > - * ufs_find_entry()
> > > + * Finds an entry in the specified directory with the wanted name. It
> > > returns a + * pointer to the directory's entry. The page in which the
> entry
> > > was found is + * in the res_page out parameter. The page is returned
> mapped
> > > and unlocked. + * The entry is guaranteed to be valid.
> > >
> > > *
> > >
> > > - * finds an entry in the specified directory with the wanted name. It
> > > - * returns the page in which the entry was found, and the entry itself
> > > - * (as a parameter - res_dir). Page is returned mapped and unlocked.
> > > - * Entry is guaranteed to be valid.
> >
> > I don't follow why this comment needed changing for this patch. It probably
> > warrants it's own patch.
> >
>
> Sure, the removal of the name of function is a different logical change, so
> I'll probably leave it as it was.
>
> > > + * On Success ufs_put_page() should be called on *res_page.
> > > + *
> > > + * ufs_find_entry() acts as a call to ufs_get_page() and must be treated
> > > + * accordingly for nesting purposes.
> > >
> > > */
>
> But this last part should be still added. Am I wrong?

You are not wrong. Adding this is appropriate. Just not the rest which seemed
very minor changes anyway.

>
> > > struct ufs_dir_entry *ufs_find_entry(struct inode *dir, const struct qstr
> *qstr,
> > > struct page **res_page)
> > >
> > > @@ -282,7 +300,7 @@ struct ufs_dir_entry *ufs_find_entry(struct inode
> *dir,
> > > const struct qstr *qstr,>
> > > goto found;
> > >
> > > de = ufs_next_entry(sb, de);
> > >
> > > }
> > >
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > }
> > > if (++n >= npages)
> > >
> > > n = 0;
> > >
> > > @@ -360,7 +378,7 @@ int ufs_add_link(struct dentry *dentry, struct inode
> > > *inode)>
> > > de = (struct ufs_dir_entry *) ((char *) de +
> rec_len);
> > >
> > > }
> > > unlock_page(page);
> > >
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > }
> > > BUG();
> > > return -EINVAL;
> > >
> > > @@ -390,7 +408,7 @@ int ufs_add_link(struct dentry *dentry, struct inode
> > > *inode)>
> > > mark_inode_dirty(dir);
> > > /* OFFSET_CACHE */
> > >
> > > out_put:
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > return err;
> > >
> > > out_unlock:
> > > unlock_page(page);
> > >
> > > @@ -468,13 +486,13 @@ ufs_readdir(struct file *file, struct dir_context
> > > *ctx)
> > >
> > > ufs_get_de_namlen(sb,
> de),
> > > fs32_to_cpu(sb, de-
> >d_ino),
> > > d_type)) {
> > >
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > return 0;
> > >
> > > }
> > >
> > > }
> > > ctx->pos += fs16_to_cpu(sb, de->d_reclen);
> > >
> > > }
> > >
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > }
> > > return 0;
> > >
> > > }
> > >
> > > @@ -485,10 +503,10 @@ ufs_readdir(struct file *file, struct dir_context
> > > *ctx)
> > >
> > > * previous entry.
> > > */
> > >
> > > int ufs_delete_entry(struct inode *inode, struct ufs_dir_entry *dir,
> > >
> > > - struct page * page)
> > > + struct page *page)
> > >
> > > {
> > >
> > > struct super_block *sb = inode->i_sb;
> > >
> > > - char *kaddr = page_address(page);
> > > + char *kaddr = (char *)((unsigned long)dir & PAGE_MASK);
> >
> > I feel like this deserves a comment to clarify that dir points somewhere in
> > the page we need the base address of.
>
> OK, it sounds reasonable.
>
> > > unsigned int from = offset_in_page(dir) &
> > > ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1); unsigned int to =
> > > offset_in_page(dir) + fs16_to_cpu(sb, dir->d_reclen); loff_t pos;
> > >
> > > @@ -527,7 +545,7 @@ int ufs_delete_entry(struct inode *inode, struct
> > > ufs_dir_entry *dir,>
> > > inode->i_ctime = inode->i_mtime = current_time(inode);
> > > mark_inode_dirty(inode);
> > >
> > > out:
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > UFSD("EXIT\n");
> > > return err;
> > >
> > > }
> > >
> > > @@ -551,8 +569,7 @@ int ufs_make_empty(struct inode * inode, struct inode
> > > *dir)>
> > > goto fail;
> > >
> > > }
> > >
> > > - kmap(page);
> > > - base = (char*)page_address(page);
> > > + base = kmap_local_page(page);
> >
> > NIT: I'd make this conversion a separate patch.
> >
> > Ira
> >
>
> We've always done multiple conversions at the same time if in the same file,
> even if they were unrelated.
>
> I don't understand why we want to change the usual procedure. Can you please
> elaborate a bit more on this topic?

The difference here is we are making a lot of changes to the
ufs_{get,put}_page() calls and all their callers and this is not part of those
changes. So reviewing those was hard enough without looking at this more
mundane change. But like I said it is a nit so feel free to leave it as the
change looks fine.

>
> Thanks so much for finding the missing conversions and for your other comments
> and advice on this patch.

NP!

Ira

>
> Fabio
>
> > > memset(base, 0, PAGE_SIZE);
> > >
> > > de = (struct ufs_dir_entry *) base;
> > >
> > > @@ -569,7 +586,7 @@ int ufs_make_empty(struct inode * inode, struct inode
> > > *dir)>
> > > de->d_reclen = cpu_to_fs16(sb, chunk_size - UFS_DIR_REC_LEN(1));
> > > ufs_set_de_namlen(sb, de, 2);
> > > strcpy (de->d_name, "..");
> > >
> > > - kunmap(page);
> > > + kunmap_local(base);
> > >
> > > err = ufs_commit_chunk(page, 0, chunk_size);
> > >
> > > fail:
> > > @@ -585,9 +602,9 @@ int ufs_empty_dir(struct inode * inode)
> > >
> > > struct super_block *sb = inode->i_sb;
> > > struct page *page = NULL;
> > > unsigned long i, npages = dir_pages(inode);
> > >
> > > + char *kaddr;
> > >
> > > for (i = 0; i < npages; i++) {
> > >
> > > - char *kaddr;
> > >
> > > struct ufs_dir_entry *de;
> > >
> > > kaddr = ufs_get_page(inode, i, &page);
> > >
> > > @@ -620,12 +637,12 @@ int ufs_empty_dir(struct inode * inode)
> > >
> > > }
> > > de = ufs_next_entry(sb, de);
> > >
> > > }
> > >
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > }
> > > return 1;
> > >
> > > not_empty:
> > > - ufs_put_page(page);
> > > + ufs_put_page(page, kaddr);
> > >
> > > return 0;
> > >
> > > }
> > >
> > > --
> > > 2.39.0
>
>
>
>