2023-12-11 14:07:49

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 00/18] block: don't access bd_inode directly from other modules

From: Yu Kuai <[email protected]>

Changes in v2:
- remove some bdev apis that is not necessary;
- pass in offset for bdev_read_folio() and __bdev_get_folio();
- remove bdev_gfp_constraint() and add a new helper in fs/buffer.c to
prevent access bd_indoe() directly from mapping_gfp_constraint() in
ext4.(patch 15, 16);
- remove block_device_ejected() from ext4.

Noted that following is not changed yet since v1:
- Chirstoph suggested to remove invalidate_inode_pages2() from
xen_update_blkif_status(), however, this sync_bdev() + invalidate_bdev()
is used from many modules, and I'll leave this for later if we want to
kill all of them.
- Matthew suggested that pass in valid file_ra_state for cramfs,
however, I don't see an easy way to do this for cramfs_lookup() and
cramfs_read_super().

Patch 1 add some bdev apis, then follow up patches will use these apis
to avoid access bd_inode directly, and hopefully the field bd_inode can
be removed eventually(after figure out a way for fs/buffer.c).

Yu Kuai (18):
block: add some bdev apis
xen/blkback: use bdev api in xen_update_blkif_status()
bcache: use bdev api in read_super()
mtd: block2mtd: use bdev apis
s390/dasd: use bdev api in dasd_format()
scsicam: use bdev api in scsi_bios_ptable()
bcachefs: remove dead function bdev_sectors()
bio: export bio_add_folio_nofail()
btrfs: use bdev apis
cramfs: use bdev apis in cramfs_blkdev_read()
erofs: use bdev api
gfs2: use bdev api
nilfs2: use bdev api in nilfs_attach_log_writer()
jbd2: use bdev apis
buffer: add a new helper to read sb block
ext4: use new helper to read sb block
ext4: remove block_device_ejected()
ext4: use bdev apis

block/bdev.c | 70 ++++++++++++++++++++++++++
block/bio.c | 1 +
block/blk.h | 2 -
drivers/block/xen-blkback/xenbus.c | 3 +-
drivers/md/bcache/super.c | 11 ++--
drivers/mtd/devices/block2mtd.c | 81 +++++++++++++-----------------
drivers/s390/block/dasd_ioctl.c | 5 +-
drivers/scsi/scsicam.c | 4 +-
fs/bcachefs/util.h | 5 --
fs/btrfs/disk-io.c | 71 ++++++++++++--------------
fs/btrfs/volumes.c | 17 +++----
fs/btrfs/zoned.c | 15 +++---
fs/buffer.c | 68 +++++++++++++++++--------
fs/cramfs/inode.c | 36 +++++--------
fs/erofs/data.c | 18 ++++---
fs/erofs/internal.h | 2 +
fs/ext4/dir.c | 6 +--
fs/ext4/ext4.h | 13 -----
fs/ext4/ext4_jbd2.c | 6 +--
fs/ext4/inode.c | 8 +--
fs/ext4/super.c | 66 ++++--------------------
fs/ext4/symlink.c | 2 +-
fs/gfs2/glock.c | 2 +-
fs/gfs2/ops_fstype.c | 2 +-
fs/jbd2/journal.c | 3 +-
fs/jbd2/recovery.c | 6 +--
fs/nilfs2/segment.c | 2 +-
include/linux/blkdev.h | 17 +++++++
include/linux/buffer_head.h | 18 ++++++-
29 files changed, 301 insertions(+), 259 deletions(-)

--
2.39.2



2023-12-11 14:08:09

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 02/18] xen/blkback: use bdev api in xen_update_blkif_status()

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/block/xen-blkback/xenbus.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index e34219ea2b05..e645afa4af57 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -104,8 +104,7 @@ static void xen_update_blkif_status(struct xen_blkif *blkif)
xenbus_dev_error(blkif->be->dev, err, "block flush");
return;
}
- invalidate_inode_pages2(
- blkif->vbd.bdev_handle->bdev->bd_inode->i_mapping);
+ invalidate_bdev(blkif->vbd.bdev_handle->bdev);

for (i = 0; i < blkif->nr_rings; i++) {
ring = &blkif->rings[i];
--
2.39.2


2023-12-11 14:08:31

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 01/18] block: add some bdev apis

From: Yu Kuai <[email protected]>

Those apis will be used for other modules, so that bd_inode won't be
accessed directly from other modules.

Signed-off-by: Yu Kuai <[email protected]>
---
block/bdev.c | 70 ++++++++++++++++++++++++++++++++++++++++++
block/blk.h | 2 --
include/linux/blkdev.h | 17 ++++++++++
3 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 750aec178b6a..9a469753eb4b 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -89,6 +89,13 @@ void invalidate_bdev(struct block_device *bdev)
}
EXPORT_SYMBOL(invalidate_bdev);

+void invalidate_bdev_range(struct block_device *bdev, pgoff_t start,
+ pgoff_t end)
+{
+ invalidate_mapping_pages(bdev->bd_inode->i_mapping, start, end);
+}
+EXPORT_SYMBOL_GPL(invalidate_bdev_range);
+
/*
* Drop all buffers & page cache for given bdev range. This function bails
* with error if bdev has other exclusive owner (such as filesystem).
@@ -121,6 +128,7 @@ int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
lstart >> PAGE_SHIFT,
lend >> PAGE_SHIFT);
}
+EXPORT_SYMBOL_GPL(truncate_bdev_range);

static void set_init_blocksize(struct block_device *bdev)
{
@@ -1102,3 +1110,65 @@ void bdev_statx_dioalign(struct inode *inode, struct kstat *stat)

blkdev_put_no_open(bdev);
}
+
+struct folio *bdev_read_folio(struct block_device *bdev, loff_t pos)
+{
+ return mapping_read_folio_gfp(bdev->bd_inode->i_mapping,
+ pos >> PAGE_SHIFT, GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(bdev_read_folio);
+
+struct folio *__bdev_get_folio(struct block_device *bdev, loff_t pos,
+ fgf_t fgp_flags, gfp_t gfp)
+{
+ return __filemap_get_folio(bdev->bd_inode->i_mapping, pos >> PAGE_SHIFT,
+ fgp_flags, gfp);
+}
+EXPORT_SYMBOL_GPL(__bdev_get_folio);
+
+int bdev_wb_err_check(struct block_device *bdev, errseq_t since)
+{
+ return errseq_check(&bdev->bd_inode->i_mapping->wb_err, since);
+}
+EXPORT_SYMBOL_GPL(bdev_wb_err_check);
+
+int bdev_wb_err_check_and_advance(struct block_device *bdev, errseq_t *since)
+{
+ return errseq_check_and_advance(&bdev->bd_inode->i_mapping->wb_err,
+ since);
+}
+EXPORT_SYMBOL_GPL(bdev_wb_err_check_and_advance);
+
+void bdev_balance_dirty_pages_ratelimited(struct block_device *bdev)
+{
+ return balance_dirty_pages_ratelimited(bdev->bd_inode->i_mapping);
+}
+EXPORT_SYMBOL_GPL(bdev_balance_dirty_pages_ratelimited);
+
+void bdev_sync_readahead(struct block_device *bdev, struct file_ra_state *ra,
+ struct file *file, pgoff_t index,
+ unsigned long req_count)
+{
+ struct file_ra_state tmp_ra = {};
+
+ if (!ra) {
+ ra = &tmp_ra;
+ file_ra_state_init(ra, bdev->bd_inode->i_mapping);
+ }
+ page_cache_sync_readahead(bdev->bd_inode->i_mapping, ra, file, index,
+ req_count);
+}
+EXPORT_SYMBOL_GPL(bdev_sync_readahead);
+
+void bdev_attach_wb(struct block_device *bdev)
+{
+ inode_attach_wb(bdev->bd_inode, NULL);
+}
+EXPORT_SYMBOL_GPL(bdev_attach_wb);
+
+void bdev_associated_mapping(struct block_device *bdev,
+ struct address_space *mapping)
+{
+ mapping->host = bdev->bd_inode;
+}
+EXPORT_SYMBOL_GPL(bdev_associated_mapping);
diff --git a/block/blk.h b/block/blk.h
index 08a358bc0919..da4becd4f7e9 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -467,8 +467,6 @@ extern struct device_attribute dev_attr_events_poll_msecs;
extern struct attribute_group blk_trace_attr_group;

blk_mode_t file_to_blk_mode(struct file *file);
-int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
- loff_t lstart, loff_t lend);
long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg);
long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg);

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 17c0a7d0d319..d2453424a9eb 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -24,6 +24,7 @@
#include <linux/sbitmap.h>
#include <linux/uuid.h>
#include <linux/xarray.h>
+#include <linux/pagemap.h>

struct module;
struct request_queue;
@@ -1502,6 +1503,22 @@ struct block_device *blkdev_get_no_open(dev_t dev);
void blkdev_put_no_open(struct block_device *bdev);

struct block_device *I_BDEV(struct inode *inode);
+void invalidate_bdev_range(struct block_device *bdev, pgoff_t start,
+ pgoff_t end);
+int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
+ loff_t lstart, loff_t lend);
+struct folio *bdev_read_folio(struct block_device *bdev, loff_t pos);
+struct folio *__bdev_get_folio(struct block_device *bdev, loff_t pos,
+ fgf_t fgp_flags, gfp_t gfp);
+int bdev_wb_err_check(struct block_device *bdev, errseq_t since);
+int bdev_wb_err_check_and_advance(struct block_device *bdev, errseq_t *since);
+void bdev_balance_dirty_pages_ratelimited(struct block_device *bdev);
+void bdev_sync_readahead(struct block_device *bdev, struct file_ra_state *ra,
+ struct file *file, pgoff_t index,
+ unsigned long req_count);
+void bdev_attach_wb(struct block_device *bdev);
+void bdev_associated_mapping(struct block_device *bdev,
+ struct address_space *mapping);

#ifdef CONFIG_BLOCK
void invalidate_bdev(struct block_device *bdev);
--
2.39.2


2023-12-11 14:08:52

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 03/18] bcache: use bdev api in read_super()

From: Yu Kuai <[email protected]>

On the one hand covert to use folio while reading bdev inode, on the
other hand prevent to access bd_inode directly.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/md/bcache/super.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index bfe1685dbae5..23892b32c582 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -168,14 +168,13 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
{
const char *err;
struct cache_sb_disk *s;
- struct page *page;
+ struct folio *folio;
unsigned int i;

- page = read_cache_page_gfp(bdev->bd_inode->i_mapping,
- SB_OFFSET >> PAGE_SHIFT, GFP_KERNEL);
- if (IS_ERR(page))
+ folio = bdev_read_folio(bdev, SB_OFFSET);
+ if (IS_ERR(folio))
return "IO error";
- s = page_address(page) + offset_in_page(SB_OFFSET);
+ s = folio_address(folio) + offset_in_folio(folio, SB_OFFSET);

sb->offset = le64_to_cpu(s->offset);
sb->version = le64_to_cpu(s->version);
@@ -272,7 +271,7 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
*res = s;
return NULL;
err:
- put_page(page);
+ folio_put(folio);
return err;
}

--
2.39.2


2023-12-11 14:09:35

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 05/18] s390/dasd: use bdev api in dasd_format()

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/s390/block/dasd_ioctl.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index 61b9675e2a67..bbfb958237e6 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -221,8 +221,9 @@ dasd_format(struct dasd_block *block, struct format_data_t *fdata)
* enabling the device later.
*/
if (fdata->start_unit == 0) {
- block->gdp->part0->bd_inode->i_blkbits =
- blksize_bits(fdata->blksize);
+ rc = set_blocksize(block->gdp->part0, fdata->blksize);
+ if (rc)
+ return rc;
}

rc = base->discipline->format_device(base, fdata, 1);
--
2.39.2


2023-12-11 14:09:53

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 06/18] scsicam: use bdev api in scsi_bios_ptable()

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/scsi/scsicam.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/scsi/scsicam.c b/drivers/scsi/scsicam.c
index e2c7d8ef205f..9617d70c0ed1 100644
--- a/drivers/scsi/scsicam.c
+++ b/drivers/scsi/scsicam.c
@@ -32,11 +32,9 @@
*/
unsigned char *scsi_bios_ptable(struct block_device *dev)
{
- struct address_space *mapping = bdev_whole(dev)->bd_inode->i_mapping;
unsigned char *res = NULL;
- struct folio *folio;
+ struct folio *folio = bdev_read_folio(bdev_whole(dev), 0);

- folio = read_mapping_folio(mapping, 0, NULL);
if (IS_ERR(folio))
return NULL;

--
2.39.2


2023-12-11 14:10:15

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 04/18] mtd: block2mtd: use bdev apis

From: Yu Kuai <[email protected]>

On the one hand covert to use folio while reading bdev inode, on the
other hand prevent to access bd_inode directly.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/mtd/devices/block2mtd.c | 81 +++++++++++++++------------------
1 file changed, 36 insertions(+), 45 deletions(-)

diff --git a/drivers/mtd/devices/block2mtd.c b/drivers/mtd/devices/block2mtd.c
index aa44a23ec045..cf201bf73184 100644
--- a/drivers/mtd/devices/block2mtd.c
+++ b/drivers/mtd/devices/block2mtd.c
@@ -46,40 +46,34 @@ struct block2mtd_dev {
/* Static info about the MTD, used in cleanup_module */
static LIST_HEAD(blkmtd_device_list);

-
-static struct page *page_read(struct address_space *mapping, pgoff_t index)
-{
- return read_mapping_page(mapping, index, NULL);
-}
-
/* erase a specified part of the device */
static int _block2mtd_erase(struct block2mtd_dev *dev, loff_t to, size_t len)
{
- struct address_space *mapping =
- dev->bdev_handle->bdev->bd_inode->i_mapping;
- struct page *page;
+ struct block_device *bdev = dev->bdev_handle->bdev;
+ struct folio *folio;
pgoff_t index = to >> PAGE_SHIFT; // page index
int pages = len >> PAGE_SHIFT;
u_long *p;
u_long *max;

while (pages) {
- page = page_read(mapping, index);
- if (IS_ERR(page))
- return PTR_ERR(page);
+ folio = bdev_read_folio(bdev, index << PAGE_SHIFT);
+ if (IS_ERR(folio))
+ return PTR_ERR(folio);

- max = page_address(page) + PAGE_SIZE;
- for (p=page_address(page); p<max; p++)
+ max = folio_address(folio) + folio_size(folio);
+ for (p = folio_address(folio); p < max; p++)
if (*p != -1UL) {
- lock_page(page);
- memset(page_address(page), 0xff, PAGE_SIZE);
- set_page_dirty(page);
- unlock_page(page);
- balance_dirty_pages_ratelimited(mapping);
+ folio_lock(folio);
+ memset(folio_address(folio), 0xff,
+ folio_size(folio));
+ folio_mark_dirty(folio);
+ folio_unlock(folio);
+ bdev_balance_dirty_pages_ratelimited(bdev);
break;
}

- put_page(page);
+ folio_put(folio);
pages--;
index++;
}
@@ -106,9 +100,7 @@ static int block2mtd_read(struct mtd_info *mtd, loff_t from, size_t len,
size_t *retlen, u_char *buf)
{
struct block2mtd_dev *dev = mtd->priv;
- struct address_space *mapping =
- dev->bdev_handle->bdev->bd_inode->i_mapping;
- struct page *page;
+ struct folio *folio;
pgoff_t index = from >> PAGE_SHIFT;
int offset = from & (PAGE_SIZE-1);
int cpylen;
@@ -120,12 +112,13 @@ static int block2mtd_read(struct mtd_info *mtd, loff_t from, size_t len,
cpylen = len; // this page
len = len - cpylen;

- page = page_read(mapping, index);
- if (IS_ERR(page))
- return PTR_ERR(page);
+ folio = bdev_read_folio(dev->bdev_handle->bdev,
+ index << PAGE_SHIFT);
+ if (IS_ERR(folio))
+ return PTR_ERR(folio);

- memcpy(buf, page_address(page) + offset, cpylen);
- put_page(page);
+ memcpy(buf, folio_address(folio) + offset, cpylen);
+ folio_put(folio);

if (retlen)
*retlen += cpylen;
@@ -141,9 +134,8 @@ static int block2mtd_read(struct mtd_info *mtd, loff_t from, size_t len,
static int _block2mtd_write(struct block2mtd_dev *dev, const u_char *buf,
loff_t to, size_t len, size_t *retlen)
{
- struct page *page;
- struct address_space *mapping =
- dev->bdev_handle->bdev->bd_inode->i_mapping;
+ struct block_device *bdev = dev->bdev_handle->bdev;
+ struct folio *folio;
pgoff_t index = to >> PAGE_SHIFT; // page index
int offset = to & ~PAGE_MASK; // page offset
int cpylen;
@@ -155,18 +147,18 @@ static int _block2mtd_write(struct block2mtd_dev *dev, const u_char *buf,
cpylen = len; // this page
len = len - cpylen;

- page = page_read(mapping, index);
- if (IS_ERR(page))
- return PTR_ERR(page);
+ folio = bdev_read_folio(bdev, index << PAGE_SHIFT);
+ if (IS_ERR(folio))
+ return PTR_ERR(folio);

- if (memcmp(page_address(page)+offset, buf, cpylen)) {
- lock_page(page);
- memcpy(page_address(page) + offset, buf, cpylen);
- set_page_dirty(page);
- unlock_page(page);
- balance_dirty_pages_ratelimited(mapping);
+ if (memcmp(folio_address(folio) + offset, buf, cpylen)) {
+ folio_lock(folio);
+ memcpy(folio_address(folio) + offset, buf, cpylen);
+ folio_mark_dirty(folio);
+ folio_unlock(folio);
+ bdev_balance_dirty_pages_ratelimited(bdev);
}
- put_page(page);
+ folio_put(folio);

if (retlen)
*retlen += cpylen;
@@ -211,8 +203,7 @@ static void block2mtd_free_device(struct block2mtd_dev *dev)
kfree(dev->mtd.name);

if (dev->bdev_handle) {
- invalidate_mapping_pages(
- dev->bdev_handle->bdev->bd_inode->i_mapping, 0, -1);
+ invalidate_bdev(dev->bdev_handle->bdev);
bdev_release(dev->bdev_handle);
}

@@ -295,7 +286,7 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,
goto err_free_block2mtd;
}

- if ((long)bdev->bd_inode->i_size % erase_size) {
+ if (bdev_nr_bytes(bdev) % erase_size) {
pr_err("erasesize must be a divisor of device size\n");
goto err_free_block2mtd;
}
@@ -313,7 +304,7 @@ static struct block2mtd_dev *add_device(char *devname, int erase_size,

dev->mtd.name = name;

- dev->mtd.size = bdev->bd_inode->i_size & PAGE_MASK;
+ dev->mtd.size = bdev_nr_bytes(bdev) & PAGE_MASK;
dev->mtd.erasesize = erase_size;
dev->mtd.writesize = 1;
dev->mtd.writebufsize = PAGE_SIZE;
--
2.39.2


2023-12-11 14:10:21

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 07/18] bcachefs: remove dead function bdev_sectors()

From: Yu Kuai <[email protected]>

bdev_sectors() is not used hence remove it.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/bcachefs/util.h | 5 -----
1 file changed, 5 deletions(-)

diff --git a/fs/bcachefs/util.h b/fs/bcachefs/util.h
index 2984b57b2958..22a0acc1704f 100644
--- a/fs/bcachefs/util.h
+++ b/fs/bcachefs/util.h
@@ -516,11 +516,6 @@ static inline unsigned fract_exp_two(unsigned x, unsigned fract_bits)
void bch2_bio_map(struct bio *bio, void *base, size_t);
int bch2_bio_alloc_pages(struct bio *, size_t, gfp_t);

-static inline sector_t bdev_sectors(struct block_device *bdev)
-{
- return bdev->bd_inode->i_size >> 9;
-}
-
#define closure_bio_submit(bio, cl) \
do { \
closure_get(cl); \
--
2.39.2


2023-12-11 14:10:29

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 08/18] bio: export bio_add_folio_nofail()

From: Yu Kuai <[email protected]>

Currently btrfs is using __bio_add_page() in write_dev_supers(). In order
to convert to use folio for bdev in btrfs, export bio_add_folio_nofail()
so that it can replace __bio_add_page().

Signed-off-by: Yu Kuai <[email protected]>
---
block/bio.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/block/bio.c b/block/bio.c
index 5eba53ca953b..a7b2bbb210ee 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1119,6 +1119,7 @@ void bio_add_folio_nofail(struct bio *bio, struct folio *folio, size_t len,
WARN_ON_ONCE(off > UINT_MAX);
__bio_add_page(bio, &folio->page, len, off);
}
+EXPORT_SYMBOL_GPL(bio_add_folio_nofail);

/**
* bio_add_folio - Attempt to add part of a folio to a bio.
--
2.39.2


2023-12-11 14:10:54

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 09/18] btrfs: use bdev apis

From: Yu Kuai <[email protected]>

On the one hand covert to use folio while reading bdev inode, on the
other hand prevent to access bd_inode directly.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/btrfs/disk-io.c | 71 +++++++++++++++++++++-------------------------
fs/btrfs/volumes.c | 17 ++++++-----
fs/btrfs/zoned.c | 15 +++++-----
3 files changed, 48 insertions(+), 55 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 401ea09ae4b8..c373806dc793 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3620,28 +3620,24 @@ ALLOW_ERROR_INJECTION(open_ctree, ERRNO);
static void btrfs_end_super_write(struct bio *bio)
{
struct btrfs_device *device = bio->bi_private;
- struct bio_vec *bvec;
- struct bvec_iter_all iter_all;
- struct page *page;
-
- bio_for_each_segment_all(bvec, bio, iter_all) {
- page = bvec->bv_page;
+ struct folio_iter fi;

+ bio_for_each_folio_all(fi, bio) {
if (bio->bi_status) {
btrfs_warn_rl_in_rcu(device->fs_info,
"lost page write due to IO error on %s (%d)",
btrfs_dev_name(device),
blk_status_to_errno(bio->bi_status));
- ClearPageUptodate(page);
- SetPageError(page);
+ folio_clear_uptodate(fi.folio);
+ folio_set_error(fi.folio);
btrfs_dev_stat_inc_and_print(device,
BTRFS_DEV_STAT_WRITE_ERRS);
} else {
- SetPageUptodate(page);
+ folio_mark_uptodate(fi.folio);
}

- put_page(page);
- unlock_page(page);
+ folio_put(fi.folio);
+ folio_unlock(fi.folio);
}

bio_put(bio);
@@ -3651,9 +3647,9 @@ struct btrfs_super_block *btrfs_read_dev_one_super(struct block_device *bdev,
int copy_num, bool drop_cache)
{
struct btrfs_super_block *super;
- struct page *page;
+ struct folio *folio;
u64 bytenr, bytenr_orig;
- struct address_space *mapping = bdev->bd_inode->i_mapping;
+ unsigned int nofs_flag;
int ret;

bytenr_orig = btrfs_sb_offset(copy_num);
@@ -3674,16 +3670,17 @@ struct btrfs_super_block *btrfs_read_dev_one_super(struct block_device *bdev,
* Drop the page of the primary superblock, so later read will
* always read from the device.
*/
- invalidate_inode_pages2_range(mapping,
- bytenr >> PAGE_SHIFT,
+ invalidate_bdev_range(bdev, bytenr >> PAGE_SHIFT,
(bytenr + BTRFS_SUPER_INFO_SIZE) >> PAGE_SHIFT);
}

- page = read_cache_page_gfp(mapping, bytenr >> PAGE_SHIFT, GFP_NOFS);
- if (IS_ERR(page))
- return ERR_CAST(page);
+ nofs_flag = memalloc_nofs_save();
+ folio = bdev_read_folio(bdev, bytenr);
+ memalloc_nofs_restore(nofs_flag);
+ if (IS_ERR(folio))
+ return ERR_CAST(folio);

- super = page_address(page);
+ super = folio_address(folio);
if (btrfs_super_magic(super) != BTRFS_MAGIC) {
btrfs_release_disk_super(super);
return ERR_PTR(-ENODATA);
@@ -3740,7 +3737,6 @@ static int write_dev_supers(struct btrfs_device *device,
struct btrfs_super_block *sb, int max_mirrors)
{
struct btrfs_fs_info *fs_info = device->fs_info;
- struct address_space *mapping = device->bdev->bd_inode->i_mapping;
SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
int i;
int errors = 0;
@@ -3753,7 +3749,7 @@ static int write_dev_supers(struct btrfs_device *device,
shash->tfm = fs_info->csum_shash;

for (i = 0; i < max_mirrors; i++) {
- struct page *page;
+ struct folio *folio;
struct bio *bio;
struct btrfs_super_block *disk_super;

@@ -3778,9 +3774,10 @@ static int write_dev_supers(struct btrfs_device *device,
BTRFS_SUPER_INFO_SIZE - BTRFS_CSUM_SIZE,
sb->csum);

- page = find_or_create_page(mapping, bytenr >> PAGE_SHIFT,
- GFP_NOFS);
- if (!page) {
+ folio = __bdev_get_folio(device->bdev, bytenr,
+ FGP_LOCK|FGP_ACCESSED|FGP_CREAT,
+ GFP_NOFS);
+ if (IS_ERR(folio)) {
btrfs_err(device->fs_info,
"couldn't get super block page for bytenr %llu",
bytenr);
@@ -3789,9 +3786,9 @@ static int write_dev_supers(struct btrfs_device *device,
}

/* Bump the refcount for wait_dev_supers() */
- get_page(page);
+ folio_get(folio);

- disk_super = page_address(page);
+ disk_super = folio_address(folio);
memcpy(disk_super, sb, BTRFS_SUPER_INFO_SIZE);

/*
@@ -3805,8 +3802,8 @@ static int write_dev_supers(struct btrfs_device *device,
bio->bi_iter.bi_sector = bytenr >> SECTOR_SHIFT;
bio->bi_private = device;
bio->bi_end_io = btrfs_end_super_write;
- __bio_add_page(bio, page, BTRFS_SUPER_INFO_SIZE,
- offset_in_page(bytenr));
+ bio_add_folio_nofail(bio, folio, BTRFS_SUPER_INFO_SIZE,
+ offset_in_folio(folio, bytenr));

/*
* We FUA only the first super block. The others we allow to
@@ -3842,7 +3839,7 @@ static int wait_dev_supers(struct btrfs_device *device, int max_mirrors)
max_mirrors = BTRFS_SUPER_MIRROR_MAX;

for (i = 0; i < max_mirrors; i++) {
- struct page *page;
+ struct folio *folio;

ret = btrfs_sb_log_location(device, i, READ, &bytenr);
if (ret == -ENOENT) {
@@ -3857,27 +3854,23 @@ static int wait_dev_supers(struct btrfs_device *device, int max_mirrors)
device->commit_total_bytes)
break;

- page = find_get_page(device->bdev->bd_inode->i_mapping,
- bytenr >> PAGE_SHIFT);
- if (!page) {
+ folio = __bdev_get_folio(device->bdev, bytenr, 0, 0);
+ if (!IS_ERR(folio)) {
errors++;
if (i == 0)
primary_failed = true;
continue;
}
/* Page is submitted locked and unlocked once the IO completes */
- wait_on_page_locked(page);
- if (PageError(page)) {
+ folio_wait_locked(folio);
+ if (folio_test_error(folio)) {
errors++;
if (i == 0)
primary_failed = true;
}

- /* Drop our reference */
- put_page(page);
-
- /* Drop the reference from the writing run */
- put_page(page);
+ /* Drop our reference and the reference from the writing run */
+ folio_put_refs(folio, 2);
}

/* log error, force error return */
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c6f16625af51..b3538777683c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1230,16 +1230,16 @@ int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,

void btrfs_release_disk_super(struct btrfs_super_block *super)
{
- struct page *page = virt_to_page(super);
+ struct folio *folio = virt_to_folio(super);

- put_page(page);
+ folio_put(folio);
}

static struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev,
u64 bytenr, u64 bytenr_orig)
{
struct btrfs_super_block *disk_super;
- struct page *page;
+ struct folio *folio;
void *p;
pgoff_t index;

@@ -1257,15 +1257,14 @@ static struct btrfs_super_block *btrfs_read_disk_super(struct block_device *bdev
return ERR_PTR(-EINVAL);

/* pull in the page with our super */
- page = read_cache_page_gfp(bdev->bd_inode->i_mapping, index, GFP_KERNEL);
+ folio = bdev_read_folio(bdev, index);
+ if (IS_ERR(folio))
+ return ERR_CAST(folio);

- if (IS_ERR(page))
- return ERR_CAST(page);
-
- p = page_address(page);
+ p = folio_address(folio);

/* align our pointer to the offset of the super block */
- disk_super = p + offset_in_page(bytenr);
+ disk_super = p + offset_in_folio(folio, bytenr);

if (btrfs_super_bytenr(disk_super) != bytenr_orig ||
btrfs_super_magic(disk_super) != BTRFS_MAGIC) {
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index 188378ca19c7..aea0a433b3e5 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -120,8 +120,6 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
return -ENOENT;
} else if (full[0] && full[1]) {
/* Compare two super blocks */
- struct address_space *mapping = bdev->bd_inode->i_mapping;
- struct page *page[BTRFS_NR_SB_LOG_ZONES];
struct btrfs_super_block *super[BTRFS_NR_SB_LOG_ZONES];
int i;

@@ -129,15 +127,18 @@ static int sb_write_pointer(struct block_device *bdev, struct blk_zone *zones,
u64 zone_end = (zones[i].start + zones[i].capacity) << SECTOR_SHIFT;
u64 bytenr = ALIGN_DOWN(zone_end, BTRFS_SUPER_INFO_SIZE) -
BTRFS_SUPER_INFO_SIZE;
+ struct folio *folio;
+ unsigned int nofs_flag;

- page[i] = read_cache_page_gfp(mapping,
- bytenr >> PAGE_SHIFT, GFP_NOFS);
- if (IS_ERR(page[i])) {
+ nofs_flag = memalloc_nofs_save();
+ folio = bdev_read_folio(bdev, bytenr);
+ memalloc_nofs_restore(nofs_flag);
+ if (IS_ERR(folio)) {
if (i == 1)
btrfs_release_disk_super(super[0]);
- return PTR_ERR(page[i]);
+ return PTR_ERR(folio);
}
- super[i] = page_address(page[i]);
+ super[i] = folio_address(folio);
}

if (btrfs_super_generation(super[0]) >
--
2.39.2


2023-12-11 14:11:15

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 10/18] cramfs: use bdev apis in cramfs_blkdev_read()

From: Yu Kuai <[email protected]>

On the one hand covert to use folio while reading bdev inode, on the
other hand prevent to access bd_inode directly.

Also do some cleanup that there is no need for two for loop, and remove
local array pages.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/cramfs/inode.c | 36 +++++++++++++-----------------------
1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index 60dbfa0f8805..fad95d683d97 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -183,9 +183,6 @@ static int next_buffer;
static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
unsigned int len)
{
- struct address_space *mapping = sb->s_bdev->bd_inode->i_mapping;
- struct file_ra_state ra = {};
- struct page *pages[BLKS_PER_BUF];
unsigned i, blocknr, buffer;
unsigned long devsize;
char *data;
@@ -214,37 +211,30 @@ static void *cramfs_blkdev_read(struct super_block *sb, unsigned int offset,
devsize = bdev_nr_bytes(sb->s_bdev) >> PAGE_SHIFT;

/* Ok, read in BLKS_PER_BUF pages completely first. */
- file_ra_state_init(&ra, mapping);
- page_cache_sync_readahead(mapping, &ra, NULL, blocknr, BLKS_PER_BUF);
-
- for (i = 0; i < BLKS_PER_BUF; i++) {
- struct page *page = NULL;
-
- if (blocknr + i < devsize) {
- page = read_mapping_page(mapping, blocknr + i, NULL);
- /* synchronous error? */
- if (IS_ERR(page))
- page = NULL;
- }
- pages[i] = page;
- }
+ bdev_sync_readahead(sb->s_bdev, NULL, NULL, blocknr, BLKS_PER_BUF);

buffer = next_buffer;
next_buffer = NEXT_BUFFER(buffer);
buffer_blocknr[buffer] = blocknr;
buffer_dev[buffer] = sb;
-
data = read_buffers[buffer];
+
for (i = 0; i < BLKS_PER_BUF; i++) {
- struct page *page = pages[i];
+ struct folio *folio = NULL;
+
+ if (blocknr + i < devsize)
+ folio = bdev_read_folio(sb->s_bdev,
+ (blocknr + i) << PAGE_SHIFT);

- if (page) {
- memcpy_from_page(data, page, 0, PAGE_SIZE);
- put_page(page);
- } else
+ if (IS_ERR_OR_NULL(folio)) {
memset(data, 0, PAGE_SIZE);
+ } else {
+ memcpy_from_folio(data, folio, 0, PAGE_SIZE);
+ folio_put(folio);
+ }
data += PAGE_SIZE;
}
+
return read_buffers[buffer] + offset;
}

--
2.39.2


2023-12-11 14:13:34

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 11/18] erofs: use bdev api

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/erofs/data.c | 18 ++++++++++++------
fs/erofs/internal.h | 2 ++
2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index c98aeda8abb2..8cf3618190ab 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -32,8 +32,7 @@ void erofs_put_metabuf(struct erofs_buf *buf)
void *erofs_bread(struct erofs_buf *buf, erofs_blk_t blkaddr,
enum erofs_kmap_type type)
{
- struct inode *inode = buf->inode;
- erofs_off_t offset = (erofs_off_t)blkaddr << inode->i_blkbits;
+ erofs_off_t offset = (erofs_off_t)blkaddr << buf->blkszbits;
pgoff_t index = offset >> PAGE_SHIFT;
struct page *page = buf->page;
struct folio *folio;
@@ -43,7 +42,9 @@ void *erofs_bread(struct erofs_buf *buf, erofs_blk_t blkaddr,
erofs_put_metabuf(buf);

nofs_flag = memalloc_nofs_save();
- folio = read_cache_folio(inode->i_mapping, index, NULL, NULL);
+ folio = buf->inode ?
+ read_mapping_folio(buf->inode->i_mapping, index, NULL) :
+ bdev_read_folio(buf->bdev, offset);
memalloc_nofs_restore(nofs_flag);
if (IS_ERR(folio))
return folio;
@@ -67,10 +68,15 @@ void *erofs_bread(struct erofs_buf *buf, erofs_blk_t blkaddr,

void erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb)
{
- if (erofs_is_fscache_mode(sb))
+ if (erofs_is_fscache_mode(sb)) {
buf->inode = EROFS_SB(sb)->s_fscache->inode;
- else
- buf->inode = sb->s_bdev->bd_inode;
+ buf->bdev = NULL;
+ buf->blkszbits = buf->inode->i_blkbits;
+ } else {
+ buf->inode = NULL;
+ buf->bdev = sb->s_bdev;
+ buf->blkszbits = EROFS_SB(sb)->blkszbits;
+ }
}

void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb,
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index b0409badb017..c9206351b485 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -224,8 +224,10 @@ enum erofs_kmap_type {

struct erofs_buf {
struct inode *inode;
+ struct block_device *bdev;
struct page *page;
void *base;
+ u8 blkszbits;
enum erofs_kmap_type kmap_type;
};
#define __EROFS_BUF_INITIALIZER ((struct erofs_buf){ .page = NULL })
--
2.39.2


2023-12-11 14:13:59

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 12/18] gfs2: use bdev api

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/gfs2/glock.c | 2 +-
fs/gfs2/ops_fstype.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index d6bf1f8c25dc..4128a5bc4bb6 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1210,7 +1210,7 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
mapping = gfs2_glock2aspace(gl);
if (mapping) {
mapping->a_ops = &gfs2_meta_aops;
- mapping->host = s->s_bdev->bd_inode;
+ bdev_associated_mapping(s->s_bdev, mapping);
mapping->flags = 0;
mapping_set_gfp_mask(mapping, GFP_NOFS);
mapping->private_data = NULL;
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index b108c5d26839..56ae8959b55a 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -114,7 +114,7 @@ static struct gfs2_sbd *init_sbd(struct super_block *sb)

address_space_init_once(mapping);
mapping->a_ops = &gfs2_rgrp_aops;
- mapping->host = sb->s_bdev->bd_inode;
+ bdev_associated_mapping(sb->s_bdev, mapping);
mapping->flags = 0;
mapping_set_gfp_mask(mapping, GFP_NOFS);
mapping->private_data = NULL;
--
2.39.2


2023-12-11 14:14:18

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 13/18] nilfs2: use bdev api in nilfs_attach_log_writer()

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/nilfs2/segment.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 55e31cc903d1..a1130e384937 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2823,7 +2823,7 @@ int nilfs_attach_log_writer(struct super_block *sb, struct nilfs_root *root)
if (!nilfs->ns_writer)
return -ENOMEM;

- inode_attach_wb(nilfs->ns_bdev->bd_inode, NULL);
+ bdev_attach_wb(nilfs->ns_bdev);

err = nilfs_segctor_start_thread(nilfs->ns_writer);
if (unlikely(err))
--
2.39.2


2023-12-11 14:14:38

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 14/18] jbd2: use bdev apis

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/jbd2/journal.c | 3 +--
fs/jbd2/recovery.c | 6 ++----
2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index ed53188472f9..f1b5ffeaf02a 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -2003,8 +2003,7 @@ static int __jbd2_journal_erase(journal_t *journal, unsigned int flags)
byte_count = (block_stop - block_start + 1) *
journal->j_blocksize;

- truncate_inode_pages_range(journal->j_dev->bd_inode->i_mapping,
- byte_start, byte_stop);
+ truncate_bdev_range(journal->j_dev, 0, byte_start, byte_stop);

if (flags & JBD2_JOURNAL_FLUSH_DISCARD) {
err = blkdev_issue_discard(journal->j_dev,
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 01f744cb97a4..6b6a2c4585fa 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -290,7 +290,6 @@ int jbd2_journal_recover(journal_t *journal)

struct recovery_info info;
errseq_t wb_err;
- struct address_space *mapping;

memset(&info, 0, sizeof(info));
sb = journal->j_superblock;
@@ -309,8 +308,7 @@ int jbd2_journal_recover(journal_t *journal)
}

wb_err = 0;
- mapping = journal->j_fs_dev->bd_inode->i_mapping;
- errseq_check_and_advance(&mapping->wb_err, &wb_err);
+ bdev_wb_err_check_and_advance(journal->j_fs_dev, &wb_err);
err = do_one_pass(journal, &info, PASS_SCAN);
if (!err)
err = do_one_pass(journal, &info, PASS_REVOKE);
@@ -334,7 +332,7 @@ int jbd2_journal_recover(journal_t *journal)
err2 = sync_blockdev(journal->j_fs_dev);
if (!err)
err = err2;
- err2 = errseq_check_and_advance(&mapping->wb_err, &wb_err);
+ err2 = bdev_wb_err_check_and_advance(journal->j_fs_dev, &wb_err);
if (!err)
err = err2;
/* Make sure all replayed data is on permanent storage */
--
2.39.2


2023-12-11 14:15:03

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 15/18] buffer: add a new helper to read sb block

From: Yu Kuai <[email protected]>

Unlike __bread_gfp(), ext4 has special handing while reading sb block:

1) __GFP_NOFAIL is not set, and memory allocation can fail;
2) If buffer write failed before, set buffer uptodate and don't read
block from disk;
3) REQ_META is set for all IO, and REQ_PRIO is set for reading xattr;
4) If failed, return error ptr instead of NULL;

This patch add a new helper __bread_gfp2() that will match above 2 and 3(
1 will be used, and 4 will still be encapsulated by ext4), and prepare to
prevent calling mapping_gfp_constraint() directly on bd_inode->i_mapping
in ext4.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/buffer.c | 68 ++++++++++++++++++++++++++-----------
include/linux/buffer_head.h | 18 +++++++++-
2 files changed, 65 insertions(+), 21 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 967f34b70aa8..188bd36c9fea 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1255,16 +1255,19 @@ void __bforget(struct buffer_head *bh)
}
EXPORT_SYMBOL(__bforget);

-static struct buffer_head *__bread_slow(struct buffer_head *bh)
+static struct buffer_head *__bread_slow(struct buffer_head *bh,
+ blk_opf_t op_flags,
+ bool check_write_error)
{
lock_buffer(bh);
- if (buffer_uptodate(bh)) {
+ if (buffer_uptodate(bh) ||
+ (check_write_error && buffer_uptodate_or_error(bh))) {
unlock_buffer(bh);
return bh;
} else {
get_bh(bh);
bh->b_end_io = end_buffer_read_sync;
- submit_bh(REQ_OP_READ, bh);
+ submit_bh(REQ_OP_READ | op_flags, bh);
wait_on_buffer(bh);
if (buffer_uptodate(bh))
return bh;
@@ -1445,6 +1448,31 @@ void __breadahead(struct block_device *bdev, sector_t block, unsigned size)
}
EXPORT_SYMBOL(__breadahead);

+static struct buffer_head *
+bread_gfp(struct block_device *bdev, sector_t block, unsigned int size,
+ blk_opf_t op_flags, gfp_t gfp, bool check_write_error)
+{
+ struct buffer_head *bh;
+
+ gfp |= mapping_gfp_constraint(bdev->bd_inode->i_mapping, ~__GFP_FS);
+
+ /*
+ * Prefer looping in the allocator rather than here, at least that
+ * code knows what it's doing.
+ */
+ gfp |= __GFP_NOFAIL;
+
+ bh = bdev_getblk(bdev, block, size, gfp);
+ if (unlikely(!bh))
+ return NULL;
+
+ if (buffer_uptodate(bh) ||
+ (check_write_error && buffer_uptodate_or_error(bh)))
+ return bh;
+
+ return __bread_slow(bh, op_flags, check_write_error);
+}
+
/**
* __bread_gfp() - reads a specified block and returns the bh
* @bdev: the block_device to read from
@@ -1458,27 +1486,27 @@ EXPORT_SYMBOL(__breadahead);
* It returns NULL if the block was unreadable.
*/
struct buffer_head *
-__bread_gfp(struct block_device *bdev, sector_t block,
- unsigned size, gfp_t gfp)
+__bread_gfp(struct block_device *bdev, sector_t block, unsigned int size,
+ gfp_t gfp)
{
- struct buffer_head *bh;
-
- gfp |= mapping_gfp_constraint(bdev->bd_inode->i_mapping, ~__GFP_FS);
-
- /*
- * Prefer looping in the allocator rather than here, at least that
- * code knows what it's doing.
- */
- gfp |= __GFP_NOFAIL;
-
- bh = bdev_getblk(bdev, block, size, gfp);
-
- if (likely(bh) && !buffer_uptodate(bh))
- bh = __bread_slow(bh);
- return bh;
+ return bread_gfp(bdev, block, size, 0, gfp, false);
}
EXPORT_SYMBOL(__bread_gfp);

+/*
+ * This works like __bread_gfp() except:
+ * 1) If buffer write failed before, set buffer uptodate and don't read
+ * block from disk;
+ * 2) Caller can pass in additional op_flags like REQ_META;
+ */
+struct buffer_head *
+__bread_gfp2(struct block_device *bdev, sector_t block, unsigned int size,
+ blk_opf_t op_flags, gfp_t gfp)
+{
+ return bread_gfp(bdev, block, size, op_flags, gfp, true);
+}
+EXPORT_SYMBOL(__bread_gfp2);
+
static void __invalidate_bh_lrus(struct bh_lru *b)
{
int i;
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 5f23ee599889..751b2744b4ae 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -171,6 +171,18 @@ static __always_inline int buffer_uptodate(const struct buffer_head *bh)
return test_bit_acquire(BH_Uptodate, &bh->b_state);
}

+static __always_inline int buffer_uptodate_or_error(struct buffer_head *bh)
+{
+ /*
+ * If the buffer has the write error flag, data was failed to write
+ * out in the block. In this case, set buffer uptodate to prevent
+ * reading old data.
+ */
+ if (buffer_write_io_error(bh))
+ set_buffer_uptodate(bh);
+ return buffer_uptodate(bh);
+}
+
static inline unsigned long bh_offset(const struct buffer_head *bh)
{
return (unsigned long)(bh)->b_data & (page_size(bh->b_page) - 1);
@@ -231,7 +243,11 @@ void __brelse(struct buffer_head *);
void __bforget(struct buffer_head *);
void __breadahead(struct block_device *, sector_t block, unsigned int size);
struct buffer_head *__bread_gfp(struct block_device *,
- sector_t block, unsigned size, gfp_t gfp);
+ sector_t block, unsigned int size, gfp_t gfp);
+struct buffer_head *__bread_gfp2(struct block_device *bdev, sector_t block,
+ unsigned int size, blk_opf_t op_flags,
+ gfp_t gfp);
+
struct buffer_head *alloc_buffer_head(gfp_t gfp_flags);
void free_buffer_head(struct buffer_head * bh);
void unlock_buffer(struct buffer_head *bh);
--
2.39.2


2023-12-11 14:15:23

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 16/18] ext4: use new helper to read sb block

From: Yu Kuai <[email protected]>

Remove __ext4_sb_bread_gfp() and ext4_buffer_uptodate() that is defined
by ext4, and convert to use common helper __bread_gfp2() and
buffer_uptodate_or_error().

Signed-off-by: Yu Kuai <[email protected]>
---
fs/ext4/ext4.h | 13 -------------
fs/ext4/inode.c | 8 ++++----
fs/ext4/super.c | 45 ++++++++++-----------------------------------
fs/ext4/symlink.c | 2 +-
4 files changed, 15 insertions(+), 53 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index a5d784872303..8377f6c5264f 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3824,19 +3824,6 @@ extern const struct iomap_ops ext4_iomap_ops;
extern const struct iomap_ops ext4_iomap_overwrite_ops;
extern const struct iomap_ops ext4_iomap_report_ops;

-static inline int ext4_buffer_uptodate(struct buffer_head *bh)
-{
- /*
- * If the buffer has the write error flag, we have failed
- * to write out data in the block. In this case, we don't
- * have to read the block because we may read the old data
- * successfully.
- */
- if (buffer_write_io_error(bh))
- set_buffer_uptodate(bh);
- return buffer_uptodate(bh);
-}
-
#endif /* __KERNEL__ */

#define EFSBADCRC EBADMSG /* Bad CRC detected */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 61277f7f8722..efb0af6f02f7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -887,7 +887,7 @@ struct buffer_head *ext4_bread(handle_t *handle, struct inode *inode,
bh = ext4_getblk(handle, inode, block, map_flags);
if (IS_ERR(bh))
return bh;
- if (!bh || ext4_buffer_uptodate(bh))
+ if (!bh || buffer_uptodate_or_error(bh))
return bh;

ret = ext4_read_bh_lock(bh, REQ_META | REQ_PRIO, true);
@@ -915,7 +915,7 @@ int ext4_bread_batch(struct inode *inode, ext4_lblk_t block, int bh_count,

for (i = 0; i < bh_count; i++)
/* Note that NULL bhs[i] is valid because of holes. */
- if (bhs[i] && !ext4_buffer_uptodate(bhs[i]))
+ if (bhs[i] && !buffer_uptodate_or_error(bhs[i]))
ext4_read_bh_lock(bhs[i], REQ_META | REQ_PRIO, false);

if (!wait)
@@ -4392,11 +4392,11 @@ static int __ext4_get_inode_loc(struct super_block *sb, unsigned long ino,
bh = sb_getblk(sb, block);
if (unlikely(!bh))
return -ENOMEM;
- if (ext4_buffer_uptodate(bh))
+ if (buffer_uptodate_or_error(bh))
goto has_buffer;

lock_buffer(bh);
- if (ext4_buffer_uptodate(bh)) {
+ if (buffer_uptodate_or_error(bh)) {
/* Someone brought it uptodate while we waited */
unlock_buffer(bh);
goto has_buffer;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index c5fcf377ab1f..ae41204f52d4 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -180,7 +180,7 @@ void ext4_read_bh_nowait(struct buffer_head *bh, blk_opf_t op_flags,
{
BUG_ON(!buffer_locked(bh));

- if (ext4_buffer_uptodate(bh)) {
+ if (buffer_uptodate_or_error(bh)) {
unlock_buffer(bh);
return;
}
@@ -191,7 +191,7 @@ int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags, bh_end_io_t *end_io
{
BUG_ON(!buffer_locked(bh));

- if (ext4_buffer_uptodate(bh)) {
+ if (buffer_uptodate_or_error(bh)) {
unlock_buffer(bh);
return 0;
}
@@ -214,49 +214,24 @@ int ext4_read_bh_lock(struct buffer_head *bh, blk_opf_t op_flags, bool wait)
return ext4_read_bh(bh, op_flags, NULL);
}

-/*
- * This works like __bread_gfp() except it uses ERR_PTR for error
- * returns. Currently with sb_bread it's impossible to distinguish
- * between ENOMEM and EIO situations (since both result in a NULL
- * return.
- */
-static struct buffer_head *__ext4_sb_bread_gfp(struct super_block *sb,
- sector_t block,
- blk_opf_t op_flags, gfp_t gfp)
-{
- struct buffer_head *bh;
- int ret;
-
- bh = sb_getblk_gfp(sb, block, gfp);
- if (bh == NULL)
- return ERR_PTR(-ENOMEM);
- if (ext4_buffer_uptodate(bh))
- return bh;
-
- ret = ext4_read_bh_lock(bh, REQ_META | op_flags, true);
- if (ret) {
- put_bh(bh);
- return ERR_PTR(ret);
- }
- return bh;
-}
-
struct buffer_head *ext4_sb_bread(struct super_block *sb, sector_t block,
blk_opf_t op_flags)
{
- gfp_t gfp = mapping_gfp_constraint(sb->s_bdev->bd_inode->i_mapping,
- ~__GFP_FS) | __GFP_MOVABLE;
+ struct buffer_head *bh = __bread_gfp2(sb->s_bdev, block,
+ sb->s_blocksize,
+ REQ_META | op_flags,
+ __GFP_MOVABLE);

- return __ext4_sb_bread_gfp(sb, block, op_flags, gfp);
+ return bh ? bh : ERR_PTR(-EIO);
}

struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
sector_t block)
{
- gfp_t gfp = mapping_gfp_constraint(sb->s_bdev->bd_inode->i_mapping,
- ~__GFP_FS);
+ struct buffer_head *bh = __bread_gfp2(sb->s_bdev, block,
+ sb->s_blocksize, 0, 0);

- return __ext4_sb_bread_gfp(sb, block, 0, gfp);
+ return bh ? bh : ERR_PTR(-EIO);
}

void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
diff --git a/fs/ext4/symlink.c b/fs/ext4/symlink.c
index 75bf1f88843c..49e918221aac 100644
--- a/fs/ext4/symlink.c
+++ b/fs/ext4/symlink.c
@@ -94,7 +94,7 @@ static const char *ext4_get_link(struct dentry *dentry, struct inode *inode,
bh = ext4_getblk(NULL, inode, 0, EXT4_GET_BLOCKS_CACHED_NOWAIT);
if (IS_ERR(bh))
return ERR_CAST(bh);
- if (!bh || !ext4_buffer_uptodate(bh))
+ if (!bh || !buffer_uptodate_or_error(bh))
return ERR_PTR(-ECHILD);
} else {
bh = ext4_bread(NULL, inode, 0, 0);
--
2.39.2


2023-12-11 14:15:46

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 17/18] ext4: remove block_device_ejected()

From: Yu Kuai <[email protected]>

block_device_ejected() is added by commit bdfe0cbd746a ("Revert
"ext4: remove block_device_ejected"") in 2015. At that time 'bdi->wb'
is destroyed synchronized from del_gendisk(), hence if ext4 is still
mounted, and then mark_buffer_dirty() will reference destroyed 'wb'.
However, such problem doesn't exist anymore:

- commit d03f6cdc1fc4 ("block: Dynamically allocate and refcount
backing_dev_info") switch bdi to use refcounting;
- commit 13eec2363ef0 ("fs: Get proper reference for s_bdi"), will grab
additional reference of bdi while mounting, so that 'bdi->wb' will not
be destroyed until generic_shutdown_super().

Hence remove this dead function block_device_ejected().

Signed-off-by: Yu Kuai <[email protected]>
---
fs/ext4/super.c | 18 ------------------
1 file changed, 18 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ae41204f52d4..3b5e2b557488 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -467,22 +467,6 @@ static void ext4_maybe_update_superblock(struct super_block *sb)
schedule_work(&EXT4_SB(sb)->s_sb_upd_work);
}

-/*
- * The del_gendisk() function uninitializes the disk-specific data
- * structures, including the bdi structure, without telling anyone
- * else. Once this happens, any attempt to call mark_buffer_dirty()
- * (for example, by ext4_commit_super), will cause a kernel OOPS.
- * This is a kludge to prevent these oops until we can put in a proper
- * hook in del_gendisk() to inform the VFS and file system layers.
- */
-static int block_device_ejected(struct super_block *sb)
-{
- struct inode *bd_inode = sb->s_bdev->bd_inode;
- struct backing_dev_info *bdi = inode_to_bdi(bd_inode);
-
- return bdi->dev == NULL;
-}
-
static void ext4_journal_commit_callback(journal_t *journal, transaction_t *txn)
{
struct super_block *sb = journal->j_private;
@@ -6162,8 +6146,6 @@ static int ext4_commit_super(struct super_block *sb)

if (!sbh)
return -EINVAL;
- if (block_device_ejected(sb))
- return -ENODEV;

ext4_update_super(sb);

--
2.39.2


2023-12-11 14:16:06

by Yu Kuai

[permalink] [raw]
Subject: [PATCH RFC v2 for-6.8/block 18/18] ext4: use bdev apis

From: Yu Kuai <[email protected]>

Avoid to access bd_inode directly, prepare to remove bd_inode from
block_devcie.

Signed-off-by: Yu Kuai <[email protected]>
---
fs/ext4/dir.c | 6 ++----
fs/ext4/ext4_jbd2.c | 6 +++---
fs/ext4/super.c | 3 +--
3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 3985f8c33f95..64e35eb6a324 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -191,10 +191,8 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
pgoff_t index = map.m_pblk >>
(PAGE_SHIFT - inode->i_blkbits);
if (!ra_has_index(&file->f_ra, index))
- page_cache_sync_readahead(
- sb->s_bdev->bd_inode->i_mapping,
- &file->f_ra, file,
- index, 1);
+ bdev_sync_readahead(sb->s_bdev, &file->f_ra,
+ file, index, 1);
file->f_ra.prev_pos = (loff_t)index << PAGE_SHIFT;
bh = ext4_bread(NULL, inode, map.m_lblk, 0);
if (IS_ERR(bh)) {
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index d1a2e6624401..c1bf3a00fad9 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -206,7 +206,6 @@ static void ext4_journal_abort_handle(const char *caller, unsigned int line,

static void ext4_check_bdev_write_error(struct super_block *sb)
{
- struct address_space *mapping = sb->s_bdev->bd_inode->i_mapping;
struct ext4_sb_info *sbi = EXT4_SB(sb);
int err;

@@ -216,9 +215,10 @@ static void ext4_check_bdev_write_error(struct super_block *sb)
* we could read old data from disk and write it out again, which
* may lead to on-disk filesystem inconsistency.
*/
- if (errseq_check(&mapping->wb_err, READ_ONCE(sbi->s_bdev_wb_err))) {
+ if (bdev_wb_err_check(sb->s_bdev, READ_ONCE(sbi->s_bdev_wb_err))) {
spin_lock(&sbi->s_bdev_wb_lock);
- err = errseq_check_and_advance(&mapping->wb_err, &sbi->s_bdev_wb_err);
+ err = bdev_wb_err_check_and_advance(sb->s_bdev,
+ &sbi->s_bdev_wb_err);
spin_unlock(&sbi->s_bdev_wb_lock);
if (err)
ext4_error_err(sb, -err,
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 3b5e2b557488..96724cae622a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5544,8 +5544,7 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb)
* used to detect the metadata async write error.
*/
spin_lock_init(&sbi->s_bdev_wb_lock);
- errseq_check_and_advance(&sb->s_bdev->bd_inode->i_mapping->wb_err,
- &sbi->s_bdev_wb_err);
+ bdev_wb_err_check_and_advance(sb->s_bdev, &sbi->s_bdev_wb_err);
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
--
2.39.2


2023-12-11 17:02:25

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 18/18] ext4: use bdev apis

On Mon 11-12-23 22:08:39, Yu Kuai wrote:
> From: Yu Kuai <[email protected]>
>
> Avoid to access bd_inode directly, prepare to remove bd_inode from
> block_devcie.
^^^ device

> Signed-off-by: Yu Kuai <[email protected]>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-12-12 01:25:49

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 01/18] block: add some bdev apis

Hi,

?? 2023/12/12 0:52, Jan Kara д??:
> On Mon 11-12-23 22:05:35, Yu Kuai wrote:
>> From: Yu Kuai <[email protected]>
>>
>> Those apis will be used for other modules, so that bd_inode won't be
>> accessed directly from other modules.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
>
> ...
>
>> +void bdev_associated_mapping(struct block_device *bdev,
>> + struct address_space *mapping)
>> +{
>> + mapping->host = bdev->bd_inode;
>> +}
>
> Here I'm not sure - is the helper really a win? It seems a bit obscure to
> me. This initialization of another mapping for a bdev looks really special.

Yes, I don't like this helper at all, but gfs2 is used this way, and I
need this helper to remove 'bd_inode' from block_devcie later. I'm not
familiar with gfs2 at all but perhaps it worth to dig deeper and figure
out a proper way for gfs2.

Thanks,
Kuai
>
> Honza
>


2023-12-12 01:32:38

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 15/18] buffer: add a new helper to read sb block

Hi,

?? 2023/12/12 1:27, Jan Kara д??:
> On Mon 11-12-23 22:07:53, Yu Kuai wrote:
>> From: Yu Kuai <[email protected]>
>>
>> Unlike __bread_gfp(), ext4 has special handing while reading sb block:
>>
>> 1) __GFP_NOFAIL is not set, and memory allocation can fail;
>> 2) If buffer write failed before, set buffer uptodate and don't read
>> block from disk;
>> 3) REQ_META is set for all IO, and REQ_PRIO is set for reading xattr;
>> 4) If failed, return error ptr instead of NULL;
>>
>> This patch add a new helper __bread_gfp2() that will match above 2 and 3(
>> 1 will be used, and 4 will still be encapsulated by ext4), and prepare to
>> prevent calling mapping_gfp_constraint() directly on bd_inode->i_mapping
>> in ext4.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
> ...
>> +/*
>> + * This works like __bread_gfp() except:
>> + * 1) If buffer write failed before, set buffer uptodate and don't read
>> + * block from disk;
>> + * 2) Caller can pass in additional op_flags like REQ_META;
>> + */
>> +struct buffer_head *
>> +__bread_gfp2(struct block_device *bdev, sector_t block, unsigned int size,
>> + blk_opf_t op_flags, gfp_t gfp)
>> +{
>> + return bread_gfp(bdev, block, size, op_flags, gfp, true);
>> +}
>> +EXPORT_SYMBOL(__bread_gfp2);
>
> __bread_gfp2() is not a great name, why not just using bread_gfp()
> directly? I'm not a huge fan of boolean arguments but three different flags
> arguments would be too much for my taste ;) so I guess I can live with
> that.

I agree that __bread_gfp2 is not a greate name, if possible, I'll try to
figure out a better name for v3.

Thanks for reviewing this patchset!
Kuai
>
> Honza
>


2023-12-12 13:15:24

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 01/18] block: add some bdev apis

On Mon, Dec 11, 2023 at 05:52:17PM +0100, Jan Kara wrote:
> > +void bdev_associated_mapping(struct block_device *bdev,
> > + struct address_space *mapping)
> > +{
> > + mapping->host = bdev->bd_inode;
> > +}
>
> Here I'm not sure - is the helper really a win? It seems a bit obscure to
> me. This initialization of another mapping for a bdev looks really special.

If we want to hide bd_inode we'll something like this helper even if
I don't particularly like it either.

But it might be a good idea to move out of this series and into the
follow on removing bd_inode, as it's rather pointless without that
context.

2023-12-12 13:17:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 01/18] block: add some bdev apis

> +void invalidate_bdev_range(struct block_device *bdev, pgoff_t start,
> + pgoff_t end)
> +{
> + invalidate_mapping_pages(bdev->bd_inode->i_mapping, start, end);
> +}
> +EXPORT_SYMBOL_GPL(invalidate_bdev_range);

Can we have kerneldoc comments for the new helpers please?

> +struct folio *__bdev_get_folio(struct block_device *bdev, loff_t pos,
> + fgf_t fgp_flags, gfp_t gfp)
> +{
> + return __filemap_get_folio(bdev->bd_inode->i_mapping, pos >> PAGE_SHIFT,
> + fgp_flags, gfp);
> +}
> +EXPORT_SYMBOL_GPL(__bdev_get_folio);

It's a bit silly to have a __-prefixed API without a version that
doesn't have the prefix, so I'd prefer to drop it. Unless willy has
a good argument for keeping it the same as the filemap API.


2023-12-12 13:26:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 15/18] buffer: add a new helper to read sb block

On Mon, Dec 11, 2023 at 10:07:53PM +0800, Yu Kuai wrote:
> +static __always_inline int buffer_uptodate_or_error(struct buffer_head *bh)
> +{
> + /*
> + * If the buffer has the write error flag, data was failed to write
> + * out in the block. In this case, set buffer uptodate to prevent
> + * reading old data.
> + */
> + if (buffer_write_io_error(bh))
> + set_buffer_uptodate(bh);
> + return buffer_uptodate(bh);
> +}

So - risking this blows up into a lot of nasty work: Why do we even
clear the uptodate flag on write errors? Doing so makes not sense to
me as the data isn't any less uptodate just because we failed to write
it..


2023-12-12 14:11:34

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 15/18] buffer: add a new helper to read sb block

On Tue 12-12-23 05:25:25, Christoph Hellwig wrote:
> On Mon, Dec 11, 2023 at 10:07:53PM +0800, Yu Kuai wrote:
> > +static __always_inline int buffer_uptodate_or_error(struct buffer_head *bh)
> > +{
> > + /*
> > + * If the buffer has the write error flag, data was failed to write
> > + * out in the block. In this case, set buffer uptodate to prevent
> > + * reading old data.
> > + */
> > + if (buffer_write_io_error(bh))
> > + set_buffer_uptodate(bh);
> > + return buffer_uptodate(bh);
> > +}
>
> So - risking this blows up into a lot of nasty work: Why do we even
> clear the uptodate flag on write errors? Doing so makes not sense to
> me as the data isn't any less uptodate just because we failed to write
> it..

Historic reasons I'd say (buffer_write_io_error isn't *that* old - from
2003 it seems). And yes, it would make a lot of sense to keep uptodate flag
set and just rely on buffer_write_io_error() but it also means going
through all buffer_uptodate() checks in filesystems and determining which
need changing to buffer_write_io_error() which is something nobody is keen
on doing ;)

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-12-13 01:10:30

by Yu Kuai

[permalink] [raw]
Subject: Re: [PATCH RFC v2 for-6.8/block 01/18] block: add some bdev apis

Hi,

?? 2023/12/12 21:14, Christoph Hellwig д??:
> On Mon, Dec 11, 2023 at 05:52:17PM +0100, Jan Kara wrote:
>>> +void bdev_associated_mapping(struct block_device *bdev,
>>> + struct address_space *mapping)
>>> +{
>>> + mapping->host = bdev->bd_inode;
>>> +}
>>
>> Here I'm not sure - is the helper really a win? It seems a bit obscure to
>> me. This initialization of another mapping for a bdev looks really special.
>
> If we want to hide bd_inode we'll something like this helper even if
> I don't particularly like it either.
>
> But it might be a good idea to move out of this series and into the
> follow on removing bd_inode, as it's rather pointless without that
> context.

Yes, this sounds good, I'll remove this from v3.

Thanks,
Kuai

> .
>