2010-07-28 18:16:23

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: RAID/block regression starting from 2.6.32, bisected

Hello,

In recent kernels we are experiencing a problem that in our setup using SCST BLOCKIO backend some BIOs are finished, i.e. the finish callback called for them, with error -EIO. It happens quite often, much more often than one would expect to have an actual IO error. (BLOCKIO backend just converts all incoming SCSI commands to the corresponding block requests.)

After some investigation, we figured out, that, most likely, raid5.c::make_request() for some reason sometimes calls bio_endio() with not BIO_UPTODATE bios.

We bisected it to commit:

commit a82afdfcb8c0df09776b6458af6b68fc58b2e87b
Author: Tejun Heo <[email protected]>
Date: Fri Jul 3 17:48:16 2009 +0900

block: use the same failfast bits for bio and request

bio and request use the same set of failfast bits. This patch makes
the following changes to simplify things.

* enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
bits coincide with __REQ_FAILFAST_* bits.

* The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
but the matching is useless anyway. init_request_from_bio() is
responsible for setting FAILFAST bits on FS requests and non-FS
requests never use BIO_RW_AHEAD. Drop the code and comment from
blk_rq_bio_prep().

* Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
simplify FAILFAST flags handling in init_request_from_bio().

Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

After looking at it I can't see how it can lead to the effect we are experiencing. Could anybody comment on this, please? Is it a known problem?

The error can be only reproduced when running RAID 5. The general layout is:

Disks --> RAID5 --> LVM --> BLOCKIO VDISK

The problem is easy to reproduce by forcing the RAID 5 array to re-sync its members, eg just fail out one member and add it back into the array and then generate some IO using dd. In fact, just writing out to the partition table on the exported block device is usually enough to provoke the error.

The complete thread about the topic you can find in http://sourceforge.net/mailarchive/forum.php?thread_name=20100727220110.GF31152%40atlantis.cc.ndsu.nodak.edu&forum_name=scst-devel

If any additional information is needed we would be glad to provide it.

Thanks,
Vlad


2010-07-30 10:30:05

by Tejun Heo

[permalink] [raw]
Subject: Re: RAID/block regression starting from 2.6.32, bisected

Hello,

On 07/28/2010 08:16 PM, Vladislav Bolkhovitin wrote:
> In recent kernels we are experiencing a problem that in our setup
> using SCST BLOCKIO backend some BIOs are finished, i.e. the finish
> callback called for them, with error -EIO. It happens quite often,
> much more often than one would expect to have an actual IO
> error. (BLOCKIO backend just converts all incoming SCSI commands to
> the corresponding block requests.)
>
> After some investigation, we figured out, that, most likely,
> raid5.c::make_request() for some reason sometimes calls bio_endio()
> with not BIO_UPTODATE bios.
>
> We bisected it to commit:
>
> commit a82afdfcb8c0df09776b6458af6b68fc58b2e87b
> Author: Tejun Heo <[email protected]>
> Date: Fri Jul 3 17:48:16 2009 +0900
>
> block: use the same failfast bits for bio and request

That commit doesn't (or at least isn't supposed to) make any behavior
difference. It's just repositioning flag bits. If the commit is
actually causing the problem, I think one possibility is that whatever
code could be using hard coded constants which now are mapped to
different flags. The mixed merge changes have been in mainline for
quite some time and shipping in all major distros too and this is the
first time this is reported, so I don't think it could be a widespread
problem.

Thanks.

--
tejun

2010-08-02 00:42:41

by NeilBrown

[permalink] [raw]
Subject: Re: RAID/block regression starting from 2.6.32, bisected

On Fri, 30 Jul 2010 12:29:30 +0200
Tejun Heo <[email protected]> wrote:

> Hello,
>
> On 07/28/2010 08:16 PM, Vladislav Bolkhovitin wrote:
> > In recent kernels we are experiencing a problem that in our setup
> > using SCST BLOCKIO backend some BIOs are finished, i.e. the finish
> > callback called for them, with error -EIO. It happens quite often,
> > much more often than one would expect to have an actual IO
> > error. (BLOCKIO backend just converts all incoming SCSI commands to
> > the corresponding block requests.)
> >
> > After some investigation, we figured out, that, most likely,
> > raid5.c::make_request() for some reason sometimes calls bio_endio()
> > with not BIO_UPTODATE bios.
> >
> > We bisected it to commit:
> >
> > commit a82afdfcb8c0df09776b6458af6b68fc58b2e87b
> > Author: Tejun Heo <[email protected]>
> > Date: Fri Jul 3 17:48:16 2009 +0900
> >
> > block: use the same failfast bits for bio and request
>
> That commit doesn't (or at least isn't supposed to) make any behavior
> difference. It's just repositioning flag bits. If the commit is
> actually causing the problem, I think one possibility is that whatever
> code could be using hard coded constants which now are mapped to
> different flags. The mixed merge changes have been in mainline for
> quite some time and shipping in all major distros too and this is the
> first time this is reported, so I don't think it could be a widespread
> problem.
>
> Thanks.
>

The problem is that md/raid5 tests bio->bi_rw against RWA_MASK, which used to
align with BIO_RW_AHEAD, and now doesn't.
However the definition of bio_rw() in fs.h seems to justify that RWA_MASK
should align with BIO_RW_AHEAD, as does the definition of READA.

Given the current definitions, any WRITE request with BIO_RW_FAILFAST_DEV
set is going to confused a number of drives which test
bio_rw(bio) == WRITE

I guess RWA_MASK needs to be changed to (1<<BIO_RW_AHEAD), and READA need to
be change to that value too.

Can I leave that to you Tejun?

Thanks,
NeilBrown

2010-08-02 14:12:53

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
breaking READA and SWRITE.

This patch updates READA and SWRITE such that they match the BIO_RW_*
bits again. A follow up patch will update the definitions to directly
use BIO_RW_* bits so that this kind of breakage won't happen again.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <[email protected]>
Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
Root-caused-by: Neil Brown <[email protected]>
Cc: Jens Axobe <[email protected]>
Cc: [email protected]
---
Aieee... thanks for root causing it Neil. That was a stupid bug. I
knew that READ/WRITE were hardcoded but forgot about READA. :-(
Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
afraid that would cause more confusions downstream. This patch
updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
in -stable releases. The next patch will create bio_types.h and
define all constants in terms of BIO_RW_*.

Thanks.

include/linux/fs.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -148,8 +148,8 @@ struct inodes_stat_t {
#define RWA_MASK 2
#define READ 0
#define WRITE 1
-#define READA 2 /* read-ahead - don't block if no resources */
-#define SWRITE 3 /* for ll_rw_block() - wait for buffer lock */
+#define READA 16 /* read-ahead - don't block if no resources */
+#define SWRITE 17 /* for ll_rw_block() - wait for buffer lock */
#define READ_SYNC (READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
#define READ_META (READ | (1 << BIO_RW_META))
#define WRITE_SYNC_PLUG (WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))

2010-08-02 14:13:33

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags. This is fragile and caused breakage during BIO_RW_* flag
rearrangement. The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Jens Axobe <[email protected]>
---
include/linux/bio.h | 153 +-----------------------------------------
include/linux/bio_types.h | 164 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 17 ++--
3 files changed, 176 insertions(+), 158 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
@@ -28,6 +28,9 @@

#include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in bio_types.h */
+#include <linux/bio_types.h>
+
#define BIO_DEBUG

#ifdef BIO_DEBUG
@@ -41,154 +44,6 @@
#define BIO_MAX_SECTORS (BIO_MAX_SIZE >> 9)

/*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
- struct page *bv_page;
- unsigned int bv_len;
- unsigned int bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
- sector_t bi_sector; /* device address in 512 byte
- sectors */
- struct bio *bi_next; /* request queue link */
- struct block_device *bi_bdev;
- unsigned long bi_flags; /* status, command, etc */
- unsigned long bi_rw; /* bottom bits READ/WRITE,
- * top bits priority
- */
-
- unsigned short bi_vcnt; /* how many bio_vec's */
- unsigned short bi_idx; /* current index into bvl_vec */
-
- /* Number of segments in this BIO after
- * physical address coalescing is performed.
- */
- unsigned int bi_phys_segments;
-
- unsigned int bi_size; /* residual I/O count */
-
- /*
- * To keep track of the max segment size, we account for the
- * sizes of the first and last mergeable segments in this bio.
- */
- unsigned int bi_seg_front_size;
- unsigned int bi_seg_back_size;
-
- unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
-
- unsigned int bi_comp_cpu; /* completion CPU */
-
- atomic_t bi_cnt; /* pin count */
-
- struct bio_vec *bi_io_vec; /* the actual vec list */
-
- bio_end_io_t *bi_end_io;
-
- void *bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
- struct bio_integrity_payload *bi_integrity; /* data integrity */
-#endif
-
- bio_destructor_t *bi_destructor; /* destructor */
-
- /*
- * We can inline a number of vecs at the end of the bio, to avoid
- * double allocations for a small number of bio_vecs. This member
- * MUST obviously be kept at the very end of the bio.
- */
- struct bio_vec bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE 0 /* ok after I/O completion */
-#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
-#define BIO_EOF 2 /* out-out-bounds error */
-#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
-#define BIO_CLONED 4 /* doesn't own data */
-#define BIO_BOUNCED 5 /* bio is a bounce bio */
-#define BIO_USER_MAPPED 6 /* contains user pages */
-#define BIO_EOPNOTSUPP 7 /* not supported */
-#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
-#define BIO_QUIET 11 /* Make BIO Quiet */
-#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS (4)
-#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
-
-/*
- * bio bi_rw flags
- *
- * bit 0 -- data direction
- * If not set, bio is a read from device. If set, it's a write to device.
- * bit 1 -- fail fast device errors
- * bit 2 -- fail fast transport errors
- * bit 3 -- fail fast driver errors
- * bit 4 -- rw-ahead when set
- * bit 5 -- barrier
- * Insert a serialization point in the IO queue, forcing previously
- * submitted IO to be completed before this one is issued.
- * bit 6 -- synchronous I/O hint.
- * bit 7 -- Unplug the device immediately after submitting this bio.
- * bit 8 -- metadata request
- * Used for tracing to differentiate metadata and data IO. May also
- * get some preferential treatment in the IO scheduler
- * bit 9 -- discard sectors
- * Informs the lower level device that this range of sectors is no longer
- * used by the file system and may thus be freed by the device. Used
- * for flash based storage.
- * Don't want driver retries for any fast fail whatever the reason.
- * bit 10 -- Tell the IO scheduler not to wait for more requests after this
- one has been submitted, even if it is a SYNC request.
- */
-enum bio_rw_flags {
- BIO_RW,
- BIO_RW_FAILFAST_DEV,
- BIO_RW_FAILFAST_TRANSPORT,
- BIO_RW_FAILFAST_DRIVER,
- /* above flags must match REQ_* */
- BIO_RW_AHEAD,
- BIO_RW_BARRIER,
- BIO_RW_SYNCIO,
- BIO_RW_UNPLUG,
- BIO_RW_META,
- BIO_RW_DISCARD,
- BIO_RW_NOIDLE,
-};
-
-/*
- * First four bits must match between bio->bi_rw and rq->cmd_flags, make
- * that explicit here.
- */
-#define BIO_RW_RQ_MASK 0xf
-
-static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
-{
- return (bio->bi_rw & (1 << flag)) != 0;
-}
-
-/*
* upper 16 bits of bi_rw define the io priority of this bio
*/
#define BIO_PRIO_SHIFT (8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/bio_types.h
===================================================================
--- /dev/null
+++ work/include/linux/bio_types.h
@@ -0,0 +1,164 @@
+/*
+ * BIO data types and constants. Include linux/bio.h for usual cases.
+ * Directly include this file only to break include dependency loop.
+ */
+#ifndef __LINUX_BIO_TYPES_H
+#define __LINUX_BIO_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+ struct page *bv_page;
+ unsigned int bv_len;
+ unsigned int bv_offset;
+};
+
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+ sector_t bi_sector; /* device address in 512 byte
+ sectors */
+ struct bio *bi_next; /* request queue link */
+ struct block_device *bi_bdev;
+ unsigned long bi_flags; /* status, command, etc */
+ unsigned long bi_rw; /* bottom bits READ/WRITE,
+ * top bits priority
+ */
+
+ unsigned short bi_vcnt; /* how many bio_vec's */
+ unsigned short bi_idx; /* current index into bvl_vec */
+
+ /* Number of segments in this BIO after
+ * physical address coalescing is performed.
+ */
+ unsigned int bi_phys_segments;
+
+ unsigned int bi_size; /* residual I/O count */
+
+ /*
+ * To keep track of the max segment size, we account for the
+ * sizes of the first and last mergeable segments in this bio.
+ */
+ unsigned int bi_seg_front_size;
+ unsigned int bi_seg_back_size;
+
+ unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
+
+ unsigned int bi_comp_cpu; /* completion CPU */
+
+ atomic_t bi_cnt; /* pin count */
+
+ struct bio_vec *bi_io_vec; /* the actual vec list */
+
+ bio_end_io_t *bi_end_io;
+
+ void *bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ struct bio_integrity_payload *bi_integrity; /* data integrity */
+#endif
+
+ bio_destructor_t *bi_destructor; /* destructor */
+
+ /*
+ * We can inline a number of vecs at the end of the bio, to avoid
+ * double allocations for a small number of bio_vecs. This member
+ * MUST obviously be kept at the very end of the bio.
+ */
+ struct bio_vec bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE 0 /* ok after I/O completion */
+#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
+#define BIO_EOF 2 /* out-out-bounds error */
+#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
+#define BIO_CLONED 4 /* doesn't own data */
+#define BIO_BOUNCED 5 /* bio is a bounce bio */
+#define BIO_USER_MAPPED 6 /* contains user pages */
+#define BIO_EOPNOTSUPP 7 /* not supported */
+#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
+#define BIO_QUIET 11 /* Make BIO Quiet */
+#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS (4)
+#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * bio bi_rw flags
+ *
+ * bit 0 -- data direction
+ * If not set, bio is a read from device. If set, it's a write to device.
+ * bit 1 -- fail fast device errors
+ * bit 2 -- fail fast transport errors
+ * bit 3 -- fail fast driver errors
+ * bit 4 -- rw-ahead when set
+ * bit 5 -- barrier
+ * Insert a serialization point in the IO queue, forcing previously
+ * submitted IO to be completed before this one is issued.
+ * bit 6 -- synchronous I/O hint.
+ * bit 7 -- Unplug the device immediately after submitting this bio.
+ * bit 8 -- metadata request
+ * Used for tracing to differentiate metadata and data IO. May also
+ * get some preferential treatment in the IO scheduler
+ * bit 9 -- discard sectors
+ * Informs the lower level device that this range of sectors is no longer
+ * used by the file system and may thus be freed by the device. Used
+ * for flash based storage.
+ * Don't want driver retries for any fast fail whatever the reason.
+ * bit 10 -- Tell the IO scheduler not to wait for more requests after this
+ one has been submitted, even if it is a SYNC request.
+ */
+enum bio_rw_flags {
+ BIO_RW,
+ BIO_RW_FAILFAST_DEV,
+ BIO_RW_FAILFAST_TRANSPORT,
+ BIO_RW_FAILFAST_DRIVER,
+ /* above flags must match REQ_* */
+ BIO_RW_AHEAD,
+ BIO_RW_BARRIER,
+ BIO_RW_SYNCIO,
+ BIO_RW_UNPLUG,
+ BIO_RW_META,
+ BIO_RW_DISCARD,
+ BIO_RW_NOIDLE,
+};
+
+/*
+ * First four bits must match between bio->bi_rw and rq->cmd_flags, make
+ * that explicit here.
+ */
+#define BIO_RW_RQ_MASK 0xf
+
+static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
+{
+ return (bio->bi_rw & (1 << flag)) != 0;
+}
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BIO_TYPES_H */
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

#include <linux/limits.h>
#include <linux/ioctl.h>
+#include <linux/bio_types.h>

/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
* immediately wait on this read without caring about
* unplugging.
* READA Used for read-ahead operations. Lower priority, and the
- * block layer could (in theory) choose to ignore this
+ * block layer could (in theory) choose to ignore this
* request if it runs into resource problems.
* WRITE A normal async write. Device will be plugged.
* SWRITE Like WRITE, but a special case for ll_rw_block() that
@@ -144,12 +145,12 @@ struct inodes_stat_t {
* of this IO.
*
*/
-#define RW_MASK 1
-#define RWA_MASK 2
-#define READ 0
-#define WRITE 1
-#define READA 16 /* read-ahead - don't block if no resources */
-#define SWRITE 17 /* for ll_rw_block() - wait for buffer lock */
+#define RW_MASK (1 << BIO_RW)
+#define RWA_MASK (1 << BIO_RW_AHEAD)
+#define READ 0
+#define WRITE RW_MASK
+#define READA RWA_MASK
+#define SWRITE (WRITE | READA)
#define READ_SYNC (READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
#define READ_META (READ | (1 << BIO_RW_META))
#define WRITE_SYNC_PLUG (WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))
@@ -2198,7 +2199,6 @@ static inline void insert_inode_hash(str
extern void file_move(struct file *f, struct list_head *list);
extern void file_kill(struct file *f);
#ifdef CONFIG_BLOCK
-struct bio;
extern void submit_bio(int, struct bio *);
extern int bdev_read_only(struct block_device *);
#endif
@@ -2265,7 +2265,6 @@ static inline int xip_truncate_page(stru
#endif

#ifdef CONFIG_BLOCK
-struct bio;
typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
loff_t file_offset);
void dio_end_io(struct bio *bio, int error);

2010-08-02 14:15:50

by Tejun Heo

[permalink] [raw]
Subject: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
breaking READA and SWRITE.

This patch updates READA and SWRITE such that they match the BIO_RW_*
bits again. A follow up patch will update the definitions to directly
use BIO_RW_* bits so that this kind of breakage won't happen again.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <[email protected]>
Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
Root-caused-by: Neil Brown <[email protected]>
Cc: Jens Axobe <[email protected]>
Cc: [email protected]
---
Aieee... thanks for root causing it Neil. That was a stupid bug. I
knew that READ/WRITE were hardcoded but forgot about READA. :-(
Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
afraid that would cause more confusions downstream. This patch
updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
in -stable releases. The next patch will create bio_types.h and
define all constants in terms of BIO_RW_*.

Thanks.

(resending w/ Jens' new address)

include/linux/fs.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -148,8 +148,8 @@ struct inodes_stat_t {
#define RWA_MASK 2
#define READ 0
#define WRITE 1
-#define READA 2 /* read-ahead - don't block if no resources */
-#define SWRITE 3 /* for ll_rw_block() - wait for buffer lock */
+#define READA 16 /* read-ahead - don't block if no resources */
+#define SWRITE 17 /* for ll_rw_block() - wait for buffer lock */
#define READ_SYNC (READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
#define READ_META (READ | (1 << BIO_RW_META))
#define WRITE_SYNC_PLUG (WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))

2010-08-02 14:16:10

by Tejun Heo

[permalink] [raw]
Subject: [PATCH RESEND 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags. This is fragile and caused breakage during BIO_RW_* flag
rearrangement. The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Jens Axobe <[email protected]>
---
include/linux/bio.h | 153 +-----------------------------------------
include/linux/bio_types.h | 164 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 17 ++--
3 files changed, 176 insertions(+), 158 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
@@ -28,6 +28,9 @@

#include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in bio_types.h */
+#include <linux/bio_types.h>
+
#define BIO_DEBUG

#ifdef BIO_DEBUG
@@ -41,154 +44,6 @@
#define BIO_MAX_SECTORS (BIO_MAX_SIZE >> 9)

/*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
- struct page *bv_page;
- unsigned int bv_len;
- unsigned int bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
- sector_t bi_sector; /* device address in 512 byte
- sectors */
- struct bio *bi_next; /* request queue link */
- struct block_device *bi_bdev;
- unsigned long bi_flags; /* status, command, etc */
- unsigned long bi_rw; /* bottom bits READ/WRITE,
- * top bits priority
- */
-
- unsigned short bi_vcnt; /* how many bio_vec's */
- unsigned short bi_idx; /* current index into bvl_vec */
-
- /* Number of segments in this BIO after
- * physical address coalescing is performed.
- */
- unsigned int bi_phys_segments;
-
- unsigned int bi_size; /* residual I/O count */
-
- /*
- * To keep track of the max segment size, we account for the
- * sizes of the first and last mergeable segments in this bio.
- */
- unsigned int bi_seg_front_size;
- unsigned int bi_seg_back_size;
-
- unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
-
- unsigned int bi_comp_cpu; /* completion CPU */
-
- atomic_t bi_cnt; /* pin count */
-
- struct bio_vec *bi_io_vec; /* the actual vec list */
-
- bio_end_io_t *bi_end_io;
-
- void *bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
- struct bio_integrity_payload *bi_integrity; /* data integrity */
-#endif
-
- bio_destructor_t *bi_destructor; /* destructor */
-
- /*
- * We can inline a number of vecs at the end of the bio, to avoid
- * double allocations for a small number of bio_vecs. This member
- * MUST obviously be kept at the very end of the bio.
- */
- struct bio_vec bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE 0 /* ok after I/O completion */
-#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
-#define BIO_EOF 2 /* out-out-bounds error */
-#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
-#define BIO_CLONED 4 /* doesn't own data */
-#define BIO_BOUNCED 5 /* bio is a bounce bio */
-#define BIO_USER_MAPPED 6 /* contains user pages */
-#define BIO_EOPNOTSUPP 7 /* not supported */
-#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
-#define BIO_QUIET 11 /* Make BIO Quiet */
-#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS (4)
-#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
-
-/*
- * bio bi_rw flags
- *
- * bit 0 -- data direction
- * If not set, bio is a read from device. If set, it's a write to device.
- * bit 1 -- fail fast device errors
- * bit 2 -- fail fast transport errors
- * bit 3 -- fail fast driver errors
- * bit 4 -- rw-ahead when set
- * bit 5 -- barrier
- * Insert a serialization point in the IO queue, forcing previously
- * submitted IO to be completed before this one is issued.
- * bit 6 -- synchronous I/O hint.
- * bit 7 -- Unplug the device immediately after submitting this bio.
- * bit 8 -- metadata request
- * Used for tracing to differentiate metadata and data IO. May also
- * get some preferential treatment in the IO scheduler
- * bit 9 -- discard sectors
- * Informs the lower level device that this range of sectors is no longer
- * used by the file system and may thus be freed by the device. Used
- * for flash based storage.
- * Don't want driver retries for any fast fail whatever the reason.
- * bit 10 -- Tell the IO scheduler not to wait for more requests after this
- one has been submitted, even if it is a SYNC request.
- */
-enum bio_rw_flags {
- BIO_RW,
- BIO_RW_FAILFAST_DEV,
- BIO_RW_FAILFAST_TRANSPORT,
- BIO_RW_FAILFAST_DRIVER,
- /* above flags must match REQ_* */
- BIO_RW_AHEAD,
- BIO_RW_BARRIER,
- BIO_RW_SYNCIO,
- BIO_RW_UNPLUG,
- BIO_RW_META,
- BIO_RW_DISCARD,
- BIO_RW_NOIDLE,
-};
-
-/*
- * First four bits must match between bio->bi_rw and rq->cmd_flags, make
- * that explicit here.
- */
-#define BIO_RW_RQ_MASK 0xf
-
-static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
-{
- return (bio->bi_rw & (1 << flag)) != 0;
-}
-
-/*
* upper 16 bits of bi_rw define the io priority of this bio
*/
#define BIO_PRIO_SHIFT (8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/bio_types.h
===================================================================
--- /dev/null
+++ work/include/linux/bio_types.h
@@ -0,0 +1,164 @@
+/*
+ * BIO data types and constants. Include linux/bio.h for usual cases.
+ * Directly include this file only to break include dependency loop.
+ */
+#ifndef __LINUX_BIO_TYPES_H
+#define __LINUX_BIO_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+ struct page *bv_page;
+ unsigned int bv_len;
+ unsigned int bv_offset;
+};
+
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+ sector_t bi_sector; /* device address in 512 byte
+ sectors */
+ struct bio *bi_next; /* request queue link */
+ struct block_device *bi_bdev;
+ unsigned long bi_flags; /* status, command, etc */
+ unsigned long bi_rw; /* bottom bits READ/WRITE,
+ * top bits priority
+ */
+
+ unsigned short bi_vcnt; /* how many bio_vec's */
+ unsigned short bi_idx; /* current index into bvl_vec */
+
+ /* Number of segments in this BIO after
+ * physical address coalescing is performed.
+ */
+ unsigned int bi_phys_segments;
+
+ unsigned int bi_size; /* residual I/O count */
+
+ /*
+ * To keep track of the max segment size, we account for the
+ * sizes of the first and last mergeable segments in this bio.
+ */
+ unsigned int bi_seg_front_size;
+ unsigned int bi_seg_back_size;
+
+ unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
+
+ unsigned int bi_comp_cpu; /* completion CPU */
+
+ atomic_t bi_cnt; /* pin count */
+
+ struct bio_vec *bi_io_vec; /* the actual vec list */
+
+ bio_end_io_t *bi_end_io;
+
+ void *bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ struct bio_integrity_payload *bi_integrity; /* data integrity */
+#endif
+
+ bio_destructor_t *bi_destructor; /* destructor */
+
+ /*
+ * We can inline a number of vecs at the end of the bio, to avoid
+ * double allocations for a small number of bio_vecs. This member
+ * MUST obviously be kept at the very end of the bio.
+ */
+ struct bio_vec bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE 0 /* ok after I/O completion */
+#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
+#define BIO_EOF 2 /* out-out-bounds error */
+#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
+#define BIO_CLONED 4 /* doesn't own data */
+#define BIO_BOUNCED 5 /* bio is a bounce bio */
+#define BIO_USER_MAPPED 6 /* contains user pages */
+#define BIO_EOPNOTSUPP 7 /* not supported */
+#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
+#define BIO_QUIET 11 /* Make BIO Quiet */
+#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS (4)
+#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * bio bi_rw flags
+ *
+ * bit 0 -- data direction
+ * If not set, bio is a read from device. If set, it's a write to device.
+ * bit 1 -- fail fast device errors
+ * bit 2 -- fail fast transport errors
+ * bit 3 -- fail fast driver errors
+ * bit 4 -- rw-ahead when set
+ * bit 5 -- barrier
+ * Insert a serialization point in the IO queue, forcing previously
+ * submitted IO to be completed before this one is issued.
+ * bit 6 -- synchronous I/O hint.
+ * bit 7 -- Unplug the device immediately after submitting this bio.
+ * bit 8 -- metadata request
+ * Used for tracing to differentiate metadata and data IO. May also
+ * get some preferential treatment in the IO scheduler
+ * bit 9 -- discard sectors
+ * Informs the lower level device that this range of sectors is no longer
+ * used by the file system and may thus be freed by the device. Used
+ * for flash based storage.
+ * Don't want driver retries for any fast fail whatever the reason.
+ * bit 10 -- Tell the IO scheduler not to wait for more requests after this
+ one has been submitted, even if it is a SYNC request.
+ */
+enum bio_rw_flags {
+ BIO_RW,
+ BIO_RW_FAILFAST_DEV,
+ BIO_RW_FAILFAST_TRANSPORT,
+ BIO_RW_FAILFAST_DRIVER,
+ /* above flags must match REQ_* */
+ BIO_RW_AHEAD,
+ BIO_RW_BARRIER,
+ BIO_RW_SYNCIO,
+ BIO_RW_UNPLUG,
+ BIO_RW_META,
+ BIO_RW_DISCARD,
+ BIO_RW_NOIDLE,
+};
+
+/*
+ * First four bits must match between bio->bi_rw and rq->cmd_flags, make
+ * that explicit here.
+ */
+#define BIO_RW_RQ_MASK 0xf
+
+static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
+{
+ return (bio->bi_rw & (1 << flag)) != 0;
+}
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BIO_TYPES_H */
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

#include <linux/limits.h>
#include <linux/ioctl.h>
+#include <linux/bio_types.h>

/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
* immediately wait on this read without caring about
* unplugging.
* READA Used for read-ahead operations. Lower priority, and the
- * block layer could (in theory) choose to ignore this
+ * block layer could (in theory) choose to ignore this
* request if it runs into resource problems.
* WRITE A normal async write. Device will be plugged.
* SWRITE Like WRITE, but a special case for ll_rw_block() that
@@ -144,12 +145,12 @@ struct inodes_stat_t {
* of this IO.
*
*/
-#define RW_MASK 1
-#define RWA_MASK 2
-#define READ 0
-#define WRITE 1
-#define READA 16 /* read-ahead - don't block if no resources */
-#define SWRITE 17 /* for ll_rw_block() - wait for buffer lock */
+#define RW_MASK (1 << BIO_RW)
+#define RWA_MASK (1 << BIO_RW_AHEAD)
+#define READ 0
+#define WRITE RW_MASK
+#define READA RWA_MASK
+#define SWRITE (WRITE | READA)
#define READ_SYNC (READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
#define READ_META (READ | (1 << BIO_RW_META))
#define WRITE_SYNC_PLUG (WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))
@@ -2198,7 +2199,6 @@ static inline void insert_inode_hash(str
extern void file_move(struct file *f, struct list_head *list);
extern void file_kill(struct file *f);
#ifdef CONFIG_BLOCK
-struct bio;
extern void submit_bio(int, struct bio *);
extern int bdev_read_only(struct block_device *);
#endif
@@ -2265,7 +2265,6 @@ static inline int xip_truncate_page(stru
#endif

#ifdef CONFIG_BLOCK
-struct bio;
typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
loff_t file_offset);
void dio_end_io(struct bio *bio, int error);

2010-08-02 14:18:10

by Tejun Heo

[permalink] [raw]
Subject: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
breaking READA and SWRITE.

This patch updates READA and SWRITE such that they match the BIO_RW_*
bits again. A follow up patch will update the definitions to directly
use BIO_RW_* bits so that this kind of breakage won't happen again.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <[email protected]>
Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
Root-caused-by: Neil Brown <[email protected]>
Cc: Jens Axobe <[email protected]>
Cc: [email protected]
---
Aieee... thanks for root causing it Neil. That was a stupid bug. I
knew that READ/WRITE were hardcoded but forgot about READA. :-(
Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
afraid that would cause more confusions downstream. This patch
updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
in -stable releases. The next patch will create bio_types.h and
define all constants in terms of BIO_RW_*.

Thanks.

(resending w/ Jens' new address)

include/linux/fs.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -148,8 +148,8 @@ struct inodes_stat_t {
#define RWA_MASK 2
#define READ 0
#define WRITE 1
-#define READA 2 /* read-ahead - don't block if no resources */
-#define SWRITE 3 /* for ll_rw_block() - wait for buffer lock */
+#define READA 16 /* read-ahead - don't block if no resources */
+#define SWRITE 17 /* for ll_rw_block() - wait for buffer lock */
#define READ_SYNC (READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
#define READ_META (READ | (1 << BIO_RW_META))
#define WRITE_SYNC_PLUG (WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))

2010-08-02 14:18:19

by Tejun Heo

[permalink] [raw]
Subject: [PATCH RESEND 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags. This is fragile and caused breakage during BIO_RW_* flag
rearrangement. The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Jens Axobe <[email protected]>
---
include/linux/bio.h | 153 +-----------------------------------------
include/linux/bio_types.h | 164 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 17 ++--
3 files changed, 176 insertions(+), 158 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
@@ -28,6 +28,9 @@

#include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in bio_types.h */
+#include <linux/bio_types.h>
+
#define BIO_DEBUG

#ifdef BIO_DEBUG
@@ -41,154 +44,6 @@
#define BIO_MAX_SECTORS (BIO_MAX_SIZE >> 9)

/*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
- struct page *bv_page;
- unsigned int bv_len;
- unsigned int bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
- sector_t bi_sector; /* device address in 512 byte
- sectors */
- struct bio *bi_next; /* request queue link */
- struct block_device *bi_bdev;
- unsigned long bi_flags; /* status, command, etc */
- unsigned long bi_rw; /* bottom bits READ/WRITE,
- * top bits priority
- */
-
- unsigned short bi_vcnt; /* how many bio_vec's */
- unsigned short bi_idx; /* current index into bvl_vec */
-
- /* Number of segments in this BIO after
- * physical address coalescing is performed.
- */
- unsigned int bi_phys_segments;
-
- unsigned int bi_size; /* residual I/O count */
-
- /*
- * To keep track of the max segment size, we account for the
- * sizes of the first and last mergeable segments in this bio.
- */
- unsigned int bi_seg_front_size;
- unsigned int bi_seg_back_size;
-
- unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
-
- unsigned int bi_comp_cpu; /* completion CPU */
-
- atomic_t bi_cnt; /* pin count */
-
- struct bio_vec *bi_io_vec; /* the actual vec list */
-
- bio_end_io_t *bi_end_io;
-
- void *bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
- struct bio_integrity_payload *bi_integrity; /* data integrity */
-#endif
-
- bio_destructor_t *bi_destructor; /* destructor */
-
- /*
- * We can inline a number of vecs at the end of the bio, to avoid
- * double allocations for a small number of bio_vecs. This member
- * MUST obviously be kept at the very end of the bio.
- */
- struct bio_vec bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE 0 /* ok after I/O completion */
-#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
-#define BIO_EOF 2 /* out-out-bounds error */
-#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
-#define BIO_CLONED 4 /* doesn't own data */
-#define BIO_BOUNCED 5 /* bio is a bounce bio */
-#define BIO_USER_MAPPED 6 /* contains user pages */
-#define BIO_EOPNOTSUPP 7 /* not supported */
-#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
-#define BIO_QUIET 11 /* Make BIO Quiet */
-#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS (4)
-#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
-
-/*
- * bio bi_rw flags
- *
- * bit 0 -- data direction
- * If not set, bio is a read from device. If set, it's a write to device.
- * bit 1 -- fail fast device errors
- * bit 2 -- fail fast transport errors
- * bit 3 -- fail fast driver errors
- * bit 4 -- rw-ahead when set
- * bit 5 -- barrier
- * Insert a serialization point in the IO queue, forcing previously
- * submitted IO to be completed before this one is issued.
- * bit 6 -- synchronous I/O hint.
- * bit 7 -- Unplug the device immediately after submitting this bio.
- * bit 8 -- metadata request
- * Used for tracing to differentiate metadata and data IO. May also
- * get some preferential treatment in the IO scheduler
- * bit 9 -- discard sectors
- * Informs the lower level device that this range of sectors is no longer
- * used by the file system and may thus be freed by the device. Used
- * for flash based storage.
- * Don't want driver retries for any fast fail whatever the reason.
- * bit 10 -- Tell the IO scheduler not to wait for more requests after this
- one has been submitted, even if it is a SYNC request.
- */
-enum bio_rw_flags {
- BIO_RW,
- BIO_RW_FAILFAST_DEV,
- BIO_RW_FAILFAST_TRANSPORT,
- BIO_RW_FAILFAST_DRIVER,
- /* above flags must match REQ_* */
- BIO_RW_AHEAD,
- BIO_RW_BARRIER,
- BIO_RW_SYNCIO,
- BIO_RW_UNPLUG,
- BIO_RW_META,
- BIO_RW_DISCARD,
- BIO_RW_NOIDLE,
-};
-
-/*
- * First four bits must match between bio->bi_rw and rq->cmd_flags, make
- * that explicit here.
- */
-#define BIO_RW_RQ_MASK 0xf
-
-static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
-{
- return (bio->bi_rw & (1 << flag)) != 0;
-}
-
-/*
* upper 16 bits of bi_rw define the io priority of this bio
*/
#define BIO_PRIO_SHIFT (8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/bio_types.h
===================================================================
--- /dev/null
+++ work/include/linux/bio_types.h
@@ -0,0 +1,164 @@
+/*
+ * BIO data types and constants. Include linux/bio.h for usual cases.
+ * Directly include this file only to break include dependency loop.
+ */
+#ifndef __LINUX_BIO_TYPES_H
+#define __LINUX_BIO_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+ struct page *bv_page;
+ unsigned int bv_len;
+ unsigned int bv_offset;
+};
+
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+ sector_t bi_sector; /* device address in 512 byte
+ sectors */
+ struct bio *bi_next; /* request queue link */
+ struct block_device *bi_bdev;
+ unsigned long bi_flags; /* status, command, etc */
+ unsigned long bi_rw; /* bottom bits READ/WRITE,
+ * top bits priority
+ */
+
+ unsigned short bi_vcnt; /* how many bio_vec's */
+ unsigned short bi_idx; /* current index into bvl_vec */
+
+ /* Number of segments in this BIO after
+ * physical address coalescing is performed.
+ */
+ unsigned int bi_phys_segments;
+
+ unsigned int bi_size; /* residual I/O count */
+
+ /*
+ * To keep track of the max segment size, we account for the
+ * sizes of the first and last mergeable segments in this bio.
+ */
+ unsigned int bi_seg_front_size;
+ unsigned int bi_seg_back_size;
+
+ unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
+
+ unsigned int bi_comp_cpu; /* completion CPU */
+
+ atomic_t bi_cnt; /* pin count */
+
+ struct bio_vec *bi_io_vec; /* the actual vec list */
+
+ bio_end_io_t *bi_end_io;
+
+ void *bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ struct bio_integrity_payload *bi_integrity; /* data integrity */
+#endif
+
+ bio_destructor_t *bi_destructor; /* destructor */
+
+ /*
+ * We can inline a number of vecs at the end of the bio, to avoid
+ * double allocations for a small number of bio_vecs. This member
+ * MUST obviously be kept at the very end of the bio.
+ */
+ struct bio_vec bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE 0 /* ok after I/O completion */
+#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
+#define BIO_EOF 2 /* out-out-bounds error */
+#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
+#define BIO_CLONED 4 /* doesn't own data */
+#define BIO_BOUNCED 5 /* bio is a bounce bio */
+#define BIO_USER_MAPPED 6 /* contains user pages */
+#define BIO_EOPNOTSUPP 7 /* not supported */
+#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
+#define BIO_QUIET 11 /* Make BIO Quiet */
+#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS (4)
+#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * bio bi_rw flags
+ *
+ * bit 0 -- data direction
+ * If not set, bio is a read from device. If set, it's a write to device.
+ * bit 1 -- fail fast device errors
+ * bit 2 -- fail fast transport errors
+ * bit 3 -- fail fast driver errors
+ * bit 4 -- rw-ahead when set
+ * bit 5 -- barrier
+ * Insert a serialization point in the IO queue, forcing previously
+ * submitted IO to be completed before this one is issued.
+ * bit 6 -- synchronous I/O hint.
+ * bit 7 -- Unplug the device immediately after submitting this bio.
+ * bit 8 -- metadata request
+ * Used for tracing to differentiate metadata and data IO. May also
+ * get some preferential treatment in the IO scheduler
+ * bit 9 -- discard sectors
+ * Informs the lower level device that this range of sectors is no longer
+ * used by the file system and may thus be freed by the device. Used
+ * for flash based storage.
+ * Don't want driver retries for any fast fail whatever the reason.
+ * bit 10 -- Tell the IO scheduler not to wait for more requests after this
+ one has been submitted, even if it is a SYNC request.
+ */
+enum bio_rw_flags {
+ BIO_RW,
+ BIO_RW_FAILFAST_DEV,
+ BIO_RW_FAILFAST_TRANSPORT,
+ BIO_RW_FAILFAST_DRIVER,
+ /* above flags must match REQ_* */
+ BIO_RW_AHEAD,
+ BIO_RW_BARRIER,
+ BIO_RW_SYNCIO,
+ BIO_RW_UNPLUG,
+ BIO_RW_META,
+ BIO_RW_DISCARD,
+ BIO_RW_NOIDLE,
+};
+
+/*
+ * First four bits must match between bio->bi_rw and rq->cmd_flags, make
+ * that explicit here.
+ */
+#define BIO_RW_RQ_MASK 0xf
+
+static inline bool bio_rw_flagged(struct bio *bio, enum bio_rw_flags flag)
+{
+ return (bio->bi_rw & (1 << flag)) != 0;
+}
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BIO_TYPES_H */
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

#include <linux/limits.h>
#include <linux/ioctl.h>
+#include <linux/bio_types.h>

/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
* immediately wait on this read without caring about
* unplugging.
* READA Used for read-ahead operations. Lower priority, and the
- * block layer could (in theory) choose to ignore this
+ * block layer could (in theory) choose to ignore this
* request if it runs into resource problems.
* WRITE A normal async write. Device will be plugged.
* SWRITE Like WRITE, but a special case for ll_rw_block() that
@@ -144,12 +145,12 @@ struct inodes_stat_t {
* of this IO.
*
*/
-#define RW_MASK 1
-#define RWA_MASK 2
-#define READ 0
-#define WRITE 1
-#define READA 16 /* read-ahead - don't block if no resources */
-#define SWRITE 17 /* for ll_rw_block() - wait for buffer lock */
+#define RW_MASK (1 << BIO_RW)
+#define RWA_MASK (1 << BIO_RW_AHEAD)
+#define READ 0
+#define WRITE RW_MASK
+#define READA RWA_MASK
+#define SWRITE (WRITE | READA)
#define READ_SYNC (READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
#define READ_META (READ | (1 << BIO_RW_META))
#define WRITE_SYNC_PLUG (WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))
@@ -2198,7 +2199,6 @@ static inline void insert_inode_hash(str
extern void file_move(struct file *f, struct list_head *list);
extern void file_kill(struct file *f);
#ifdef CONFIG_BLOCK
-struct bio;
extern void submit_bio(int, struct bio *);
extern int bdev_read_only(struct block_device *);
#endif
@@ -2265,7 +2265,6 @@ static inline int xip_truncate_page(stru
#endif

#ifdef CONFIG_BLOCK
-struct bio;
typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
loff_t file_offset);
void dio_end_io(struct bio *bio, int error);

2010-08-02 20:04:58

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

On 08/02/2010 04:17 PM, Tejun Heo wrote:
> Aieee... thanks for root causing it Neil. That was a stupid bug. I
> knew that READ/WRITE were hardcoded but forgot about READA. :-(
> Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
> afraid that would cause more confusions downstream. This patch
> updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
> in -stable releases. The next patch will create bio_types.h and
> define all constants in terms of BIO_RW_*.

Tejun, care to resend these against for-2.6.36? We can reference
these for the stable backport (at least the first one should go in).

--
Jens Axboe

2010-08-02 21:52:23

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

On Mon, 02 Aug 2010 16:15:38 +0200
Tejun Heo <[email protected]> wrote:

> Commit a82afdf (block: use the same failfast bits for bio and request)
> moved BIO_RW_* bits around such that they match up with REQ_* bits.
> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
> 2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
> breaking READA and SWRITE.
>
> This patch updates READA and SWRITE such that they match the BIO_RW_*
> bits again. A follow up patch will update the definitions to directly
> use BIO_RW_* bits so that this kind of breakage won't happen again.
>
> Stable: The offending commit a82afdf was released with v2.6.32, so
> this patch should be applied to all kernels since then but it must
> _NOT_ be applied to kernels earlier than that.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
> Root-caused-by: Neil Brown <[email protected]>
> Cc: Jens Axobe <[email protected]>
> Cc: [email protected]
> ---
> Aieee... thanks for root causing it Neil. That was a stupid bug. I
> knew that READ/WRITE were hardcoded but forgot about READA. :-(
> Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
> afraid that would cause more confusions downstream. This patch
> updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
> in -stable releases. The next patch will create bio_types.h and
> define all constants in terms of BIO_RW_*.
>
> Thanks.
>
> (resending w/ Jens' new address)
>
> include/linux/fs.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Index: work/include/linux/fs.h
> ===================================================================
> --- work.orig/include/linux/fs.h
> +++ work/include/linux/fs.h
> @@ -148,8 +148,8 @@ struct inodes_stat_t {
> #define RWA_MASK 2

Close, but not quite there - RWA_MASK must be 16 too !!

Thanks,
NeilBrown


> #define READ 0
> #define WRITE 1
> -#define READA 2 /* read-ahead - don't block if no resources */
> -#define SWRITE 3 /* for ll_rw_block() - wait for buffer lock */
> +#define READA 16 /* read-ahead - don't block if no resources */
> +#define SWRITE 17 /* for ll_rw_block() - wait for buffer lock */
> #define READ_SYNC (READ | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_UNPLUG))
> #define READ_META (READ | (1 << BIO_RW_META))
> #define WRITE_SYNC_PLUG (WRITE | (1 << BIO_RW_SYNCIO) | (1 << BIO_RW_NOIDLE))

2010-08-03 09:27:53

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

On 08/02/2010 11:52 PM, Neil Brown wrote:
> Close, but not quite there - RWA_MASK must be 16 too !!

Indeed. Updated.

Thanks.

--
tejun

2010-08-03 09:53:45

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and SWRITE to match the corresponding BIO_RW_* bits

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded RW_MASK, RWA_MASK, READ, WRITE, READA
and SWRITE as 0, 1, 2 and 3, and expected them to match with BIO_RW_*
bits. READ/WRITE didn't change but BIO_RW_AHEAD was moved to bit 4
instead of bit 1, breaking RWA_MASK, READA and SWRITE.

This patch updates RWA_MASK, READA and SWRITE such that they match the
BIO_RW_* bits again. A follow up patch will update the definitions to
directly use BIO_RW_* bits so that this kind of breakage won't happen
again.

Neil also spotted missing RWA_MASK conversion.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <[email protected]>
Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
Root-caused-by: Neil Brown <[email protected]>
Cc: Jens Axobe <[email protected]>
Cc: [email protected]
---
Here's the regenerated version also w/ the missing RWA_MASK conversion
Neil spotted.

Thanks.

include/linux/fs.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -145,12 +145,12 @@ struct inodes_stat_t {
*
*/
#define RW_MASK 1
-#define RWA_MASK 2
+#define RWA_MASK 16

#define READ 0
#define WRITE 1
-#define READA 2 /* readahead - don't block if no resources */
-#define SWRITE 3 /* for ll_rw_block() - wait for buffer lock */
+#define READA 16 /* readahead - don't block if no resources */
+#define SWRITE 17 /* for ll_rw_block(), wait for buffer lock */

#define READ_SYNC (READ | REQ_SYNC | REQ_UNPLUG)
#define READ_META (READ | REQ_META)

2010-08-03 09:54:55

by Tejun Heo

[permalink] [raw]
Subject: [PATCH 2/2 block#for-2.6.36] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags. This is fragile and caused breakage during BIO_RW_* flag
rearrangement. The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Jens Axobe <[email protected]>
---
I renamed the file to blk_types.h instead of bio_types.h as it now
contains the REQ bits too.

include/linux/bio.h | 183 -------------------------------------------
include/linux/blk_types.h | 193 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/fs.h | 15 +--
3 files changed, 204 insertions(+), 187 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
@@ -9,7 +9,7 @@
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
@@ -28,6 +28,9 @@

#include <asm/io.h>

+/* struct bio, bio_vec and BIO_* flags are defined in blk_types.h */
+#include <linux/blk_types.h>
+
#define BIO_DEBUG

#ifdef BIO_DEBUG
@@ -41,184 +44,6 @@
#define BIO_MAX_SECTORS (BIO_MAX_SIZE >> 9)

/*
- * was unsigned short, but we might as well be ready for > 64kB I/O pages
- */
-struct bio_vec {
- struct page *bv_page;
- unsigned int bv_len;
- unsigned int bv_offset;
-};
-
-struct bio_set;
-struct bio;
-struct bio_integrity_payload;
-typedef void (bio_end_io_t) (struct bio *, int);
-typedef void (bio_destructor_t) (struct bio *);
-
-/*
- * main unit of I/O for the block layer and lower layers (ie drivers and
- * stacking drivers)
- */
-struct bio {
- sector_t bi_sector; /* device address in 512 byte
- sectors */
- struct bio *bi_next; /* request queue link */
- struct block_device *bi_bdev;
- unsigned long bi_flags; /* status, command, etc */
- unsigned long bi_rw; /* bottom bits READ/WRITE,
- * top bits priority
- */
-
- unsigned short bi_vcnt; /* how many bio_vec's */
- unsigned short bi_idx; /* current index into bvl_vec */
-
- /* Number of segments in this BIO after
- * physical address coalescing is performed.
- */
- unsigned int bi_phys_segments;
-
- unsigned int bi_size; /* residual I/O count */
-
- /*
- * To keep track of the max segment size, we account for the
- * sizes of the first and last mergeable segments in this bio.
- */
- unsigned int bi_seg_front_size;
- unsigned int bi_seg_back_size;
-
- unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
-
- unsigned int bi_comp_cpu; /* completion CPU */
-
- atomic_t bi_cnt; /* pin count */
-
- struct bio_vec *bi_io_vec; /* the actual vec list */
-
- bio_end_io_t *bi_end_io;
-
- void *bi_private;
-#if defined(CONFIG_BLK_DEV_INTEGRITY)
- struct bio_integrity_payload *bi_integrity; /* data integrity */
-#endif
-
- bio_destructor_t *bi_destructor; /* destructor */
-
- /*
- * We can inline a number of vecs at the end of the bio, to avoid
- * double allocations for a small number of bio_vecs. This member
- * MUST obviously be kept at the very end of the bio.
- */
- struct bio_vec bi_inline_vecs[0];
-};
-
-/*
- * bio flags
- */
-#define BIO_UPTODATE 0 /* ok after I/O completion */
-#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
-#define BIO_EOF 2 /* out-out-bounds error */
-#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
-#define BIO_CLONED 4 /* doesn't own data */
-#define BIO_BOUNCED 5 /* bio is a bounce bio */
-#define BIO_USER_MAPPED 6 /* contains user pages */
-#define BIO_EOPNOTSUPP 7 /* not supported */
-#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
-#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
-#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
-#define BIO_QUIET 11 /* Make BIO Quiet */
-#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
-
-/*
- * top 4 bits of bio flags indicate the pool this bio came from
- */
-#define BIO_POOL_BITS (4)
-#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
-#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
-#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
-#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
-
-/*
- * Request flags. For use in the cmd_flags field of struct request, and in
- * bi_rw of struct bio. Note that some flags are only valid in either one.
- */
-enum rq_flag_bits {
- /* common flags */
- __REQ_WRITE, /* not set, read. set, write */
- __REQ_FAILFAST_DEV, /* no driver retries of device errors */
- __REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
- __REQ_FAILFAST_DRIVER, /* no driver retries of driver errors */
-
- __REQ_HARDBARRIER, /* may not be passed by drive either */
- __REQ_SYNC, /* request is sync (sync write or read) */
- __REQ_META, /* metadata io request */
- __REQ_DISCARD, /* request to discard sectors */
- __REQ_NOIDLE, /* don't anticipate more IO after this one */
-
- /* bio only flags */
- __REQ_UNPLUG, /* unplug the immediately after submission */
- __REQ_RAHEAD, /* read ahead, can fail anytime */
-
- /* request only flags */
- __REQ_SORTED, /* elevator knows about this request */
- __REQ_SOFTBARRIER, /* may not be passed by ioscheduler */
- __REQ_FUA, /* forced unit access */
- __REQ_NOMERGE, /* don't touch this for merging */
- __REQ_STARTED, /* drive already may have started this one */
- __REQ_DONTPREP, /* don't call prep for this one */
- __REQ_QUEUED, /* uses queueing */
- __REQ_ELVPRIV, /* elevator private data attached */
- __REQ_FAILED, /* set if the request failed */
- __REQ_QUIET, /* don't worry about errors */
- __REQ_PREEMPT, /* set for "ide_preempt" requests */
- __REQ_ORDERED_COLOR, /* is before or after barrier */
- __REQ_ALLOCED, /* request came from our alloc pool */
- __REQ_COPY_USER, /* contains copies of user pages */
- __REQ_INTEGRITY, /* integrity metadata has been remapped */
- __REQ_FLUSH, /* request for cache flush */
- __REQ_IO_STAT, /* account I/O stat */
- __REQ_MIXED_MERGE, /* merge of different types, fail separately */
- __REQ_NR_BITS, /* stops here */
-};
-
-#define REQ_WRITE (1 << __REQ_WRITE)
-#define REQ_FAILFAST_DEV (1 << __REQ_FAILFAST_DEV)
-#define REQ_FAILFAST_TRANSPORT (1 << __REQ_FAILFAST_TRANSPORT)
-#define REQ_FAILFAST_DRIVER (1 << __REQ_FAILFAST_DRIVER)
-#define REQ_HARDBARRIER (1 << __REQ_HARDBARRIER)
-#define REQ_SYNC (1 << __REQ_SYNC)
-#define REQ_META (1 << __REQ_META)
-#define REQ_DISCARD (1 << __REQ_DISCARD)
-#define REQ_NOIDLE (1 << __REQ_NOIDLE)
-
-#define REQ_FAILFAST_MASK \
- (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
-#define REQ_COMMON_MASK \
- (REQ_WRITE | REQ_FAILFAST_MASK | REQ_HARDBARRIER | REQ_SYNC | \
- REQ_META| REQ_DISCARD | REQ_NOIDLE)
-
-#define REQ_UNPLUG (1 << __REQ_UNPLUG)
-#define REQ_RAHEAD (1 << __REQ_RAHEAD)
-
-#define REQ_SORTED (1 << __REQ_SORTED)
-#define REQ_SOFTBARRIER (1 << __REQ_SOFTBARRIER)
-#define REQ_FUA (1 << __REQ_FUA)
-#define REQ_NOMERGE (1 << __REQ_NOMERGE)
-#define REQ_STARTED (1 << __REQ_STARTED)
-#define REQ_DONTPREP (1 << __REQ_DONTPREP)
-#define REQ_QUEUED (1 << __REQ_QUEUED)
-#define REQ_ELVPRIV (1 << __REQ_ELVPRIV)
-#define REQ_FAILED (1 << __REQ_FAILED)
-#define REQ_QUIET (1 << __REQ_QUIET)
-#define REQ_PREEMPT (1 << __REQ_PREEMPT)
-#define REQ_ORDERED_COLOR (1 << __REQ_ORDERED_COLOR)
-#define REQ_ALLOCED (1 << __REQ_ALLOCED)
-#define REQ_COPY_USER (1 << __REQ_COPY_USER)
-#define REQ_INTEGRITY (1 << __REQ_INTEGRITY)
-#define REQ_FLUSH (1 << __REQ_FLUSH)
-#define REQ_IO_STAT (1 << __REQ_IO_STAT)
-#define REQ_MIXED_MERGE (1 << __REQ_MIXED_MERGE)
-
-/*
* upper 16 bits of bi_rw define the io priority of this bio
*/
#define BIO_PRIO_SHIFT (8 * sizeof(unsigned long) - IOPRIO_BITS)
Index: work/include/linux/fs.h
===================================================================
--- work.orig/include/linux/fs.h
+++ work/include/linux/fs.h
@@ -8,6 +8,7 @@

#include <linux/limits.h>
#include <linux/ioctl.h>
+#include <linux/blk_types.h>

/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -117,7 +118,7 @@ struct inodes_stat_t {
* immediately wait on this read without caring about
* unplugging.
* READA Used for read-ahead operations. Lower priority, and the
- * block layer could (in theory) choose to ignore this
+ * block layer could (in theory) choose to ignore this
* request if it runs into resource problems.
* WRITE A normal async write. Device will be plugged.
* SWRITE Like WRITE, but a special case for ll_rw_block() that
@@ -144,13 +145,13 @@ struct inodes_stat_t {
* of this IO.
*
*/
-#define RW_MASK 1
-#define RWA_MASK 16
+#define RW_MASK REQ_WRITE
+#define RWA_MASK REQ_RAHEAD

#define READ 0
-#define WRITE 1
-#define READA 16 /* readahead - don't block if no resources */
-#define SWRITE 17 /* for ll_rw_block(), wait for buffer lock */
+#define WRITE RW_MASK
+#define READA RWA_MASK
+#define SWRITE (WRITE | READA)

#define READ_SYNC (READ | REQ_SYNC | REQ_UNPLUG)
#define READ_META (READ | REQ_META)
@@ -2200,7 +2201,6 @@ static inline void insert_inode_hash(str
extern void file_move(struct file *f, struct list_head *list);
extern void file_kill(struct file *f);
#ifdef CONFIG_BLOCK
-struct bio;
extern void submit_bio(int, struct bio *);
extern int bdev_read_only(struct block_device *);
#endif
@@ -2267,7 +2267,6 @@ static inline int xip_truncate_page(stru
#endif

#ifdef CONFIG_BLOCK
-struct bio;
typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
loff_t file_offset);
void dio_end_io(struct bio *bio, int error);
Index: work/include/linux/blk_types.h
===================================================================
--- /dev/null
+++ work/include/linux/blk_types.h
@@ -0,0 +1,193 @@
+/*
+ * Block data types and constants. Directly include this file only to
+ * break include dependency loop.
+ */
+#ifndef __LINUX_BLK_TYPES_H
+#define __LINUX_BLK_TYPES_H
+
+#ifdef CONFIG_BLOCK
+
+#include <linux/types.h>
+
+struct bio_set;
+struct bio;
+struct bio_integrity_payload;
+struct page;
+struct block_device;
+typedef void (bio_end_io_t) (struct bio *, int);
+typedef void (bio_destructor_t) (struct bio *);
+
+/*
+ * was unsigned short, but we might as well be ready for > 64kB I/O pages
+ */
+struct bio_vec {
+ struct page *bv_page;
+ unsigned int bv_len;
+ unsigned int bv_offset;
+};
+
+/*
+ * main unit of I/O for the block layer and lower layers (ie drivers and
+ * stacking drivers)
+ */
+struct bio {
+ sector_t bi_sector; /* device address in 512 byte
+ sectors */
+ struct bio *bi_next; /* request queue link */
+ struct block_device *bi_bdev;
+ unsigned long bi_flags; /* status, command, etc */
+ unsigned long bi_rw; /* bottom bits READ/WRITE,
+ * top bits priority
+ */
+
+ unsigned short bi_vcnt; /* how many bio_vec's */
+ unsigned short bi_idx; /* current index into bvl_vec */
+
+ /* Number of segments in this BIO after
+ * physical address coalescing is performed.
+ */
+ unsigned int bi_phys_segments;
+
+ unsigned int bi_size; /* residual I/O count */
+
+ /*
+ * To keep track of the max segment size, we account for the
+ * sizes of the first and last mergeable segments in this bio.
+ */
+ unsigned int bi_seg_front_size;
+ unsigned int bi_seg_back_size;
+
+ unsigned int bi_max_vecs; /* max bvl_vecs we can hold */
+
+ unsigned int bi_comp_cpu; /* completion CPU */
+
+ atomic_t bi_cnt; /* pin count */
+
+ struct bio_vec *bi_io_vec; /* the actual vec list */
+
+ bio_end_io_t *bi_end_io;
+
+ void *bi_private;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ struct bio_integrity_payload *bi_integrity; /* data integrity */
+#endif
+
+ bio_destructor_t *bi_destructor; /* destructor */
+
+ /*
+ * We can inline a number of vecs at the end of the bio, to avoid
+ * double allocations for a small number of bio_vecs. This member
+ * MUST obviously be kept at the very end of the bio.
+ */
+ struct bio_vec bi_inline_vecs[0];
+};
+
+/*
+ * bio flags
+ */
+#define BIO_UPTODATE 0 /* ok after I/O completion */
+#define BIO_RW_BLOCK 1 /* RW_AHEAD set, and read/write would block */
+#define BIO_EOF 2 /* out-out-bounds error */
+#define BIO_SEG_VALID 3 /* bi_phys_segments valid */
+#define BIO_CLONED 4 /* doesn't own data */
+#define BIO_BOUNCED 5 /* bio is a bounce bio */
+#define BIO_USER_MAPPED 6 /* contains user pages */
+#define BIO_EOPNOTSUPP 7 /* not supported */
+#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
+#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
+#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
+#define BIO_QUIET 11 /* Make BIO Quiet */
+#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))
+
+/*
+ * top 4 bits of bio flags indicate the pool this bio came from
+ */
+#define BIO_POOL_BITS (4)
+#define BIO_POOL_NONE ((1UL << BIO_POOL_BITS) - 1)
+#define BIO_POOL_OFFSET (BITS_PER_LONG - BIO_POOL_BITS)
+#define BIO_POOL_MASK (1UL << BIO_POOL_OFFSET)
+#define BIO_POOL_IDX(bio) ((bio)->bi_flags >> BIO_POOL_OFFSET)
+
+/*
+ * Request flags. For use in the cmd_flags field of struct request, and in
+ * bi_rw of struct bio. Note that some flags are only valid in either one.
+ */
+enum rq_flag_bits {
+ /* common flags */
+ __REQ_WRITE, /* not set, read. set, write */
+ __REQ_FAILFAST_DEV, /* no driver retries of device errors */
+ __REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
+ __REQ_FAILFAST_DRIVER, /* no driver retries of driver errors */
+
+ __REQ_HARDBARRIER, /* may not be passed by drive either */
+ __REQ_SYNC, /* request is sync (sync write or read) */
+ __REQ_META, /* metadata io request */
+ __REQ_DISCARD, /* request to discard sectors */
+ __REQ_NOIDLE, /* don't anticipate more IO after this one */
+
+ /* bio only flags */
+ __REQ_UNPLUG, /* unplug the immediately after submission */
+ __REQ_RAHEAD, /* read ahead, can fail anytime */
+
+ /* request only flags */
+ __REQ_SORTED, /* elevator knows about this request */
+ __REQ_SOFTBARRIER, /* may not be passed by ioscheduler */
+ __REQ_FUA, /* forced unit access */
+ __REQ_NOMERGE, /* don't touch this for merging */
+ __REQ_STARTED, /* drive already may have started this one */
+ __REQ_DONTPREP, /* don't call prep for this one */
+ __REQ_QUEUED, /* uses queueing */
+ __REQ_ELVPRIV, /* elevator private data attached */
+ __REQ_FAILED, /* set if the request failed */
+ __REQ_QUIET, /* don't worry about errors */
+ __REQ_PREEMPT, /* set for "ide_preempt" requests */
+ __REQ_ORDERED_COLOR, /* is before or after barrier */
+ __REQ_ALLOCED, /* request came from our alloc pool */
+ __REQ_COPY_USER, /* contains copies of user pages */
+ __REQ_INTEGRITY, /* integrity metadata has been remapped */
+ __REQ_FLUSH, /* request for cache flush */
+ __REQ_IO_STAT, /* account I/O stat */
+ __REQ_MIXED_MERGE, /* merge of different types, fail separately */
+ __REQ_NR_BITS, /* stops here */
+};
+
+#define REQ_WRITE (1 << __REQ_WRITE)
+#define REQ_FAILFAST_DEV (1 << __REQ_FAILFAST_DEV)
+#define REQ_FAILFAST_TRANSPORT (1 << __REQ_FAILFAST_TRANSPORT)
+#define REQ_FAILFAST_DRIVER (1 << __REQ_FAILFAST_DRIVER)
+#define REQ_HARDBARRIER (1 << __REQ_HARDBARRIER)
+#define REQ_SYNC (1 << __REQ_SYNC)
+#define REQ_META (1 << __REQ_META)
+#define REQ_DISCARD (1 << __REQ_DISCARD)
+#define REQ_NOIDLE (1 << __REQ_NOIDLE)
+
+#define REQ_FAILFAST_MASK \
+ (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER)
+#define REQ_COMMON_MASK \
+ (REQ_WRITE | REQ_FAILFAST_MASK | REQ_HARDBARRIER | REQ_SYNC | \
+ REQ_META| REQ_DISCARD | REQ_NOIDLE)
+
+#define REQ_UNPLUG (1 << __REQ_UNPLUG)
+#define REQ_RAHEAD (1 << __REQ_RAHEAD)
+
+#define REQ_SORTED (1 << __REQ_SORTED)
+#define REQ_SOFTBARRIER (1 << __REQ_SOFTBARRIER)
+#define REQ_FUA (1 << __REQ_FUA)
+#define REQ_NOMERGE (1 << __REQ_NOMERGE)
+#define REQ_STARTED (1 << __REQ_STARTED)
+#define REQ_DONTPREP (1 << __REQ_DONTPREP)
+#define REQ_QUEUED (1 << __REQ_QUEUED)
+#define REQ_ELVPRIV (1 << __REQ_ELVPRIV)
+#define REQ_FAILED (1 << __REQ_FAILED)
+#define REQ_QUIET (1 << __REQ_QUIET)
+#define REQ_PREEMPT (1 << __REQ_PREEMPT)
+#define REQ_ORDERED_COLOR (1 << __REQ_ORDERED_COLOR)
+#define REQ_ALLOCED (1 << __REQ_ALLOCED)
+#define REQ_COPY_USER (1 << __REQ_COPY_USER)
+#define REQ_INTEGRITY (1 << __REQ_INTEGRITY)
+#define REQ_FLUSH (1 << __REQ_FLUSH)
+#define REQ_IO_STAT (1 << __REQ_IO_STAT)
+#define REQ_MIXED_MERGE (1 << __REQ_MIXED_MERGE)
+
+#endif /* CONFIG_BLOCK */
+#endif /* __LINUX_BLK_TYPES_H */

2010-08-03 11:15:36

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and SWRITE to match the corresponding BIO_RW_* bits

On 2010-08-03 11:53, Tejun Heo wrote:
> Commit a82afdf (block: use the same failfast bits for bio and request)
> moved BIO_RW_* bits around such that they match up with REQ_* bits.
> Unfortunately, fs.h hard coded RW_MASK, RWA_MASK, READ, WRITE, READA
> and SWRITE as 0, 1, 2 and 3, and expected them to match with BIO_RW_*
> bits. READ/WRITE didn't change but BIO_RW_AHEAD was moved to bit 4
> instead of bit 1, breaking RWA_MASK, READA and SWRITE.
>
> This patch updates RWA_MASK, READA and SWRITE such that they match the
> BIO_RW_* bits again. A follow up patch will update the definitions to
> directly use BIO_RW_* bits so that this kind of breakage won't happen
> again.
>
> Neil also spotted missing RWA_MASK conversion.
>
> Stable: The offending commit a82afdf was released with v2.6.32, so
> this patch should be applied to all kernels since then but it must
> _NOT_ be applied to kernels earlier than that.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
> Root-caused-by: Neil Brown <[email protected]>
> Cc: Jens Axobe <[email protected]>
^^^^^

(Too) common typo :-)

Anyway, applied to for-2.6.36, thanks a lot.

--
Jens Axboe

2010-08-03 11:21:23

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and SWRITE to match the corresponding BIO_RW_* bits

On 2010-08-03 13:15, Jens Axboe wrote:
> On 2010-08-03 11:53, Tejun Heo wrote:
>> Commit a82afdf (block: use the same failfast bits for bio and request)
>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>> Unfortunately, fs.h hard coded RW_MASK, RWA_MASK, READ, WRITE, READA
>> and SWRITE as 0, 1, 2 and 3, and expected them to match with BIO_RW_*
>> bits. READ/WRITE didn't change but BIO_RW_AHEAD was moved to bit 4
>> instead of bit 1, breaking RWA_MASK, READA and SWRITE.
>>
>> This patch updates RWA_MASK, READA and SWRITE such that they match the
>> BIO_RW_* bits again. A follow up patch will update the definitions to
>> directly use BIO_RW_* bits so that this kind of breakage won't happen
>> again.
>>
>> Neil also spotted missing RWA_MASK conversion.
>>
>> Stable: The offending commit a82afdf was released with v2.6.32, so
>> this patch should be applied to all kernels since then but it must
>> _NOT_ be applied to kernels earlier than that.
>>
>> Signed-off-by: Tejun Heo <[email protected]>
>> Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
>> Root-caused-by: Neil Brown <[email protected]>
>> Cc: Jens Axobe <[email protected]>
> ^^^^^
>
> (Too) common typo :-)
>
> Anyway, applied to for-2.6.36, thanks a lot.

Irk, we have an issue:

In file included from fs/coda/psdev.c:48:
include/linux/coda_psdev.h:91:1: warning: "REQ_WRITE" redefined
In file included from include/linux/fs.h:11,
from include/linux/proc_fs.h:5,
from fs/coda/psdev.c:31:
include/linux/blk_types.h:154:1: warning: this is the location of the previous definition

And from include/linux/coda_psdev.h:

#define REQ_ASYNC 0x1
#define REQ_READ 0x2
#define REQ_WRITE 0x4
#define REQ_ABORT 0x8

which unfortunately seem to not be under __KERNEL__ protection, but
there are things like wait_queue_head_t structs there as well so should
be safe to change.

--
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.

2010-08-03 15:52:40

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 1/2 block#for-2.6.36] bio, fs: update RWA_MASK, READA and SWRITE to match the corresponding BIO_RW_* bits

(cc'ing CODA people)
Hello,

On 08/03/2010 01:21 PM, Jens Axboe wrote:
>>> Signed-off-by: Tejun Heo <[email protected]>
>>> Reported-and-bisected-by: Vladislav Bolkhovitin <[email protected]>
>>> Root-caused-by: Neil Brown <[email protected]>
>>> Cc: Jens Axobe <[email protected]>
>> ^^^^^
>>
>> (Too) common typo :-)
>>
>> Anyway, applied to for-2.6.36, thanks a lot.

Oops, I'm sorry. Heh, it reminds me of misspelling Linus's last name
as Tolvards in a number of patches. :-)

> Irk, we have an issue:
>
> In file included from fs/coda/psdev.c:48:
> include/linux/coda_psdev.h:91:1: warning: "REQ_WRITE" redefined
> In file included from include/linux/fs.h:11,
> from include/linux/proc_fs.h:5,
> from fs/coda/psdev.c:31:
> include/linux/blk_types.h:154:1: warning: this is the location of the previous definition
>
> And from include/linux/coda_psdev.h:
>
> #define REQ_ASYNC 0x1
> #define REQ_READ 0x2
> #define REQ_WRITE 0x4
> #define REQ_ABORT 0x8
>
> which unfortunately seem to not be under __KERNEL__ protection, but
> there are things like wait_queue_head_t structs there as well so should
> be safe to change.

Yeah, I hate it when symbols in non-core code doesn't have proper
prefix. Prefixing CODA_ in front of those macros should do it. Jan
Harkes, would that work for CODA?

Thanks.

--
tejun

2010-08-03 16:03:14

by Tejun Heo

[permalink] [raw]
Subject: [PATCH] coda: rename REQ_* to CODA_REQ_*

REQ_* constants are used for block layer requests causing inconsistent
duplicate definitions of REQ_WRITE. Rename REQ_* used by coda to
CODA_REQ_*.

Signed-off-by: Tejun Heo <[email protected]>
---
So, something like this. Build tested only.

Thanks.

fs/coda/psdev.c | 14 +++++++-------
fs/coda/upcall.c | 12 ++++++------
include/linux/coda_psdev.h | 8 ++++----
3 files changed, 17 insertions(+), 17 deletions(-)

Index: work/fs/coda/psdev.c
===================================================================
--- work.orig/fs/coda/psdev.c
+++ work/fs/coda/psdev.c
@@ -177,7 +177,7 @@ static ssize_t coda_psdev_write(struct f
nbytes = req->uc_outSize; /* don't have more space! */
}
if (copy_from_user(req->uc_data, buf, nbytes)) {
- req->uc_flags |= REQ_ABORT;
+ req->uc_flags |= CODA_REQ_ABORT;
wake_up(&req->uc_sleep);
retval = -EFAULT;
goto out;
@@ -185,7 +185,7 @@ static ssize_t coda_psdev_write(struct f

/* adjust outsize. is this useful ?? */
req->uc_outSize = nbytes;
- req->uc_flags |= REQ_WRITE;
+ req->uc_flags |= CODA_REQ_WRITE;
count = nbytes;

/* Convert filedescriptor into a file handle */
@@ -254,8 +254,8 @@ static ssize_t coda_psdev_read(struct fi
retval = -EFAULT;

/* If request was not a signal, enqueue and don't free */
- if (!(req->uc_flags & REQ_ASYNC)) {
- req->uc_flags |= REQ_READ;
+ if (!(req->uc_flags & CODA_REQ_ASYNC)) {
+ req->uc_flags |= CODA_REQ_READ;
list_add_tail(&(req->uc_chain), &vcp->vc_processing);
goto out;
}
@@ -315,19 +315,19 @@ static int coda_psdev_release(struct ino
list_del(&req->uc_chain);

/* Async requests need to be freed here */
- if (req->uc_flags & REQ_ASYNC) {
+ if (req->uc_flags & CODA_REQ_ASYNC) {
CODA_FREE(req->uc_data, sizeof(struct coda_in_hdr));
kfree(req);
continue;
}
- req->uc_flags |= REQ_ABORT;
+ req->uc_flags |= CODA_REQ_ABORT;
wake_up(&req->uc_sleep);
}

list_for_each_entry_safe(req, tmp, &vcp->vc_processing, uc_chain) {
list_del(&req->uc_chain);

- req->uc_flags |= REQ_ABORT;
+ req->uc_flags |= CODA_REQ_ABORT;
wake_up(&req->uc_sleep);
}

Index: work/fs/coda/upcall.c
===================================================================
--- work.orig/fs/coda/upcall.c
+++ work/fs/coda/upcall.c
@@ -604,7 +604,7 @@ static void coda_unblock_signals(sigset_
(((r)->uc_opcode != CODA_CLOSE && \
(r)->uc_opcode != CODA_STORE && \
(r)->uc_opcode != CODA_RELEASE) || \
- (r)->uc_flags & REQ_READ))
+ (r)->uc_flags & CODA_REQ_READ))

static inline void coda_waitfor_upcall(struct upc_req *req)
{
@@ -624,7 +624,7 @@ static inline void coda_waitfor_upcall(s
set_current_state(TASK_UNINTERRUPTIBLE);

/* got a reply */
- if (req->uc_flags & (REQ_WRITE | REQ_ABORT))
+ if (req->uc_flags & (CODA_REQ_WRITE | CODA_REQ_ABORT))
break;

if (blocked && time_after(jiffies, timeout) &&
@@ -708,7 +708,7 @@ static int coda_upcall(struct venus_comm
coda_waitfor_upcall(req);

/* Op went through, interrupt or not... */
- if (req->uc_flags & REQ_WRITE) {
+ if (req->uc_flags & CODA_REQ_WRITE) {
out = (union outputArgs *)req->uc_data;
/* here we map positive Venus errors to kernel errors */
error = -out->oh.result;
@@ -717,13 +717,13 @@ static int coda_upcall(struct venus_comm
}

error = -EINTR;
- if ((req->uc_flags & REQ_ABORT) || !signal_pending(current)) {
+ if ((req->uc_flags & CODA_REQ_ABORT) || !signal_pending(current)) {
printk(KERN_WARNING "coda: Unexpected interruption.\n");
goto exit;
}

/* Interrupted before venus read it. */
- if (!(req->uc_flags & REQ_READ))
+ if (!(req->uc_flags & CODA_REQ_READ))
goto exit;

/* Venus saw the upcall, make sure we can send interrupt signal */
@@ -747,7 +747,7 @@ static int coda_upcall(struct venus_comm
sig_inputArgs->ih.opcode = CODA_SIGNAL;
sig_inputArgs->ih.unique = req->uc_unique;

- sig_req->uc_flags = REQ_ASYNC;
+ sig_req->uc_flags = CODA_REQ_ASYNC;
sig_req->uc_opcode = sig_inputArgs->ih.opcode;
sig_req->uc_unique = sig_inputArgs->ih.unique;
sig_req->uc_inSize = sizeof(struct coda_in_hdr);
Index: work/include/linux/coda_psdev.h
===================================================================
--- work.orig/include/linux/coda_psdev.h
+++ work/include/linux/coda_psdev.h
@@ -86,9 +86,9 @@ struct upc_req {
wait_queue_head_t uc_sleep; /* process' wait queue */
};

-#define REQ_ASYNC 0x1
-#define REQ_READ 0x2
-#define REQ_WRITE 0x4
-#define REQ_ABORT 0x8
+#define CODA_REQ_ASYNC 0x1
+#define CODA_REQ_READ 0x2
+#define CODA_REQ_WRITE 0x4
+#define CODA_REQ_ABORT 0x8

#endif

2010-08-03 16:49:58

by Jan Harkes

[permalink] [raw]
Subject: Re: [PATCH] coda: rename REQ_* to CODA_REQ_*

On Tue, Aug 03, 2010 at 06:02:53PM +0200, Tejun Heo wrote:
> REQ_* constants are used for block layer requests causing inconsistent
> duplicate definitions of REQ_WRITE. Rename REQ_* used by coda to
> CODA_REQ_*.
>
> Signed-off-by: Tejun Heo <[email protected]>

Looks good to me.

Acked-by: Jan Harkes <[email protected]>

2010-08-03 17:31:55

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] coda: rename REQ_* to CODA_REQ_*

On 2010-08-03 18:02, Tejun Heo wrote:
> REQ_* constants are used for block layer requests causing inconsistent
> duplicate definitions of REQ_WRITE. Rename REQ_* used by coda to
> CODA_REQ_*.

Should have been clear, I committed the fixup already:

http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=ceb1fde0d5c7611fdb9004176ac34140d27e745a

Looks byte-for-byte identical :-)

--
Jens Axboe


Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited.

2010-08-05 18:46:29

by Jeff Moyer

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Tejun Heo <[email protected]> writes:

> Commit a82afdf (block: use the same failfast bits for bio and request)
> moved BIO_RW_* bits around such that they match up with REQ_* bits.
> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
> 2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
> breaking READA and SWRITE.
>
> This patch updates READA and SWRITE such that they match the BIO_RW_*
> bits again. A follow up patch will update the definitions to directly
> use BIO_RW_* bits so that this kind of breakage won't happen again.
>
> Stable: The offending commit a82afdf was released with v2.6.32, so
> this patch should be applied to all kernels since then but it must
> _NOT_ be applied to kernels earlier than that.

Would someone be so kind as to remind me how this problem manifests
itself? I know I read this recently, but my memory and googling skills
are both failing me. :(

Cheers,
Jeff

2010-08-05 18:57:33

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits


Jeff Moyer, on 08/05/2010 10:45 PM wrote:
> Tejun Heo<[email protected]> writes:
>
>> Commit a82afdf (block: use the same failfast bits for bio and request)
>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
>> 2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
>> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
>> breaking READA and SWRITE.
>>
>> This patch updates READA and SWRITE such that they match the BIO_RW_*
>> bits again. A follow up patch will update the definitions to directly
>> use BIO_RW_* bits so that this kind of breakage won't happen again.
>>
>> Stable: The offending commit a82afdf was released with v2.6.32, so
>> this patch should be applied to all kernels since then but it must
>> _NOT_ be applied to kernels earlier than that.
>
> Would someone be so kind as to remind me how this problem manifests
> itself? I know I read this recently, but my memory and googling skills
> are both failing me. :(

See http://lkml.org/lkml/2010/7/28/315 and link inside.

Vlad

2010-08-05 19:30:55

by Jeff Moyer

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Vladislav Bolkhovitin <[email protected]> writes:

> Jeff Moyer, on 08/05/2010 10:45 PM wrote:
>> Tejun Heo<[email protected]> writes:
>>
>>> Commit a82afdf (block: use the same failfast bits for bio and request)
>>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>>> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
>>> 2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
>>> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
>>> breaking READA and SWRITE.
>>>
>>> This patch updates READA and SWRITE such that they match the BIO_RW_*
>>> bits again. A follow up patch will update the definitions to directly
>>> use BIO_RW_* bits so that this kind of breakage won't happen again.
>>>
>>> Stable: The offending commit a82afdf was released with v2.6.32, so
>>> this patch should be applied to all kernels since then but it must
>>> _NOT_ be applied to kernels earlier than that.
>>
>> Would someone be so kind as to remind me how this problem manifests
>> itself? I know I read this recently, but my memory and googling skills
>> are both failing me. :(
>
> See http://lkml.org/lkml/2010/7/28/315 and link inside.

Thanks a ton!

-Jeff

2010-08-06 04:57:18

by Tao Ma

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Hi Tejun,

On 08/06/2010 02:45 AM, Jeff Moyer wrote:
> Tejun Heo<[email protected]> writes:
>
>> Commit a82afdf (block: use the same failfast bits for bio and request)
>> moved BIO_RW_* bits around such that they match up with REQ_* bits.
>> Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
>> 2 and 3, and expected them to match with BIO_RW_* bits. READ/WRITE
>> didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
>> breaking READA and SWRITE.
>>
>> This patch updates READA and SWRITE such that they match the BIO_RW_*
>> bits again. A follow up patch will update the definitions to directly
>> use BIO_RW_* bits so that this kind of breakage won't happen again.
>>
>> Stable: The offending commit a82afdf was released with v2.6.32, so
>> this patch should be applied to all kernels since then but it must
>> _NOT_ be applied to kernels earlier than that.
If only this patch goes into stable, do we need to change RWA_MASK in
this patch also(I have seen you change it the 2/2)?

-#define RWA_MASK 2
+#define RWA_MASK 16

Regards,
Tao

2010-08-06 06:30:22

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Hello,

On 08/06/2010 06:04 AM, Tao Ma wrote:
> If only this patch goes into stable, do we need to change RWA_MASK in this patch also(I have seen you change it the 2/2)?
>
> -#define RWA_MASK 2
> +#define RWA_MASK 16

Yeah, Neil already pointed it out and updated patch has been posted
and merged.

Thanks.

--
tejun

2010-08-19 08:39:03

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [dm-devel] [PATCH RESEND 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

On Fri, Aug 06, 2010 at 12:04:43PM +0800, Tao Ma wrote:
> >>Stable: The offending commit a82afdf was released with v2.6.32, so
> >>this patch should be applied to all kernels since then but it must
> >>_NOT_ be applied to kernels earlier than that.
> If only this patch goes into stable, do we need to change RWA_MASK
> in this patch also(I have seen you change it the 2/2)?
>
> -#define RWA_MASK 2
> +#define RWA_MASK 16

I think RW_MASK and RWA_MASK should simply go away, they don't server
any use over just masking the correct REQ_* flags.