dio_bio_complete turns all errors into -EIO. This is historical,
since you used to only get 1 bit precision for errors (BIO_UPTODATE).
Now that we get actual error codes, we can return the appropriate
code to userspace. File systems seem to only propagate either EIO
or ENOSPC, so I've followed suit in this patch.
This fixes an issue where -ENOSPC was being turned into -EIO when
testing dm-thin.
Reported-by: Carlos Maiolino <[email protected]>
Tested-by: Mike Snitzer <[email protected]>
Signed-off-by: Jeff Moyer <[email protected]>
diff --git a/fs/direct-io.c b/fs/direct-io.c
index d6a9012..990e0aa 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -466,13 +466,15 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
{
struct bio_vec *bvec;
unsigned i;
- int err;
- if (bio->bi_error)
+ /* Only EIO and ENOSPC should be returned to userspace. */
+ if (bio->bi_error == 0 ||
+ bio->bi_error == -ENOSPC || bio->bi_error == -EIO)
+ dio->io_error = bio->bi_error;
+ else
dio->io_error = -EIO;
if (dio->is_async && dio->rw == READ && dio->should_dirty) {
- err = bio->bi_error;
bio_check_pages_dirty(bio); /* transfers ownership */
} else {
bio_for_each_segment_all(bvec, bio, i) {
@@ -483,10 +485,9 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
set_page_dirty_lock(page);
page_cache_release(page);
}
- err = bio->bi_error;
bio_put(bio);
}
- return err;
+ return dio->io_error;
}
/*
This looks good to me.
Reviewed-by: Carlos Maiolino <[email protected]>
On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
> dio_bio_complete turns all errors into -EIO. This is historical,
> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
> Now that we get actual error codes, we can return the appropriate
> code to userspace. File systems seem to only propagate either EIO
> or ENOSPC, so I've followed suit in this patch.
>
> This fixes an issue where -ENOSPC was being turned into -EIO when
> testing dm-thin.
>
> Reported-by: Carlos Maiolino <[email protected]>
> Tested-by: Mike Snitzer <[email protected]>
> Signed-off-by: Jeff Moyer <[email protected]>
>
> diff --git a/fs/direct-io.c b/fs/direct-io.c
> index d6a9012..990e0aa 100644
> --- a/fs/direct-io.c
> +++ b/fs/direct-io.c
> @@ -466,13 +466,15 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
> {
> struct bio_vec *bvec;
> unsigned i;
> - int err;
>
> - if (bio->bi_error)
> + /* Only EIO and ENOSPC should be returned to userspace. */
> + if (bio->bi_error == 0 ||
> + bio->bi_error == -ENOSPC || bio->bi_error == -EIO)
> + dio->io_error = bio->bi_error;
> + else
> dio->io_error = -EIO;
>
> if (dio->is_async && dio->rw == READ && dio->should_dirty) {
> - err = bio->bi_error;
> bio_check_pages_dirty(bio); /* transfers ownership */
> } else {
> bio_for_each_segment_all(bvec, bio, i) {
> @@ -483,10 +485,9 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio)
> set_page_dirty_lock(page);
> page_cache_release(page);
> }
> - err = bio->bi_error;
> bio_put(bio);
> }
> - return err;
> + return dio->io_error;
> }
>
> /*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Carlos
On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
> dio_bio_complete turns all errors into -EIO. This is historical,
> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
> Now that we get actual error codes, we can return the appropriate
> code to userspace. File systems seem to only propagate either EIO
> or ENOSPC, so I've followed suit in this patch.
Do we? Just propagating some errors defintively seems odd. Even
if we do we should have a central helper doing that mapping instead
of opencoding it in various places.
Hi, Christoph,
Christoph Hellwig <[email protected]> writes:
> On Mon, Mar 14, 2016 at 05:10:00PM -0400, Jeff Moyer wrote:
>> dio_bio_complete turns all errors into -EIO. This is historical,
>> since you used to only get 1 bit precision for errors (BIO_UPTODATE).
>> Now that we get actual error codes, we can return the appropriate
>> code to userspace. File systems seem to only propagate either EIO
>> or ENOSPC, so I've followed suit in this patch.
>
> Do we?
Sometimes! :) I was referring to this:
static inline void mapping_set_error(struct address_space *mapping, int error)
{
if (unlikely(error)) {
if (error == -ENOSPC)
set_bit(AS_ENOSPC, &mapping->flags);
else
set_bit(AS_EIO, &mapping->flags);
}
}
> Just propagating some errors defintively seems odd.
Not really. read, write, etc only expect a subset of errnos to be
returned. The goal was not to leak kernel-internal or unexpected error
numbers to userspace, and I didn't think I would be able to successfully
audit all code paths that lead here. So, I opted for a more
conservative patch that just allows one more errno through.
> Even if we do we should have a central helper doing that mapping
> instead of opencoding it in various places.
OK. I would argue that the right place to filter out errno's would be
up in the vfs. I do wonder if I'm just being overly paranoid, though.
Al, do you have any opinions, here?
Cheers,
Jeff
On 03/21/2016 04:22 PM, Jeff Moyer wrote:
>> Just propagating some errors defintively seems odd.
>
> Not really. read, write, etc only expect a subset of errnos to be
> returned. The goal was not to leak kernel-internal or unexpected error
> numbers to userspace, and I didn't think I would be able to successfully
> audit all code paths that lead here. So, I opted for a more
> conservative patch that just allows one more errno through.
We have a use case which makes heavy use of dm to map plain files into
block devs. Propagating ENOSPC is useful here, as a sparse backing file
might run into this when extending, and the downstream app would be able
to see something other than EIO.
I'm ambivalent on whether or not to allow all errnos through, or just a
peephole like ENOSPC here. However, there's another errno which is
effectively the same cause as ENOSPC but triggered by a logical rather
than physical limit: EDQUOT.
If we're looking at a peephole for ENOSPC to be let through, I'd suggest
also allowing EDQUOT through. To your concerns about confusing
applications with too many possible errnos, EDQUOT could be translated
to ENOSPC before being let through to the block dev layer. (The net
effect is the same in any case, as the backing store has denied a write
due to some kind of space limit.)
--
-- Todd Vierling <[email protected]> <[email protected]>