Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751810AbdF1VLm (ORCPT ); Wed, 28 Jun 2017 17:11:42 -0400 Received: from mail.kernel.org ([198.145.29.99]:59716 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751722AbdF1VLe (ORCPT ); Wed, 28 Jun 2017 17:11:34 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0009F22B5A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=shli@kernel.org Date: Wed, 28 Jun 2017 14:11:31 -0700 From: Shaohua Li To: Jens Axboe Cc: Christoph Hellwig , Keith Busch , wenxiong@linux.vnet.ibm.com, linux-kernel@vger.kernel.org, bjking@linux.vnet.ibm.com Subject: Re: [PATCH] fs: System memory leak when running HTX with T10 DIF enabled Message-ID: <20170628211131.la2n3gi7xbjlkzvj@kernel.org> References: <1498667571-14275-1-git-send-email-wenxiong@linux.vnet.ibm.com> <20170628171031.GC2650@localhost.localdomain> <20170628183141.GA12722@lst.de> <0fa90019-5b07-7afb-8915-17c54a22709b@fb.com> <20170628183818.GA12860@lst.de> <20170628185235.GA13126@lst.de> <9308965a-f9c2-54b3-ae0c-65dcdd7514fb@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9308965a-f9c2-54b3-ae0c-65dcdd7514fb@fb.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4131 Lines: 127 On Wed, Jun 28, 2017 at 12:57:50PM -0600, Jens Axboe wrote: > On 06/28/2017 12:52 PM, Christoph Hellwig wrote: > > On Wed, Jun 28, 2017 at 12:44:00PM -0600, Jens Axboe wrote: > >> On 06/28/2017 12:38 PM, Christoph Hellwig wrote: > >>> On Wed, Jun 28, 2017 at 12:34:15PM -0600, Jens Axboe wrote: > >>>> That's what I sent out. > >>> > >>> Where? Didn't see that anywhere.. > >> > >> Looks like you weren't CC'ed on the original thread. About an hour ago. > >> > >>>> Here it is again. We should get this into 4.12, > >>>> so would be great with a review or two. > >>> > >>> Can we rename __bio_free to bio_uninit and add a comment to bio_init > >>> that it must be paried with bio_uninit? > >> > >> Let's keep it small for 4.12. We can do a cleanup on top of this for > >> 4.13. > > > > The rename is two additional lines for the patch, it's not going to > > make a difference.. > > Sure, why not... > > >>> Except for that this looks fine, although there are a lot more callers > >>> that should get this treatment.. > >> > >> Should only be an issue for on-stack bio's, since we don't go through > >> the put/free path. Did a quick grep, looks like this is one of 3. One > >> is floppy, which probably neither has DIF or uses blk-throttle. Then > >> there's one in dm-bufio, didn't look too closely at that. Last one is > >> this one. > > > > Well, it's really all callers but bio_alloc_bioset itself that > > will need this handling, as only bios that come from bio_alloc_bioset > > will be freed through bio_free. Most of them probably don't > > support DIF, but they'll also miss the bio_disassociate_task call > > this way, and will leak I/O context and css references if block > > cgroup support is enabled. > > I guess local allocs too will be affected. > > I'm baffled we've had this issue for so long, and nobody's seen it. > I knew DIF was never used (basically), but blk-throttle must have some > users at least. I fixed the same issue in the blktrace patches. It did the free in bio_endio. > > > diff --git a/block/bio.c b/block/bio.c > index 888e7801c638..f877d5e6d17f 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -240,7 +240,7 @@ struct bio_vec *bvec_alloc(gfp_t gfp_mask, int nr, unsigned long *idx, > return bvl; > } > > -static void __bio_free(struct bio *bio) > +void bio_uninit(struct bio *bio) > { > bio_disassociate_task(bio); > > @@ -253,7 +253,7 @@ static void bio_free(struct bio *bio) > struct bio_set *bs = bio->bi_pool; > void *p; > > - __bio_free(bio); > + bio_uninit(bio); > > if (bs) { > bvec_free(bs->bvec_pool, bio->bi_io_vec, BVEC_POOL_IDX(bio)); > @@ -271,6 +271,11 @@ static void bio_free(struct bio *bio) > } > } > > +/* > + * Users of this function have their own bio allocation. Subsequently, > + * they must remember to pair any call to bio_init() with bio_uninit() > + * when IO has completed, or when the bio is released. > + */ > void bio_init(struct bio *bio, struct bio_vec *table, > unsigned short max_vecs) > { > @@ -297,7 +302,7 @@ void bio_reset(struct bio *bio) > { > unsigned long flags = bio->bi_flags & (~0UL << BIO_RESET_BITS); > > - __bio_free(bio); > + bio_uninit(bio); > > memset(bio, 0, BIO_RESET_BYTES); > bio->bi_flags = flags; > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 519599dddd36..0a7404ef9335 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -263,7 +263,10 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter, > kfree(vecs); > > if (unlikely(bio.bi_error)) > - return bio.bi_error; > + ret = bio.bi_error; > + > + bio_uninit(&bio); > + > return ret; > } > > diff --git a/include/linux/bio.h b/include/linux/bio.h > index d1b04b0e99cf..a7e29fa0981f 100644 > --- a/include/linux/bio.h > +++ b/include/linux/bio.h > @@ -426,6 +426,7 @@ extern void bio_advance(struct bio *, unsigned); > > extern void bio_init(struct bio *bio, struct bio_vec *table, > unsigned short max_vecs); > +extern void bio_uninit(struct bio *); > extern void bio_reset(struct bio *); > void bio_chain(struct bio *, struct bio *); > > > -- > Jens Axboe >