Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751195AbbLTXlL (ORCPT ); Sun, 20 Dec 2015 18:41:11 -0500 Received: from smtprelay0154.b.hostedemail.com ([64.98.42.154]:39755 "EHLO smtprelay.b.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750806AbbLTXlJ (ORCPT ); Sun, 20 Dec 2015 18:41:09 -0500 X-Session-Marker: 742E617274656D406C79636F732E636F6D X-Spam-Summary: 30,2,0,,d41d8cd98f00b204,t.artem@lycos.com,:::::::::::::::::::,RULES_HIT:41:46:150:153:355:379:421:582:599:967:968:973:988:989:1152:1260:1277:1311:1313:1314:1345:1359:1437:1515:1516:1518:1535:1544:1593:1594:1605:1711:1730:1747:1777:1792:2198:2199:2393:2553:2559:2562:2895:2901:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3872:3873:3874:4117:4321:4419:5007:6119:6261:10004:10848:11026:11232:11658:11914:12043:12114:12438:12517:12519:12555:12698:12737:12740:13180:13229:13255:14659:21080,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:none,Custom_rules:0:0:0,LFtime:2,LUA_SUMMARY:none X-HE-Tag: lake20_7e47646a56256 X-Filterd-Recvd-Size: 6172 MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Mon, 21 Dec 2015 04:41:07 +0500 From: "Artem S. Tashkinov" To: Kent Overstreet Cc: Christoph Hellwig , Linus Torvalds , Ming Lin , Jens Axboe , "Artem S. Tashkinov" , Steven Whitehouse , Tejun Heo , IDE-ML , Linux Kernel Mailing List Subject: Re: IO errors after "block: remove =?UTF-8?Q?bio=5Fget=5Fnr=5Fvec?= =?UTF-8?Q?s=28=29=22?= In-Reply-To: <20151220184404.GA18035@kmo-pixel> References: <20151220181801.GA12402@lst.de> <20151220184404.GA18035@kmo-pixel> Message-ID: <1503ce1112aae1a881177bb103838e83@lycos.com> User-Agent: Roundcube Webmail/1.0.2 X-Originating-IP: [5.166.173.43] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5037 Lines: 108 On 2015-12-20 23:44, Kent Overstreet wrote: > On Sun, Dec 20, 2015 at 07:18:01PM +0100, Christoph Hellwig wrote: >> On Sun, Dec 20, 2015 at 09:51:14AM -0800, Linus Torvalds wrote: >> > Kent, Jens, Christoph et al, >> ie please see this bugzilla: >> >o >> > httpps://bugzilla.kernel.org/show_bug.cgi?id=109661 >> > >> > where Artem Tashkinov bisected his problems with 4.3 down to commit >> > b54ffb73cadc ("block: remove bio_get_nr_vecs()") that you've all >> > signed off on. >> >> Artem, >> >> can you re-check the commits around this series again? I would be >> extremtly surprised if it's really this particular commit and not >> one just before it causing the problem - it just allocates bios >> to the biggest possible instead of only allocating up to what >> bio_add_page would accept. > > pretty sure it's something with how blk_bio_segment_split() decides > what > segments are mergable and not. bio_get_nr_vecs() was just returning > nr_pages == > queue_max_segments (ignoring sectors for the moment) - so wait, wtf? > that's > basically assuming no segment merging can ever happen, if it does then > this was > causing us to send smaller requests to the device than we could have > been. > > so actually two possibilities I can see: > - in blk_bio_segment_split(), something's screwed up with how it > decides what > segments are going to be mergable or not. but I don't think that's > likely > since it's doing the exact same thing the rest of the segment > merging code > does. > - or, the driver was lying in its queue limits, using > queue_max_segments for > "the maximum number of pages I can possibly take", and that bug > lurked > undiscovered because of the screwed-upness in bio_get_nr_vecs(). > > Offhand I don't know where to start digging in the driver code to look > into the > second theory though. Tejun, you got any ideas? Here's an actual bisect log which Linus was missing: git bisect start # bad: [6a13feb9c82803e2b815eca72fa7a9f5561d7861] Linux 4.3 git bisect bad 6a13feb9c82803e2b815eca72fa7a9f5561d7861 # good: [64291f7db5bd8150a74ad2036f1037e6a0428df2] Linux 4.2 git bisect good 64291f7db5bd8150a74ad2036f1037e6a0428df2 # bad: [807249d3ada1ff28a47c4054ca4edd479421b671] Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus git bisect bad 807249d3ada1ff28a47c4054ca4edd479421b671 # good: [102178108e2246cb4b329d3fb7872cd3d7120205] Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc git bisect good 102178108e2246cb4b329d3fb7872cd3d7120205 # good: [62da98656b62a5ca57f22263705175af8ded5aa1] netfilter: nf_conntrack: make nf_ct_zone_dflt built-in git bisect good 62da98656b62a5ca57f22263705175af8ded5aa1 # good: [f1a3c0b933e7ff856223d6fcd7456d403e54e4e5] Merge tag 'devicetree-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux git bisect good f1a3c0b933e7ff856223d6fcd7456d403e54e4e5 # bad: [9cbf22b37ae0592dea809cb8d424990774c21786] Merge tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm git bisect bad 9cbf22b37ae0592dea809cb8d424990774c21786 # good: [8bdc69b764013a9b5ebeef7df8f314f1066c5d79] Merge branch 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup git bisect good 8bdc69b764013a9b5ebeef7df8f314f1066c5d79 # good: [df910390e2db07a76c87f258475f6c96253cee6c] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi git bisect good df910390e2db07a76c87f258475f6c96253cee6c # bad: [d975f309a8b250e67b66eabeb56be6989c783629] Merge branch 'for-4.3/sg' of git://git.kernel.dk/linux-block git bisect bad d975f309a8b250e67b66eabeb56be6989c783629 # bad: [89e2a8404e4415da1edbac6ca4f7332b4a74fae2] crypto/omap-sham: remove an open coded access to ->page_link git bisect bad 89e2a8404e4415da1edbac6ca4f7332b4a74fae2 # good: [0e28997ec476bad4c7dbe0a08775290051325f53] btrfs: remove bio splitting and merge_bvec_fn() calls git bisect good 0e28997ec476bad4c7dbe0a08775290051325f53 # bad: [2ec3182f9c20a9eef0dacc0512cf2ca2df7be5ad] Documentation: update notes in biovecs about arbitrarily sized bios git bisect bad 2ec3182f9c20a9eef0dacc0512cf2ca2df7be5ad # good: [7140aafce2fc14c5af02fdb7859b6bea0108be3d] md/raid5: get rid of bio_fits_rdev() git bisect good 7140aafce2fc14c5af02fdb7859b6bea0108be3d # good: [6cf66b4caf9c71f64a5486cadbd71ab58d0d4307] fs: use helper bio_add_page() instead of open coding on bi_io_vec git bisect good 6cf66b4caf9c71f64a5486cadbd71ab58d0d4307 # bad: [b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c] block: remove bio_get_nr_vecs() git bisect bad b54ffb73cadcdcff9cc1ae0e11f502407e3e2e4c And like he said since the step before the last one was good and the very last one was bad there was no way I could have made a mistake. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/