Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759673AbcDFBLP (ORCPT ); Tue, 5 Apr 2016 21:11:15 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:49096 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754166AbcDFBLN (ORCPT ); Tue, 5 Apr 2016 21:11:13 -0400 MIME-Version: 1.0 In-Reply-To: <20160406010433.GB3129698@devbig084.prn1.facebook.com> References: <1459878246-9249-1-git-send-email-ming.lei@canonical.com> <20160405182720.GA2375685@devbig084.prn1.facebook.com> <20160406010433.GB3129698@devbig084.prn1.facebook.com> Date: Wed, 6 Apr 2016 09:11:10 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] block: make sure big bio is splitted into at most 256 bvecs From: Ming Lei To: Shaohua Li Cc: Jens Axboe , Linux Kernel Mailing List , linux-block@vger.kernel.org, Kent Overstreet , Christoph Hellwig , Eric Wheeler , Sebastian Roesner , "4.2+" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3456 Lines: 77 On Wed, Apr 6, 2016 at 9:04 AM, Shaohua Li wrote: > On Wed, Apr 06, 2016 at 08:47:56AM +0800, Ming Lei wrote: >> On Wed, Apr 6, 2016 at 2:27 AM, Shaohua Li wrote: >> > On Wed, Apr 06, 2016 at 01:44:06AM +0800, Ming Lei wrote: >> >> After arbitrary bio size is supported, the incoming bio may >> >> be very big. We have to split the bio into small bios so that >> >> each holds at most BIO_MAX_PAGES bvecs for safety reason, such >> >> as bio_clone(). >> >> >> >> This patch fixes the following kernel crash: >> >> >> >> > [ 172.660142] BUG: unable to handle kernel NULL pointer dereference at >> >> > 0000000000000028 >> >> > [ 172.660229] IP: [] bio_trim+0xf/0x2a >> >> > [ 172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0 >> >> > [ 172.660399] Oops: 0000 [#1] SMP >> >> > [...] >> >> > [ 172.664780] Call Trace: >> >> > [ 172.664813] [] ? raid1_make_request+0x2e8/0xad7 [raid1] >> >> > [ 172.664846] [] ? blk_queue_split+0x377/0x3d4 >> >> > [ 172.664880] [] ? md_make_request+0xf6/0x1e9 [md_mod] >> >> > [ 172.664912] [] ? generic_make_request+0xb5/0x155 >> >> > [ 172.664947] [] ? prio_io+0x85/0x95 [bcache] >> >> > [ 172.664981] [] ? register_cache_set+0x355/0x8d0 [bcache] >> >> > [ 172.665016] [] ? register_bcache+0x1006/0x1174 [bcache] >> >> >> >> Fixes: 54efd50(block: make generic_make_request handle arbitrarily sized bios) >> > >> > this bug is introduced by d2be537c3ba >> >> Reported-by: Sebastian Roesner >> >> Reported-by: Eric Wheeler >> >> Cc: stable@vger.kernel.org (4.2+) >> >> Cc: Shaohua Li >> >> Signed-off-by: Ming Lei >> >> --- >> >> I can reproduce the issue and verify the fix by the following approach: >> >> - create one raid1 over two virtio-blk >> >> - build bcache device over the above raid1 and another cache device. >> >> - set cache mode as writeback >> >> - run random write over ext4 on the bcache device >> >> - then the crash can be triggered >> > >> > can you explain why this is better than my original patch? >> >> Shaohua, your patch is wrong, please see the following link: >> >> https://lkml.org/lkml/2016/3/30/893 > > I don't see why, really, except it declares you are right :) Never mind, I post it again, and maybe cause my poor english, :-) blk_rq_get_max_sectors() which uses max sectors limit is used for merging bios/reqs, and that means limits.max_sectors is for limitting max sectors in one request or transfer. One request may include lots of bios. Now this patch decreases the limit just for single bio's 256 bvec's limitation. Is that correct? That is the reason why I suggest to change get_max_io_size() for bio's 256 bvecs limit. On the contrary, the default max sectors should have been increased since hardware is becoming quicker, and we should send more to drive in one request, IMO. > > why it's 2560 instead of 2048? I don't know the exact reason why Jeff takes 2560, but I feel it can be bigger because the hardware is becoming quicker. Thanks, Ming > -- > To unsubscribe from this list: send the line "unsubscribe linux-block" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html