Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753191AbdFRWnr (ORCPT ); Sun, 18 Jun 2017 18:43:47 -0400 Received: from mail-qt0-f169.google.com ([209.85.216.169]:34063 "EHLO mail-qt0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752752AbdFRWnp (ORCPT ); Sun, 18 Jun 2017 18:43:45 -0400 Message-ID: <1497825821.21567.6.camel@redhat.com> Subject: Re: attempting to format brd device results in OOM kills From: Jeff Layton To: Jens Axboe , Christoph Hellwig Cc: Hannes Reinecke , LKML Date: Sun, 18 Jun 2017 18:43:41 -0400 In-Reply-To: References: <1497803428.21567.4.camel@redhat.com> <7e8cdef3-7062-0a11-63c1-e19fabcd117c@kernel.dk> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.24.2 (3.24.2-1.fc26) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5193 Lines: 101 On Sun, 2017-06-18 at 16:27 -0600, Jens Axboe wrote: > On 06/18/2017 04:21 PM, Jens Axboe wrote: > > On 06/18/2017 10:30 AM, Jeff Layton wrote: > > > I've run across a regression from v4.11. If I boot a v4.12-rc1 or later > > > kernel, make a large brd device and try to format it, it quickly slows > > > down to a crawl and then the OOM killer kicks in. > > > > > > I ran a bisect and it landed here: > > > > > > commit f09a06a193d942a12c1a33c153388b3962222006 (HEAD, refs/bisect/bad) > > > Author: Christoph Hellwig > > > Date: Wed Apr 5 19:21:16 2017 +0200 > > > > > > brd: remove discard support > > > > > > It's just a in-driver reimplementation of writing zeroes to the pages, > > > which fails if the discards aren't page aligned. > > > > > > Signed-off-by: Christoph Hellwig > > > Reviewed-by: Hannes Reinecke > > > Signed-off-by: Jens Axboe > > > > > > > > > I've been reproducing it in a VM with ~8G allocated to it: > > > > > > I have a modprobe.d file with this in it: > > > > > > options brd rd_nr=1 rd_size=1073741824 > > > > > > I then just: > > > > > > # modprobe brd > > > # mkfs -t ext2 /dev/ram0 > > > > > > It keels over pretty quickly after that. > > > > Just checked, and creating a 1TB ram disk and then running mkfs.ext2 on it > > writes 16851MiB of data. I can't say I'm surprised you OOM, if you run that > > in a 8G VM, as you're about 8G short. > > > > I'm puzzled as to why the discard change would make any difference, however. > > Reverted the patch, and I see identical behavior. The only difference is that > the whole device is trimmed first, as expected. But it still writes ~16G > afterwards. > > Are you sure this commit is what broke things for you? Honestly, I don't see > how it could ever work with 1TB ram disk, 8G of RAM, and 16G of data written. > My mistake! My brd rd_size parameter was too large by a factor of 1024 (I missed that it was in kbytes and not bytes). With it sanely sized to 1G (as I had actually intended), it works fine. It's interesting that the older kernel survives this and the newer one doesn't, but since it's such a pathological setup I'm not too worried about it. As far as that commit...no, I'm not sure that's what "broke" it for me. That's where the bisect landed (and I think I did it right), but I didn't independently verify whether reverting it helps or not. Anyway here's the bisect log if you're interested: $ git bisect log # bad: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1 # good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11 git bisect start 'v4.12-rc1' 'v4.11' # bad: [221656e7c4ce342b99c31eca96c1cbb6d1dce45f] Merge tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect bad 221656e7c4ce342b99c31eca96c1cbb6d1dce45f # bad: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next git bisect bad 8d65b08debc7e62b2c6032d7fe7389d895b92cbc # good: [cec381919818a9a0cb85600b3c82404bdd38cf36] Merge tag 'mac80211-next-for-davem-2017-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next git bisect good cec381919818a9a0cb85600b3c82404bdd38cf36 # bad: [6dc2cce9321198172cd96f955a5fc798a4cc35a6] Merge branch 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad 6dc2cce9321198172cd96f955a5fc798a4cc35a6 # bad: [477d7caeede0e3a933368440fc877b12c25dbb6d] Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration git bisect bad 477d7caeede0e3a933368440fc877b12c25dbb6d # bad: [a5695a79088653c73c92ae8d48658cbc49f31884] coda: Convert to separately allocated bdi git bisect bad a5695a79088653c73c92ae8d48658cbc49f31884 # good: [ee056f98126170ca8b16b9a4a6e20aae7c5c184e] blk-mq-sched: provide hooks for initializing hardware queue data git bisect good ee056f98126170ca8b16b9a4a6e20aae7c5c184e # bad: [2a79efd833dd51c4362af655b9b011393c423f18] lightnvm: fix some WARN() messages git bisect bad 2a79efd833dd51c4362af655b9b011393c423f18 # bad: [48920ff2a5a940cd07d12cc79e4a2c75f1185aee] block: remove the discard_zeroes_data flag git bisect bad 48920ff2a5a940cd07d12cc79e4a2c75f1185aee # good: [ee472d835c264a4cb77f8cf878603e1e40f3559e] block: add a flags argument to (__)blkdev_issue_zeroout git bisect good ee472d835c264a4cb77f8cf878603e1e40f3559e # good: [19372e2769179ddd154a0d6fbbdb719eb5d0af12] loop: implement REQ_OP_WRITE_ZEROES git bisect good 19372e2769179ddd154a0d6fbbdb719eb5d0af12 # bad: [5d1429fead5beacce6df052c31b28a97a11e250b] mmc: remove the discard_zeroes_data flag git bisect bad 5d1429fead5beacce6df052c31b28a97a11e250b # bad: [93c1defedcae701512957c279b850659d1dae78f] rbd: remove the discard_zeroes_data flag git bisect bad 93c1defedcae701512957c279b850659d1dae78f # bad: [f09a06a193d942a12c1a33c153388b3962222006] brd: remove discard support git bisect bad f09a06a193d942a12c1a33c153388b3962222006 # first bad commit: [f09a06a193d942a12c1a33c153388b3962222006] brd: remove discard support Anyway, sorry for the noise! -- Jeff Layton