2013-06-07 02:32:08

by Fengguang Wu

[permalink] [raw]
Subject: [drbd?] Kernel panic - not syncing: Out of memory and no killable processes...

Greetings,

My "kvm -m 256" reliably goes Out Of Memory after this commit. It may
not be the only one that eats up the memory, however I wonder how much
memory consumption this commit added? Thanks!

commit 23361cf32b58efdf09945a64e1d8d41fa6117157
Author: Lars Ellenberg <[email protected]>
Date: Thu Mar 31 16:36:43 2011 +0200

drbd: get rid of bio_split, allow bios of "arbitrary" size

Where "arbitrary" size is currently 1 MiB, which is the BIO_MAX_SIZE
for architectures with 4k PAGE_CACHE_SIZE (most).

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>

[ 9.458297] osst :I: Tape driver with OnStream support version 0.99.4
[ 9.458297] osst :I: $Id: osst.c,v 1.73 2005/01/01 21:13:34 wriede Exp $
[ 9.497666] swapper invoked oom-killer: gfp_mask=0x2d2, order=0, oom_score_adj=0
[ 9.508104] CPU: 0 PID: 1 Comm: swapper Not tainted 3.10.0-rc4-00279-g81d9042 #7
[ 9.515344] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 9.520406] ffff88000d8b0560 ffff88000d8abb98 ffffffff81c6877d ffff88000d8abc38
[ 9.528736] ffffffff81c493f8 0000000000000036 000000000000bd44 ffff88000d8abbc8
[ 9.540233] ffffffff810b95b3 ffff88000d8abbf8 ffffffff810b969d ffff880004e73da8
[ 9.548608] Call Trace:
[ 9.551717] [<ffffffff81c6877d>] dump_stack+0x19/0x1b
[ 9.556395] [<ffffffff81c493f8>] dump_header.isra.12+0x6d/0x23a
[ 9.561364] [<ffffffff810b95b3>] ? put_lock_stats.isra.18+0xe/0x28
[ 9.565767] [<ffffffff810b969d>] ? lock_release_holdtime.part.19+0xd0/0xd8
[ 9.570484] [<ffffffff810fa72e>] out_of_memory+0x25f/0x2cd
[ 9.577443] [<ffffffff810fa5fa>] ? out_of_memory+0x12b/0x2cd
[ 9.582370] [<ffffffff81c75ee2>] ? _raw_spin_unlock+0x58/0x65
[ 9.587474] [<ffffffff810fe35f>] __alloc_pages_nodemask+0x6da/0x8b9
[ 9.592914] [<ffffffff81122229>] __vmalloc_node_range+0x119/0x1cd
[ 9.598151] [<ffffffff826f8c43>] ? scsi_debug_init+0x162/0x7f9
[ 9.603408] [<ffffffff81122312>] __vmalloc_node+0x35/0x37
[ 9.608376] [<ffffffff826f8c43>] ? scsi_debug_init+0x162/0x7f9
[ 9.613614] [<ffffffff81122360>] vmalloc+0x2a/0x2c
[ 9.618125] [<ffffffff826f8c43>] scsi_debug_init+0x162/0x7f9
[ 9.627300] [<ffffffff81791d62>] ? scsi_register_driver+0x16/0x18
[ 9.632654] [<ffffffff826f8ae1>] ? ses_init+0x3c/0x3c
[ 9.637981] [<ffffffff826acdec>] do_one_initcall+0xe9/0x1aa
[ 9.647683] [<ffffffff826acfa1>] kernel_init_freeable+0xf4/0x183
[ 9.658099] [<ffffffff826ac6ba>] ? do_early_param+0x8c/0x8c
[ 9.663348] [<ffffffff81c41ba3>] ? rest_init+0xc7/0xc7
[ 9.671711] [<ffffffff81c41bb1>] kernel_init+0xe/0xd1
[ 9.678330] [<ffffffff81c7741a>] ret_from_fork+0x7a/0xb0
[ 9.684293] [<ffffffff81c41ba3>] ? rest_init+0xc7/0xc7
[ 9.689178] Mem-Info:
[ 9.692235] DMA per-cpu:

git bisect start v3.8 v3.7 --
git bisect good dadfab4873256d2145640c0ce468fcbfb48977fe # 10 2013-06-06 20:03:02 Merge tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
git bisect bad 992956189de58cae9f2be40585bc25105cd7c5ad # 0 2013-06-06 20:12:04 efi: Fix the build with user namespaces enabled.
git bisect good 2b8318881ddbcb67c5e8d2178b42284749442222 # 10 2013-06-06 20:18:06 Merge tag 'fbdev-for-3.8' of git://gitorious.org/linux-omap-dss2/linux
git bisect good 3c2e81ef344a90bb0a39d84af6878b4aeff568a2 # 10 2013-06-06 20:28:58 Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
git bisect bad 009ba89db5ae836949009f97a00abb96feba69f4 # 0 2013-06-06 20:32:47 drbd: fix schedule in atomic
git bisect good 85f75dd7630436b0aa46a6393099c0f23121f5f0 # 10 2013-06-06 20:38:02 drbd: introduce in-kernel "down" command
git bisect good 49ba9b1bb3295fa690ae8f5091b093e61acf3ada # 10 2013-06-06 20:43:45 drbd: Remove useless error messages
git bisect bad 0db55363cb1e6cfe2bedecb7e47c05f8992c612e # 0 2013-06-06 20:47:27 drbd: Rename drbd_alloc_ee() to drbd_alloc_peer_req()
git bisect bad 8e0af25fa85c9efe393128b0a0dd874981edb22f # 0 2013-06-06 20:51:41 drbd: Moved susp, susp_nod and susp_fen to the connection object
git bisect good 2bf896213d4faa7289316663f5e8e0bc35d80abf # 10 2013-06-06 21:10:13 drbd: drbd_connect(): Initialize struct drbd_socket before sending anything
git bisect good 181286ad22bf9bfb85de625e8501285de5261b35 # 10 2013-06-06 21:33:27 drbd: preparation commit, pass drbd_interval to drbd_al_begin/complete_io
git bisect bad e15766e9c94f7fa3396eff4ffbbf30dea8c0e22a # 0 2013-06-06 21:47:39 drbd: improvements to activate/deactivate multiple activity log extents
git bisect bad 23361cf32b58efdf09945a64e1d8d41fa6117157 # 0 2013-06-06 22:09:21 drbd: get rid of bio_split, allow bios of "arbitrary" size
git bisect good 7726547e67a1fda0d12e1de5ec917a2e5d4b8186 # 10 2013-06-06 22:20:16 drbd: prepare to activate two activity log extents at once
git bisect good 7726547e67a1fda0d12e1de5ec917a2e5d4b8186 # 30 2013-06-06 22:24:49 drbd: prepare to activate two activity log extents at once
git bisect bad 81d904298000f4c82977575165b72af3d68e49b3 # 0 2013-06-06 22:25:03 Merge remote-tracking branch 'drm-intel/drm-intel-nightly' into devel-xian-x86_64-201306061718
git bisect bad 4e1e7059d375482daeeda395bba2939679b1ee14 # 0 2013-06-06 22:30:29 Add linux-next specific files for 20130606

Thanks,
Fengguang


Attachments:
(No filename) (5.34 kB)
dmesg-kvm-xian-51270-20130606184251-3.10.0-rc4-00279-g81d9042-7 (39.38 kB)
81d904298000f4c82977575165b72af3d68e49b3-bisect.log (16.80 kB)
.config-bisect (77.71 kB)
Download all attachments

2013-06-11 15:33:33

by Lars Ellenberg

[permalink] [raw]
Subject: Re: [drbd?] Kernel panic - not syncing: Out of memory and no killable processes...

On Fri, Jun 07, 2013 at 10:31:54AM +0800, Fengguang Wu wrote:
> Greetings,
>
> My "kvm -m 256" reliably goes Out Of Memory after this commit. It may
> not be the only one that eats up the memory, however I wonder how much
> memory consumption this commit added? Thanks!
>

Out of curiosity, what exactly is it you are doing there?
What project or appliance or behaviour or product or paper is the goal?


We scale certain mempools and reserves with
DRBD_MAX_BIO_SIZE/PAGE_SIZE * minor_count.

DRBD_MAX_BIO_SIZE has been increased by this patch,
resulting in more memory allocated to those reserved pools.

Please just scale down the "minor_count" parameter.
You can use the module parameter (e.g. modprobe drbd minor_count=8),
or, compiled in, use the kernel command line parameter drbd.minor_count=8.

Though "minor_count" at some point used to be the hard limit for the number of
minor devices (allocation of an array of corresponding size), that has
long since changed, and now it is really only used as scaling factor for
these mempools.

Lars


--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD? and LINBIT? are registered trademarks of LINBIT, Austria.

2013-06-12 10:11:49

by Fengguang Wu

[permalink] [raw]
Subject: Re: [drbd?] Kernel panic - not syncing: Out of memory and no killable processes...

On Tue, Jun 11, 2013 at 05:33:27PM +0200, Lars Ellenberg wrote:
> On Fri, Jun 07, 2013 at 10:31:54AM +0800, Fengguang Wu wrote:
> > Greetings,
> >
> > My "kvm -m 256" reliably goes Out Of Memory after this commit. It may
> > not be the only one that eats up the memory, however I wonder how much
> > memory consumption this commit added? Thanks!
> >
>
> Out of curiosity, what exactly is it you are doing there?
> What project or appliance or behaviour or product or paper is the goal?

Philipp, this is the 0day kernel testing project from Intel OTC.

We are running regular build/boot tests for 300+ kernel git trees and
aim to find and report problems ASAP. We test 30000+ kernel boots
every day (mainly in KVM).

> We scale certain mempools and reserves with
> DRBD_MAX_BIO_SIZE/PAGE_SIZE * minor_count.
>
> DRBD_MAX_BIO_SIZE has been increased by this patch,
> resulting in more memory allocated to those reserved pools.
>
> Please just scale down the "minor_count" parameter.
> You can use the module parameter (e.g. modprobe drbd minor_count=8),
> or, compiled in, use the kernel command line parameter drbd.minor_count=8.
>
> Though "minor_count" at some point used to be the hard limit for the number of
> minor devices (allocation of an array of corresponding size), that has
> long since changed, and now it is really only used as scaling factor for
> these mempools.

Got it, thank you very much for the helpful tips and explanations!
I'll add the drbd.minor_count=8 option.

Thanks,
Fengguang