2012-10-03 21:23:28

by Nico Schottelius

[permalink] [raw]
Subject: Re: Out of memory on 3.5 kernels

Hello,

does anyone of you have a clue so far what may be causing the huge
slab usage?

I've just found an interesting detail: umounting and cryptsetup
luksClosing frees up the used memory (not sure which one was freeing
up)

Attached are dmesg, slabtop output during the backup and after umounting.

Cheers,

Nico


--
PGP key: 7ED9 F7D3 6B10 81D7 0EC5 5C09 D7DC C8E4 3187 7DF0


Attachments:
(No filename) (381.00 B)
dmesg (480.37 kB)
slabtop (1.96 kB)
slabtop-post-umount (1.96 kB)
Download all attachments

2012-10-05 15:50:55

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Out of memory on 3.5 kernels

On Wed, 03 Oct 2012 23:23:11 +0200, Nico Schottelius said:

> does anyone of you have a clue so far what may be causing the huge
> slab usage?
>
> I've just found an interesting detail: umounting and cryptsetup
> luksClosing frees up the used memory (not sure which one was freeing
> up)

For what it's worth, I'm seeing a similar problem in linux-next on my laptop -
trying to run a backup to an external hard drive that has a LUKS partition on
it will OOM. (For some reason having the external LUKS partition is much more
problematic than the LVM-on-LUKS on the internal drive)

I've started bisecting, and gotten this far:

% git bisect log
git bisect start
# bad: [1aa44772a621e8547dc4db41b47c747469fe0ea3] Add linux-next specific files for 20121001
git bisect bad 1aa44772a621e8547dc4db41b47c747469fe0ea3
# good: [fea7a08acb13524b47711625eebea40a0ede69a0] Linux 3.6-rc3
git bisect good fea7a08acb13524b47711625eebea40a0ede69a0
# good: [526c4d73327f56f83da8b8088fd0b3c7be38c7ae] Merge remote-tracking branch 'regulator/for-next'
git bisect good 526c4d73327f56f83da8b8088fd0b3c7be38c7ae
# good: [961d70d88557405c5b7302c7d87752566468f035] Merge remote-tracking branch 'tty/tty-next'
git bisect good 961d70d88557405c5b7302c7d87752566468f035

(That's as far as I've gotten that I trust, the next bisect hits a different problem where
connecting the hard drive and *starting* cryptLuks hangs the machine hard, so I'll
have to finish bisecting that problem first, then return to bisecting this one)

I admit not being sure why you see it on a 3.5 kernel, but I only see it on stuff
that's after 3.6-rc3.


Attachments:
(No filename) (865.00 B)

2012-10-05 17:49:53

by Nico Schottelius

[permalink] [raw]
Subject: Re: Out of memory on 3.5 kernels

Hey Valdis,

[email protected] [Fri, Oct 05, 2012 at 11:48:04AM -0400]:
> On Wed, 03 Oct 2012 23:23:11 +0200, Nico Schottelius said:
>
> > does anyone of you have a clue so far what may be causing the huge
> > slab usage?
> >
> > I've just found an interesting detail: umounting and cryptsetup
> > luksClosing frees up the used memory (not sure which one was freeing
> > up)
>
> For what it's worth, I'm seeing a similar problem in linux-next on my laptop -
> trying to run a backup to an external hard drive that has a LUKS partition on
> it will OOM. (For some reason having the external LUKS partition is much more
> problematic than the LVM-on-LUKS on the internal drive)

Indeed, I also have the internal drive encrypted with LUKS, also jfs
on it, but the problem only occurs on the external drive.

Is there the possibility that the JFS on the external drive got into a
state that may caus JFS to behave unexpected? If this is true, it would
explain why it happens only with the external drive and also accross
kernel versions.

But "good" to hear I am not the only one affected anymore.

Cheers,

Nico


--
PGP key: 7ED9 F7D3 6B10 81D7 0EC5 5C09 D7DC C8E4 3187 7DF0


Attachments:
(No filename) (1.16 kB)
(No filename) (198.00 B)
Download all attachments

2012-10-30 22:09:10

by Nico Schottelius

[permalink] [raw]
Subject: Re: Out of memory on 3.5 kernels

Good morning,

update: this problem still exists on 3.6.2-1-ARCH and it got worse:

I reformatted the external disk to use xfs, but as the my
root filesystem is still jfs, it still appears:

Active / Total Objects (% used) : 642732 / 692268 (92.8%)
Active / Total Slabs (% used) : 24801 / 24801 (100.0%)
Active / Total Caches (% used) : 79 / 111 (71.2%)
Active / Total Size (% used) : 603522.30K / 622612.05K (96.9%)
Minimum / Average / Maximum Object : 0.01K / 0.90K / 15.25K

OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
475548 467649 98% 1.21K 18722 26 599104K jfs_ip
25670 19143 74% 0.05K 302 85 1208K shared_policy_node
24612 16861 68% 0.19K 1172 21 4688K dentry
24426 19524 79% 0.17K 1062 23 4248K vm_area_struct
21636 21180 97% 0.11K 601 36 2404K sysfs_dir_cache
12352 9812 79% 0.06K 193 64 772K kmalloc-64
11684 9145 78% 0.09K 254 46 1016K anon_vma
9855 8734 88% 0.58K 365 27 5840K inode_cache
9728 9281 95% 0.01K 19 512 76K kmalloc-8
8932 4411 49% 0.55K 319 28 5104K radix_tree_node
6336 5760 90% 0.25K 198 32 1584K kmalloc-256
5632 5632 100% 0.02K 22 256 88K kmalloc-16
4998 2627 52% 0.09K 119 42 476K kmalloc-96
4998 3893 77% 0.04K 49 102 196K Acpi-Namespace
4736 3887 82% 0.03K 37 128 148K kmalloc-32
4144 4144 100% 0.07K 74 56 296K Acpi-ParseExt
3740 3740 100% 0.02K 22 170 88K numa_policy
3486 3023 86% 0.19K 166 21 664K kmalloc-192
3200 2047 63% 0.12K 100 32 400K kmalloc-128
2304 2074 90% 0.50K 72 32 1152K kmalloc-512
2136 2019 94% 0.64K 89 24 1424K proc_inode_cache
2080 2080 100% 0.12K 65 32 260K jfs_mp
2024 1890 93% 0.70K 88 23 1408K shmem_inode_cache
1632 1556 95% 1.00K 51 32 1632K kmalloc-1024


I am wondering if anyone is feeling responsible for this bug or if the mid-term
solution is to move away from jfs?

Cheers,

Nico

--
PGP key: 7ED9 F7D3 6B10 81D7 0EC5 5C09 D7DC C8E4 3187 7DF0

2012-11-01 18:11:49

by Tino Reichardt

[permalink] [raw]
Subject: Re: [Jfs-discussion] Out of memory on 3.5 kernels

* Nico Schottelius <[email protected]> wrote:
> Good morning,
>
> update: this problem still exists on 3.6.2-1-ARCH and it got worse:
>
> I reformatted the external disk to use xfs, but as the my
> root filesystem is still jfs, it still appears:
>
> Active / Total Objects (% used) : 642732 / 692268 (92.8%)
> Active / Total Slabs (% used) : 24801 / 24801 (100.0%)
> Active / Total Caches (% used) : 79 / 111 (71.2%)
> Active / Total Size (% used) : 603522.30K / 622612.05K (96.9%)
> Minimum / Average / Maximum Object : 0.01K / 0.90K / 15.25K
>
> OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> 475548 467649 98% 1.21K 18722 26 599104K jfs_ip
> 25670 19143 74% 0.05K 302 85 1208K shared_policy_node
> 24612 16861 68% 0.19K 1172 21 4688K dentry
> 24426 19524 79% 0.17K 1062 23 4248K vm_area_struct
> 21636 21180 97% 0.11K 601 36 2404K sysfs_dir_cache
> 12352 9812 79% 0.06K 193 64 772K kmalloc-64
> 11684 9145 78% 0.09K 254 46 1016K anon_vma
> 9855 8734 88% 0.58K 365 27 5840K inode_cache
> 9728 9281 95% 0.01K 19 512 76K kmalloc-8
> 8932 4411 49% 0.55K 319 28 5104K radix_tree_node
> 6336 5760 90% 0.25K 198 32 1584K kmalloc-256
> 5632 5632 100% 0.02K 22 256 88K kmalloc-16
> 4998 2627 52% 0.09K 119 42 476K kmalloc-96
> 4998 3893 77% 0.04K 49 102 196K Acpi-Namespace
> 4736 3887 82% 0.03K 37 128 148K kmalloc-32
> 4144 4144 100% 0.07K 74 56 296K Acpi-ParseExt
> 3740 3740 100% 0.02K 22 170 88K numa_policy
> 3486 3023 86% 0.19K 166 21 664K kmalloc-192
> 3200 2047 63% 0.12K 100 32 400K kmalloc-128
> 2304 2074 90% 0.50K 72 32 1152K kmalloc-512
> 2136 2019 94% 0.64K 89 24 1424K proc_inode_cache
> 2080 2080 100% 0.12K 65 32 260K jfs_mp
> 2024 1890 93% 0.70K 88 23 1408K shmem_inode_cache
> 1632 1556 95% 1.00K 51 32 1632K kmalloc-1024
>
>
> I am wondering if anyone is feeling responsible for this bug or if the mid-term
> solution is to move away from jfs?

I also did some tests, when this bug was first reported... but I couln't
re-produce it... currently I have no idea what is wrong there.

I think moving to ext4 or xfs is the best for now... :(


--
regards, TR

2012-11-22 19:41:44

by Dave Kleikamp

[permalink] [raw]
Subject: Re: Out of memory on 3.5 kernels

On 10/30/2012 05:35 AM, Nico Schottelius wrote:
> Good morning,
>
> update: this problem still exists on 3.6.2-1-ARCH and it got worse:
>
> I reformatted the external disk to use xfs, but as the my
> root filesystem is still jfs, it still appears:
>
> Active / Total Objects (% used) : 642732 / 692268 (92.8%)
> Active / Total Slabs (% used) : 24801 / 24801 (100.0%)
> Active / Total Caches (% used) : 79 / 111 (71.2%)
> Active / Total Size (% used) : 603522.30K / 622612.05K (96.9%)
> Minimum / Average / Maximum Object : 0.01K / 0.90K / 15.25K
>
> OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> 475548 467649 98% 1.21K 18722 26 599104K jfs_ip
> 25670 19143 74% 0.05K 302 85 1208K shared_policy_node
> 24612 16861 68% 0.19K 1172 21 4688K dentry
> 24426 19524 79% 0.17K 1062 23 4248K vm_area_struct
> 21636 21180 97% 0.11K 601 36 2404K sysfs_dir_cache
> 12352 9812 79% 0.06K 193 64 772K kmalloc-64
> 11684 9145 78% 0.09K 254 46 1016K anon_vma
> 9855 8734 88% 0.58K 365 27 5840K inode_cache
> 9728 9281 95% 0.01K 19 512 76K kmalloc-8
> 8932 4411 49% 0.55K 319 28 5104K radix_tree_node
> 6336 5760 90% 0.25K 198 32 1584K kmalloc-256
> 5632 5632 100% 0.02K 22 256 88K kmalloc-16
> 4998 2627 52% 0.09K 119 42 476K kmalloc-96
> 4998 3893 77% 0.04K 49 102 196K Acpi-Namespace
> 4736 3887 82% 0.03K 37 128 148K kmalloc-32
> 4144 4144 100% 0.07K 74 56 296K Acpi-ParseExt
> 3740 3740 100% 0.02K 22 170 88K numa_policy
> 3486 3023 86% 0.19K 166 21 664K kmalloc-192
> 3200 2047 63% 0.12K 100 32 400K kmalloc-128
> 2304 2074 90% 0.50K 72 32 1152K kmalloc-512
> 2136 2019 94% 0.64K 89 24 1424K proc_inode_cache
> 2080 2080 100% 0.12K 65 32 260K jfs_mp
> 2024 1890 93% 0.70K 88 23 1408K shmem_inode_cache
> 1632 1556 95% 1.00K 51 32 1632K kmalloc-1024
>
>
> I am wondering if anyone is feeling responsible for this bug or if the mid-term
> solution is to move away from jfs?

Sorry, I haven't taken too close a look at this, but I did notice
another conversation that may be related:

https://lkml.org/lkml/2012/11/17/26

The commit in question first showed up in 3.5-rc1, which coincides with
your problem.

>
> Cheers,
>
> Nico
>

2012-11-27 15:57:21

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [Jfs-discussion] Out of memory on 3.5 kernels

On 11/21/2012 04:37 PM, Dave Kleikamp wrote:
> On 10/30/2012 05:35 AM, Nico Schottelius wrote:
>> Good morning,
>>
>> update: this problem still exists on 3.6.2-1-ARCH and it got worse:
>>
>> I reformatted the external disk to use xfs, but as the my
>> root filesystem is still jfs, it still appears:
>>
>> Active / Total Objects (% used) : 642732 / 692268 (92.8%)
>> Active / Total Slabs (% used) : 24801 / 24801 (100.0%)
>> Active / Total Caches (% used) : 79 / 111 (71.2%)
>> Active / Total Size (% used) : 603522.30K / 622612.05K (96.9%)
>> Minimum / Average / Maximum Object : 0.01K / 0.90K / 15.25K
>>
>> OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
>> 475548 467649 98% 1.21K 18722 26 599104K jfs_ip

...

>>
>> I am wondering if anyone is feeling responsible for this bug or if the mid-term
>> solution is to move away from jfs?
>
> Sorry, I haven't taken too close a look at this, but I did notice
> another conversation that may be related:
>
> https://lkml.org/lkml/2012/11/17/26
>
> The commit in question first showed up in 3.5-rc1, which coincides with
> your problem.

I believe this commit will fix the problem:
http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4eff96d

It is targeted for the stable kernels.

2012-11-27 16:11:53

by Nico Schottelius

[permalink] [raw]
Subject: Re: [Jfs-discussion] Out of memory on 3.5 kernels

Hey Dave,

Dave Kleikamp [Tue, Nov 27, 2012 at 09:56:58AM -0600]:
> [...]
> I believe this commit will fix the problem:
> http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=4eff96d
>
> It is targeted for the stable kernels.

Thanks, I'll give it a try, I've v3.7-rc7-25-g2844a48 compiled already
and give the system a reboot this evening.

Cheers,

Nico

--
PGP key: 7ED9 F7D3 6B10 81D7 0EC5 5C09 D7DC C8E4 3187 7DF0