2024-03-26 17:14:00

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [External] : nfsd: memory leak when client does many file operations

On 26 Mar 2024, at 13:04, Jan Schunk wrote:

> Before I start doing this on my own build I tried it with unmodified linux-image-6.6.13+bpo-amd64 from Debian 12.
> I installed systemtap, linux-headers-6.6.13+bpo-amd64 and linux-image-6.6.13+bpo-amd64-dbg and tried to run stap:
>
> user@deb:~$ sudo stap -v --all-modules kmem_alloc.stp nfsd_file
> WARNING: Kernel function symbol table missing [man warning::symbols]
> Pass 1: parsed user script and 484 library scripts using 110120virt/96896res/7168shr/89800data kb, in 1360usr/1080sys/4963real ms.
> WARNING: cannot find module kernel debuginfo: No DWARF information found [man warning::debuginfo]
> semantic error: resolution failed in DWARF builder
>
> semantic error: while resolving probe point: identifier 'kernel' at kmem_alloc.stp:5:7
> source: probe kernel.function("kmem_cache_alloc") {
> ^
>
> semantic error: no match
>
> Pass 2: analyzed script: 1 probe, 5 functions, 1 embed, 3 globals using 112132virt/100352res/8704shr/91792data kb, in 30usr/30sys/167real ms.
> Pass 2: analysis failed. [man error::pass2]
> Tip: /usr/share/doc/systemtap/README.Debian should help you get started.
> user@deb:~$
>
> user@deb:~$ grep -E 'CONFIG_DEBUG_INFO|CONFIG_KPROBES|CONFIG_DEBUG_FS|CONFIG_RELAY' /boot/config-6.6.13+bpo-amd64
> CONFIG_RELAY=y
> CONFIG_KPROBES=y
> CONFIG_KPROBES_ON_FTRACE=y
> CONFIG_DEBUG_INFO=y
> # CONFIG_DEBUG_INFO_NONE is not set
> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> # CONFIG_DEBUG_INFO_DWARF4 is not set
> # CONFIG_DEBUG_INFO_DWARF5 is not set
> # CONFIG_DEBUG_INFO_REDUCED is not set
> CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
> # CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
> # CONFIG_DEBUG_INFO_SPLIT is not set
> CONFIG_DEBUG_INFO_BTF=y
> CONFIG_DEBUG_INFO_BTF_MODULES=y
> CONFIG_DEBUG_FS=y
> CONFIG_DEBUG_FS_ALLOW_ALL=y
> # CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
> # CONFIG_DEBUG_FS_ALLOW_NONE is not set
> user@deb:~$
>
> Do I need to enable other options?

You should just need DEBUG_INFO.. maybe stap can't find it? You can try to add: -r /path/to/the/kernel/build

.. but usually I use this option for a cross-compile. Usually I don't have to muck around without the debuginfo packages either. If I don't have them then I'm annotating the kernel directly.

Maybe just a view of what's happening in /proc/slabinfo would be enough..

Ben



2024-03-26 17:15:53

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [External] : nfsd: memory leak when client does many file operations

On 26 Mar 2024, at 13:13, Benjamin Coddington wrote:

> On 26 Mar 2024, at 13:04, Jan Schunk wrote:
>
>> Before I start doing this on my own build I tried it with unmodified linux-image-6.6.13+bpo-amd64 from Debian 12.
>> I installed systemtap, linux-headers-6.6.13+bpo-amd64 and linux-image-6.6.13+bpo-amd64-dbg and tried to run stap:
>>
>> user@deb:~$ sudo stap -v --all-modules kmem_alloc.stp nfsd_file
>> WARNING: Kernel function symbol table missing [man warning::symbols]
>> Pass 1: parsed user script and 484 library scripts using 110120virt/96896res/7168shr/89800data kb, in 1360usr/1080sys/4963real ms.
>> WARNING: cannot find module kernel debuginfo: No DWARF information found [man warning::debuginfo]
>> semantic error: resolution failed in DWARF builder
>>
>> semantic error: while resolving probe point: identifier 'kernel' at kmem_alloc.stp:5:7
>> source: probe kernel.function("kmem_cache_alloc") {
>> ^
>>
>> semantic error: no match
>>
>> Pass 2: analyzed script: 1 probe, 5 functions, 1 embed, 3 globals using 112132virt/100352res/8704shr/91792data kb, in 30usr/30sys/167real ms.
>> Pass 2: analysis failed. [man error::pass2]
>> Tip: /usr/share/doc/systemtap/README.Debian should help you get started.
>> user@deb:~$
>>
>> user@deb:~$ grep -E 'CONFIG_DEBUG_INFO|CONFIG_KPROBES|CONFIG_DEBUG_FS|CONFIG_RELAY' /boot/config-6.6.13+bpo-amd64
>> CONFIG_RELAY=y
>> CONFIG_KPROBES=y
>> CONFIG_KPROBES_ON_FTRACE=y
>> CONFIG_DEBUG_INFO=y
>> # CONFIG_DEBUG_INFO_NONE is not set
>> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
>> # CONFIG_DEBUG_INFO_DWARF4 is not set
>> # CONFIG_DEBUG_INFO_DWARF5 is not set
>> # CONFIG_DEBUG_INFO_REDUCED is not set
>> CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
>> # CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
>> # CONFIG_DEBUG_INFO_SPLIT is not set
>> CONFIG_DEBUG_INFO_BTF=y
>> CONFIG_DEBUG_INFO_BTF_MODULES=y
>> CONFIG_DEBUG_FS=y
>> CONFIG_DEBUG_FS_ALLOW_ALL=y
>> # CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
>> # CONFIG_DEBUG_FS_ALLOW_NONE is not set
>> user@deb:~$
>>
>> Do I need to enable other options?
>
> You should just need DEBUG_INFO.. maybe stap can't find it? You can try to add: -r /path/to/the/kernel/build

oh, nevermind - you're using a packaged kernel. I'm no familiar with the packaged requirements for systemtap on debian.

Ben


2024-03-26 19:10:41

by Jan Schunk

[permalink] [raw]
Subject: Aw: Re: [External] : nfsd: memory leak when client does many file operations

Thanks, yes this was a packaged kernel, I will try it with my own build later.

On an earlier test run I saved slabinfo to a file sometimes. On Kernel 6.6.x I can see nfsd_file <active_objs> and <num_objs> is growing from 72 to 324 within 14 hours. But I can not compare it to older kernels since there is no nfsd_file in the list.

top - 00:49:49 up 3 min, 1 user, load average: 0,21, 0,19, 0,09
Tasks: 111 total, 1 running, 110 sleeping, 0 stopped, 0 zombie
%CPU(s): 0,2 us, 0,3 sy, 0,0 ni, 99,5 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
MiB Spch: 467,0 total, 302,3 free, 89,3 used, 88,1 buff/cache
MiB Swap: 975,0 total, 975,0 free, 0,0 used. 377,7 avail Spch

slabinfo
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
nfsd_file 72 72 112 36 1 : tunables 0 0 0 : slabdata 2 2 0

top - 15:05:39 up 14:19, 1 user, load average: 1,87, 1,72, 1,65
Tasks: 104 total, 1 running, 103 sleeping, 0 stopped, 0 zombie
%CPU(s): 0,2 us, 4,9 sy, 0,0 ni, 53,3 id, 39,0 wa, 0,0 hi, 2,6 si, 0,0 st
MiB Spch: 467,0 total, 21,2 free, 147,1 used, 310,9 buff/cache
MiB Swap: 975,0 total, 952,9 free, 22,1 used. 319,9 avail Spch

slabinfo
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
nfsd_file 324 324 112 36 1 : tunables 0 0 0 : slabdata 9 9 0


> Gesendet: Dienstag, den 26.03.2024 um 18:15 Uhr
> Von: "Benjamin Coddington" <[email protected]>
> An: "Jan Schunk" <[email protected]>
> Cc: "Chuck Lever III" <[email protected]>, "Jeff Layton" <[email protected]>, "Neil Brown" <[email protected]>, "Olga Kornievskaia" <[email protected]>, "Dai Ngo" <[email protected]>, "Tom Talpey" <[email protected]>, "Linux NFS Mailing List" <[email protected]>, [email protected]
> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
>
> On 26 Mar 2024, at 13:13, Benjamin Coddington wrote:
>
> > On 26 Mar 2024, at 13:04, Jan Schunk wrote:
> >
> >> Before I start doing this on my own build I tried it with unmodified linux-image-6.6.13+bpo-amd64 from Debian 12.
> >> I installed systemtap, linux-headers-6.6.13+bpo-amd64 and linux-image-6.6.13+bpo-amd64-dbg and tried to run stap:
> >>
> >> user@deb:~$ sudo stap -v --all-modules kmem_alloc.stp nfsd_file
> >> WARNING: Kernel function symbol table missing [man warning::symbols]
> >> Pass 1: parsed user script and 484 library scripts using 110120virt/96896res/7168shr/89800data kb, in 1360usr/1080sys/4963real ms.
> >> WARNING: cannot find module kernel debuginfo: No DWARF information found [man warning::debuginfo]
> >> semantic error: resolution failed in DWARF builder
> >>
> >> semantic error: while resolving probe point: identifier 'kernel' at kmem_alloc.stp:5:7
> >> source: probe kernel.function("kmem_cache_alloc") {
> >> ^
> >>
> >> semantic error: no match
> >>
> >> Pass 2: analyzed script: 1 probe, 5 functions, 1 embed, 3 globals using 112132virt/100352res/8704shr/91792data kb, in 30usr/30sys/167real ms.
> >> Pass 2: analysis failed. [man error::pass2]
> >> Tip: /usr/share/doc/systemtap/README.Debian should help you get started.
> >> user@deb:~$
> >>
> >> user@deb:~$ grep -E 'CONFIG_DEBUG_INFO|CONFIG_KPROBES|CONFIG_DEBUG_FS|CONFIG_RELAY' /boot/config-6.6.13+bpo-amd64
> >> CONFIG_RELAY=y
> >> CONFIG_KPROBES=y
> >> CONFIG_KPROBES_ON_FTRACE=y
> >> CONFIG_DEBUG_INFO=y
> >> # CONFIG_DEBUG_INFO_NONE is not set
> >> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> >> # CONFIG_DEBUG_INFO_DWARF4 is not set
> >> # CONFIG_DEBUG_INFO_DWARF5 is not set
> >> # CONFIG_DEBUG_INFO_REDUCED is not set
> >> CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
> >> # CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
> >> # CONFIG_DEBUG_INFO_SPLIT is not set
> >> CONFIG_DEBUG_INFO_BTF=y
> >> CONFIG_DEBUG_INFO_BTF_MODULES=y
> >> CONFIG_DEBUG_FS=y
> >> CONFIG_DEBUG_FS_ALLOW_ALL=y
> >> # CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
> >> # CONFIG_DEBUG_FS_ALLOW_NONE is not set
> >> user@deb:~$
> >>
> >> Do I need to enable other options?
> >
> > You should just need DEBUG_INFO.. maybe stap can't find it? You can try to add: -r /path/to/the/kernel/build
>
> oh, nevermind - you're using a packaged kernel. I'm no familiar with the packaged requirements for systemtap on debian.
>
> Ben
>