2001-12-08 15:40:02

by Leigh Orf

[permalink] [raw]
Subject: 2.4.16 memory badness (reproducible)


I've been having confounding out-of-memory problems with 2.4.16 on my
1.4MHz Athlon with 1 GB of memory (2 GB of swap). I just caught it in
the act and I think it relates to some of the weirdness others have been
reporting.

I'm running RedHat 7.2. After bootup, it runs a program called updatedb
(slocate -u) which does a lot of file i/o as it indexes all the files on
my hard drives. Following this action, my machine is in a state which
make many applications give "cannot allocate memory" errors. It seems
the kernel is not freeing up buffered or cached memory, and even more
troubling is the fact that it isn't using any of my swap space.

Here is the state of the machine after updatedb runs:

home[1006]:/home/orf% free
total used free shared buffers cached
Mem: 1029820 1021252 8568 0 471036 90664
-/+ buffers/cache: 459552 570268
Swap: 2064344 0 2064344

home[1003]:/home/orf% cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 1054535680 1045901312 8634368 0 480497664 93954048
Swap: 2113888256 0 2113888256
MemTotal: 1029820 kB
MemFree: 8432 kB
MemShared: 0 kB
Buffers: 469236 kB
Cached: 91752 kB
SwapCached: 0 kB
Active: 383812 kB
Inactive: 229016 kB
HighTotal: 130992 kB
HighFree: 2044 kB
LowTotal: 898828 kB
LowFree: 6388 kB
SwapTotal: 2064344 kB
SwapFree: 2064344 kB

home[1005]:/home/orf% cat /proc/slabinfo
slabinfo - version: 1.1
kmem_cache 65 68 112 2 2 1
ip_conntrack 9 50 384 4 5 1
nfs_write_data 0 0 384 0 0 1
nfs_read_data 0 0 384 0 0 1
nfs_page 0 0 128 0 0 1
ip_fib_hash 10 112 32 1 1 1
urb_priv 0 0 64 0 0 1
clip_arp_cache 0 0 128 0 0 1
ip_mrt_cache 0 0 128 0 0 1
tcp_tw_bucket 0 0 128 0 0 1
tcp_bind_bucket 8 112 32 1 1 1
tcp_open_request 0 0 128 0 0 1
inet_peer_cache 4 59 64 1 1 1
ip_dst_cache 27 40 192 2 2 1
arp_cache 3 30 128 1 1 1
blkdev_requests 640 660 128 22 22 1
journal_head 0 0 48 0 0 1
revoke_table 0 0 12 0 0 1
revoke_record 0 0 32 0 0 1
dnotify cache 0 0 20 0 0 1
file lock cache 2 42 92 1 1 1
fasync cache 2 202 16 1 1 1
uid_cache 5 112 32 1 1 1
skbuff_head_cache 327 340 192 17 17 1
sock 188 198 1280 66 66 1
sigqueue 2 29 132 1 1 1
cdev_cache 2313 2360 64 40 40 1
bdev_cache 8 59 64 1 1 1
mnt_cache 19 59 64 1 1 1
inode_cache 439584 439586 512 62798 62798 1
dentry_cache 454136 454200 128 15140 15140 1
dquot 0 0 128 0 0 1
filp 1471 1500 128 50 50 1
names_cache 0 2 4096 0 2 1
buffer_head 144413 173280 128 5776 5776 1
mm_struct 57 80 192 4 4 1
vm_area_struct 2325 2760 128 92 92 1
fs_cache 56 118 64 2 2 1
files_cache 56 72 448 8 8 1
signal_act 64 72 1344 24 24 1
size-131072(DMA) 0 0 131072 0 0 32
size-131072 0 0 131072 0 0 32
size-65536(DMA) 0 0 65536 0 0 16
size-65536 1 1 65536 1 1 16
size-32768(DMA) 0 0 32768 0 0 8
size-32768 1 1 32768 1 1 8
size-16384(DMA) 0 0 16384 0 0 4
size-16384 1 1 16384 1 1 4
size-8192(DMA) 0 0 8192 0 0 2
size-8192 4 4 8192 4 4 2
size-4096(DMA) 0 0 4096 0 0 1
size-4096 64 68 4096 64 68 1
size-2048(DMA) 0 0 2048 0 0 1
size-2048 52 66 2048 27 33 1
size-1024(DMA) 0 0 1024 0 0 1
size-1024 11042 11048 1024 2762 2762 1
size-512(DMA) 0 0 512 0 0 1
size-512 12004 12016 512 1501 1502 1
size-256(DMA) 0 0 256 0 0 1
size-256 1678 1695 256 113 113 1
size-128(DMA) 2 30 128 1 1 1
size-128 29398 29430 128 980 981 1
size-64(DMA) 0 0 64 0 0 1
size-64 7954 7965 64 135 135 1
size-32(DMA) 34 59 64 1 1 1
size-32 66711 66729 64 1131 1131 1

Now, I try to run a common application:

home[1031]:/home/orf% xmms
Memory fault

Strace on xmms shows:

home[1008]:/home/orf/memfuck% cat xmms.strace
[snip]
modify_ldt(0x1, 0xbffff1fc, 0x10) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

Also, from my syslog (I have an ntfs partition):

Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed
Dec 8 09:55:01 orp kernel: NTFS: ntfs_getdir_unsorted(): Read failed. Returning error code -95.
Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed
Dec 8 09:55:01 orp kernel: NTFS: ntfs_getdir_unsorted(): Read failed. Returning error code -95.
Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed
Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed

The program nautilus, which is involved with the Gnome windowing stuff,
also complains it can't allocate memory if I log into the console after
udpatedb has run (that's what clued me into this problem in the first
place).

The only way I can find to make the system usable is to run an
application which aggressively recovers some of this buffered/cached
memory, and quit it. One easy way to do this:

home[1014]:/home/orf% lmdd opat=1 count=1 bs=900m

After I do this, much free memory is available.

Some applications are able to "reclaim" the buffered/cached memory,
while others aren't. Netscape doesn't have a problem, for instance,
running after updatedb runs.

This is a pretty serious problem. Interestingly enough, it does NOT
occur on my other machine, running same kernel and RH7.2, with 256M
memory and 512M swap.

Leigh Orf


2001-12-08 15:56:38

by Ken Brownfield

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)

This parallels what I'm seeing -- perhaps inode/dentry cache bloat is
causing the memory issue (which mimics if not _is_ a memory leak) _and_
my kswapd thrashing? It fits both the situation you report and what I'm
seeing with I/O across a large number of files (inodes) -- updatedb,
smb, NFS, etc.

I think Andrea was on to this issue, so I'm hoping his work will help.
Have you tried an -aa kernel or an aa patch onto a 2.4.17-pre4 to see
how the kernel's behavior changes?

--
Ken.
[email protected]

On Sat, Dec 08, 2001 at 10:39:14AM -0500, Leigh Orf wrote:
|
| I've been having confounding out-of-memory problems with 2.4.16 on my
| 1.4MHz Athlon with 1 GB of memory (2 GB of swap). I just caught it in
| the act and I think it relates to some of the weirdness others have been
| reporting.
|
| I'm running RedHat 7.2. After bootup, it runs a program called updatedb
| (slocate -u) which does a lot of file i/o as it indexes all the files on
| my hard drives. Following this action, my machine is in a state which
| make many applications give "cannot allocate memory" errors. It seems
| the kernel is not freeing up buffered or cached memory, and even more
| troubling is the fact that it isn't using any of my swap space.
|
| Here is the state of the machine after updatedb runs:
|
| home[1006]:/home/orf% free
| total used free shared buffers cached
| Mem: 1029820 1021252 8568 0 471036 90664
| -/+ buffers/cache: 459552 570268
| Swap: 2064344 0 2064344
|
| home[1003]:/home/orf% cat /proc/meminfo
| total: used: free: shared: buffers: cached:
| Mem: 1054535680 1045901312 8634368 0 480497664 93954048
| Swap: 2113888256 0 2113888256
| MemTotal: 1029820 kB
| MemFree: 8432 kB
| MemShared: 0 kB
| Buffers: 469236 kB
| Cached: 91752 kB
| SwapCached: 0 kB
| Active: 383812 kB
| Inactive: 229016 kB
| HighTotal: 130992 kB
| HighFree: 2044 kB
| LowTotal: 898828 kB
| LowFree: 6388 kB
| SwapTotal: 2064344 kB
| SwapFree: 2064344 kB
|
| home[1005]:/home/orf% cat /proc/slabinfo
| slabinfo - version: 1.1
| kmem_cache 65 68 112 2 2 1
| ip_conntrack 9 50 384 4 5 1
| nfs_write_data 0 0 384 0 0 1
| nfs_read_data 0 0 384 0 0 1
| nfs_page 0 0 128 0 0 1
| ip_fib_hash 10 112 32 1 1 1
| urb_priv 0 0 64 0 0 1
| clip_arp_cache 0 0 128 0 0 1
| ip_mrt_cache 0 0 128 0 0 1
| tcp_tw_bucket 0 0 128 0 0 1
| tcp_bind_bucket 8 112 32 1 1 1
| tcp_open_request 0 0 128 0 0 1
| inet_peer_cache 4 59 64 1 1 1
| ip_dst_cache 27 40 192 2 2 1
| arp_cache 3 30 128 1 1 1
| blkdev_requests 640 660 128 22 22 1
| journal_head 0 0 48 0 0 1
| revoke_table 0 0 12 0 0 1
| revoke_record 0 0 32 0 0 1
| dnotify cache 0 0 20 0 0 1
| file lock cache 2 42 92 1 1 1
| fasync cache 2 202 16 1 1 1
| uid_cache 5 112 32 1 1 1
| skbuff_head_cache 327 340 192 17 17 1
| sock 188 198 1280 66 66 1
| sigqueue 2 29 132 1 1 1
| cdev_cache 2313 2360 64 40 40 1
| bdev_cache 8 59 64 1 1 1
| mnt_cache 19 59 64 1 1 1
| inode_cache 439584 439586 512 62798 62798 1
| dentry_cache 454136 454200 128 15140 15140 1
| dquot 0 0 128 0 0 1
| filp 1471 1500 128 50 50 1
| names_cache 0 2 4096 0 2 1
| buffer_head 144413 173280 128 5776 5776 1
| mm_struct 57 80 192 4 4 1
| vm_area_struct 2325 2760 128 92 92 1
| fs_cache 56 118 64 2 2 1
| files_cache 56 72 448 8 8 1
| signal_act 64 72 1344 24 24 1
| size-131072(DMA) 0 0 131072 0 0 32
| size-131072 0 0 131072 0 0 32
| size-65536(DMA) 0 0 65536 0 0 16
| size-65536 1 1 65536 1 1 16
| size-32768(DMA) 0 0 32768 0 0 8
| size-32768 1 1 32768 1 1 8
| size-16384(DMA) 0 0 16384 0 0 4
| size-16384 1 1 16384 1 1 4
| size-8192(DMA) 0 0 8192 0 0 2
| size-8192 4 4 8192 4 4 2
| size-4096(DMA) 0 0 4096 0 0 1
| size-4096 64 68 4096 64 68 1
| size-2048(DMA) 0 0 2048 0 0 1
| size-2048 52 66 2048 27 33 1
| size-1024(DMA) 0 0 1024 0 0 1
| size-1024 11042 11048 1024 2762 2762 1
| size-512(DMA) 0 0 512 0 0 1
| size-512 12004 12016 512 1501 1502 1
| size-256(DMA) 0 0 256 0 0 1
| size-256 1678 1695 256 113 113 1
| size-128(DMA) 2 30 128 1 1 1
| size-128 29398 29430 128 980 981 1
| size-64(DMA) 0 0 64 0 0 1
| size-64 7954 7965 64 135 135 1
| size-32(DMA) 34 59 64 1 1 1
| size-32 66711 66729 64 1131 1131 1
|
| Now, I try to run a common application:
|
| home[1031]:/home/orf% xmms
| Memory fault
|
| Strace on xmms shows:
|
| home[1008]:/home/orf/memfuck% cat xmms.strace
| [snip]
| modify_ldt(0x1, 0xbffff1fc, 0x10) = -1 ENOMEM (Cannot allocate memory)
| --- SIGSEGV (Segmentation fault) ---
| +++ killed by SIGSEGV +++
|
| Also, from my syslog (I have an ntfs partition):
|
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_getdir_unsorted(): Read failed. Returning error code -95.
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_getdir_unsorted(): Read failed. Returning error code -95.
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_insert_run: ntfs_vmalloc(new_size = 0x1000) failed
| Dec 8 09:55:01 orp kernel: NTFS: ntfs_process_runs: ntfs_insert_run failed
|
| The program nautilus, which is involved with the Gnome windowing stuff,
| also complains it can't allocate memory if I log into the console after
| udpatedb has run (that's what clued me into this problem in the first
| place).
|
| The only way I can find to make the system usable is to run an
| application which aggressively recovers some of this buffered/cached
| memory, and quit it. One easy way to do this:
|
| home[1014]:/home/orf% lmdd opat=1 count=1 bs=900m
|
| After I do this, much free memory is available.
|
| Some applications are able to "reclaim" the buffered/cached memory,
| while others aren't. Netscape doesn't have a problem, for instance,
| running after updatedb runs.
|
| This is a pretty serious problem. Interestingly enough, it does NOT
| occur on my other machine, running same kernel and RH7.2, with 256M
| memory and 512M swap.
|
| Leigh Orf
|
| -
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to [email protected]
| More majordomo info at http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at http://www.tux.org/lkml/

2001-12-08 18:54:50

by Leigh Orf

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)



Ken Brownfield wrote:

| This parallels what I'm seeing -- perhaps inode/dentry cache
| bloat is causing the memory issue (which mimics if not _is_
| a memory leak) _and_ my kswapd thrashing? It fits both the
| situation you report and what I'm seeing with I/O across a
| large number of files (inodes) -- updatedb, smb, NFS, etc.
|
| I think Andrea was on to this issue, so I'm hoping his work
| will help. Have you tried an -aa kernel or an aa patch onto
| a 2.4.17-pre4 to see how the kernel's behavior changes?
|
| --
| Ken.
| [email protected]

I get the exact same behavior with 2.4.17-pre4-aa1 - many applications
abort with ENOMEM after updatedb (filling the buffer and cache). Is
there another kernel/patch I should try?

Leigh Orf

2001-12-08 19:42:22

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)

Leigh Orf wrote:
>
> Ken Brownfield wrote:
>
> | This parallels what I'm seeing -- perhaps inode/dentry cache
> | bloat is causing the memory issue (which mimics if not _is_
> | a memory leak) _and_ my kswapd thrashing? It fits both the
> | situation you report and what I'm seeing with I/O across a
> | large number of files (inodes) -- updatedb, smb, NFS, etc.
> |
> | I think Andrea was on to this issue, so I'm hoping his work
> | will help. Have you tried an -aa kernel or an aa patch onto
> | a 2.4.17-pre4 to see how the kernel's behavior changes?
> |
> | --
> | Ken.
> | [email protected]
>
> I get the exact same behavior with 2.4.17-pre4-aa1 - many applications
> abort with ENOMEM after updatedb (filling the buffer and cache). Is
> there another kernel/patch I should try?
>

Just for interest's sake:

--- linux-2.4.17-pre6/mm/memory.c Fri Dec 7 15:39:52 2001
+++ linux-akpm/mm/memory.c Sat Dec 8 11:13:30 2001
@@ -1184,6 +1184,7 @@ static int do_anonymous_page(struct mm_s
flush_page_to_ram(page);
entry = pte_mkwrite(pte_mkdirty(mk_pte(page, vma->vm_page_prot)));
lru_cache_add(page);
+ activate_page(page);
}

set_pte(page_table, entry);

2001-12-08 20:05:15

by Leigh Orf

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)


No change - identical behavior.

Leigh Orf

Andrew Morton wrote:

| Leigh Orf wrote:
| >
| > Ken Brownfield wrote:
| >
| > | This parallels what I'm seeing -- perhaps inode/dentry cache
| > | bloat is causing the memory issue (which mimics if not _is_
| > | a memory leak) _and_ my kswapd thrashing? It fits both the
| > | situation you report and what I'm seeing with I/O across a
| > | large number of files (inodes) -- updatedb, smb, NFS, etc.
| > |
| > | I think Andrea was on to this issue, so I'm hoping his work
| > | will help. Have you tried an -aa kernel or an aa patch onto
| > | a 2.4.17-pre4 to see how the kernel's behavior changes?
| > |
| > | --
| > | Ken.
| > | [email protected]
| >
| > I get the exact same behavior with 2.4.17-pre4-aa1 - many applications
| > abort with ENOMEM after updatedb (filling the buffer and cache). Is
| > there another kernel/patch I should try?
| >
|
| Just for interest's sake:
|
| --- linux-2.4.17-pre6/mm/memory.c Fri Dec 7 15:39:52 2001
| +++ linux-akpm/mm/memory.c Sat Dec 8 11:13:30 2001
| @@ -1184,6 +1184,7 @@ static int do_anonymous_page(struct mm_s
| flush_page_to_ram(page);
| entry = pte_mkwrite(pte_mkdirty(mk_pte(page, vma->vm_page_prot)));
| lru_cache_add(page);
| + activate_page(page);
| }
|
| set_pte(page_table, entry);

2001-12-08 21:42:32

by Leigh Orf

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)


I've noticed a couple more things with the memory allocation problem
with large buffer and cache allocation. Some applications will fail
with ENOMEM *even if* there is a considerable amount (say, 62 MB as
below) of "truly" free memory.

The second thing I've noticed is that all these apps that die with
ENOMEM pretty much have the same strace output towards the end. What
is strange is "display *.tif" dies while "ee *.tif" and "gimp *.tif"
does not. Piping the strace output of commands that *don't* cause this
behavior and grepping for modify_ldt shows that modify_ldt is *not*
being called for apps that *don't* die.

So I don't know if it's a symptom or a cause, but modify_ldt seems to be
triggering the problem. Not being a kernel hacker, I leave the analysis
of this to those who are.

Leigh Orf

home[1029]:/home/orf% free
total used free shared buffers cached
Mem: 1029772 967096 62676 0 443988 98312
-/+ buffers/cache: 424796 604976
Swap: 2064344 0 2064344

home[1026]:/home/orf% strace xmms 2>&1 | tail
old_mmap(NULL, 1291080, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40316000
mprotect(0x40448000, 37704, PROT_NONE) = 0
old_mmap(0x40448000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x131000) = 0x40448000
old_mmap(0x4044e000, 13128, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4044e000
close(3) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40452000
munmap(0x40018000, 72129) = 0
modify_ldt(0x1, 0xbffff1fc, 0x10) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

home[1027]:/home/orf% strace nautilus 2>&1 | tail
old_mmap(NULL, 1291080, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40958000
mprotect(0x40a8a000, 37704, PROT_NONE) = 0
old_mmap(0x40a8a000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x131000) = 0x40a8a000
old_mmap(0x40a90000, 13128, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40a90000
close(3) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40a94000
munmap(0x40018000, 72129) = 0
modify_ldt(0x1, 0xbffff1fc, 0x10) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++

home[1028]:/home/orf% strace display *.tif 2>&1 | tail
old_mmap(NULL, 1291080, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x404ff000
mprotect(0x40631000, 37704, PROT_NONE) = 0
old_mmap(0x40631000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x131000) = 0x40631000
old_mmap(0x40637000, 13128, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40637000
close(3) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4063b000
munmap(0x401a8000, 72129) = 0
modify_ldt(0x1, 0xbfffefac, 0x10) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++



Leigh Orf wrote:

|
| No change - identical behavior.
|
| Leigh Orf
|
| Andrew Morton wrote:
|
| | Leigh Orf wrote:
| | >
| | > Ken Brownfield wrote:
| | >
| | > | This parallels what I'm seeing -- perhaps inode/dentry cache
| | > | bloat is causing the memory issue (which mimics if not _is_
| | > | a memory leak) _and_ my kswapd thrashing? It fits both the
| | > | situation you report and what I'm seeing with I/O across a
| | > | large number of files (inodes) -- updatedb, smb, NFS, etc.
| | > |
| | > | I think Andrea was on to this issue, so I'm hoping his work
| | > | will help. Have you tried an -aa kernel or an aa patch onto
| | > | a 2.4.17-pre4 to see how the kernel's behavior changes?
| | > |
| | > | --
| | > | Ken.
| | > | [email protected]
| | >
| | > I get the exact same behavior with 2.4.17-pre4-aa1 - many applications
| | > abort with ENOMEM after updatedb (filling the buffer and cache). Is
| | > there another kernel/patch I should try?
| | >
| |
| | Just for interest's sake:
| |
| | --- linux-2.4.17-pre6/mm/memory.c Fri Dec 7 15:39:52 2001
| | +++ linux-akpm/mm/memory.c Sat Dec 8 11:13:30 2001
| | @@ -1184,6 +1184,7 @@ static int do_anonymous_page(struct mm_s
| | flush_page_to_ram(page);
| | entry = pte_mkwrite(pte_mkdirty(mk_pte(page, vma->vm_page_prot)));
| | lru_cache_add(page);
| | + activate_page(page);
| | }
| |
| | set_pte(page_table, entry);

2001-12-08 22:24:43

by Leigh Orf

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)


More clues...

The only way I can seem to bring the machine back to being totally
normal after buff/cache fullness is to force some swap to be written,
such as by doing

lmdd opat=1 count=1 bs=900m

If I do

lmdd opat=1 count=1 bs=500m

about 500MB of memory is freed but no swap is written, and modify_ldt
still returns ENOMEM if I run xmms, display, etc....

It looks like the problem is somewhere in vmalloc since that's what
returns a null pointer where ENOMEM gets set in arch/i386/kernel/ldt.c

BTW I have been running kernels with

CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y

I am compiling a kernel with

CONFIG_NOHIGHMEM=y

and will see if the bad memory behavior continues.

Leigh Orf

Leigh Orf wrote:

| I've noticed a couple more things with the memory allocation
| problem with large buffer and cache allocation. Some
| applications will fail with ENOMEM *even if* there is a
| considerable amount (say, 62 MB as below) of "truly" free
| memory.
|
| The second thing I've noticed is that all these apps that die
| with ENOMEM pretty much have the same strace output towards
| the end. What is strange is "display *.tif" dies while "ee
| *.tif" and "gimp *.tif" does not. Piping the strace output of
| commands that *don't* cause this behavior and grepping for
| modify_ldt shows that modify_ldt is *not* being called for
| apps that *don't* die.
|
| So I don't know if it's a symptom or a cause, but modify_ldt
| seems to be triggering the problem. Not being a kernel
| hacker, I leave the analysis of this to those who are.
|
| Leigh Orf


2001-12-11 19:07:51

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)

On Sat, 8 Dec 2001, Leigh Orf wrote:
>
> So I don't know if it's a symptom or a cause, but modify_ldt seems to be
> triggering the problem. Not being a kernel hacker, I leave the analysis
> of this to those who are.
>
> home[1029]:/home/orf% free
> total used free shared buffers cached
> Mem: 1029772 967096 62676 0 443988 98312
> -/+ buffers/cache: 424796 604976
> Swap: 2064344 0 2064344
>
> modify_ldt(0x1, 0xbffff1fc, 0x10) = -1 ENOMEM (Cannot allocate memory)

I believe this error comes, not from a (genuine or mistaken) shortage
of free memory, but from shortage or fragmentation of vmalloc's virtual
address space. Does patch below (to 2.4.17-pre4-aa1 since I think that's
what you tried last; easily adaptible to other trees) doubling vmalloc's
address space (on your 1GB machine or larger) make any difference?
Perhaps there's a vmalloc leak and this will only delay the error.

Hugh

--- 1704aa1/arch/i386/kernel/setup.c Tue Dec 11 15:22:53 2001
+++ linux/arch/i386/kernel/setup.c Tue Dec 11 19:01:37 2001
@@ -835,7 +835,7 @@
/*
* 128MB for vmalloc and initrd
*/
-#define VMALLOC_RESERVE (unsigned long)(128 << 20)
+#define VMALLOC_RESERVE (unsigned long)(256 << 20)
#define MAXMEM (unsigned long)(-PAGE_OFFSET-VMALLOC_RESERVE)
#ifdef CONFIG_HIGHMEM_EMULATION
#define ORDER_DOWN(x) ((x >> (MAX_ORDER-1)) << (MAX_ORDER-1))

2001-12-11 20:05:21

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)

On Tue, 11 Dec 2001 19:07:41 +0000 (GMT)
Hugh Dickins <[email protected]> wrote:

> I believe this error comes, not from a (genuine or mistaken) shortage
> of free memory,

Me, too.

> but from shortage or fragmentation of vmalloc's virtual
> address space. Does patch below (to 2.4.17-pre4-aa1 since I think that's
> what you tried last; easily adaptible to other trees) doubling vmalloc's
> address space (on your 1GB machine or larger) make any difference?
> Perhaps there's a vmalloc leak and this will only delay the error.

At least I think this direction to search the bug looks a lot more promising than a general mem shortage problem. After reviewing modify_ldt this looked like the only useable idea to Leighs problem.

Regards,
Stephan


2001-12-11 22:14:30

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)

Followup to: <[email protected]>
By author: Hugh Dickins <[email protected]>
In newsgroup: linux.dev.kernel
>
> I believe this error comes, not from a (genuine or mistaken) shortage
> of free memory, but from shortage or fragmentation of vmalloc's virtual
> address space. Does patch below (to 2.4.17-pre4-aa1 since I think that's
> what you tried last; easily adaptible to other trees) doubling vmalloc's
> address space (on your 1GB machine or larger) make any difference?
> Perhaps there's a vmalloc leak and this will only delay the error.
>
> Hugh
>
> --- 1704aa1/arch/i386/kernel/setup.c Tue Dec 11 15:22:53 2001
> +++ linux/arch/i386/kernel/setup.c Tue Dec 11 19:01:37 2001
> @@ -835,7 +835,7 @@
> /*
> * 128MB for vmalloc and initrd
> */
> -#define VMALLOC_RESERVE (unsigned long)(128 << 20)
> +#define VMALLOC_RESERVE (unsigned long)(256 << 20)
> #define MAXMEM (unsigned long)(-PAGE_OFFSET-VMALLOC_RESERVE)
> #ifdef CONFIG_HIGHMEM_EMULATION
> #define ORDER_DOWN(x) ((x >> (MAX_ORDER-1)) << (MAX_ORDER-1))
>

Well, for one thing it will screw over just about every Linux boot
loader in existence if you are using an initrd. You need my boot
protocol 2.03 patch *plus* a 2.03 compliant boot loader (e.g. SYSLINUX
1.65-pre2 or later) if you apply this change to a 1 GB or larger
machine.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-12-11 23:00:41

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)

On Tue, Dec 11, 2001 at 07:07:41PM +0000, Hugh Dickins wrote:
> On Sat, 8 Dec 2001, Leigh Orf wrote:
> >
> > So I don't know if it's a symptom or a cause, but modify_ldt seems to be
> > triggering the problem. Not being a kernel hacker, I leave the analysis
> > of this to those who are.
> >
> > home[1029]:/home/orf% free
> > total used free shared buffers cached
> > Mem: 1029772 967096 62676 0 443988 98312
> > -/+ buffers/cache: 424796 604976
> > Swap: 2064344 0 2064344
> >
> > modify_ldt(0x1, 0xbffff1fc, 0x10) = -1 ENOMEM (Cannot allocate memory)
>
> I believe this error comes, not from a (genuine or mistaken) shortage
> of free memory, but from shortage or fragmentation of vmalloc's virtual

definitely agreed. This is the same I was wondering about right now here
while reading his report.

He always get vmalloc failures, this is way too suspect. If the VM
memory balancing was the culprit he should get failures with all the
other allocations too. So it has to be a problem with a shortage of the
address space available to vmalloc, not a problem with the page
allocator.

> address space. Does patch below (to 2.4.17-pre4-aa1 since I think that's
> what you tried last; easily adaptible to other trees) doubling vmalloc's
> address space (on your 1GB machine or larger) make any difference?
> Perhaps there's a vmalloc leak and this will only delay the error.
>
> Hugh
>
> --- 1704aa1/arch/i386/kernel/setup.c Tue Dec 11 15:22:53 2001
> +++ linux/arch/i386/kernel/setup.c Tue Dec 11 19:01:37 2001
> @@ -835,7 +835,7 @@
> /*
> * 128MB for vmalloc and initrd
> */
> -#define VMALLOC_RESERVE (unsigned long)(128 << 20)
> +#define VMALLOC_RESERVE (unsigned long)(256 << 20)
> #define MAXMEM (unsigned long)(-PAGE_OFFSET-VMALLOC_RESERVE)
> #ifdef CONFIG_HIGHMEM_EMULATION
> #define ORDER_DOWN(x) ((x >> (MAX_ORDER-1)) << (MAX_ORDER-1))

yes, this will tend to hide it.

Even better would be to change fs/ntfs/* to avoid using vmalloc for tons
of little pieces. It's not only a matter of wasting direct mapped
address space, but it's also a matter of running fast, mainly on SMP
with the IPI for the tlb flushes...

attr.c:233: new = ntfs_vmalloc(new_size);
attr.c:235: ntfs_error("ntfs_insert_run:
ntfs_vmalloc(new_size = "
attr.c:458: rlt = ntfs_vmalloc(rl_size);
inode.c:1297: rl = ntfs_vmalloc(rlen << sizeof(ntfs_runlist));
inode.c:1638: rlt =
ntfs_vmalloc(rl_size);
inode.c:1942: rl2 = ntfs_vmalloc(rl2_size);
inode.c:2006: rlt = ntfs_vmalloc(rl_size);
super.c:810: rlt = ntfs_vmalloc(rlsize);
super.c:1335: buf = ntfs_vmalloc(buf_size);
support.h:29:#include <linux/vmalloc.h>
support.h:35:#define ntfs_vmalloc(size) vmalloc_32(size)


In short there are three solutions avaialble:

1) don't use ntfs
2) fix ntfs
3) enlarge vmalloc address space with the above patch, but this won't be
a final solution because you'll overflow again the vmalloc address
space by adding the double of files in your fs

So I'd redirect this report to Anton Altaparmakov <[email protected]> and
I still have no VM bugreport pending from my part.

thanks,

Andrea

2001-12-12 14:51:27

by Leigh Orf

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)


Andrea,

I disabled ntfs and as you suspected my problem went away. This worked
for both 2.4.16 and 2.4.17pre4aa1.

Thanks a lot,

Leigh Orf

Andrea Arcangeli wrote:

| Even better would be to change fs/ntfs/* to avoid using vmalloc for tons
| of little pieces. It's not only a matter of wasting direct mapped
| address space, but it's also a matter of running fast, mainly on SMP
| with the IPI for the tlb flushes...
|
| attr.c:233: new = ntfs_vmalloc(new_size);
| attr.c:235: ntfs_error("ntfs_insert_run:
| ntfs_vmalloc(new_size = "
| attr.c:458: rlt = ntfs_vmalloc(rl_size);
| inode.c:1297: rl = ntfs_vmalloc(rlen << sizeof(ntfs_runlist));
| inode.c:1638: rlt =
| ntfs_vmalloc(rl_size);
| inode.c:1942: rl2 = ntfs_vmalloc(rl2_size);
| inode.c:2006: rlt = ntfs_vmalloc(rl_size);
| super.c:810: rlt = ntfs_vmalloc(rlsize);
| super.c:1335: buf = ntfs_vmalloc(buf_size);
| support.h:29:#include <linux/vmalloc.h>
| support.h:35:#define ntfs_vmalloc(size) vmalloc_32(size)
|
|
| In short there are three solutions avaialble:
|
| 1) don't use ntfs
| 2) fix ntfs
| 3) enlarge vmalloc address space with the above patch, but this won't be
| a final solution because you'll overflow again the vmalloc address
| space by adding the double of files in your fs
|
| So I'd redirect this report to Anton Altaparmakov <[email protected]> and
| I still have no VM bugreport pending from my part.
|
| thanks,
|
| Andrea

2001-12-18 14:27:56

by Holger Lubitz

[permalink] [raw]
Subject: Re: 2.4.16 memory badness (reproducible)

Andrea Arcangeli proclaimed:

> He always get vmalloc failures, this is way too suspect. If the VM
> memory balancing was the culprit he should get failures with all the
> other allocations too. So it has to be a problem with a shortage of the
> address space available to vmalloc, not a problem with the page
> allocator.

Leigh pointed me to your post in reply to another thread (modify_ldt
failing on highmem machine).

Is there any special vmalloc handling on highmem kernels? I only run
into the problem if I am using high memory support in the kernel. I
haven't been able to reproduce the problem with 896M or less, which
strikes me as slightly odd. Why does _more_ memory trigger "no memory"
failures?

The problem is indeed not vm specific. the last -ac kernel shows the
problem, too (and that one still has the old vm, doesn't it?)

Holger