2005-01-30 17:07:43

by Nix

[permalink] [raw]
Subject: 2.6.10: SPARC64 mapped figure goes unsignedly negative...

/proc/meminfo on my UltraSPARC IIi:

MemTotal: 512816 kB
MemFree: 14208 kB
Buffers: 51328 kB
Cached: 163056 kB
SwapCached: 0 kB
Active: 142160 kB
Inactive: 304712 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 512816 kB
LowFree: 14208 kB
SwapTotal: 1557264 kB
SwapFree: 1557176 kB
Dirty: 5256 kB
Writeback: 0 kB
Mapped: 18446744073687883208 kB
Slab: 43928 kB
CommitLimit: 1813672 kB
Committed_AS: 342712 kB
PageTables: 1728 kB
VmallocTotal: 3145728 kB
VmallocUsed: 456 kB
VmallocChunk: 3145272 kB

That Mapped figure looks somewhat inaccurate, being about negative
19Gb. The other figures are pretty much right, as far as I can tell.


(This kernel is compiled with GCC-3.4.3, which might be relevant.)

--
`Blish is clearly in love with language. Unfortunately,
language dislikes him intensely.' --- Russ Allbery


2005-01-31 13:12:27

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.6.10: SPARC64 mapped figure goes unsignedly negative...

On Sun, 30 Jan 2005, Nix wrote:
> /proc/meminfo on my UltraSPARC IIi:
> Mapped: 18446744073687883208 kB
>
> (This kernel is compiled with GCC-3.4.3, which might be relevant.)

Indeed: sparc64 gcc-3.4 seems to be having trouble with that
since 2.6.9: we've been persuing it offlist, I'll factor you in.

Hugh

2005-01-31 15:32:03

by Nix

[permalink] [raw]
Subject: Re: 2.6.10: SPARC64 mapped figure goes unsignedly negative...

On Mon, 31 Jan 2005, Hugh Dickins suggested tentatively:
> On Sun, 30 Jan 2005, Nix wrote:
>> /proc/meminfo on my UltraSPARC IIi:
>> Mapped: 18446744073687883208 kB
>>
>> (This kernel is compiled with GCC-3.4.3, which might be relevant.)
>
> Indeed: sparc64 gcc-3.4 seems to be having trouble with that
> since 2.6.9: we've been persuing it offlist, I'll factor you in.

Excellent; thank you!

(2.6.10 seems to *run* perfectly well on that box, for what it's worth;
unless this is a symptom of some underlying dark and terrible failure,
it looks like a not-very-important cosmetic bug.)

--
`Blish is clearly in love with language. Unfortunately,
language dislikes him intensely.' --- Russ Allbery

2005-01-31 16:01:45

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.6.10: SPARC64 mapped figure goes unsignedly negative...

On Mon, 31 Jan 2005, Nix wrote:
> On Mon, 31 Jan 2005, Hugh Dickins suggested tentatively:
> > On Sun, 30 Jan 2005, Nix wrote:
> >> /proc/meminfo on my UltraSPARC IIi:
> >> Mapped: 18446744073687883208 kB
> >>
> >> (This kernel is compiled with GCC-3.4.3, which might be relevant.)
> >
> > Indeed: sparc64 gcc-3.4 seems to be having trouble with that
> > since 2.6.9: we've been persuing it offlist, I'll factor you in.
>
> Excellent; thank you!
>
> (2.6.10 seems to *run* perfectly well on that box, for what it's worth;
> unless this is a symptom of some underlying dark and terrible failure,
> it looks like a not-very-important cosmetic bug.)

A lot of the time you're right and it is just cosmetic. But if memory
gets tight and it should be using swap, it mistakenly fails to do so,
so you may end up getting OOM-killed. Patch below is a temporary hack
workaround against that. The Mapped count also affects when dirty file
writeback kicks in, but the effect there appears to be less serious.

More worrying is, what else might sparc64 gcc-3.4 be getting wrong?
(if it really is to blame: looks to be so but not yet proven)

Hugh

--- 2.6.10/mm/vmscan.c 2004-12-24 21:36:18.000000000 +0000
+++ linux/mm/vmscan.c 2005-01-31 12:44:56.006629152 +0000
@@ -690,6 +690,8 @@ refill_inactive_zone(struct zone *zone,
* is mapped.
*/
mapped_ratio = (sc->nr_mapped * 100) / total_memory;
+ if (mapped_ratio < 0)
+ mapped_ratio = 78;

/*
* Now decide how much we really want to unmap some pages. The mapped

2005-01-31 16:14:02

by Nix

[permalink] [raw]
Subject: Re: 2.6.10: SPARC64 mapped figure goes unsignedly negative...

On Mon, 31 Jan 2005, Hugh Dickins said:
> On Mon, 31 Jan 2005, Nix wrote:
>> (2.6.10 seems to *run* perfectly well on that box, for what it's worth;
>> unless this is a symptom of some underlying dark and terrible failure,
>> it looks like a not-very-important cosmetic bug.)
>
> A lot of the time you're right and it is just cosmetic. But if memory
> gets tight and it should be using swap, it mistakenly fails to do so,
> so you may end up getting OOM-killed. Patch below is a temporary hack
> workaround against that.

Odd: this machine seems to be using swap, albeit not very much (and I've
got the swap priorities upside down, as well; whoops, that's probably
been harming performance for, well, years):

Filename Type Size Used Priority
/dev/sda2 partition 523016 0 1
/dev/sda4 partition 511232 57648 2
/dev/sdb2 partition 523016 0 1

Is the problem that the higher-priority kicking out to swap which should
happen when memory is tight, won't?

> The Mapped count also affects when dirty file
> writeback kicks in, but the effect there appears to be less serious.

... since it kicks in eventually anyway.

> More worrying is, what else might sparc64 gcc-3.4 be getting wrong?

That's a question that's very hard to answer until we know the cause of
this failure and can fix it: then we can go through the RTL or assembler
dumps of a kernel compilation and comb more potential problems out
(or not: it's probably a long and thankless task).

I'll build rmap.c with GCC-3.3 later tonight (if I can find a copy on my
old backups), compare the generated code, and see if anything leaps out
at me.

> --- 2.6.10/mm/vmscan.c 2004-12-24 21:36:18.000000000 +0000
> +++ linux/mm/vmscan.c 2005-01-31 12:44:56.006629152 +0000
> @@ -690,6 +690,8 @@ refill_inactive_zone(struct zone *zone,
> * is mapped.
> */
> mapped_ratio = (sc->nr_mapped * 100) / total_memory;
> + if (mapped_ratio < 0)
> + mapped_ratio = 78;
>
> /*
> * Now decide how much we really want to unmap some pages. The mapped

`78'? A hack indeed! :)

--
`Blish is clearly in love with language. Unfortunately,
language dislikes him intensely.' --- Russ Allbery

2005-01-31 17:09:52

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.6.10: SPARC64 mapped figure goes unsignedly negative...

On Mon, 31 Jan 2005, Nix wrote:
>
> Odd: this machine seems to be using swap, albeit not very much (and I've
> got the swap priorities upside down, as well; whoops, that's probably
> been harming performance for, well, years):
>
> Filename Type Size Used Priority
> /dev/sda2 partition 523016 0 1
> /dev/sda4 partition 511232 57648 2
> /dev/sdb2 partition 523016 0 1
>
> Is the problem that the higher-priority kicking out to swap which should
> happen when memory is tight, won't?

I had thought that it was any kicking out to swap - apart from kicking
tmpfs/shmem pages to swap, which should happen independently of Mapped.

If you're not using tmpfs or shmem, then I'm surprised by that figure.
There was 88 kB out to swap in your original /proc/meminfo, which we
may suppose was before Mapped went negative; but above shows more since.

> I'll build rmap.c with GCC-3.3 later tonight (if I can find a copy on my
> old backups), compare the generated code, and see if anything leaps out
> at me.

Worth doing, thank you. Rene has sent us the GCC-3.4 output,
but I've not spotted anything the matter with it yet.

Hugh

2005-01-31 17:32:25

by Nix

[permalink] [raw]
Subject: Re: 2.6.10: SPARC64 mapped figure goes unsignedly negative...

On Mon, 31 Jan 2005, Hugh Dickins uttered the following:
> On Mon, 31 Jan 2005, Nix wrote:
>> Filename Type Size Used Priority
>> /dev/sda2 partition 523016 0 1
>> /dev/sda4 partition 511232 57648 2
>> /dev/sdb2 partition 523016 0 1
>>
>> Is the problem that the higher-priority kicking out to swap which should
>> happen when memory is tight, won't?
>
> I had thought that it was any kicking out to swap - apart from kicking
> tmpfs/shmem pages to swap, which should happen independently of Mapped.
>
> If you're not using tmpfs or shmem, then I'm surprised by that figure.

Oh. Yes, tmpfs might just about explain it:

58320 /tmp

So it looks like I have a swap-free box for a time. I guess I'd better
be careful... :)

> There was 88 kB out to swap in your original /proc/meminfo, which we
> may suppose was before Mapped went negative; but above shows more since.

Yes, I expect so. It must've gone negative really rather early: and note
that it's some distance below 2^64 by now, so it's still falling. If I
wait for a billion years or so it might wrap around. :)

--
`Blish is clearly in love with language. Unfortunately,
language dislikes him intensely.' --- Russ Allbery