2009-04-02 08:05:36

by Michael Ellerman

[permalink] [raw]
Subject: Re: Resend: /proc/<pid>/maps offset output broken in 2.6.29

On Wed, 2009-04-01 at 17:18 -0600, Chris Friesen wrote:
> Resending due to lack of response to original post.

Hi Chris,

You'll probably get a more useful response on lkml. You CC'ed
linux-kernel-owner originally :)

> I was validating some code dealing with /proc/<pid>/maps on 2.6.29 and
> was surprised when it failed. It turns out that at least on my ppc64 G5
> machine the offset value for the last entry is strange--it shows up as a
> 64-bit value even though the process itself is only 32-bit.
>
> This behaviour also shows up in 2.6.25, but doesn't in 2.6.14. I
> haven't yet tested anything else in between.
>
> [cfriesen@localhost cfriesen]$ cat /proc/self/maps
> 00100000-00103000 r-xp 00100000 00:00 0 [vdso]
> 0fe70000-0ffbf000 r-xp 00000000 08:03 4312393 /lib/tls/libc-2.3.3.so
> 0ffbf000-0ffc0000 ---p 0014f000 08:03 4312393 /lib/tls/libc-2.3.3.so
> 0ffc0000-0ffc2000 r--p 00150000 08:03 4312393 /lib/tls/libc-2.3.3.so
> 0ffc2000-0ffc6000 rwxp 00152000 08:03 4312393 /lib/tls/libc-2.3.3.so
> 0ffc6000-0ffc8000 rwxp 0ffc6000 00:00 0
> 0ffd0000-0ffec000 r-xp 00000000 08:03 4309011 /lib/ld-2.3.3.so
> 0fff0000-0fff1000 r--p 00020000 08:03 4309011 /lib/ld-2.3.3.so
> 0fff1000-0fff2000 rwxp 00021000 08:03 4309011 /lib/ld-2.3.3.so
> 10000000-10004000 r-xp 00000000 08:03 917536 /bin/cat
> 10013000-10015000 rwxp 00003000 08:03 917536 /bin/cat
> 10015000-10036000 rwxp 10015000 00:00 0 [heap]
> f7deb000-f7feb000 r--p 00000000 08:03 2560322
> /usr/lib/locale/locale-archive
> f7feb000-f7fec000 rw-p f7feb000 00:00 0
> ffe6d000-ffe82000 rw-p ffffffeb000 00:00 0 [stack]
>
> I'm at a loss to explain what's going on here. Anyone got any ideas?

It looks like for vmas that don't have a vm_file (like the stack),
vm_pgoff is basically "internal use" - and so can be > 32 bit.

cheers

--
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person


Attachments:
signature.asc (197.00 B)
This is a digitally signed message part

2009-04-02 12:46:10

by Hugh Dickins

[permalink] [raw]
Subject: Re: Resend: /proc/<pid>/maps offset output broken in 2.6.29

On Thu, 2 Apr 2009, Michael Ellerman wrote:
> On Wed, 2009-04-01 at 17:18 -0600, Chris Friesen wrote:
> > Resending due to lack of response to original post.
>
> Hi Chris,
>
> You'll probably get a more useful response on lkml. You CC'ed
> linux-kernel-owner originally :)

Thanks.

>
> > I was validating some code dealing with /proc/<pid>/maps on 2.6.29 and
> > was surprised when it failed. It turns out that at least on my ppc64 G5
> > machine the offset value for the last entry is strange--it shows up as a
> > 64-bit value even though the process itself is only 32-bit.
> >
> > This behaviour also shows up in 2.6.25, but doesn't in 2.6.14. I
> > haven't yet tested anything else in between.
> >
> > [cfriesen@localhost cfriesen]$ cat /proc/self/maps
> > 00100000-00103000 r-xp 00100000 00:00 0 [vdso]
> > 0fe70000-0ffbf000 r-xp 00000000 08:03 4312393 /lib/tls/libc-2.3.3.so
> > 0ffbf000-0ffc0000 ---p 0014f000 08:03 4312393 /lib/tls/libc-2.3.3.so
> > 0ffc0000-0ffc2000 r--p 00150000 08:03 4312393 /lib/tls/libc-2.3.3.so
> > 0ffc2000-0ffc6000 rwxp 00152000 08:03 4312393 /lib/tls/libc-2.3.3.so
> > 0ffc6000-0ffc8000 rwxp 0ffc6000 00:00 0
> > 0ffd0000-0ffec000 r-xp 00000000 08:03 4309011 /lib/ld-2.3.3.so
> > 0fff0000-0fff1000 r--p 00020000 08:03 4309011 /lib/ld-2.3.3.so
> > 0fff1000-0fff2000 rwxp 00021000 08:03 4309011 /lib/ld-2.3.3.so
> > 10000000-10004000 r-xp 00000000 08:03 917536 /bin/cat
> > 10013000-10015000 rwxp 00003000 08:03 917536 /bin/cat
> > 10015000-10036000 rwxp 10015000 00:00 0 [heap]
> > f7deb000-f7feb000 r--p 00000000 08:03 2560322
> > /usr/lib/locale/locale-archive
> > f7feb000-f7fec000 rw-p f7feb000 00:00 0
> > ffe6d000-ffe82000 rw-p ffffffeb000 00:00 0 [stack]
> >
> > I'm at a loss to explain what's going on here. Anyone got any ideas?
>
> It looks like for vmas that don't have a vm_file (like the stack),
> vm_pgoff is basically "internal use" - and so can be > 32 bit.

Yes, it's just a cosmetic blemish, which comes from how the args on
stack are initially prepared in a 64-bit space, then moved into place
for the 32-bit task - the anon vm_pgoff still reflects the original
location, precisely in order to track pages despite movements.

(2.6.14 had the same use of anon vm_pgoff, but args on stack
were limited, and inserted directly into the 32-bit space.)

Chris isn't the first to be concerned by that: there's a patch in
-mm which just shows 0 instead of anon vm_pgoff in /proc/<pid>/maps
output. That patch is on akpm's list for 2.6.30 merge, but I think
hasn't gone to Linus yet: expect it in a later batch.

Hugh

2009-04-02 17:53:40

by Chris Friesen

[permalink] [raw]
Subject: Re: Resend: /proc/<pid>/maps offset output broken in 2.6.29

Hugh Dickins wrote:
> On Thu, 2 Apr 2009, Michael Ellerman wrote:
>> On Wed, 2009-04-01 at 17:18 -0600, Chris Friesen wrote:
>>> Resending due to lack of response to original post.
>> Hi Chris,
>>
>> You'll probably get a more useful response on lkml. You CC'ed
>> linux-kernel-owner originally :)
>
> Thanks.

Thanks from me too. (Oops.)

>>> I was validating some code dealing with /proc/<pid>/maps on 2.6.29 and
>>> was surprised when it failed. It turns out that at least on my ppc64 G5
>>> machine the offset value for the last entry is strange--it shows up as a
>>> 64-bit value even though the process itself is only 32-bit.
>>>
>>> This behaviour also shows up in 2.6.25, but doesn't in 2.6.14. I
>>> haven't yet tested anything else in between.
>>>
>>> [cfriesen@localhost cfriesen]$ cat /proc/self/maps
>>> 00100000-00103000 r-xp 00100000 00:00 0 [vdso]
>>> 0fe70000-0ffbf000 r-xp 00000000 08:03 4312393 /lib/tls/libc-2.3.3.so
>>> 0ffbf000-0ffc0000 ---p 0014f000 08:03 4312393 /lib/tls/libc-2.3.3.so
>>> 0ffc0000-0ffc2000 r--p 00150000 08:03 4312393 /lib/tls/libc-2.3.3.so
>>> 0ffc2000-0ffc6000 rwxp 00152000 08:03 4312393 /lib/tls/libc-2.3.3.so
>>> 0ffc6000-0ffc8000 rwxp 0ffc6000 00:00 0
>>> 0ffd0000-0ffec000 r-xp 00000000 08:03 4309011 /lib/ld-2.3.3.so
>>> 0fff0000-0fff1000 r--p 00020000 08:03 4309011 /lib/ld-2.3.3.so
>>> 0fff1000-0fff2000 rwxp 00021000 08:03 4309011 /lib/ld-2.3.3.so
>>> 10000000-10004000 r-xp 00000000 08:03 917536 /bin/cat
>>> 10013000-10015000 rwxp 00003000 08:03 917536 /bin/cat
>>> 10015000-10036000 rwxp 10015000 00:00 0 [heap]
>>> f7deb000-f7feb000 r--p 00000000 08:03 2560322
>>> /usr/lib/locale/locale-archive
>>> f7feb000-f7fec000 rw-p f7feb000 00:00 0
>>> ffe6d000-ffe82000 rw-p ffffffeb000 00:00 0 [stack]

> Chris isn't the first to be concerned by that: there's a patch in
> -mm which just shows 0 instead of anon vm_pgoff in /proc/<pid>/maps
> output. That patch is on akpm's list for 2.6.30 merge, but I think
> hasn't gone to Linus yet: expect it in a later batch.

Alternately, what about just making the offset for the stack match the
starting address of the VMA? That way it would look the same as other
anonymous areas, and as a bonus would look the same as previous
releases. Arguably, /proc/<pid>/maps should count as userspace-visible API.

Chris

2009-04-02 19:12:08

by Hugh Dickins

[permalink] [raw]
Subject: Re: Resend: /proc/<pid>/maps offset output broken in 2.6.29

On Thu, 2 Apr 2009, Chris Friesen wrote:
> Hugh Dickins wrote:
> > > > f7feb000-f7fec000 rw-p f7feb000 00:00 0
> > > > ffe6d000-ffe82000 rw-p ffffffeb000 00:00 0 [stack]
>
> > Chris isn't the first to be concerned by that: there's a patch in
> > -mm which just shows 0 instead of anon vm_pgoff in /proc/<pid>/maps
> > output. That patch is on akpm's list for 2.6.30 merge, but I think
> > hasn't gone to Linus yet: expect it in a later batch.
>
> Alternately, what about just making the offset for the stack match the
> starting address of the VMA?

The rmap code for locating anonymous pages, even after the vma has
been moved meanwhile, depends on vma->vm_pgoff. There is no point
in making that more complicated for this.

For display purposes only? Well, yes, we could have done that,
but why bother? It wouldn't be adding any information, and
might raise a question of identifying "the" stack to do that to.

> That way it would look the same as other anonymous areas,

The stack will be looking the same as other anonymous areas:
they'll all be showing 00000000 there.

This is a cosmetic matter, not worth more than a couple of lines of
code: I suggested masking off the high bits in the display, but when
KAMEZAWA-san suggested just showing 0, it was hard to argue against
his brutal simplicity.

> and as a bonus would look the same as previous releases.
> Arguably, /proc/<pid>/maps should count as userspace-visible API.

Consider this change a fix: it used to show 00000000 before 2.6.7.

See http://lkml.org/lkml/2009/1/13/331 for one of the threads
on the subject - but you've not tempted me to reopen it!

Hugh

2009-04-02 20:46:20

by Chris Friesen

[permalink] [raw]
Subject: Re: Resend: /proc/<pid>/maps offset output broken in 2.6.29

Hugh Dickins wrote:

> This is a cosmetic matter, not worth more than a couple of lines of
> code: I suggested masking off the high bits in the display, but when
> KAMEZAWA-san suggested just showing 0, it was hard to argue against
> his brutal simplicity.

<snip>

> Consider this change a fix: it used to show 00000000 before 2.6.7.
>
> See http://lkml.org/lkml/2009/1/13/331 for one of the threads
> on the subject - but you've not tempted me to reopen it!

Okay, fair enough. I'll change my code to deal with it. Thanks for the
explanation.

Chris

2009-04-05 18:52:29

by Hugh Dickins

[permalink] [raw]
Subject: Re: Resend: /proc/<pid>/maps offset output broken in 2.6.29

On Thu, 2 Apr 2009, Chris Friesen wrote:
> Hugh Dickins wrote:
>
> > This is a cosmetic matter, not worth more than a couple of lines of
> > code: I suggested masking off the high bits in the display, but when
> > KAMEZAWA-san suggested just showing 0, it was hard to argue against
> > his brutal simplicity.
>
> <snip>
>
> > Consider this change a fix: it used to show 00000000 before 2.6.7.
> >
> > See http://lkml.org/lkml/2009/1/13/331 for one of the threads
> > on the subject - but you've not tempted me to reopen it!
>
> Okay, fair enough. I'll change my code to deal with it. Thanks for the
> explanation.

Oh, I thought you were arguing hypotheticals. This is more serious,
that you have some userspace code which depended on the 2.6.8-2.6.29
way of filling that field, and now we're about to break it.

Please, would you share what you were doing with the vm_pgoff of an
anonymous area? I won't pretend: I am indeed hoping to show that
what you were doing before was already broken, so that we can safely
go ahead and break it some more!

Hugh