2004-06-25 15:42:20

by Makhlis, Lev

[permalink] [raw]
Subject: [PATCH] [SYSVIPC] Change shm_tot from int to size_t

I see that shm_tot (the total number of pages in shm segments) in
ipc/shm.c is defined as int, even though its max value (shmall) is size_t.

Admittedly, it only matters for systems with >8TB memory, but shouldn't
shm_tot also be size_t? The attached patch makes it so.


Attachments:
shmtot64.patch (428.00 B)

2004-06-25 16:40:22

by Andries Brouwer

[permalink] [raw]
Subject: Re: [PATCH] [SYSVIPC] Change shm_tot from int to size_t

On Fri, Jun 25, 2004 at 10:41:13AM -0500, Makhlis, Lev wrote:

> I see that shm_tot (the total number of pages in shm segments) in
> ipc/shm.c is defined as int, even though its max value (shmall) is size_t.
>
> Admittedly, it only matters for systems with >8TB memory, but shouldn't
> shm_tot also be size_t? The attached patch makes it so.

> -static int shm_tot; /* total number of shared memory pages */
> +static size_t shm_tot; /* total number of shared memory pages */

First, please avoid attachments.

Secondly, this makes shm_tot unsigned. Have you checked all places
where it occurs in an inequality to see whether the semantics did
change? (It looks OK.)

Thirdly, shm_tot is transmitted to userspace (via the SHM_INFO ioctl)
as an unsigned long. If it is necessary to make it larger, then we
must do something with this ioctl. For example, return -1 there
in case the actual value does not fit in an unsigned long.


2004-06-25 17:43:02

by Makhlis, Lev

[permalink] [raw]
Subject: RE: [PATCH] [SYSVIPC] Change shm_tot from int to size_t


Andries Brouwer wrote:
> On Fri, Jun 25, 2004 at 10:41:13AM -0500, Makhlis, Lev wrote:
>
> > I see that shm_tot (the total number of pages in shm segments) in
> > ipc/shm.c is defined as int, even though its max value
> (shmall) is size_t.
> >
> > Admittedly, it only matters for systems with >8TB memory,
> but shouldn't
> > shm_tot also be size_t? The attached patch makes it so.
>
> > -static int shm_tot; /* total number of shared memory pages */
> > +static size_t shm_tot; /* total number of shared memory pages */
>
> First, please avoid attachments.
>
> Secondly, this makes shm_tot unsigned. Have you checked all places
> where it occurs in an inequality to see whether the semantics did
> change? (It looks OK.)

Yes -- it looked to me like the semantics would be more correct, since
currently, "if (shm_tot + numpages >= shm_ctlall)" will break when
shm_tot wraps around MAX_INT and becomes negative.

>
> Thirdly, shm_tot is transmitted to userspace (via the SHM_INFO ioctl)
> as an unsigned long. If it is necessary to make it larger, then we
> must do something with this ioctl. For example, return -1 there
> in case the actual value does not fit in an unsigned long.

The SHM_INFO shmctl is actually how I found it in the first place.
But we have the same situation with many other values. For example,
shm_ctlmax, shm_ctlall and shm_segsz can all potentially be 64-bit wide
in the kernel and are exported into potentially 32-bit userspace values.
We don't return -1 for any of those if they don't fit. Is there a
special reason to do it in this case?

2004-06-25 20:39:27

by Andries Brouwer

[permalink] [raw]
Subject: Re: [PATCH] [SYSVIPC] Change shm_tot from int to size_t

On Fri, Jun 25, 2004 at 12:42:26PM -0500, Makhlis, Lev wrote:

> > Thirdly, shm_tot is transmitted to userspace (via the SHM_INFO ioctl)
> > as an unsigned long. If it is necessary to make it larger, then we
> > must do something with this ioctl. For example, return -1 there
> > in case the actual value does not fit in an unsigned long.
>
> The SHM_INFO shmctl is actually how I found it in the first place.
> But we have the same situation with many other values. For example,
> shm_ctlmax, shm_ctlall and shm_segsz can all potentially be 64-bit wide
> in the kernel and are exported into potentially 32-bit userspace values.
> We don't return -1 for any of those if they don't fit. Is there a
> special reason to do it in this case?

There is a good reason to do it always.
If one truncates, then the result is always unreliable.
If one replaces a too large value by -1, then any other value is reliable.

2004-06-25 21:56:59

by Makhlis, Lev

[permalink] [raw]
Subject: RE: [PATCH] [SYSVIPC] Change shm_tot from int to size_t

Andries Brouwer wrote:
> On Fri, Jun 25, 2004 at 12:42:26PM -0500, Makhlis, Lev wrote:
>
> > > Thirdly, shm_tot is transmitted to userspace (via the
> SHM_INFO ioctl)
> > > as an unsigned long. If it is necessary to make it larger, then we
> > > must do something with this ioctl. For example, return -1 there
> > > in case the actual value does not fit in an unsigned long.
> >
> > The SHM_INFO shmctl is actually how I found it in the first place.
> > But we have the same situation with many other values. For example,
> > shm_ctlmax, shm_ctlall and shm_segsz can all potentially be
> 64-bit wide
> > in the kernel and are exported into potentially 32-bit
> userspace values.
> > We don't return -1 for any of those if they don't fit. Is there a
> > special reason to do it in this case?
>
> There is a good reason to do it always.
> If one truncates, then the result is always unreliable.
> If one replaces a too large value by -1, then any other value
> is reliable.
>

I've looked around and couldn't find anything that returns -1 as a value
(as opposed to returning an error) when the value won't fit in 32 bits.
Here's what I've found (assuming a 32-bit app in all cases):
* sys_statfs, which fails with EOVERFLOW if any of the values won't fit
* sys_sysinfo, which scales the values down up to a PAGE_SIZE factor,
and silently truncates them if they are still too large
* BLKGETSIZE ioctl, which fails with EFBIG if the kernel itself is 32-bit,
but silently truncates if the kernel is 64-bit
* BLKGETSIZE64 ioctl, which always silently truncates
* HDIO_GETGEO ioctl (.start is ulong), which always silently truncates
I'm sure there are some other examples...

What do you think about following the example of sys_statfs in sys_shmctl?

2004-06-26 00:14:59

by Andries Brouwer

[permalink] [raw]
Subject: Re: [PATCH] [SYSVIPC] Change shm_tot from int to size_t

On Fri, Jun 25, 2004 at 04:56:32PM -0500, Makhlis, Lev wrote:

> What do you think about following the example of sys_statfs in sys_shmctl?

I don't mind. It is fairly unimportant - just the informative stuff
printed by

% ipcs -m -l

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 32768
max total shared memory (kbytes) = 8388608
min seg size (bytes) = 1

(but it seems a pity to deny user space this information if only
one field is large).

2004-06-28 16:23:24

by Makhlis, Lev

[permalink] [raw]
Subject: RE: [PATCH] [SYSVIPC] Change shm_tot from int to size_t

Andries Brouwer wrote:
> On Fri, Jun 25, 2004 at 04:56:32PM -0500, Makhlis, Lev wrote:
>
> > What do you think about following the example of sys_statfs
> in sys_shmctl?
>
> I don't mind. It is fairly unimportant - just the informative stuff
> printed by
>
> % ipcs -m -l
>
> ------ Shared Memory Limits --------
> max number of segments = 4096
> max seg size (kbytes) = 32768
> max total shared memory (kbytes) = 8388608
> min seg size (bytes) = 1
>
> (but it seems a pity to deny user space this information if only
> one field is large).
>

Yes, I thought it might be a little heavy-handed, but I figured, if it's
good enough for statfs... Also, the values in "ipcs -m -l"
(1) come from IPC_INFO shmctl (separate from SHM_INFO) and
(2) are also available in /proc/sys/kernel (except for shmmin, which is
constant anyway).

On a tangential issue, I think it would be nice if an app could get the
actual values it wants, rather than just know that they are unreliable.
With statfs, that's possible thanks to the existence of statfs64, which
is 64-bit wide even in 32-bit world.
With {shm,sem,msg}ctl, we do have struct shminfo64 etc., but I'm not
sure why they exist at all, as they seem always to have the same bitness
as the default structures.
On the other hand, sys_shmctl() has forever had a comment about replacing
it with a /proc interface. This has been done for {SHM,SEM,MSG}_STAT calls
(/proc/sysvipc/{shm,sem,msg}), but not for {IPC,SHM,SEM,MSG}_INFO calls.
Some of the *_INFO values are exposed in /proc, but only the sysctl-able
ones.
So I'm writing a patch to create /proc/sysvipc/{shm,sem,msg}info. This
will allow ipcs and anyone else to access the values using a
bitness-agnostic
interface. (But exactly because these values aren't that important, I'm
afraid no one may care enough to apply it.)