The sysinfo member does not have any "available ram" field and
the bufferram field is not much helpful either, to get a rough
estimate of available ram needed for allocation.
One needs to parse MemAvailable field separately from /proc/meminfo
to get this info instead of directly getting if from sysinfo itself.
Thus, this patch introduce a new field as availram in sysinfo
so that all the info total/free/available can be retrieved from
one place itself.
There are couple of places in kernel as well where this can be improved.
For example:
In fs/proc/meminfo.c:
meminfo_proc_show:
si_meminfo(&i);
available = si_mem_available();
Now with this change the second call be avoided.
Thus, we can directly do:
show_val_kb(m, "MemAvailable: ", i.availram);
Note, this also requires update in procfs for free and other commands.
Like in free command as well we frist call sysinfo then again parse
/proc/meminfo to get available field.
This can be avoided too with higher kernel version.
A sample output with single sysinfo call is shown below:
Total RAM: 248376 kB
Free RAM: 231540 kB
Avail RAM: 230448 kB
Signed-off-by: Pintu Kumar <[email protected]>
Signed-off-by: Pintu Agarwal <[email protected]>
---
include/uapi/linux/sysinfo.h | 1 +
kernel/sys.c | 4 ++++
mm/page_alloc.c | 2 ++
3 files changed, 7 insertions(+)
diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
index 435d5c2..6e77e90 100644
--- a/include/uapi/linux/sysinfo.h
+++ b/include/uapi/linux/sysinfo.h
@@ -12,6 +12,7 @@ struct sysinfo {
__kernel_ulong_t freeram; /* Available memory size */
__kernel_ulong_t sharedram; /* Amount of shared memory */
__kernel_ulong_t bufferram; /* Memory used by buffers */
+ __kernel_ulong_t availram; /* Memory available for allocation */
__kernel_ulong_t totalswap; /* Total swap space size */
__kernel_ulong_t freeswap; /* swap space still available */
__u16 procs; /* Number of current processes */
diff --git a/kernel/sys.c b/kernel/sys.c
index ecc4cf0..7059515 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2671,6 +2671,7 @@ static int do_sysinfo(struct sysinfo *info)
info->freeram <<= bitcount;
info->sharedram <<= bitcount;
info->bufferram <<= bitcount;
+ info->availram <<= bitcount;
info->totalswap <<= bitcount;
info->freeswap <<= bitcount;
info->totalhigh <<= bitcount;
@@ -2700,6 +2701,7 @@ struct compat_sysinfo {
u32 freeram;
u32 sharedram;
u32 bufferram;
+ u32 availram;
u32 totalswap;
u32 freeswap;
u16 procs;
@@ -2732,6 +2734,7 @@ COMPAT_SYSCALL_DEFINE1(sysinfo, struct compat_sysinfo __user *, info)
s.freeram >>= bitcount;
s.sharedram >>= bitcount;
s.bufferram >>= bitcount;
+ s.availram >>= bitcount;
s.totalswap >>= bitcount;
s.freeswap >>= bitcount;
s.totalhigh >>= bitcount;
@@ -2747,6 +2750,7 @@ COMPAT_SYSCALL_DEFINE1(sysinfo, struct compat_sysinfo __user *, info)
s_32.freeram = s.freeram;
s_32.sharedram = s.sharedram;
s_32.bufferram = s.bufferram;
+ s_32.availram = s.availram;
s_32.totalswap = s.totalswap;
s_32.freeswap = s.freeswap;
s_32.procs = s.procs;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b5d62e1..5013c75 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5786,6 +5786,7 @@ void si_meminfo(struct sysinfo *val)
val->sharedram = global_node_page_state(NR_SHMEM);
val->freeram = global_zone_page_state(NR_FREE_PAGES);
val->bufferram = nr_blockdev_pages();
+ val->availram = si_mem_available();
val->totalhigh = totalhigh_pages();
val->freehigh = nr_free_highpages();
val->mem_unit = PAGE_SIZE;
@@ -5807,6 +5808,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
val->totalram = managed_pages;
val->sharedram = node_page_state(pgdat, NR_SHMEM);
val->freeram = sum_zone_node_page_state(nid, NR_FREE_PAGES);
+ val->availram = si_mem_available();
#ifdef CONFIG_HIGHMEM
for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++) {
struct zone *zone = &pgdat->node_zones[zone_type];
--
2.7.4
On Thu, Jan 06, 2022 at 09:04:10PM +0530, Pintu Kumar wrote:
> The sysinfo member does not have any "available ram" field and
> the bufferram field is not much helpful either, to get a rough
> estimate of available ram needed for allocation.
>
> One needs to parse MemAvailable field separately from /proc/meminfo
> to get this info instead of directly getting if from sysinfo itself.
>
> Thus, this patch introduce a new field as availram in sysinfo
> so that all the info total/free/available can be retrieved from
> one place itself.
>
> There are couple of places in kernel as well where this can be improved.
> For example:
> In fs/proc/meminfo.c:
> meminfo_proc_show:
> si_meminfo(&i);
> available = si_mem_available();
> Now with this change the second call be avoided.
> Thus, we can directly do:
> show_val_kb(m, "MemAvailable: ", i.availram);
>
> Note, this also requires update in procfs for free and other commands.
> Like in free command as well we frist call sysinfo then again parse
> /proc/meminfo to get available field.
> This can be avoided too with higher kernel version.
>
> A sample output with single sysinfo call is shown below:
> Total RAM: 248376 kB
> Free RAM: 231540 kB
> Avail RAM: 230448 kB
>
> Signed-off-by: Pintu Kumar <[email protected]>
> Signed-off-by: Pintu Agarwal <[email protected]>
> ---
> include/uapi/linux/sysinfo.h | 1 +
> kernel/sys.c | 4 ++++
> mm/page_alloc.c | 2 ++
> 3 files changed, 7 insertions(+)
>
> diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> index 435d5c2..6e77e90 100644
> --- a/include/uapi/linux/sysinfo.h
> +++ b/include/uapi/linux/sysinfo.h
> @@ -12,6 +12,7 @@ struct sysinfo {
> __kernel_ulong_t freeram; /* Available memory size */
> __kernel_ulong_t sharedram; /* Amount of shared memory */
> __kernel_ulong_t bufferram; /* Memory used by buffers */
> + __kernel_ulong_t availram; /* Memory available for allocation */
> __kernel_ulong_t totalswap; /* Total swap space size */
> __kernel_ulong_t freeswap; /* swap space still available */
> __u16 procs; /* Number of current processes */
Hi! Sorry, but I don't understand -- the sysinfo structure seems to
be part of user API, no? Don't we break it up here?
On Thu, 6 Jan 2022 at 21:41, Cyrill Gorcunov <[email protected]> wrote:
>
> On Thu, Jan 06, 2022 at 09:04:10PM +0530, Pintu Kumar wrote:
> > The sysinfo member does not have any "available ram" field and
> > the bufferram field is not much helpful either, to get a rough
> > estimate of available ram needed for allocation.
> >
> > One needs to parse MemAvailable field separately from /proc/meminfo
> > to get this info instead of directly getting if from sysinfo itself.
> >
> > Thus, this patch introduce a new field as availram in sysinfo
> > so that all the info total/free/available can be retrieved from
> > one place itself.
> >
> > There are couple of places in kernel as well where this can be improved.
> > For example:
> > In fs/proc/meminfo.c:
> > meminfo_proc_show:
> > si_meminfo(&i);
> > available = si_mem_available();
> > Now with this change the second call be avoided.
> > Thus, we can directly do:
> > show_val_kb(m, "MemAvailable: ", i.availram);
> >
> > Note, this also requires update in procfs for free and other commands.
> > Like in free command as well we frist call sysinfo then again parse
> > /proc/meminfo to get available field.
> > This can be avoided too with higher kernel version.
> >
> > A sample output with single sysinfo call is shown below:
> > Total RAM: 248376 kB
> > Free RAM: 231540 kB
> > Avail RAM: 230448 kB
> >
> > Signed-off-by: Pintu Kumar <[email protected]>
> > Signed-off-by: Pintu Agarwal <[email protected]>
> > ---
> > include/uapi/linux/sysinfo.h | 1 +
> > kernel/sys.c | 4 ++++
> > mm/page_alloc.c | 2 ++
> > 3 files changed, 7 insertions(+)
> >
> > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > index 435d5c2..6e77e90 100644
> > --- a/include/uapi/linux/sysinfo.h
> > +++ b/include/uapi/linux/sysinfo.h
> > @@ -12,6 +12,7 @@ struct sysinfo {
> > __kernel_ulong_t freeram; /* Available memory size */
> > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > + __kernel_ulong_t availram; /* Memory available for allocation */
> > __kernel_ulong_t totalswap; /* Total swap space size */
> > __kernel_ulong_t freeswap; /* swap space still available */
> > __u16 procs; /* Number of current processes */
>
> Hi! Sorry, but I don't understand -- the sysinfo structure seems to
> be part of user API, no? Don't we break it up here?
Yes, the corresponding user space header /usr/include/linux/sysinfo.h
also needs to be updated.
When we generate the kernel header it will be updated automatically.
On Thu, Jan 06, 2022 at 10:19:55PM +0530, Pintu Agarwal wrote:
> > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > > index 435d5c2..6e77e90 100644
> > > --- a/include/uapi/linux/sysinfo.h
> > > +++ b/include/uapi/linux/sysinfo.h
> > > @@ -12,6 +12,7 @@ struct sysinfo {
> > > __kernel_ulong_t freeram; /* Available memory size */
> > > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > > + __kernel_ulong_t availram; /* Memory available for allocation */
> > > __kernel_ulong_t totalswap; /* Total swap space size */
> > > __kernel_ulong_t freeswap; /* swap space still available */
> > > __u16 procs; /* Number of current processes */
> >
> > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
> > be part of user API, no? Don't we break it up here?
>
> Yes, the corresponding user space header /usr/include/linux/sysinfo.h
> also needs to be updated.
> When we generate the kernel header it will be updated automatically.
Wait. The userspace may pass old structure here, and in result we
return incorrect layout which won't match old one, no? Old binary
code has no clue about this header update.
From: Pintu Agarwal
> Sent: 06 January 2022 16:50
>
> On Thu, 6 Jan 2022 at 21:41, Cyrill Gorcunov <[email protected]> wrote:
> >
> > On Thu, Jan 06, 2022 at 09:04:10PM +0530, Pintu Kumar wrote:
> > > The sysinfo member does not have any "available ram" field and
> > > the bufferram field is not much helpful either, to get a rough
> > > estimate of available ram needed for allocation.
> > >
> > > One needs to parse MemAvailable field separately from /proc/meminfo
> > > to get this info instead of directly getting if from sysinfo itself.
> > >
> > > Thus, this patch introduce a new field as availram in sysinfo
> > > so that all the info total/free/available can be retrieved from
> > > one place itself.
> > >
> > > There are couple of places in kernel as well where this can be improved.
> > > For example:
> > > In fs/proc/meminfo.c:
> > > meminfo_proc_show:
> > > si_meminfo(&i);
> > > available = si_mem_available();
> > > Now with this change the second call be avoided.
> > > Thus, we can directly do:
> > > show_val_kb(m, "MemAvailable: ", i.availram);
> > >
> > > Note, this also requires update in procfs for free and other commands.
> > > Like in free command as well we frist call sysinfo then again parse
> > > /proc/meminfo to get available field.
> > > This can be avoided too with higher kernel version.
> > >
> > > A sample output with single sysinfo call is shown below:
> > > Total RAM: 248376 kB
> > > Free RAM: 231540 kB
> > > Avail RAM: 230448 kB
> > >
> > > Signed-off-by: Pintu Kumar <[email protected]>
> > > Signed-off-by: Pintu Agarwal <[email protected]>
> > > ---
> > > include/uapi/linux/sysinfo.h | 1 +
> > > kernel/sys.c | 4 ++++
> > > mm/page_alloc.c | 2 ++
> > > 3 files changed, 7 insertions(+)
> > >
> > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > > index 435d5c2..6e77e90 100644
> > > --- a/include/uapi/linux/sysinfo.h
> > > +++ b/include/uapi/linux/sysinfo.h
> > > @@ -12,6 +12,7 @@ struct sysinfo {
> > > __kernel_ulong_t freeram; /* Available memory size */
> > > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > > + __kernel_ulong_t availram; /* Memory available for allocation */
> > > __kernel_ulong_t totalswap; /* Total swap space size */
> > > __kernel_ulong_t freeswap; /* swap space still available */
> > > __u16 procs; /* Number of current processes */
> >
> > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
> > be part of user API, no? Don't we break it up here?
>
> Yes, the corresponding user space header /usr/include/linux/sysinfo.h
> also needs to be updated.
> When we generate the kernel header it will be updated automatically.
You can't add a field in the middle of a UAPI structure.
It breaks compatibility for old binaries.
Depending on the interface definition you may be able to add one at the end.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Thu, 6 Jan 2022 at 23:12, David Laight <[email protected]> wrote:
>
> From: Pintu Agarwal
> > Sent: 06 January 2022 16:50
> >
> > On Thu, 6 Jan 2022 at 21:41, Cyrill Gorcunov <[email protected]> wrote:
> > >
> > > On Thu, Jan 06, 2022 at 09:04:10PM +0530, Pintu Kumar wrote:
> > > > The sysinfo member does not have any "available ram" field and
> > > > the bufferram field is not much helpful either, to get a rough
> > > > estimate of available ram needed for allocation.
> > > >
> > > > One needs to parse MemAvailable field separately from /proc/meminfo
> > > > to get this info instead of directly getting if from sysinfo itself.
> > > >
> > > > Thus, this patch introduce a new field as availram in sysinfo
> > > > so that all the info total/free/available can be retrieved from
> > > > one place itself.
> > > >
> > > > There are couple of places in kernel as well where this can be improved.
> > > > For example:
> > > > In fs/proc/meminfo.c:
> > > > meminfo_proc_show:
> > > > si_meminfo(&i);
> > > > available = si_mem_available();
> > > > Now with this change the second call be avoided.
> > > > Thus, we can directly do:
> > > > show_val_kb(m, "MemAvailable: ", i.availram);
> > > >
> > > > Note, this also requires update in procfs for free and other commands.
> > > > Like in free command as well we frist call sysinfo then again parse
> > > > /proc/meminfo to get available field.
> > > > This can be avoided too with higher kernel version.
> > > >
> > > > A sample output with single sysinfo call is shown below:
> > > > Total RAM: 248376 kB
> > > > Free RAM: 231540 kB
> > > > Avail RAM: 230448 kB
> > > >
> > > > Signed-off-by: Pintu Kumar <[email protected]>
> > > > Signed-off-by: Pintu Agarwal <[email protected]>
> > > > ---
> > > > include/uapi/linux/sysinfo.h | 1 +
> > > > kernel/sys.c | 4 ++++
> > > > mm/page_alloc.c | 2 ++
> > > > 3 files changed, 7 insertions(+)
> > > >
> > > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > > > index 435d5c2..6e77e90 100644
> > > > --- a/include/uapi/linux/sysinfo.h
> > > > +++ b/include/uapi/linux/sysinfo.h
> > > > @@ -12,6 +12,7 @@ struct sysinfo {
> > > > __kernel_ulong_t freeram; /* Available memory size */
> > > > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > > > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > > > + __kernel_ulong_t availram; /* Memory available for allocation */
> > > > __kernel_ulong_t totalswap; /* Total swap space size */
> > > > __kernel_ulong_t freeswap; /* swap space still available */
> > > > __u16 procs; /* Number of current processes */
> > >
> > > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
> > > be part of user API, no? Don't we break it up here?
> >
> > Yes, the corresponding user space header /usr/include/linux/sysinfo.h
> > also needs to be updated.
> > When we generate the kernel header it will be updated automatically.
>
> You can't add a field in the middle of a UAPI structure.
> It breaks compatibility for old binaries.
>
> Depending on the interface definition you may be able to add one at the end.
>
oh okay thank you for your feedback. I will move to the end and check again.
But my doubt is, whether I should move before this
char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)];
or after this ?
Also, I could not understand what this is for ?
Do we need to update this since sture is changed ?
Pintu Agarwal <[email protected]> writes:
> On Thu, 6 Jan 2022 at 23:12, David Laight <[email protected]> wrote:
>>
>> From: Pintu Agarwal
>> > Sent: 06 January 2022 16:50
>> >
>> > On Thu, 6 Jan 2022 at 21:41, Cyrill Gorcunov <[email protected]> wrote:
>> > >
>> > > On Thu, Jan 06, 2022 at 09:04:10PM +0530, Pintu Kumar wrote:
>> > > > The sysinfo member does not have any "available ram" field and
>> > > > the bufferram field is not much helpful either, to get a rough
>> > > > estimate of available ram needed for allocation.
>> > > >
>> > > > One needs to parse MemAvailable field separately from /proc/meminfo
>> > > > to get this info instead of directly getting if from sysinfo itself.
>> > > >
>> > > > Thus, this patch introduce a new field as availram in sysinfo
>> > > > so that all the info total/free/available can be retrieved from
>> > > > one place itself.
>> > > >
>> > > > There are couple of places in kernel as well where this can be improved.
>> > > > For example:
>> > > > In fs/proc/meminfo.c:
>> > > > meminfo_proc_show:
>> > > > si_meminfo(&i);
>> > > > available = si_mem_available();
>> > > > Now with this change the second call be avoided.
>> > > > Thus, we can directly do:
>> > > > show_val_kb(m, "MemAvailable: ", i.availram);
>> > > >
>> > > > Note, this also requires update in procfs for free and other commands.
>> > > > Like in free command as well we frist call sysinfo then again parse
>> > > > /proc/meminfo to get available field.
>> > > > This can be avoided too with higher kernel version.
>> > > >
>> > > > A sample output with single sysinfo call is shown below:
>> > > > Total RAM: 248376 kB
>> > > > Free RAM: 231540 kB
>> > > > Avail RAM: 230448 kB
>> > > >
>> > > > Signed-off-by: Pintu Kumar <[email protected]>
>> > > > Signed-off-by: Pintu Agarwal <[email protected]>
>> > > > ---
>> > > > include/uapi/linux/sysinfo.h | 1 +
>> > > > kernel/sys.c | 4 ++++
>> > > > mm/page_alloc.c | 2 ++
>> > > > 3 files changed, 7 insertions(+)
>> > > >
>> > > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
>> > > > index 435d5c2..6e77e90 100644
>> > > > --- a/include/uapi/linux/sysinfo.h
>> > > > +++ b/include/uapi/linux/sysinfo.h
>> > > > @@ -12,6 +12,7 @@ struct sysinfo {
>> > > > __kernel_ulong_t freeram; /* Available memory size */
>> > > > __kernel_ulong_t sharedram; /* Amount of shared memory */
>> > > > __kernel_ulong_t bufferram; /* Memory used by buffers */
>> > > > + __kernel_ulong_t availram; /* Memory available for allocation */
>> > > > __kernel_ulong_t totalswap; /* Total swap space size */
>> > > > __kernel_ulong_t freeswap; /* swap space still available */
>> > > > __u16 procs; /* Number of current processes */
>> > >
>> > > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
>> > > be part of user API, no? Don't we break it up here?
>> >
>> > Yes, the corresponding user space header /usr/include/linux/sysinfo.h
>> > also needs to be updated.
>> > When we generate the kernel header it will be updated automatically.
>>
>> You can't add a field in the middle of a UAPI structure.
>> It breaks compatibility for old binaries.
>>
>> Depending on the interface definition you may be able to add one at the end.
>>
> oh okay thank you for your feedback. I will move to the end and check again.
> But my doubt is, whether I should move before this
> char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)];
> or after this ?
Before the padding and you should reduce the size of the padding by the
size of your new field.
> Also, I could not understand what this is for ?
> Do we need to update this since sture is changed ?
In general padding like that is so new fields can be added. The
comment about libc5 makes me a wonder a bit, but I expect libc5 just
added the padding in it's copy of the structure that it exported to
userspace many many years ago so that new fields could be added.
Eric
On Thu, Jan 06, 2022 at 08:27:47PM +0300, Cyrill Gorcunov wrote:
> On Thu, Jan 06, 2022 at 10:19:55PM +0530, Pintu Agarwal wrote:
> > > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > > > index 435d5c2..6e77e90 100644
> > > > --- a/include/uapi/linux/sysinfo.h
> > > > +++ b/include/uapi/linux/sysinfo.h
> > > > @@ -12,6 +12,7 @@ struct sysinfo {
> > > > __kernel_ulong_t freeram; /* Available memory size */
> > > > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > > > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > > > + __kernel_ulong_t availram; /* Memory available for allocation */
> > > > __kernel_ulong_t totalswap; /* Total swap space size */
> > > > __kernel_ulong_t freeswap; /* swap space still available */
> > > > __u16 procs; /* Number of current processes */
> > >
> > > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
> > > be part of user API, no? Don't we break it up here?
> >
> > Yes, the corresponding user space header /usr/include/linux/sysinfo.h
> > also needs to be updated.
> > When we generate the kernel header it will be updated automatically.
>
> Wait. The userspace may pass old structure here, and in result we
> return incorrect layout which won't match old one, no? Old binary
> code has no clue about this header update.
Yes, that won't work as done.
If we want to do this it needs to be done at the end of the struct right
before the padding field and the newly added field substracted from the
padding. (Not the preferred way to do it these days for new structs.)
A new kernel can then pass in the struct with the newly added field and
an old kernel can just fill the struct in as usual. New kernel will
update the field with the correct value.
But there's a catch depending on the type of value.
The problem with these types of extensions is that you'll often need
indicators to and from the kernel whether the extension is supported.
Consider an extension where 0 is a valid value meaning "this resource is
completely used". Since the kernel and userspace always agree on the
size of the struct the kernel will zero the whole struct. So if in your
newly added field 0 is a valid value you can't differentiate between 0
as a valid value indicating that your resource isn't available and 0 as
the kernel not supporting your extension.
Other APIs solve this and similar problems by having a request mask and
a return mask. Userspace fills in what values it wants reported in the
request mask and kernel sets the supported flags in the return mask.
This way you can differentiate between the two (see statx).
If the 0 example is not a concern or acceptable for userspace it's
probably fine. But you need to document that having 0 returned can mean
both things.
Or, you select a value different from 0 (-1?) that you can use to
indicate to userspace that the resource is used up so 0 can just mean
"kernel doesn't support this extension".
On Fri, 7 Jan 2022 at 17:35, Christian Brauner
<[email protected]> wrote:
>
> On Thu, Jan 06, 2022 at 08:27:47PM +0300, Cyrill Gorcunov wrote:
> > On Thu, Jan 06, 2022 at 10:19:55PM +0530, Pintu Agarwal wrote:
> > > > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > > > > index 435d5c2..6e77e90 100644
> > > > > --- a/include/uapi/linux/sysinfo.h
> > > > > +++ b/include/uapi/linux/sysinfo.h
> > > > > @@ -12,6 +12,7 @@ struct sysinfo {
> > > > > __kernel_ulong_t freeram; /* Available memory size */
> > > > > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > > > > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > > > > + __kernel_ulong_t availram; /* Memory available for allocation */
> > > > > __kernel_ulong_t totalswap; /* Total swap space size */
> > > > > __kernel_ulong_t freeswap; /* swap space still available */
> > > > > __u16 procs; /* Number of current processes */
> > > >
> > > > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
> > > > be part of user API, no? Don't we break it up here?
> > >
> > > Yes, the corresponding user space header /usr/include/linux/sysinfo.h
> > > also needs to be updated.
> > > When we generate the kernel header it will be updated automatically.
> >
> > Wait. The userspace may pass old structure here, and in result we
> > return incorrect layout which won't match old one, no? Old binary
> > code has no clue about this header update.
>
> Yes, that won't work as done.
>
> If we want to do this it needs to be done at the end of the struct right
> before the padding field and the newly added field substracted from the
> padding. (Not the preferred way to do it these days for new structs.)
>
> A new kernel can then pass in the struct with the newly added field and
> an old kernel can just fill the struct in as usual. New kernel will
> update the field with the correct value.
>
> But there's a catch depending on the type of value.
> The problem with these types of extensions is that you'll often need
> indicators to and from the kernel whether the extension is supported.
>
> Consider an extension where 0 is a valid value meaning "this resource is
> completely used". Since the kernel and userspace always agree on the
> size of the struct the kernel will zero the whole struct. So if in your
> newly added field 0 is a valid value you can't differentiate between 0
> as a valid value indicating that your resource isn't available and 0 as
> the kernel not supporting your extension.
>
> Other APIs solve this and similar problems by having a request mask and
> a return mask. Userspace fills in what values it wants reported in the
> request mask and kernel sets the supported flags in the return mask.
> This way you can differentiate between the two (see statx).
>
> If the 0 example is not a concern or acceptable for userspace it's
> probably fine. But you need to document that having 0 returned can mean
> both things.
>
> Or, you select a value different from 0 (-1?) that you can use to
> indicate to userspace that the resource is used up so 0 can just mean
> "kernel doesn't support this extension".
Thanks all for your inputs.
As Eric suggested in other thread (pasting here for reference):
{
> Before the padding and you should reduce the size of the padding by the
> size of your new field.
>> Also, I could not understand what this is for ?
>> Do we need to update this since sture is changed ?
> In general padding like that is so new fields can be added. The
> comment about libc5 makes me a wonder a bit, but I expect libc5 just
> added the padding in it's copy of the structure that it exported to
> userspace many many years ago so that new fields could be added.
> Eric
}
I made the changes like below and this seems to work even with older user space.
I mean earlier, when I ran "free" command it was giving "stack
smashing..." error,
but now the "free" command (which comes as part of busybox) works fine
even without recompiling with the updated header.
These are the header changes for quick look:
{{{
diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
index 6e77e90..fe84c6a 100644
--- a/include/uapi/linux/sysinfo.h
+++ b/include/uapi/linux/sysinfo.h
@@ -12,7 +12,6 @@ struct sysinfo {
__kernel_ulong_t freeram; /* Available memory size */
__kernel_ulong_t sharedram; /* Amount of shared memory */
__kernel_ulong_t bufferram; /* Memory used by buffers */
- __kernel_ulong_t availram; /* Memory available for allocation */
__kernel_ulong_t totalswap; /* Total swap space size */
__kernel_ulong_t freeswap; /* swap space still available */
__u16 procs; /* Number of current processes */
@@ -20,7 +19,8 @@ struct sysinfo {
__kernel_ulong_t totalhigh; /* Total high memory size */
__kernel_ulong_t freehigh; /* Available high memory size */
__u32 mem_unit; /* Memory unit size in bytes */
- char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
Padding: libc5 uses this.. */
+ __kernel_ulong_t availram; /* Memory available for allocation */
+ char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
Padding: libc5 uses this.. */
};
}}}
If this is fine, I will push the new patch set.
Thanks,
Pintu
CC linux-api
On 1/7/22 14:44, Pintu Agarwal wrote:
> On Fri, 7 Jan 2022 at 17:35, Christian Brauner
> <[email protected]> wrote:
>>
>> On Thu, Jan 06, 2022 at 08:27:47PM +0300, Cyrill Gorcunov wrote:
>> > On Thu, Jan 06, 2022 at 10:19:55PM +0530, Pintu Agarwal wrote:
>> > > > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
>> > > > > index 435d5c2..6e77e90 100644
>> > > > > --- a/include/uapi/linux/sysinfo.h
>> > > > > +++ b/include/uapi/linux/sysinfo.h
>> > > > > @@ -12,6 +12,7 @@ struct sysinfo {
>> > > > > __kernel_ulong_t freeram; /* Available memory size */
>> > > > > __kernel_ulong_t sharedram; /* Amount of shared memory */
>> > > > > __kernel_ulong_t bufferram; /* Memory used by buffers */
>> > > > > + __kernel_ulong_t availram; /* Memory available for allocation */
>> > > > > __kernel_ulong_t totalswap; /* Total swap space size */
>> > > > > __kernel_ulong_t freeswap; /* swap space still available */
>> > > > > __u16 procs; /* Number of current processes */
>> > > >
>> > > > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
>> > > > be part of user API, no? Don't we break it up here?
>> > >
>> > > Yes, the corresponding user space header /usr/include/linux/sysinfo.h
>> > > also needs to be updated.
>> > > When we generate the kernel header it will be updated automatically.
>> >
>> > Wait. The userspace may pass old structure here, and in result we
>> > return incorrect layout which won't match old one, no? Old binary
>> > code has no clue about this header update.
>>
>> Yes, that won't work as done.
>>
>> If we want to do this it needs to be done at the end of the struct right
>> before the padding field and the newly added field substracted from the
>> padding. (Not the preferred way to do it these days for new structs.)
>>
>> A new kernel can then pass in the struct with the newly added field and
>> an old kernel can just fill the struct in as usual. New kernel will
>> update the field with the correct value.
>>
>> But there's a catch depending on the type of value.
>> The problem with these types of extensions is that you'll often need
>> indicators to and from the kernel whether the extension is supported.
>>
>> Consider an extension where 0 is a valid value meaning "this resource is
>> completely used". Since the kernel and userspace always agree on the
>> size of the struct the kernel will zero the whole struct. So if in your
>> newly added field 0 is a valid value you can't differentiate between 0
>> as a valid value indicating that your resource isn't available and 0 as
>> the kernel not supporting your extension.
>>
>> Other APIs solve this and similar problems by having a request mask and
>> a return mask. Userspace fills in what values it wants reported in the
>> request mask and kernel sets the supported flags in the return mask.
>> This way you can differentiate between the two (see statx).
>>
>> If the 0 example is not a concern or acceptable for userspace it's
>> probably fine. But you need to document that having 0 returned can mean
>> both things.
>>
>> Or, you select a value different from 0 (-1?) that you can use to
>> indicate to userspace that the resource is used up so 0 can just mean
>> "kernel doesn't support this extension".
>
> Thanks all for your inputs.
> As Eric suggested in other thread (pasting here for reference):
> {
>> Before the padding and you should reduce the size of the padding by the
>> size of your new field.
>
>>> Also, I could not understand what this is for ?
>>> Do we need to update this since sture is changed ?
>
>> In general padding like that is so new fields can be added. The
>> comment about libc5 makes me a wonder a bit, but I expect libc5 just
>> added the padding in it's copy of the structure that it exported to
>> userspace many many years ago so that new fields could be added.
>
>> Eric
> }
>
> I made the changes like below and this seems to work even with older user space.
> I mean earlier, when I ran "free" command it was giving "stack
> smashing..." error,
> but now the "free" command (which comes as part of busybox) works fine
> even without recompiling with the updated header.
>
> These are the header changes for quick look:
> {{{
> diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> index 6e77e90..fe84c6a 100644
> --- a/include/uapi/linux/sysinfo.h
> +++ b/include/uapi/linux/sysinfo.h
> @@ -12,7 +12,6 @@ struct sysinfo {
> __kernel_ulong_t freeram; /* Available memory size */
> __kernel_ulong_t sharedram; /* Amount of shared memory */
> __kernel_ulong_t bufferram; /* Memory used by buffers */
> - __kernel_ulong_t availram; /* Memory available for allocation */
> __kernel_ulong_t totalswap; /* Total swap space size */
> __kernel_ulong_t freeswap; /* swap space still available */
> __u16 procs; /* Number of current processes */
> @@ -20,7 +19,8 @@ struct sysinfo {
> __kernel_ulong_t totalhigh; /* Total high memory size */
> __kernel_ulong_t freehigh; /* Available high memory size */
> __u32 mem_unit; /* Memory unit size in bytes */
> - char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
> Padding: libc5 uses this.. */
> + __kernel_ulong_t availram; /* Memory available for allocation */
> + char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
> Padding: libc5 uses this.. */
> };
> }}}
>
> If this is fine, I will push the new patch set.
Please CC [email protected] on the new posting.
> Thanks,
> Pintu
>
On Fri, 7 Jan 2022 at 22:28, Vlastimil Babka <[email protected]> wrote:
>
> CC linux-api
>
> On 1/7/22 14:44, Pintu Agarwal wrote:
> > On Fri, 7 Jan 2022 at 17:35, Christian Brauner
> > <[email protected]> wrote:
> >>
> >> On Thu, Jan 06, 2022 at 08:27:47PM +0300, Cyrill Gorcunov wrote:
> >> > On Thu, Jan 06, 2022 at 10:19:55PM +0530, Pintu Agarwal wrote:
> >> > > > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> >> > > > > index 435d5c2..6e77e90 100644
> >> > > > > --- a/include/uapi/linux/sysinfo.h
> >> > > > > +++ b/include/uapi/linux/sysinfo.h
> >> > > > > @@ -12,6 +12,7 @@ struct sysinfo {
> >> > > > > __kernel_ulong_t freeram; /* Available memory size */
> >> > > > > __kernel_ulong_t sharedram; /* Amount of shared memory */
> >> > > > > __kernel_ulong_t bufferram; /* Memory used by buffers */
> >> > > > > + __kernel_ulong_t availram; /* Memory available for allocation */
> >> > > > > __kernel_ulong_t totalswap; /* Total swap space size */
> >> > > > > __kernel_ulong_t freeswap; /* swap space still available */
> >> > > > > __u16 procs; /* Number of current processes */
> >> > > >
> >> > > > Hi! Sorry, but I don't understand -- the sysinfo structure seems to
> >> > > > be part of user API, no? Don't we break it up here?
> >> > >
> >> > > Yes, the corresponding user space header /usr/include/linux/sysinfo.h
> >> > > also needs to be updated.
> >> > > When we generate the kernel header it will be updated automatically.
> >> >
> >> > Wait. The userspace may pass old structure here, and in result we
> >> > return incorrect layout which won't match old one, no? Old binary
> >> > code has no clue about this header update.
> >>
> >> Yes, that won't work as done.
> >>
> >> If we want to do this it needs to be done at the end of the struct right
> >> before the padding field and the newly added field substracted from the
> >> padding. (Not the preferred way to do it these days for new structs.)
> >>
> >> A new kernel can then pass in the struct with the newly added field and
> >> an old kernel can just fill the struct in as usual. New kernel will
> >> update the field with the correct value.
> >>
> >> But there's a catch depending on the type of value.
> >> The problem with these types of extensions is that you'll often need
> >> indicators to and from the kernel whether the extension is supported.
> >>
> >> Consider an extension where 0 is a valid value meaning "this resource is
> >> completely used". Since the kernel and userspace always agree on the
> >> size of the struct the kernel will zero the whole struct. So if in your
> >> newly added field 0 is a valid value you can't differentiate between 0
> >> as a valid value indicating that your resource isn't available and 0 as
> >> the kernel not supporting your extension.
> >>
> >> Other APIs solve this and similar problems by having a request mask and
> >> a return mask. Userspace fills in what values it wants reported in the
> >> request mask and kernel sets the supported flags in the return mask.
> >> This way you can differentiate between the two (see statx).
> >>
> >> If the 0 example is not a concern or acceptable for userspace it's
> >> probably fine. But you need to document that having 0 returned can mean
> >> both things.
> >>
> >> Or, you select a value different from 0 (-1?) that you can use to
> >> indicate to userspace that the resource is used up so 0 can just mean
> >> "kernel doesn't support this extension".
> >
> > Thanks all for your inputs.
> > As Eric suggested in other thread (pasting here for reference):
> > {
> >> Before the padding and you should reduce the size of the padding by the
> >> size of your new field.
> >
> >>> Also, I could not understand what this is for ?
> >>> Do we need to update this since sture is changed ?
> >
> >> In general padding like that is so new fields can be added. The
> >> comment about libc5 makes me a wonder a bit, but I expect libc5 just
> >> added the padding in it's copy of the structure that it exported to
> >> userspace many many years ago so that new fields could be added.
> >
> >> Eric
> > }
> >
> > I made the changes like below and this seems to work even with older user space.
> > I mean earlier, when I ran "free" command it was giving "stack
> > smashing..." error,
> > but now the "free" command (which comes as part of busybox) works fine
> > even without recompiling with the updated header.
> >
> > These are the header changes for quick look:
> > {{{
> > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > index 6e77e90..fe84c6a 100644
> > --- a/include/uapi/linux/sysinfo.h
> > +++ b/include/uapi/linux/sysinfo.h
> > @@ -12,7 +12,6 @@ struct sysinfo {
> > __kernel_ulong_t freeram; /* Available memory size */
> > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > - __kernel_ulong_t availram; /* Memory available for allocation */
> > __kernel_ulong_t totalswap; /* Total swap space size */
> > __kernel_ulong_t freeswap; /* swap space still available */
> > __u16 procs; /* Number of current processes */
> > @@ -20,7 +19,8 @@ struct sysinfo {
> > __kernel_ulong_t totalhigh; /* Total high memory size */
> > __kernel_ulong_t freehigh; /* Available high memory size */
> > __u32 mem_unit; /* Memory unit size in bytes */
> > - char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
> > Padding: libc5 uses this.. */
> > + __kernel_ulong_t availram; /* Memory available for allocation */
> > + char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
> > Padding: libc5 uses this.. */
> > };
> > }}}
> >
> > If this is fine, I will push the new patch set.
>
> Please CC [email protected] on the new posting.
>
@Christian Brauner,
Regarding 0 case I guess it is fine.
Just to cross check, I used my test program to run with some other
kernel (where there are no changes to sysinfo).
I see that the field returns 0.
# ./test-sysinfo.out
Total RAM: 249320 kB
Free RAM: 233416 kB
Avail RAM: 0 kB
And this is fine and this is also good.
This also indicates 2 things:
a) Either "availram" field is not available in this kernel version
(less than 5.1x)
==> Thus it should fall back to parsing MemAvailable from /proc/meminfo
b) Or, MemAvailable field itself is not available (less than 3.1x)
I will push the new patch set now..
Thanks all!
Pintu
The sysinfo member does not have any "available ram" field and
the bufferram field is not much helpful either, to get a rough
estimate of available ram needed for allocation.
One needs to parse MemAvailable field separately from /proc/meminfo
to get this info instead of directly getting if from sysinfo itself.
Thus, this patch introduce a new field as availram in sysinfo
so that all the info total/free/available can be retrieved from
one place itself.
There are couple of places in kernel as well where this can be improved.
For example:
In fs/proc/meminfo.c:
meminfo_proc_show:
si_meminfo(&i);
available = si_mem_available();
Now with this change the second call be avoided.
Thus, we can directly do:
show_val_kb(m, "MemAvailable: ", i.availram);
Note, this also requires update in procfs for free and other commands.
Like in free command as well we frist call sysinfo then again parse
/proc/meminfo to get available field.
This can be avoided too with higher kernel version.
A sample output with single sysinfo call is shown below:
Total RAM: 248376 kB
Free RAM: 231540 kB
Avail RAM: 230448 kB
Signed-off-by: Pintu Kumar <[email protected]>
Signed-off-by: Pintu Agarwal <[email protected]>
---
include/uapi/linux/sysinfo.h | 3 ++-
kernel/sys.c | 4 ++++
mm/page_alloc.c | 2 ++
3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
index 435d5c2..fe84c6a 100644
--- a/include/uapi/linux/sysinfo.h
+++ b/include/uapi/linux/sysinfo.h
@@ -19,7 +19,8 @@ struct sysinfo {
__kernel_ulong_t totalhigh; /* Total high memory size */
__kernel_ulong_t freehigh; /* Available high memory size */
__u32 mem_unit; /* Memory unit size in bytes */
- char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
+ __kernel_ulong_t availram; /* Memory available for allocation */
+ char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
};
#endif /* _LINUX_SYSINFO_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index ecc4cf0..7059515 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2671,6 +2671,7 @@ static int do_sysinfo(struct sysinfo *info)
info->freeram <<= bitcount;
info->sharedram <<= bitcount;
info->bufferram <<= bitcount;
+ info->availram <<= bitcount;
info->totalswap <<= bitcount;
info->freeswap <<= bitcount;
info->totalhigh <<= bitcount;
@@ -2700,6 +2701,7 @@ struct compat_sysinfo {
u32 freeram;
u32 sharedram;
u32 bufferram;
+ u32 availram;
u32 totalswap;
u32 freeswap;
u16 procs;
@@ -2732,6 +2734,7 @@ COMPAT_SYSCALL_DEFINE1(sysinfo, struct compat_sysinfo __user *, info)
s.freeram >>= bitcount;
s.sharedram >>= bitcount;
s.bufferram >>= bitcount;
+ s.availram >>= bitcount;
s.totalswap >>= bitcount;
s.freeswap >>= bitcount;
s.totalhigh >>= bitcount;
@@ -2747,6 +2750,7 @@ COMPAT_SYSCALL_DEFINE1(sysinfo, struct compat_sysinfo __user *, info)
s_32.freeram = s.freeram;
s_32.sharedram = s.sharedram;
s_32.bufferram = s.bufferram;
+ s_32.availram = s.availram;
s_32.totalswap = s.totalswap;
s_32.freeswap = s.freeswap;
s_32.procs = s.procs;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b5d62e1..5013c75 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5786,6 +5786,7 @@ void si_meminfo(struct sysinfo *val)
val->sharedram = global_node_page_state(NR_SHMEM);
val->freeram = global_zone_page_state(NR_FREE_PAGES);
val->bufferram = nr_blockdev_pages();
+ val->availram = si_mem_available();
val->totalhigh = totalhigh_pages();
val->freehigh = nr_free_highpages();
val->mem_unit = PAGE_SIZE;
@@ -5807,6 +5808,7 @@ void si_meminfo_node(struct sysinfo *val, int nid)
val->totalram = managed_pages;
val->sharedram = node_page_state(pgdat, NR_SHMEM);
val->freeram = sum_zone_node_page_state(nid, NR_FREE_PAGES);
+ val->availram = si_mem_available();
#ifdef CONFIG_HIGHMEM
for (zone_type = 0; zone_type < MAX_NR_ZONES; zone_type++) {
struct zone *zone = &pgdat->node_zones[zone_type];
--
2.7.4
On Fri, Jan 07, 2022 at 01:04:51PM +0100, Christian Brauner wrote:
> >
> > Wait. The userspace may pass old structure here, and in result we
> > return incorrect layout which won't match old one, no? Old binary
> > code has no clue about this header update.
>
> Yes, that won't work as done.
>
Yup. When I've been developing struct prctl_mm_map I reserved
PR_SET_MM_MAP_SIZE opcode so userspace would be able to query
current structure size and provide memory slab needed to fit
running structure.
As far as I see we can cut off some space from padding (at the
end of the structure) though one need to make a precise check
that there is no alignment holes appear on different architectures
other than x86. This area is pretty sensitive.
On Fri, Jan 07, 2022 at 11:37:34PM +0530, Pintu Kumar wrote:
> The sysinfo member does not have any "available ram" field and
> the bufferram field is not much helpful either, to get a rough
> estimate of available ram needed for allocation.
>
> One needs to parse MemAvailable field separately from /proc/meminfo
> to get this info instead of directly getting if from sysinfo itself.
Who exactly needs this change? Do you have some application for which
parsing /proc/meminfo is a hot path so it needs this information via
sysinfo interface?
Don't get me wrong please but such extension really need a strong
justification because they are part of UAPI and there is not that much
space left in sysinfo structure. We will _have_ to live with this new
field forever so I propose to not introduce anything new here until
we have no other choise or parsing meminfo become a really bottleneck.
> diff --git a/kernel/sys.c b/kernel/sys.c
> index ecc4cf0..7059515 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2671,6 +2671,7 @@ static int do_sysinfo(struct sysinfo *info)
> info->freeram <<= bitcount;
> info->sharedram <<= bitcount;
> info->bufferram <<= bitcount;
> + info->availram <<= bitcount;
> info->totalswap <<= bitcount;
> info->freeswap <<= bitcount;
> info->totalhigh <<= bitcount;
> @@ -2700,6 +2701,7 @@ struct compat_sysinfo {
> u32 freeram;
> u32 sharedram;
> u32 bufferram;
> + u32 availram;
If only I'm not missing something ovious, this is part of UAPI as well.
> > These are the header changes for quick look:
> > {{{
> > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > index 6e77e90..fe84c6a 100644
> > --- a/include/uapi/linux/sysinfo.h
> > +++ b/include/uapi/linux/sysinfo.h
> > @@ -12,7 +12,6 @@ struct sysinfo {
> > __kernel_ulong_t freeram; /* Available memory size */
> > __kernel_ulong_t sharedram; /* Amount of shared memory */
> > __kernel_ulong_t bufferram; /* Memory used by buffers */
> > - __kernel_ulong_t availram; /* Memory available for allocation */
> > __kernel_ulong_t totalswap; /* Total swap space size */
> > __kernel_ulong_t freeswap; /* swap space still available */
> > __u16 procs; /* Number of current processes */
> > @@ -20,7 +19,8 @@ struct sysinfo {
> > __kernel_ulong_t totalhigh; /* Total high memory size */
> > __kernel_ulong_t freehigh; /* Available high memory size */
> > __u32 mem_unit; /* Memory unit size in bytes */
> > - char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
> > Padding: libc5 uses this.. */
> > + __kernel_ulong_t availram; /* Memory available for allocation */
> > + char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
> > Padding: libc5 uses this.. */
> > };
> > }}}
> >
> > If this is fine, I will push the new patch set.
>
> Please CC [email protected] on the new posting.
That is probably still broken.
If __kernel_ulong_t is 64bit there is architecture
dependant padding after mem_unit.
In particular a 32bit x86 app running on a 64bit kernel
will probably see the wrong layout.
You definitely need a compile-time assert on the total structure size.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
From: Pintu Kumar
> Sent: 07 January 2022 18:08
>
> The sysinfo member does not have any "available ram" field and
> the bufferram field is not much helpful either, to get a rough
> estimate of available ram needed for allocation.
>
> One needs to parse MemAvailable field separately from /proc/meminfo
> to get this info instead of directly getting if from sysinfo itself.
>
> Thus, this patch introduce a new field as availram in sysinfo
> so that all the info total/free/available can be retrieved from
> one place itself.
>
...
> diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> index 435d5c2..fe84c6a 100644
> --- a/include/uapi/linux/sysinfo.h
> +++ b/include/uapi/linux/sysinfo.h
> @@ -19,7 +19,8 @@ struct sysinfo {
> __kernel_ulong_t totalhigh; /* Total high memory size */
> __kernel_ulong_t freehigh; /* Available high memory size */
> __u32 mem_unit; /* Memory unit size in bytes */
> - char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
There are 4 pad bytes here on most 64bit architectures.
> + __kernel_ulong_t availram; /* Memory available for allocation */
> + char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
> };
You've not compile-time tested the size of the structure.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Sat, 8 Jan 2022 at 02:31, Cyrill Gorcunov <[email protected]> wrote:
>
> On Fri, Jan 07, 2022 at 11:37:34PM +0530, Pintu Kumar wrote:
> > The sysinfo member does not have any "available ram" field and
> > the bufferram field is not much helpful either, to get a rough
> > estimate of available ram needed for allocation.
> >
> > One needs to parse MemAvailable field separately from /proc/meminfo
> > to get this info instead of directly getting if from sysinfo itself.
>
> Who exactly needs this change? Do you have some application for which
> parsing /proc/meminfo is a hot path so it needs this information via
> sysinfo interface?
>
Thank you so much for your feedback...
I had a need to get total/free/available memory in my application (on
a memory constraint system).
First I tried to parse these from /proc/meminfo but then I saw sysinfo
already provides some information,
however available field was missing. Just to get available field I
need to again do all the file operations.
Then I saw, even the "free" command doing this redundant work.
Use sysinfo system call to get "total" and "free" memory then again
get "available" memory from /proc/meminfo.
Yet again, I see, even in kernel its reading from two different places
while populating the /proc/meminfo.
Also, I am sure there are plenty of other applications where this can
be improved with this.
Moreover, I think with this field there is not much use of other ram
fields in sysinfo.
Thus I felt a need to introduce this field to ease some operations.
> Don't get me wrong please but such extension really need a strong
> justification because they are part of UAPI and there is not that much
> space left in sysinfo structure. We will _have_ to live with this new
> field forever so I propose to not introduce anything new here until
> we have no other choise or parsing meminfo become a really bottleneck.
>
My guess is that this situation might exist in other places as well ?
How do we handle new field addition to existing system calls ?
> > diff --git a/kernel/sys.c b/kernel/sys.c
> > index ecc4cf0..7059515 100644
> > --- a/kernel/sys.c
> > +++ b/kernel/sys.c
> > @@ -2671,6 +2671,7 @@ static int do_sysinfo(struct sysinfo *info)
> > info->freeram <<= bitcount;
> > info->sharedram <<= bitcount;
> > info->bufferram <<= bitcount;
> > + info->availram <<= bitcount;
> > info->totalswap <<= bitcount;
> > info->freeswap <<= bitcount;
> > info->totalhigh <<= bitcount;
> > @@ -2700,6 +2701,7 @@ struct compat_sysinfo {
> > u32 freeram;
> > u32 sharedram;
> > u32 bufferram;
> > + u32 availram;
>
> If only I'm not missing something ovious, this is part of UAPI as well.
Yes, this structure is part of the common UAPI header.
On Sat, 8 Jan 2022 at 03:52, David Laight <[email protected]> wrote:
>
> From: Pintu Kumar
> > Sent: 07 January 2022 18:08
> >
> > The sysinfo member does not have any "available ram" field and
> > the bufferram field is not much helpful either, to get a rough
> > estimate of available ram needed for allocation.
> >
> > One needs to parse MemAvailable field separately from /proc/meminfo
> > to get this info instead of directly getting if from sysinfo itself.
> >
> > Thus, this patch introduce a new field as availram in sysinfo
> > so that all the info total/free/available can be retrieved from
> > one place itself.
> >
> ...
> > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > index 435d5c2..fe84c6a 100644
> > --- a/include/uapi/linux/sysinfo.h
> > +++ b/include/uapi/linux/sysinfo.h
> > @@ -19,7 +19,8 @@ struct sysinfo {
> > __kernel_ulong_t totalhigh; /* Total high memory size */
> > __kernel_ulong_t freehigh; /* Available high memory size */
> > __u32 mem_unit; /* Memory unit size in bytes */
> > - char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
>
> There are 4 pad bytes here on most 64bit architectures.
>
> > + __kernel_ulong_t availram; /* Memory available for allocation */
> > + char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
> > };
>
> You've not compile-time tested the size of the structure.
>
With "32" instead of "20" in padding I get these size of sysinfo:
In x86-64 kernel, with app 64-bit: Size of sysinfo = 128
In x86-64 kernel, with app 32-bit:: Size of sysinfo = 76
In arm-64 kernel, with app 32-bit: Size of sysinfo = 76
Okay the sys robot reported some issue in 64-bit build.
{{{
>> include/uapi/linux/sysinfo.h:23:14: error: size of array '_f' is too large
>> 23 | char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
>> | ^~
}}}
Also, I got the same issue while building for arm64, so I tried to
adjust like this:
char _f[32-3*sizeof(__kernel_ulong_t)-sizeof(__u32)];
With this the build works on both 32/64 but output fails when running
32-bit program on 64-bit kernel.
Also, the free command on 64-bit reports "stack smashing error"..
How do we resolve this issue to make it work on both arch ?
Also, I don't really understand the significance of that number "20"
in padding ?
Thanks,
Pintu
From: Pintu Agarwal
> Sent: 08 January 2022 16:53
>
> On Sat, 8 Jan 2022 at 03:52, David Laight <[email protected]> wrote:
> >
> > From: Pintu Kumar
> > > Sent: 07 January 2022 18:08
> > >
> > > The sysinfo member does not have any "available ram" field and
> > > the bufferram field is not much helpful either, to get a rough
> > > estimate of available ram needed for allocation.
> > >
> > > One needs to parse MemAvailable field separately from /proc/meminfo
> > > to get this info instead of directly getting if from sysinfo itself.
> > >
> > > Thus, this patch introduce a new field as availram in sysinfo
> > > so that all the info total/free/available can be retrieved from
> > > one place itself.
> > >
> > ...
> > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > > index 435d5c2..fe84c6a 100644
> > > --- a/include/uapi/linux/sysinfo.h
> > > +++ b/include/uapi/linux/sysinfo.h
> > > @@ -19,7 +19,8 @@ struct sysinfo {
> > > __kernel_ulong_t totalhigh; /* Total high memory size */
> > > __kernel_ulong_t freehigh; /* Available high memory size */
> > > __u32 mem_unit; /* Memory unit size in bytes */
> > > - char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
> >
> > There are 4 pad bytes here on most 64bit architectures.
> >
> > > + __kernel_ulong_t availram; /* Memory available for allocation */
> > > + char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
> > > };
> >
> > You've not compile-time tested the size of the structure.
> >
> With "32" instead of "20" in padding I get these size of sysinfo:
> In x86-64 kernel, with app 64-bit: Size of sysinfo = 128
> In x86-64 kernel, with app 32-bit:: Size of sysinfo = 76
> In arm-64 kernel, with app 32-bit: Size of sysinfo = 76
You need to compare the sizes before and after your patch
to ensure it doesn't change on any architecture.
> Okay the sys robot reported some issue in 64-bit build.
> {{{
> >> include/uapi/linux/sysinfo.h:23:14: error: size of array '_f' is too large
> >> 23 | char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses
> this.. */
> >> | ^~
> }}}
>
> Also, I got the same issue while building for arm64, so I tried to
> adjust like this:
> char _f[32-3*sizeof(__kernel_ulong_t)-sizeof(__u32)];
>
> With this the build works on both 32/64 but output fails when running
> 32-bit program on 64-bit kernel.
> Also, the free command on 64-bit reports "stack smashing error"..
>
> How do we resolve this issue to make it work on both arch ?
> Also, I don't really understand the significance of that number "20"
> in padding ?
My guess is that someone added a char _f[20] pad to allow for expansion.
Then two __kernel_ulong_t and one __u32 field were added, so the
size of the pad was reduced.
When __kernel_ulong_t is 64bit then it seems to be char _f[0]
- which might generate a compile warning since you are supposed
to use char _f[] to indicate an extensible structure.
There is, however, 4 bytes of pad after the _f[] on most 64bit
architectures.
So actually there isn't enough space to anything useful at all.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Sat, Jan 08, 2022 at 09:54:37PM +0530, Pintu Agarwal wrote:
> Thank you so much for your feedback...
> I had a need to get total/free/available memory in my application (on
> a memory constraint system).
> First I tried to parse these from /proc/meminfo but then I saw sysinfo
> already provides some information,
> however available field was missing. Just to get available field I
> need to again do all the file operations.
>
> Then I saw, even the "free" command doing this redundant work.
> Use sysinfo system call to get "total" and "free" memory then again
> get "available" memory from /proc/meminfo.
> Yet again, I see, even in kernel its reading from two different places
> while populating the /proc/meminfo.
> Also, I am sure there are plenty of other applications where this can
> be improved with this.
> Moreover, I think with this field there is not much use of other ram
> fields in sysinfo.
> Thus I felt a need to introduce this field to ease some operations.
Thanks for explanation.
>
> > Don't get me wrong please but such extension really need a strong
> > justification because they are part of UAPI and there is not that much
> > space left in sysinfo structure. We will _have_ to live with this new
> > field forever so I propose to not introduce anything new here until
> > we have no other choise or parsing meminfo become a really bottleneck.
> >
> My guess is that this situation might exist in other places as well ?
> How do we handle new field addition to existing system calls ?
If there is no space left in uapi structures then we simply can't extend
the syscall, since we're not allowed to break api. an option is to deprecate
old interface and introduce a new one but this is a painfull procedure that's
why i'm not convinced that we should extend sysinfo right now. but final
decision is up to mantainers of course.
On Sun, 9 Jan 2022 at 04:05, David Laight <[email protected]> wrote:
>
> From: Pintu Agarwal
> > Sent: 08 January 2022 16:53
> >
> > On Sat, 8 Jan 2022 at 03:52, David Laight <[email protected]> wrote:
> > >
> > > From: Pintu Kumar
> > > > Sent: 07 January 2022 18:08
> > > >
> > > > The sysinfo member does not have any "available ram" field and
> > > > the bufferram field is not much helpful either, to get a rough
> > > > estimate of available ram needed for allocation.
> > > >
> > > > One needs to parse MemAvailable field separately from /proc/meminfo
> > > > to get this info instead of directly getting if from sysinfo itself.
> > > >
> > > > Thus, this patch introduce a new field as availram in sysinfo
> > > > so that all the info total/free/available can be retrieved from
> > > > one place itself.
> > > >
> > > ...
> > > > diff --git a/include/uapi/linux/sysinfo.h b/include/uapi/linux/sysinfo.h
> > > > index 435d5c2..fe84c6a 100644
> > > > --- a/include/uapi/linux/sysinfo.h
> > > > +++ b/include/uapi/linux/sysinfo.h
> > > > @@ -19,7 +19,8 @@ struct sysinfo {
> > > > __kernel_ulong_t totalhigh; /* Total high memory size */
> > > > __kernel_ulong_t freehigh; /* Available high memory size */
> > > > __u32 mem_unit; /* Memory unit size in bytes */
> > > > - char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
> > >
> > > There are 4 pad bytes here on most 64bit architectures.
> > >
> > > > + __kernel_ulong_t availram; /* Memory available for allocation */
> > > > + char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses this.. */
> > > > };
> > >
> > > You've not compile-time tested the size of the structure.
> > >
> > With "32" instead of "20" in padding I get these size of sysinfo:
> > In x86-64 kernel, with app 64-bit: Size of sysinfo = 128
> > In x86-64 kernel, with app 32-bit:: Size of sysinfo = 76
> > In arm-64 kernel, with app 32-bit: Size of sysinfo = 76
>
> You need to compare the sizes before and after your patch
> to ensure it doesn't change on any architecture.
Without the changes:
On 32-bit, the size of structure = 64
On 64-bit, the size of structure = 112
With the addition of my new field (availram) if I try to fix the size
issue on one arch, the other arch gets disturbed.
I could fix the same size issue on 64-bit with below changes:
__kernel_ulong_t freeswap; /* swap space still available */
__u16 procs; /* Number of current processes */
__u16 pad; /* Explicit padding for m68k */
+ __u32 mem_unit; /* Memory unit size in bytes
*/ ============> Move the mem_unit up to adjust the padding
__kernel_ulong_t totalhigh; /* Total high memory size */
__kernel_ulong_t freehigh; /* Available high memory size */
- __u32 mem_unit; /* Memory unit size in bytes */
+ __kernel_ulong_t availram; /* Memory available for
allocation */ ========> Add the new field here
- char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
Padding: libc5 uses this.. */
+ char _f[28-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /*
Padding: libc5 uses this.. */ ====> Increase the size to 28 (thus _f
becomes 0 like original)
+ //char _f[4];
};
Output with 64-bit build:
$ gcc test-sysinfo.c ; ./a.out
Total RAM: 32715804 kB
Free RAM: 1111296 kB
Size of sysinfo = 112
Size of sysinfo uptime = 8
Size of sysinfo loads = 24
Size of sysinfo totalram = 8
Size of sysinfo pad = 2
Size of sysinfo memunit = 4
Size of sysinfo _f = 0
Output with 32-bit build:
$ gcc test-sysinfo.c -m32 ; ./a.out
Total RAM: 7987 kB
Free RAM: 271 kB
Size of sysinfo = 72
Size of sysinfo uptime = 4
Size of sysinfo loads = 12
Size of sysinfo totalram = 4
Size of sysinfo pad = 2
Size of sysinfo memunit = 4
Size of sysinfo _f = 12
Are there any more suggestions/ideas to experiment with padding
changes before we give-up ?
Can we handle it using : __arch64__ check ?
Or, the only option is to add one more, say: sysinfo64 ?
> > Okay the sys robot reported some issue in 64-bit build.
> > {{{
> > >> include/uapi/linux/sysinfo.h:23:14: error: size of array '_f' is too large
> > >> 23 | char _f[20-3*sizeof(__kernel_ulong_t)-sizeof(__u32)]; /* Padding: libc5 uses
> > this.. */
> > >> | ^~
> > }}}
> >
> > Also, I got the same issue while building for arm64, so I tried to
> > adjust like this:
> > char _f[32-3*sizeof(__kernel_ulong_t)-sizeof(__u32)];
> >
> > With this the build works on both 32/64 but output fails when running
> > 32-bit program on 64-bit kernel.
> > Also, the free command on 64-bit reports "stack smashing error"..
> >
> > How do we resolve this issue to make it work on both arch ?
> > Also, I don't really understand the significance of that number "20"
> > in padding ?
>
> My guess is that someone added a char _f[20] pad to allow for expansion.
> Then two __kernel_ulong_t and one __u32 field were added, so the
> size of the pad was reduced.
> When __kernel_ulong_t is 64bit then it seems to be char _f[0]
> - which might generate a compile warning since you are supposed
> to use char _f[] to indicate an extensible structure.
> There is, however, 4 bytes of pad after the _f[] on most 64bit
> architectures.
>
Thanks, yes even I guessed the same.
> So actually there isn't enough space to anything useful at all.
>
Is this problem does not exist in other UAPI structures ?
Seems like nothing can be done to allow future expansion without
breaking existing things and without developing the new one.
Thanks,
Pintu