2020-03-11 03:45:59

by Jaewon Kim

[permalink] [raw]
Subject: [RFC PATCH 0/3] meminfo: introduce extra meminfo

/proc/meminfo or show_free_areas does not show full system wide memory
usage status. There seems to be huge hidden memory especially on
embedded Android system. Because it usually have some HW IP which do not
have internal memory and use common DRAM memory.

In Android system, most of those hidden memory seems to be vmalloc pages
, ion system heap memory, graphics memory, and memory for DRAM based
compressed swap storage. They may be shown in other node but it seems to
useful if /proc/meminfo shows all those extra memory information. And
show_mem also need to print the info in oom situation.

Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
memory using zsmalloc can be seen through vmstat by commit 91537fee0013
("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.

Memory usage of specific driver can be various so that showing the usage
through upstream meminfo.c is not easy. To print the extra memory usage
of a driver, introduce following APIs. Each driver needs to count as
atomic_long_t.

int register_extra_meminfo(atomic_long_t *val, int shift,
const char *name);
int unregister_extra_meminfo(atomic_long_t *val);

Currently register ION system heap allocator and zsmalloc pages.
Additionally tested on local graphics driver.

i.e) cat /proc/meminfo | tail -3
IonSystemHeap: 242620 kB
ZsPages: 203860 kB
GraphicDriver: 196576 kB

i.e.) show_mem on oom
<6>[ 420.856428] Mem-Info:
<6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
<6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0

Jaewon Kim (3):
proc/meminfo: introduce extra meminfo
mm: zsmalloc: include zs page size in proc/meminfo
android: ion: include system heap size in proc/meminfo

drivers/staging/android/ion/ion.c | 2 +
drivers/staging/android/ion/ion.h | 1 +
drivers/staging/android/ion/ion_system_heap.c | 2 +
fs/proc/meminfo.c | 103 ++++++++++++++++++++++++++
include/linux/mm.h | 4 +
lib/show_mem.c | 1 +
mm/zsmalloc.c | 2 +
7 files changed, 115 insertions(+)

--
2.13.7


2020-03-11 03:46:10

by Jaewon Kim

[permalink] [raw]
Subject: [RFC PATCH 2/3] mm: zsmalloc: include zs page size in proc/meminfo

On most of recent Android device use DRAM memory based compressed swap
to save free memory. And the swap device size is also big enough.

The zsmalloc page size is alread shown on vmstat by commit 91537fee0013
("mm: add NR_ZSMALLOC to vmstat"). If the size is also shown in
/proc/meminfo, it will be better to see system wide memory usage at a
glance.

To include heap size, use register_extra_meminfo introduced in previous
patch.

i.e) cat /proc/meminfo | grep ZsPages
IonSystemHeap: 242620 kB
ZsPages: 203860 kB

i.e.) show_mem on oom
<6>[ 420.856428] Mem-Info:
<6>[ 420.856433] ZsPages:44114kB

Signed-off-by: Jaewon Kim <[email protected]>
---
mm/zsmalloc.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 22d17ecfe7df..9e45d7e0cd69 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -2566,6 +2566,7 @@ static int __init zs_init(void)

zs_stat_init();

+ register_extra_meminfo(&vm_zone_stat[NR_ZSPAGES], 0, "ZsPages");
return 0;

hp_setup_fail:
@@ -2583,6 +2584,7 @@ static void __exit zs_exit(void)
cpuhp_remove_state(CPUHP_MM_ZS_PREPARE);

zs_stat_exit();
+ unregister_extra_meminfo(&vm_zone_stat[NR_ZSPAGES]);
}

module_init(zs_init);
--
2.13.7

2020-03-11 03:46:46

by Jaewon Kim

[permalink] [raw]
Subject: [RFC PATCH 1/3] proc/meminfo: introduce extra meminfo

Provide APIs to drivers so that they can show its memory usage on
/proc/meminfo.

int register_extra_meminfo(atomic_long_t *val, int shift,
const char *name);
int unregister_extra_meminfo(atomic_long_t *val);

Signed-off-by: Jaewon Kim <[email protected]>
---
fs/proc/meminfo.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/mm.h | 4 +++
lib/show_mem.c | 1 +
3 files changed, 108 insertions(+)

diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 8c1f1bb1a5ce..12b1f77b091b 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -13,6 +13,7 @@
#include <linux/vmstat.h>
#include <linux/atomic.h>
#include <linux/vmalloc.h>
+#include <linux/slab.h>
#ifdef CONFIG_CMA
#include <linux/cma.h>
#endif
@@ -30,6 +31,106 @@ static void show_val_kb(struct seq_file *m, const char *s, unsigned long num)
seq_write(m, " kB\n", 4);
}

+static LIST_HEAD(meminfo_head);
+static DEFINE_SPINLOCK(meminfo_lock);
+
+#define NAME_SIZE 15
+#define NAME_BUF_SIZE (NAME_SIZE + 2) /* ':' and '\0' */
+
+struct extra_meminfo {
+ struct list_head list;
+ atomic_long_t *val;
+ int shift_for_page;
+ char name[NAME_BUF_SIZE];
+ char name_pad[NAME_BUF_SIZE];
+};
+
+int register_extra_meminfo(atomic_long_t *val, int shift, const char *name)
+{
+ struct extra_meminfo *meminfo, *memtemp;
+ int len;
+ int error = 0;
+
+ meminfo = kzalloc(sizeof(*meminfo), GFP_KERNEL);
+ if (!meminfo) {
+ error = -ENOMEM;
+ goto out;
+ }
+
+ meminfo->val = val;
+ meminfo->shift_for_page = shift;
+ strncpy(meminfo->name, name, NAME_SIZE);
+ len = strlen(meminfo->name);
+ meminfo->name[len] = ':';
+ strncpy(meminfo->name_pad, meminfo->name, NAME_BUF_SIZE);
+ while (++len < NAME_BUF_SIZE - 1)
+ meminfo->name_pad[len] = ' ';
+
+ spin_lock(&meminfo_lock);
+ list_for_each_entry(memtemp, &meminfo_head, list) {
+ if (memtemp->val == val) {
+ error = -EINVAL;
+ break;
+ }
+ }
+ if (!error)
+ list_add_tail(&meminfo->list, &meminfo_head);
+ spin_unlock(&meminfo_lock);
+ if (error)
+ kfree(meminfo);
+out:
+
+ return error;
+}
+EXPORT_SYMBOL(register_extra_meminfo);
+
+int unregister_extra_meminfo(atomic_long_t *val)
+{
+ struct extra_meminfo *memtemp, *memtemp2;
+ int error = -EINVAL;
+
+ spin_lock(&meminfo_lock);
+ list_for_each_entry_safe(memtemp, memtemp2, &meminfo_head, list) {
+ if (memtemp->val == val) {
+ list_del(&memtemp->list);
+ error = 0;
+ break;
+ }
+ }
+ spin_unlock(&meminfo_lock);
+ kfree(memtemp);
+
+ return error;
+}
+EXPORT_SYMBOL(unregister_extra_meminfo);
+
+static void __extra_meminfo(struct seq_file *m)
+{
+ struct extra_meminfo *memtemp;
+ unsigned long nr_page;
+
+ spin_lock(&meminfo_lock);
+ list_for_each_entry(memtemp, &meminfo_head, list) {
+ nr_page = (unsigned long)atomic_long_read(memtemp->val);
+ nr_page = nr_page >> memtemp->shift_for_page;
+ if (m)
+ show_val_kb(m, memtemp->name_pad, nr_page);
+ else
+ pr_cont("%s%lukB ", memtemp->name, nr_page);
+ }
+ spin_unlock(&meminfo_lock);
+}
+
+void extra_meminfo_log(void)
+{
+ __extra_meminfo(NULL);
+}
+
+static void extra_meminfo_proc(struct seq_file *m)
+{
+ __extra_meminfo(m);
+}
+
static int meminfo_proc_show(struct seq_file *m, void *v)
{
struct sysinfo i;
@@ -148,6 +249,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v)

arch_report_meminfo(m);

+ extra_meminfo_proc(m);
+
return 0;
}

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c54fb96cb1e6..457570ddd17c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2902,6 +2902,10 @@ void __init setup_nr_node_ids(void);
static inline void setup_nr_node_ids(void) {}
#endif

+void extra_meminfo_log(void);
+int register_extra_meminfo(atomic_long_t *val, int shift, const char *name);
+int unregister_extra_meminfo(atomic_long_t *val);
+
extern int memcmp_pages(struct page *page1, struct page *page2);

static inline int pages_identical(struct page *page1, struct page *page2)
diff --git a/lib/show_mem.c b/lib/show_mem.c
index 1c26c14ffbb9..48be5afaca0a 100644
--- a/lib/show_mem.c
+++ b/lib/show_mem.c
@@ -14,6 +14,7 @@ void show_mem(unsigned int filter, nodemask_t *nodemask)
unsigned long total = 0, reserved = 0, highmem = 0;

printk("Mem-Info:\n");
+ extra_meminfo_log();
show_free_areas(filter, nodemask);

for_each_online_pgdat(pgdat) {
--
2.13.7

2020-03-11 03:46:50

by Jaewon Kim

[permalink] [raw]
Subject: [RFC PATCH 3/3] android: ion: include system heap size in proc/meminfo

In Android system ion system heap size is huge like hundreds of MB. To
know overal system memory usage, include ion system heap size in
proc/meminfo.

To include heap size, use register_extra_meminfo introduced in previous
patch.

Prior to register we need to add stats to show the ion heap usage. Add
total_allocated into ion heap and count it on allocation and freeing. In
a ion heap using ION_HEAP_FLAG_DEFER_FREE, a buffer can be freed from
user but still live on deferred free list. Keep stats until the buffer
is finally freed so that we can cover situation of deferred free thread
stuck problem.

i.e) cat /proc/meminfo | grep IonSystemHeap
IonSystemHeap: 242620 kB

i.e.) show_mem on oom
<6>[ 420.856428] Mem-Info:
<6>[ 420.856433] IonSystemHeap:32813kB

Signed-off-by: Jaewon Kim <[email protected]>
---
drivers/staging/android/ion/ion.c | 2 ++
drivers/staging/android/ion/ion.h | 1 +
drivers/staging/android/ion/ion_system_heap.c | 2 ++
3 files changed, 5 insertions(+)

diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
index 38b51eace4f9..76db91a9f26a 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -74,6 +74,7 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap,

INIT_LIST_HEAD(&buffer->attachments);
mutex_init(&buffer->lock);
+ atomic_long_add(len, &heap->total_allocated);
return buffer;

err1:
@@ -95,6 +96,7 @@ void ion_buffer_destroy(struct ion_buffer *buffer)
buffer->heap->num_of_buffers--;
buffer->heap->num_of_alloc_bytes -= buffer->size;
spin_unlock(&buffer->heap->stat_lock);
+ atomic_long_sub(buffer->size, &buffer->heap->total_allocated);

kfree(buffer);
}
diff --git a/drivers/staging/android/ion/ion.h b/drivers/staging/android/ion/ion.h
index 74914a266e25..10867a2e5728 100644
--- a/drivers/staging/android/ion/ion.h
+++ b/drivers/staging/android/ion/ion.h
@@ -157,6 +157,7 @@ struct ion_heap {
u64 num_of_buffers;
u64 num_of_alloc_bytes;
u64 alloc_bytes_wm;
+ atomic_long_t total_allocated;

/* protect heap statistics */
spinlock_t stat_lock;
diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c
index b83a1d16bd89..2cc568e2bc9c 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -259,6 +259,8 @@ static struct ion_heap *__ion_system_heap_create(void)
if (ion_system_heap_create_pools(heap->pools))
goto free_heap;

+ register_extra_meminfo(&heap->heap.total_allocated, PAGE_SHIFT,
+ "IonSystemHeap");
return &heap->heap;

free_heap:
--
2.13.7

2020-03-11 06:20:16

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] proc/meminfo: introduce extra meminfo

On (20/03/11 12:44), Jaewon Kim wrote:
[..]
> +#define NAME_SIZE 15
> +#define NAME_BUF_SIZE (NAME_SIZE + 2) /* ':' and '\0' */
> +
> +struct extra_meminfo {
> + struct list_head list;
> + atomic_long_t *val;
> + int shift_for_page;
> + char name[NAME_BUF_SIZE];
> + char name_pad[NAME_BUF_SIZE];
> +};
> +
> +int register_extra_meminfo(atomic_long_t *val, int shift, const char *name)
> +{
> + struct extra_meminfo *meminfo, *memtemp;
> + int len;
> + int error = 0;
> +
> + meminfo = kzalloc(sizeof(*meminfo), GFP_KERNEL);
> + if (!meminfo) {
> + error = -ENOMEM;
> + goto out;
> + }
> +
> + meminfo->val = val;
> + meminfo->shift_for_page = shift;
> + strncpy(meminfo->name, name, NAME_SIZE);
> + len = strlen(meminfo->name);
> + meminfo->name[len] = ':';
> + strncpy(meminfo->name_pad, meminfo->name, NAME_BUF_SIZE);

What happens if there is no NULL byte among the first NAME_SIZE bytes
of passed `name'?

[..]

> + spin_lock(&meminfo_lock);

Does this need to be a spinlock?

-ss

2020-03-11 06:26:02

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] proc/meminfo: introduce extra meminfo

On (20/03/11 15:18), Sergey Senozhatsky wrote:
> On (20/03/11 12:44), Jaewon Kim wrote:
> [..]
> > +#define NAME_SIZE 15
> > +#define NAME_BUF_SIZE (NAME_SIZE + 2) /* ':' and '\0' */
> > +
> > +struct extra_meminfo {
> > + struct list_head list;
> > + atomic_long_t *val;
> > + int shift_for_page;
> > + char name[NAME_BUF_SIZE];
> > + char name_pad[NAME_BUF_SIZE];
> > +};
> > +
> > +int register_extra_meminfo(atomic_long_t *val, int shift, const char *name)
> > +{
> > + struct extra_meminfo *meminfo, *memtemp;
> > + int len;
> > + int error = 0;
> > +
> > + meminfo = kzalloc(sizeof(*meminfo), GFP_KERNEL);
> > + if (!meminfo) {
> > + error = -ENOMEM;
> > + goto out;
> > + }
> > +
> > + meminfo->val = val;
> > + meminfo->shift_for_page = shift;
> > + strncpy(meminfo->name, name, NAME_SIZE);
> > + len = strlen(meminfo->name);
> > + meminfo->name[len] = ':';
> > + strncpy(meminfo->name_pad, meminfo->name, NAME_BUF_SIZE);
>
> What happens if there is no NULL byte among the first NAME_SIZE bytes
> of passed `name'?

Ah. The buffer size is NAME_BUF_SIZE, so should be fine.

-ss

2020-03-11 06:32:16

by Jaewon Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] proc/meminfo: introduce extra meminfo



On 2020년 03월 11일 15:25, Sergey Senozhatsky wrote:
> On (20/03/11 15:18), Sergey Senozhatsky wrote:
>> On (20/03/11 12:44), Jaewon Kim wrote:
>> [..]
>>> +#define NAME_SIZE 15
>>> +#define NAME_BUF_SIZE (NAME_SIZE + 2) /* ':' and '\0' */
>>> +
>>> +struct extra_meminfo {
>>> + struct list_head list;
>>> + atomic_long_t *val;
>>> + int shift_for_page;
>>> + char name[NAME_BUF_SIZE];
>>> + char name_pad[NAME_BUF_SIZE];
>>> +};
>>> +
>>> +int register_extra_meminfo(atomic_long_t *val, int shift, const char *name)
>>> +{
>>> + struct extra_meminfo *meminfo, *memtemp;
>>> + int len;
>>> + int error = 0;
>>> +
>>> + meminfo = kzalloc(sizeof(*meminfo), GFP_KERNEL);
>>> + if (!meminfo) {
>>> + error = -ENOMEM;
>>> + goto out;
>>> + }
>>> +
>>> + meminfo->val = val;
>>> + meminfo->shift_for_page = shift;
>>> + strncpy(meminfo->name, name, NAME_SIZE);
>>> + len = strlen(meminfo->name);
>>> + meminfo->name[len] = ':';
>>> + strncpy(meminfo->name_pad, meminfo->name, NAME_BUF_SIZE);
>> What happens if there is no NULL byte among the first NAME_SIZE bytes
>> of passed `name'?
> Ah. The buffer size is NAME_BUF_SIZE, so should be fine.
>
> -ss
Hello yes correct.

For your comment of 'spinlock', it may be changed to other lock like rw semaphore.
I think there are just couple of writers compared to many readers.
Thank you for your comment.
>
>

2020-03-11 07:27:20

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

On Wed, Mar 11, 2020 at 12:44:38PM +0900, Jaewon Kim wrote:
> /proc/meminfo or show_free_areas does not show full system wide memory
> usage status. There seems to be huge hidden memory especially on
> embedded Android system. Because it usually have some HW IP which do not
> have internal memory and use common DRAM memory.
>
> In Android system, most of those hidden memory seems to be vmalloc pages
> , ion system heap memory, graphics memory, and memory for DRAM based
> compressed swap storage. They may be shown in other node but it seems to
> useful if /proc/meminfo shows all those extra memory information. And
> show_mem also need to print the info in oom situation.
>
> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
>
> Memory usage of specific driver can be various so that showing the usage
> through upstream meminfo.c is not easy. To print the extra memory usage
> of a driver, introduce following APIs. Each driver needs to count as
> atomic_long_t.
>
> int register_extra_meminfo(atomic_long_t *val, int shift,
> const char *name);
> int unregister_extra_meminfo(atomic_long_t *val);
>
> Currently register ION system heap allocator and zsmalloc pages.
> Additionally tested on local graphics driver.
>
> i.e) cat /proc/meminfo | tail -3
> IonSystemHeap: 242620 kB
> ZsPages: 203860 kB
> GraphicDriver: 196576 kB
>
> i.e.) show_mem on oom
> <6>[ 420.856428] Mem-Info:
> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0

The idea is nice and helpful, but I'm sure that the interface will be abused
almost immediately. I expect that every driver will register to such API.

First it will be done by "large" drivers and after that everyone will copy/paste.

Thanks

2020-03-11 17:37:23

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] proc/meminfo: introduce extra meminfo

On Wed, Mar 11, 2020 at 12:44:39PM +0900, Jaewon Kim wrote:
> Provide APIs to drivers so that they can show its memory usage on
> /proc/meminfo.
>
> int register_extra_meminfo(atomic_long_t *val, int shift,
> const char *name);
> int unregister_extra_meminfo(atomic_long_t *val);

> + show_val_kb(m, memtemp->name_pad, nr_page);

I have 3 issues.

Can this be printed without "KB" piece and without useless whitespace,
like /proc/vmstat does?

I don't know how do you parse /proc/meminfo.
Do you search for specific string or do you use some kind of map[k] = v
interface?

2) zsmalloc can create top-level symlink and resolve it to necessary value.
It will be only 1 readlink(2) system call to fetch it.

3) android can do the same

For simple values there is no need to register stuff and create
mini subsystems.

/proc/alexey

2020-03-13 04:42:28

by Jaewon Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo



On 2020년 03월 11일 16:25, Leon Romanovsky wrote:
> On Wed, Mar 11, 2020 at 12:44:38PM +0900, Jaewon Kim wrote:
>> /proc/meminfo or show_free_areas does not show full system wide memory
>> usage status. There seems to be huge hidden memory especially on
>> embedded Android system. Because it usually have some HW IP which do not
>> have internal memory and use common DRAM memory.
>>
>> In Android system, most of those hidden memory seems to be vmalloc pages
>> , ion system heap memory, graphics memory, and memory for DRAM based
>> compressed swap storage. They may be shown in other node but it seems to
>> useful if /proc/meminfo shows all those extra memory information. And
>> show_mem also need to print the info in oom situation.
>>
>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
>>
>> Memory usage of specific driver can be various so that showing the usage
>> through upstream meminfo.c is not easy. To print the extra memory usage
>> of a driver, introduce following APIs. Each driver needs to count as
>> atomic_long_t.
>>
>> int register_extra_meminfo(atomic_long_t *val, int shift,
>> const char *name);
>> int unregister_extra_meminfo(atomic_long_t *val);
>>
>> Currently register ION system heap allocator and zsmalloc pages.
>> Additionally tested on local graphics driver.
>>
>> i.e) cat /proc/meminfo | tail -3
>> IonSystemHeap: 242620 kB
>> ZsPages: 203860 kB
>> GraphicDriver: 196576 kB
>>
>> i.e.) show_mem on oom
>> <6>[ 420.856428] Mem-Info:
>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
> The idea is nice and helpful, but I'm sure that the interface will be abused
> almost immediately. I expect that every driver will register to such API.
>
> First it will be done by "large" drivers and after that everyone will copy/paste.
I thought using it is up to driver developers.
If it is abused, /proc/meminfo will show too much info. for that device.
What about a new node, /proc/meminfo_extra, to gather those info. and not
corrupting original /proc/meminfo.

Thank you
> Thanks
>
>

2020-03-13 04:54:06

by Jaewon Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 1/3] proc/meminfo: introduce extra meminfo



On 2020년 03월 12일 02:35, Alexey Dobriyan wrote:
> On Wed, Mar 11, 2020 at 12:44:39PM +0900, Jaewon Kim wrote:
>> Provide APIs to drivers so that they can show its memory usage on
>> /proc/meminfo.
>>
>> int register_extra_meminfo(atomic_long_t *val, int shift,
>> const char *name);
>> int unregister_extra_meminfo(atomic_long_t *val);
>> + show_val_kb(m, memtemp->name_pad, nr_page);
> I have 3 issues.
>
> Can this be printed without "KB" piece and without useless whitespace,
> like /proc/vmstat does?
>
> I don't know how do you parse /proc/meminfo.
> Do you search for specific string or do you use some kind of map[k] = v
> interface?
If need, I can remove KB. but show_free_areas also seems to print KB for some stats.
And I intentionally added : and space to be same format like others in /proc/meminfo.
show_val_kb(m, "MemTotal: ", i.totalram);
>
> 2) zsmalloc can create top-level symlink and resolve it to necessary value.
> It will be only 1 readlink(2) system call to fetch it.
Yes it could be done by userspace readlink systemcall. But I wanted to see
all huge memory stats on one node of /proc/meminfo.
>
> 3) android can do the same
>
> For simple values there is no need to register stuff and create
> mini subsystems.
>
> /proc/alexey
OK as Leon Romanovsky said, I may be able to move other node of /proc/meminfo_extra.
Please let me know if it still have a point to be reviewed.

Thank you
>
>

2020-03-13 07:22:23

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

On Fri, Mar 13, 2020 at 01:39:14PM +0900, Jaewon Kim wrote:
>
>
> On 2020년 03월 11일 16:25, Leon Romanovsky wrote:
> > On Wed, Mar 11, 2020 at 12:44:38PM +0900, Jaewon Kim wrote:
> >> /proc/meminfo or show_free_areas does not show full system wide memory
> >> usage status. There seems to be huge hidden memory especially on
> >> embedded Android system. Because it usually have some HW IP which do not
> >> have internal memory and use common DRAM memory.
> >>
> >> In Android system, most of those hidden memory seems to be vmalloc pages
> >> , ion system heap memory, graphics memory, and memory for DRAM based
> >> compressed swap storage. They may be shown in other node but it seems to
> >> useful if /proc/meminfo shows all those extra memory information. And
> >> show_mem also need to print the info in oom situation.
> >>
> >> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> >> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> >> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> >> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
> >>
> >> Memory usage of specific driver can be various so that showing the usage
> >> through upstream meminfo.c is not easy. To print the extra memory usage
> >> of a driver, introduce following APIs. Each driver needs to count as
> >> atomic_long_t.
> >>
> >> int register_extra_meminfo(atomic_long_t *val, int shift,
> >> const char *name);
> >> int unregister_extra_meminfo(atomic_long_t *val);
> >>
> >> Currently register ION system heap allocator and zsmalloc pages.
> >> Additionally tested on local graphics driver.
> >>
> >> i.e) cat /proc/meminfo | tail -3
> >> IonSystemHeap: 242620 kB
> >> ZsPages: 203860 kB
> >> GraphicDriver: 196576 kB
> >>
> >> i.e.) show_mem on oom
> >> <6>[ 420.856428] Mem-Info:
> >> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> >> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
> > The idea is nice and helpful, but I'm sure that the interface will be abused
> > almost immediately. I expect that every driver will register to such API.
> >
> > First it will be done by "large" drivers and after that everyone will copy/paste.
> I thought using it is up to driver developers.
> If it is abused, /proc/meminfo will show too much info. for that device.
> What about a new node, /proc/meminfo_extra, to gather those info. and not
> corrupting original /proc/meminfo.

I don't know if it is applicable for all users, but for the drivers
such info is better to be placed in /sys/ as separate file (for example
/sys/class/net/wlp3s0/*) and driver/core will be responsible to
register/unregister.

It will ensure that all drivers get this info without need to register
and make /proc/meminfo and /proc/meminfo_extra too large.

Thanks

>
> Thank you
> > Thanks
> >
> >
>

2020-03-13 15:20:45

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

+CC linux-api, please include in future versions as well

On 3/11/20 4:44 AM, Jaewon Kim wrote:
> /proc/meminfo or show_free_areas does not show full system wide memory
> usage status. There seems to be huge hidden memory especially on
> embedded Android system. Because it usually have some HW IP which do not
> have internal memory and use common DRAM memory.
>
> In Android system, most of those hidden memory seems to be vmalloc pages
> , ion system heap memory, graphics memory, and memory for DRAM based
> compressed swap storage. They may be shown in other node but it seems to
> useful if /proc/meminfo shows all those extra memory information. And
> show_mem also need to print the info in oom situation.
>
> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
>
> Memory usage of specific driver can be various so that showing the usage
> through upstream meminfo.c is not easy. To print the extra memory usage
> of a driver, introduce following APIs. Each driver needs to count as
> atomic_long_t.
>
> int register_extra_meminfo(atomic_long_t *val, int shift,
> const char *name);
> int unregister_extra_meminfo(atomic_long_t *val);
>
> Currently register ION system heap allocator and zsmalloc pages.
> Additionally tested on local graphics driver.
>
> i.e) cat /proc/meminfo | tail -3
> IonSystemHeap: 242620 kB
> ZsPages: 203860 kB
> GraphicDriver: 196576 kB
>
> i.e.) show_mem on oom
> <6>[ 420.856428] Mem-Info:
> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0

I like the idea and the dynamic nature of this, so that drivers not present
wouldn't add lots of useless zeroes to the output.
It also makes simpler the decisions of "what is important enough to need its own
meminfo entry".

The suggestion for hunting per-driver /sys files would only work if there was a
common name to such files so once can find(1) them easily.
It also doesn't work for the oom/failed alloc warning output.

I think a new meminfo_extra file is a reasonable compromise, as there might be
tools periodically reading /proc/meminfo and thus we would limit the overhead of
that.

> Jaewon Kim (3):
> proc/meminfo: introduce extra meminfo
> mm: zsmalloc: include zs page size in proc/meminfo
> android: ion: include system heap size in proc/meminfo
>
> drivers/staging/android/ion/ion.c | 2 +
> drivers/staging/android/ion/ion.h | 1 +
> drivers/staging/android/ion/ion_system_heap.c | 2 +
> fs/proc/meminfo.c | 103 ++++++++++++++++++++++++++
> include/linux/mm.h | 4 +
> lib/show_mem.c | 1 +
> mm/zsmalloc.c | 2 +
> 7 files changed, 115 insertions(+)
>

2020-03-13 17:49:06

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote:
> +CC linux-api, please include in future versions as well
>
> On 3/11/20 4:44 AM, Jaewon Kim wrote:
> > /proc/meminfo or show_free_areas does not show full system wide memory
> > usage status. There seems to be huge hidden memory especially on
> > embedded Android system. Because it usually have some HW IP which do not
> > have internal memory and use common DRAM memory.
> >
> > In Android system, most of those hidden memory seems to be vmalloc pages
> > , ion system heap memory, graphics memory, and memory for DRAM based
> > compressed swap storage. They may be shown in other node but it seems to
> > useful if /proc/meminfo shows all those extra memory information. And
> > show_mem also need to print the info in oom situation.
> >
> > Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> > ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> > memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> > ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
> >
> > Memory usage of specific driver can be various so that showing the usage
> > through upstream meminfo.c is not easy. To print the extra memory usage
> > of a driver, introduce following APIs. Each driver needs to count as
> > atomic_long_t.
> >
> > int register_extra_meminfo(atomic_long_t *val, int shift,
> > const char *name);
> > int unregister_extra_meminfo(atomic_long_t *val);
> >
> > Currently register ION system heap allocator and zsmalloc pages.
> > Additionally tested on local graphics driver.
> >
> > i.e) cat /proc/meminfo | tail -3
> > IonSystemHeap: 242620 kB
> > ZsPages: 203860 kB
> > GraphicDriver: 196576 kB
> >
> > i.e.) show_mem on oom
> > <6>[ 420.856428] Mem-Info:
> > <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> > <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
>
> I like the idea and the dynamic nature of this, so that drivers not present
> wouldn't add lots of useless zeroes to the output.
> It also makes simpler the decisions of "what is important enough to need its own
> meminfo entry".
>
> The suggestion for hunting per-driver /sys files would only work if there was a
> common name to such files so once can find(1) them easily.
> It also doesn't work for the oom/failed alloc warning output.

Of course there is a need to have a stable name for such an output, this
is why driver/core should be responsible for that and not drivers authors.

The use case which I had in mind slightly different than to look after OOM.

I'm interested to optimize our drivers in their memory footprint to
allow better scale in SR-IOV mode where one device creates many separate
copies of itself. Those copies easily can take gigabytes of RAM due to
the need to optimize for high-performance networking. Sometimes the
amount of memory and not HW is actually limits the scale factor.

So I would imagine this feature being used as an aid for the driver
developers and not for the runtime decisions.

My 2-cents.

Thanks

2020-03-16 04:08:00

by Jaewon Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo



On 2020년 03월 14일 02:48, Leon Romanovsky wrote:
> On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote:
>> +CC linux-api, please include in future versions as well
>>
>> On 3/11/20 4:44 AM, Jaewon Kim wrote:
>>> /proc/meminfo or show_free_areas does not show full system wide memory
>>> usage status. There seems to be huge hidden memory especially on
>>> embedded Android system. Because it usually have some HW IP which do not
>>> have internal memory and use common DRAM memory.
>>>
>>> In Android system, most of those hidden memory seems to be vmalloc pages
>>> , ion system heap memory, graphics memory, and memory for DRAM based
>>> compressed swap storage. They may be shown in other node but it seems to
>>> useful if /proc/meminfo shows all those extra memory information. And
>>> show_mem also need to print the info in oom situation.
>>>
>>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
>>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
>>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
>>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
>>>
>>> Memory usage of specific driver can be various so that showing the usage
>>> through upstream meminfo.c is not easy. To print the extra memory usage
>>> of a driver, introduce following APIs. Each driver needs to count as
>>> atomic_long_t.
>>>
>>> int register_extra_meminfo(atomic_long_t *val, int shift,
>>> const char *name);
>>> int unregister_extra_meminfo(atomic_long_t *val);
>>>
>>> Currently register ION system heap allocator and zsmalloc pages.
>>> Additionally tested on local graphics driver.
>>>
>>> i.e) cat /proc/meminfo | tail -3
>>> IonSystemHeap: 242620 kB
>>> ZsPages: 203860 kB
>>> GraphicDriver: 196576 kB
>>>
>>> i.e.) show_mem on oom
>>> <6>[ 420.856428] Mem-Info:
>>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
>>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
>> I like the idea and the dynamic nature of this, so that drivers not present
>> wouldn't add lots of useless zeroes to the output.
>> It also makes simpler the decisions of "what is important enough to need its own
>> meminfo entry".
>>
>> The suggestion for hunting per-driver /sys files would only work if there was a
>> common name to such files so once can find(1) them easily.
>> It also doesn't work for the oom/failed alloc warning output.
> Of course there is a need to have a stable name for such an output, this
> is why driver/core should be responsible for that and not drivers authors.
>
> The use case which I had in mind slightly different than to look after OOM.
>
> I'm interested to optimize our drivers in their memory footprint to
> allow better scale in SR-IOV mode where one device creates many separate
> copies of itself. Those copies easily can take gigabytes of RAM due to
> the need to optimize for high-performance networking. Sometimes the
> amount of memory and not HW is actually limits the scale factor.
>
> So I would imagine this feature being used as an aid for the driver
> developers and not for the runtime decisions.
>
> My 2-cents.
>
> Thanks
>
>
Thank you for your comment.
My idea, I think, may be able to help each driver developer to see their memory usage.
But I'd like to see overall memory usage through the one node.

Let me know if you have more comment.
I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2.

Thank you
Jaewon Kim

2020-03-16 08:32:50

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote:
>
>
> On 2020년 03월 14일 02:48, Leon Romanovsky wrote:
> > On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote:
> >> +CC linux-api, please include in future versions as well
> >>
> >> On 3/11/20 4:44 AM, Jaewon Kim wrote:
> >>> /proc/meminfo or show_free_areas does not show full system wide memory
> >>> usage status. There seems to be huge hidden memory especially on
> >>> embedded Android system. Because it usually have some HW IP which do not
> >>> have internal memory and use common DRAM memory.
> >>>
> >>> In Android system, most of those hidden memory seems to be vmalloc pages
> >>> , ion system heap memory, graphics memory, and memory for DRAM based
> >>> compressed swap storage. They may be shown in other node but it seems to
> >>> useful if /proc/meminfo shows all those extra memory information. And
> >>> show_mem also need to print the info in oom situation.
> >>>
> >>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> >>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> >>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> >>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
> >>>
> >>> Memory usage of specific driver can be various so that showing the usage
> >>> through upstream meminfo.c is not easy. To print the extra memory usage
> >>> of a driver, introduce following APIs. Each driver needs to count as
> >>> atomic_long_t.
> >>>
> >>> int register_extra_meminfo(atomic_long_t *val, int shift,
> >>> const char *name);
> >>> int unregister_extra_meminfo(atomic_long_t *val);
> >>>
> >>> Currently register ION system heap allocator and zsmalloc pages.
> >>> Additionally tested on local graphics driver.
> >>>
> >>> i.e) cat /proc/meminfo | tail -3
> >>> IonSystemHeap: 242620 kB
> >>> ZsPages: 203860 kB
> >>> GraphicDriver: 196576 kB
> >>>
> >>> i.e.) show_mem on oom
> >>> <6>[ 420.856428] Mem-Info:
> >>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> >>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
> >> I like the idea and the dynamic nature of this, so that drivers not present
> >> wouldn't add lots of useless zeroes to the output.
> >> It also makes simpler the decisions of "what is important enough to need its own
> >> meminfo entry".
> >>
> >> The suggestion for hunting per-driver /sys files would only work if there was a
> >> common name to such files so once can find(1) them easily.
> >> It also doesn't work for the oom/failed alloc warning output.
> > Of course there is a need to have a stable name for such an output, this
> > is why driver/core should be responsible for that and not drivers authors.
> >
> > The use case which I had in mind slightly different than to look after OOM.
> >
> > I'm interested to optimize our drivers in their memory footprint to
> > allow better scale in SR-IOV mode where one device creates many separate
> > copies of itself. Those copies easily can take gigabytes of RAM due to
> > the need to optimize for high-performance networking. Sometimes the
> > amount of memory and not HW is actually limits the scale factor.
> >
> > So I would imagine this feature being used as an aid for the driver
> > developers and not for the runtime decisions.
> >
> > My 2-cents.
> >
> > Thanks
> >
> >
> Thank you for your comment.
> My idea, I think, may be able to help each driver developer to see their memory usage.
> But I'd like to see overall memory usage through the one node.

It is more than enough :).

>
> Let me know if you have more comment.
> I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2.

Can you please help me to understand how that file will look like once
many drivers will start to use this interface? Will I see multiple
lines?

Something like:
driver1 ....
driver2 ....
driver3 ....
...
driver1000 ....

How can we extend it to support subsystems core code?

Thanks

>
> Thank you
> Jaewon Kim

2020-03-17 03:30:34

by Jaewon Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <[email protected]>님이 작성:
>
> On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote:
> >
> >
> > On 2020년 03월 14일 02:48, Leon Romanovsky wrote:
> > > On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote:
> > >> +CC linux-api, please include in future versions as well
> > >>
> > >> On 3/11/20 4:44 AM, Jaewon Kim wrote:
> > >>> /proc/meminfo or show_free_areas does not show full system wide memory
> > >>> usage status. There seems to be huge hidden memory especially on
> > >>> embedded Android system. Because it usually have some HW IP which do not
> > >>> have internal memory and use common DRAM memory.
> > >>>
> > >>> In Android system, most of those hidden memory seems to be vmalloc pages
> > >>> , ion system heap memory, graphics memory, and memory for DRAM based
> > >>> compressed swap storage. They may be shown in other node but it seems to
> > >>> useful if /proc/meminfo shows all those extra memory information. And
> > >>> show_mem also need to print the info in oom situation.
> > >>>
> > >>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> > >>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> > >>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> > >>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
> > >>>
> > >>> Memory usage of specific driver can be various so that showing the usage
> > >>> through upstream meminfo.c is not easy. To print the extra memory usage
> > >>> of a driver, introduce following APIs. Each driver needs to count as
> > >>> atomic_long_t.
> > >>>
> > >>> int register_extra_meminfo(atomic_long_t *val, int shift,
> > >>> const char *name);
> > >>> int unregister_extra_meminfo(atomic_long_t *val);
> > >>>
> > >>> Currently register ION system heap allocator and zsmalloc pages.
> > >>> Additionally tested on local graphics driver.
> > >>>
> > >>> i.e) cat /proc/meminfo | tail -3
> > >>> IonSystemHeap: 242620 kB
> > >>> ZsPages: 203860 kB
> > >>> GraphicDriver: 196576 kB
> > >>>
> > >>> i.e.) show_mem on oom
> > >>> <6>[ 420.856428] Mem-Info:
> > >>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> > >>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
> > >> I like the idea and the dynamic nature of this, so that drivers not present
> > >> wouldn't add lots of useless zeroes to the output.
> > >> It also makes simpler the decisions of "what is important enough to need its own
> > >> meminfo entry".
> > >>
> > >> The suggestion for hunting per-driver /sys files would only work if there was a
> > >> common name to such files so once can find(1) them easily.
> > >> It also doesn't work for the oom/failed alloc warning output.
> > > Of course there is a need to have a stable name for such an output, this
> > > is why driver/core should be responsible for that and not drivers authors.
> > >
> > > The use case which I had in mind slightly different than to look after OOM.
> > >
> > > I'm interested to optimize our drivers in their memory footprint to
> > > allow better scale in SR-IOV mode where one device creates many separate
> > > copies of itself. Those copies easily can take gigabytes of RAM due to
> > > the need to optimize for high-performance networking. Sometimes the
> > > amount of memory and not HW is actually limits the scale factor.
> > >
> > > So I would imagine this feature being used as an aid for the driver
> > > developers and not for the runtime decisions.
> > >
> > > My 2-cents.
> > >
> > > Thanks
> > >
> > >
> > Thank you for your comment.
> > My idea, I think, may be able to help each driver developer to see their memory usage.
> > But I'd like to see overall memory usage through the one node.
>
> It is more than enough :).
>
> >
> > Let me know if you have more comment.
> > I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2.
>
> Can you please help me to understand how that file will look like once
> many drivers will start to use this interface? Will I see multiple
> lines?
>
> Something like:
> driver1 ....
> driver2 ....
> driver3 ....
> ...
> driver1000 ....
>
> How can we extend it to support subsystems core code?

I do not have a plan to support subsystem core.

I just want the /proc/meminfo_extra to show size of alloc_pages APIs
rather than slub size. It is to show hidden huge memory.
I think most of drivers do not need to register its size to
/proc/meminfo_extra because
drivers usually use slub APIs and rather than alloc_pages APIs.
/proc/slabinfo helps for slub size in detail.

As a candidate of /proc/meminfo_extra, I hope only few drivers using
huge memory like over 100 MB got from alloc_pages APIs.

As you say, if there is a static node on /sys for each driver, it may
be used for all the drivers.
I think sysfs class way may be better to show categorized sum size.
But /proc/meminfo_extra can be another way to show those hidden huge memory.
I mean your idea and my idea is not exclusive.

Thank you
>
> Thanks
>
> >
> > Thank you
> > Jaewon Kim

2020-03-17 14:38:12

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

On Tue, Mar 17, 2020 at 12:04:46PM +0900, Jaewon Kim wrote:
> 2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <[email protected]>님이 작성:
> >
> > On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote:
> > >
> > >
> > > On 2020년 03월 14일 02:48, Leon Romanovsky wrote:
> > > > On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote:
> > > >> +CC linux-api, please include in future versions as well
> > > >>
> > > >> On 3/11/20 4:44 AM, Jaewon Kim wrote:
> > > >>> /proc/meminfo or show_free_areas does not show full system wide memory
> > > >>> usage status. There seems to be huge hidden memory especially on
> > > >>> embedded Android system. Because it usually have some HW IP which do not
> > > >>> have internal memory and use common DRAM memory.
> > > >>>
> > > >>> In Android system, most of those hidden memory seems to be vmalloc pages
> > > >>> , ion system heap memory, graphics memory, and memory for DRAM based
> > > >>> compressed swap storage. They may be shown in other node but it seems to
> > > >>> useful if /proc/meminfo shows all those extra memory information. And
> > > >>> show_mem also need to print the info in oom situation.
> > > >>>
> > > >>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> > > >>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> > > >>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> > > >>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
> > > >>>
> > > >>> Memory usage of specific driver can be various so that showing the usage
> > > >>> through upstream meminfo.c is not easy. To print the extra memory usage
> > > >>> of a driver, introduce following APIs. Each driver needs to count as
> > > >>> atomic_long_t.
> > > >>>
> > > >>> int register_extra_meminfo(atomic_long_t *val, int shift,
> > > >>> const char *name);
> > > >>> int unregister_extra_meminfo(atomic_long_t *val);
> > > >>>
> > > >>> Currently register ION system heap allocator and zsmalloc pages.
> > > >>> Additionally tested on local graphics driver.
> > > >>>
> > > >>> i.e) cat /proc/meminfo | tail -3
> > > >>> IonSystemHeap: 242620 kB
> > > >>> ZsPages: 203860 kB
> > > >>> GraphicDriver: 196576 kB
> > > >>>
> > > >>> i.e.) show_mem on oom
> > > >>> <6>[ 420.856428] Mem-Info:
> > > >>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> > > >>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
> > > >> I like the idea and the dynamic nature of this, so that drivers not present
> > > >> wouldn't add lots of useless zeroes to the output.
> > > >> It also makes simpler the decisions of "what is important enough to need its own
> > > >> meminfo entry".
> > > >>
> > > >> The suggestion for hunting per-driver /sys files would only work if there was a
> > > >> common name to such files so once can find(1) them easily.
> > > >> It also doesn't work for the oom/failed alloc warning output.
> > > > Of course there is a need to have a stable name for such an output, this
> > > > is why driver/core should be responsible for that and not drivers authors.
> > > >
> > > > The use case which I had in mind slightly different than to look after OOM.
> > > >
> > > > I'm interested to optimize our drivers in their memory footprint to
> > > > allow better scale in SR-IOV mode where one device creates many separate
> > > > copies of itself. Those copies easily can take gigabytes of RAM due to
> > > > the need to optimize for high-performance networking. Sometimes the
> > > > amount of memory and not HW is actually limits the scale factor.
> > > >
> > > > So I would imagine this feature being used as an aid for the driver
> > > > developers and not for the runtime decisions.
> > > >
> > > > My 2-cents.
> > > >
> > > > Thanks
> > > >
> > > >
> > > Thank you for your comment.
> > > My idea, I think, may be able to help each driver developer to see their memory usage.
> > > But I'd like to see overall memory usage through the one node.
> >
> > It is more than enough :).
> >
> > >
> > > Let me know if you have more comment.
> > > I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2.
> >
> > Can you please help me to understand how that file will look like once
> > many drivers will start to use this interface? Will I see multiple
> > lines?
> >
> > Something like:
> > driver1 ....
> > driver2 ....
> > driver3 ....
> > ...
> > driver1000 ....
> >
> > How can we extend it to support subsystems core code?
>
> I do not have a plan to support subsystem core.

Fair enough.

>
> I just want the /proc/meminfo_extra to show size of alloc_pages APIs
> rather than slub size. It is to show hidden huge memory.
> I think most of drivers do not need to register its size to
> /proc/meminfo_extra because
> drivers usually use slub APIs and rather than alloc_pages APIs.
> /proc/slabinfo helps for slub size in detail.

The problem with this statement that the drivers that consuming memory
are the ones who are interested in this interface. I can be not accurate
here, but I think that all RDMA and major NICs will want to get this
information.

On my machine, it is something like 6 devices.

>
> As a candidate of /proc/meminfo_extra, I hope only few drivers using
> huge memory like over 100 MB got from alloc_pages APIs.
>
> As you say, if there is a static node on /sys for each driver, it may
> be used for all the drivers.
> I think sysfs class way may be better to show categorized sum size.
> But /proc/meminfo_extra can be another way to show those hidden huge memory.
> I mean your idea and my idea is not exclusive.

It is just better to have one interface.

>
> Thank you
> >
> > Thanks
> >
> > >
> > > Thank you
> > > Jaewon Kim
>

2020-03-18 08:59:52

by Jaewon Kim

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo



On 2020년 03월 17일 23:37, Leon Romanovsky wrote:
> On Tue, Mar 17, 2020 at 12:04:46PM +0900, Jaewon Kim wrote:
>> 2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <[email protected]>님이 작성:
>>> On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote:
>>>>
>>>> On 2020년 03월 14일 02:48, Leon Romanovsky wrote:
>>>>> On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote:
>>>>>> +CC linux-api, please include in future versions as well
>>>>>>
>>>>>> On 3/11/20 4:44 AM, Jaewon Kim wrote:
>>>>>>> /proc/meminfo or show_free_areas does not show full system wide memory
>>>>>>> usage status. There seems to be huge hidden memory especially on
>>>>>>> embedded Android system. Because it usually have some HW IP which do not
>>>>>>> have internal memory and use common DRAM memory.
>>>>>>>
>>>>>>> In Android system, most of those hidden memory seems to be vmalloc pages
>>>>>>> , ion system heap memory, graphics memory, and memory for DRAM based
>>>>>>> compressed swap storage. They may be shown in other node but it seems to
>>>>>>> useful if /proc/meminfo shows all those extra memory information. And
>>>>>>> show_mem also need to print the info in oom situation.
>>>>>>>
>>>>>>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
>>>>>>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
>>>>>>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
>>>>>>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
>>>>>>>
>>>>>>> Memory usage of specific driver can be various so that showing the usage
>>>>>>> through upstream meminfo.c is not easy. To print the extra memory usage
>>>>>>> of a driver, introduce following APIs. Each driver needs to count as
>>>>>>> atomic_long_t.
>>>>>>>
>>>>>>> int register_extra_meminfo(atomic_long_t *val, int shift,
>>>>>>> const char *name);
>>>>>>> int unregister_extra_meminfo(atomic_long_t *val);
>>>>>>>
>>>>>>> Currently register ION system heap allocator and zsmalloc pages.
>>>>>>> Additionally tested on local graphics driver.
>>>>>>>
>>>>>>> i.e) cat /proc/meminfo | tail -3
>>>>>>> IonSystemHeap: 242620 kB
>>>>>>> ZsPages: 203860 kB
>>>>>>> GraphicDriver: 196576 kB
>>>>>>>
>>>>>>> i.e.) show_mem on oom
>>>>>>> <6>[ 420.856428] Mem-Info:
>>>>>>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
>>>>>>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
>>>>>> I like the idea and the dynamic nature of this, so that drivers not present
>>>>>> wouldn't add lots of useless zeroes to the output.
>>>>>> It also makes simpler the decisions of "what is important enough to need its own
>>>>>> meminfo entry".
>>>>>>
>>>>>> The suggestion for hunting per-driver /sys files would only work if there was a
>>>>>> common name to such files so once can find(1) them easily.
>>>>>> It also doesn't work for the oom/failed alloc warning output.
>>>>> Of course there is a need to have a stable name for such an output, this
>>>>> is why driver/core should be responsible for that and not drivers authors.
>>>>>
>>>>> The use case which I had in mind slightly different than to look after OOM.
>>>>>
>>>>> I'm interested to optimize our drivers in their memory footprint to
>>>>> allow better scale in SR-IOV mode where one device creates many separate
>>>>> copies of itself. Those copies easily can take gigabytes of RAM due to
>>>>> the need to optimize for high-performance networking. Sometimes the
>>>>> amount of memory and not HW is actually limits the scale factor.
>>>>>
>>>>> So I would imagine this feature being used as an aid for the driver
>>>>> developers and not for the runtime decisions.
>>>>>
>>>>> My 2-cents.
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>> Thank you for your comment.
>>>> My idea, I think, may be able to help each driver developer to see their memory usage.
>>>> But I'd like to see overall memory usage through the one node.
>>> It is more than enough :).
>>>
>>>> Let me know if you have more comment.
>>>> I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2.
>>> Can you please help me to understand how that file will look like once
>>> many drivers will start to use this interface? Will I see multiple
>>> lines?
>>>
>>> Something like:
>>> driver1 ....
>>> driver2 ....
>>> driver3 ....
>>> ...
>>> driver1000 ....
>>>
>>> How can we extend it to support subsystems core code?
>> I do not have a plan to support subsystem core.
> Fair enough.
>
>> I just want the /proc/meminfo_extra to show size of alloc_pages APIs
>> rather than slub size. It is to show hidden huge memory.
>> I think most of drivers do not need to register its size to
>> /proc/meminfo_extra because
>> drivers usually use slub APIs and rather than alloc_pages APIs.
>> /proc/slabinfo helps for slub size in detail.
> The problem with this statement that the drivers that consuming memory
> are the ones who are interested in this interface. I can be not accurate
> here, but I think that all RDMA and major NICs will want to get this
> information.
>
> On my machine, it is something like 6 devices.
>
>> As a candidate of /proc/meminfo_extra, I hope only few drivers using
>> huge memory like over 100 MB got from alloc_pages APIs.
>>
>> As you say, if there is a static node on /sys for each driver, it may
>> be used for all the drivers.
>> I think sysfs class way may be better to show categorized sum size.
>> But /proc/meminfo_extra can be another way to show those hidden huge memory.
>> I mean your idea and my idea is not exclusive.
> It is just better to have one interface.
Sorry about that one interface.

If we need to create a-meminfo_extra-like node on /sysfs, then
I think further discussion with more people is needed.
If there is no logical problem on creating /proc/meminfo_extra,
I'd like to prepare v2 patch and get more comment on that v2
patch. Please help again for further discussion.

Thank you
>
>> Thank you
>>> Thanks
>>>
>>>> Thank you
>>>> Jaewon Kim
>

2020-03-18 11:00:08

by Leon Romanovsky

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

On Wed, Mar 18, 2020 at 05:58:51PM +0900, Jaewon Kim wrote:
>
>
> On 2020년 03월 17일 23:37, Leon Romanovsky wrote:
> > On Tue, Mar 17, 2020 at 12:04:46PM +0900, Jaewon Kim wrote:
> >> 2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <[email protected]>님이 작성:
> >>> On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote:
> >>>>
> >>>> On 2020년 03월 14일 02:48, Leon Romanovsky wrote:
> >>>>> On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote:
> >>>>>> +CC linux-api, please include in future versions as well
> >>>>>>
> >>>>>> On 3/11/20 4:44 AM, Jaewon Kim wrote:
> >>>>>>> /proc/meminfo or show_free_areas does not show full system wide memory
> >>>>>>> usage status. There seems to be huge hidden memory especially on
> >>>>>>> embedded Android system. Because it usually have some HW IP which do not
> >>>>>>> have internal memory and use common DRAM memory.
> >>>>>>>
> >>>>>>> In Android system, most of those hidden memory seems to be vmalloc pages
> >>>>>>> , ion system heap memory, graphics memory, and memory for DRAM based
> >>>>>>> compressed swap storage. They may be shown in other node but it seems to
> >>>>>>> useful if /proc/meminfo shows all those extra memory information. And
> >>>>>>> show_mem also need to print the info in oom situation.
> >>>>>>>
> >>>>>>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> >>>>>>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> >>>>>>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> >>>>>>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
> >>>>>>>
> >>>>>>> Memory usage of specific driver can be various so that showing the usage
> >>>>>>> through upstream meminfo.c is not easy. To print the extra memory usage
> >>>>>>> of a driver, introduce following APIs. Each driver needs to count as
> >>>>>>> atomic_long_t.
> >>>>>>>
> >>>>>>> int register_extra_meminfo(atomic_long_t *val, int shift,
> >>>>>>> const char *name);
> >>>>>>> int unregister_extra_meminfo(atomic_long_t *val);
> >>>>>>>
> >>>>>>> Currently register ION system heap allocator and zsmalloc pages.
> >>>>>>> Additionally tested on local graphics driver.
> >>>>>>>
> >>>>>>> i.e) cat /proc/meminfo | tail -3
> >>>>>>> IonSystemHeap: 242620 kB
> >>>>>>> ZsPages: 203860 kB
> >>>>>>> GraphicDriver: 196576 kB
> >>>>>>>
> >>>>>>> i.e.) show_mem on oom
> >>>>>>> <6>[ 420.856428] Mem-Info:
> >>>>>>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> >>>>>>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0
> >>>>>> I like the idea and the dynamic nature of this, so that drivers not present
> >>>>>> wouldn't add lots of useless zeroes to the output.
> >>>>>> It also makes simpler the decisions of "what is important enough to need its own
> >>>>>> meminfo entry".
> >>>>>>
> >>>>>> The suggestion for hunting per-driver /sys files would only work if there was a
> >>>>>> common name to such files so once can find(1) them easily.
> >>>>>> It also doesn't work for the oom/failed alloc warning output.
> >>>>> Of course there is a need to have a stable name for such an output, this
> >>>>> is why driver/core should be responsible for that and not drivers authors.
> >>>>>
> >>>>> The use case which I had in mind slightly different than to look after OOM.
> >>>>>
> >>>>> I'm interested to optimize our drivers in their memory footprint to
> >>>>> allow better scale in SR-IOV mode where one device creates many separate
> >>>>> copies of itself. Those copies easily can take gigabytes of RAM due to
> >>>>> the need to optimize for high-performance networking. Sometimes the
> >>>>> amount of memory and not HW is actually limits the scale factor.
> >>>>>
> >>>>> So I would imagine this feature being used as an aid for the driver
> >>>>> developers and not for the runtime decisions.
> >>>>>
> >>>>> My 2-cents.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>>
> >>>> Thank you for your comment.
> >>>> My idea, I think, may be able to help each driver developer to see their memory usage.
> >>>> But I'd like to see overall memory usage through the one node.
> >>> It is more than enough :).
> >>>
> >>>> Let me know if you have more comment.
> >>>> I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2.
> >>> Can you please help me to understand how that file will look like once
> >>> many drivers will start to use this interface? Will I see multiple
> >>> lines?
> >>>
> >>> Something like:
> >>> driver1 ....
> >>> driver2 ....
> >>> driver3 ....
> >>> ...
> >>> driver1000 ....
> >>>
> >>> How can we extend it to support subsystems core code?
> >> I do not have a plan to support subsystem core.
> > Fair enough.
> >
> >> I just want the /proc/meminfo_extra to show size of alloc_pages APIs
> >> rather than slub size. It is to show hidden huge memory.
> >> I think most of drivers do not need to register its size to
> >> /proc/meminfo_extra because
> >> drivers usually use slub APIs and rather than alloc_pages APIs.
> >> /proc/slabinfo helps for slub size in detail.
> > The problem with this statement that the drivers that consuming memory
> > are the ones who are interested in this interface. I can be not accurate
> > here, but I think that all RDMA and major NICs will want to get this
> > information.
> >
> > On my machine, it is something like 6 devices.
> >
> >> As a candidate of /proc/meminfo_extra, I hope only few drivers using
> >> huge memory like over 100 MB got from alloc_pages APIs.
> >>
> >> As you say, if there is a static node on /sys for each driver, it may
> >> be used for all the drivers.
> >> I think sysfs class way may be better to show categorized sum size.
> >> But /proc/meminfo_extra can be another way to show those hidden huge memory.
> >> I mean your idea and my idea is not exclusive.
> > It is just better to have one interface.
> Sorry about that one interface.
>
> If we need to create a-meminfo_extra-like node on /sysfs, then
> I think further discussion with more people is needed.
> If there is no logical problem on creating /proc/meminfo_extra,
> I'd like to prepare v2 patch and get more comment on that v2
> patch. Please help again for further discussion.

No problem, but can you please the summary of that discussion in the
cover letter of v2 and add Greg KH as the driver/core maintainer?

It will save from us to go in circles.

Thanks

>
> Thank you
> >
> >> Thank you
> >>> Thanks
> >>>
> >>>> Thank you
> >>>> Jaewon Kim
> >
>

2020-03-20 10:02:25

by Dave Young

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] meminfo: introduce extra meminfo

On 03/11/20 at 12:44pm, Jaewon Kim wrote:
> /proc/meminfo or show_free_areas does not show full system wide memory
> usage status. There seems to be huge hidden memory especially on
> embedded Android system. Because it usually have some HW IP which do not
> have internal memory and use common DRAM memory.
>
> In Android system, most of those hidden memory seems to be vmalloc pages
> , ion system heap memory, graphics memory, and memory for DRAM based
> compressed swap storage. They may be shown in other node but it seems to
> useful if /proc/meminfo shows all those extra memory information. And
> show_mem also need to print the info in oom situation.
>
> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8
> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap
> memory using zsmalloc can be seen through vmstat by commit 91537fee0013
> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo.
>
> Memory usage of specific driver can be various so that showing the usage
> through upstream meminfo.c is not easy. To print the extra memory usage
> of a driver, introduce following APIs. Each driver needs to count as
> atomic_long_t.
>
> int register_extra_meminfo(atomic_long_t *val, int shift,
> const char *name);
> int unregister_extra_meminfo(atomic_long_t *val);
>
> Currently register ION system heap allocator and zsmalloc pages.
> Additionally tested on local graphics driver.
>
> i.e) cat /proc/meminfo | tail -3
> IonSystemHeap: 242620 kB
> ZsPages: 203860 kB
> GraphicDriver: 196576 kB
>
> i.e.) show_mem on oom
> <6>[ 420.856428] Mem-Info:
> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB
> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0

Kdump is also a use case for having a better memory use info, it runs
with limited memory, and we see more oom cases from device drivers
instead of userspace processes.

I think this might be helpful if drivers can implement and register the
hook. But it would be ideal if we can have some tracing code to trace
the memory alloc/free and get the memory use info automatically.

Anyway the proposal is better than none, thumb up!

Let me cc Kairui who is working on kdump oom issues.

Thanks
Dave