2023-07-22 21:26:44

by Ajay Kaher

[permalink] [raw]
Subject: [PATCH v5 00/10] tracing: introducing eventfs

Events Tracing infrastructure contains lot of files, directories
(internally in terms of inodes, dentries). And ends up by consuming
memory in MBs. We can have multiple events of Events Tracing, which
further requires more memory.

Instead of creating inodes/dentries, eventfs could keep meta-data and
skip the creation of inodes/dentries. As and when require, eventfs will
create the inodes/dentries only for required files/directories.
Also eventfs would delete the inodes/dentries once no more requires
but preserve the meta data.

Tracing events took ~9MB, with this approach it took ~4.5MB
for ~10K files/dir.

Diff from v4:
Patch 02: moved from v4 08/10
added fs/tracefs/internal.h

Patch 03: moved from v4 02/10
removed fs/tracefs/internal.h

Patch 04: moved from v4 03/10
moved out changes of fs/tracefs/internal.h

Patch 05: moved from v4 04/10
renamed eventfs_add_top_file() -> eventfs_add_events_file()

Patch 06: moved from v4 07/10
implemented create_dentry() helper function
added create_file(), create_dir() stub function

Patch 07: moved from v4 06/10
Patch 08: moved from v4 05/10
improved eventfs remove functionality

Patch 09: removed unwanted if conditions

Patch 10: added available_filter_functions check


Diff from v3:
Patch 3,4,5,7,9:
removed all the eventfs_rwsem code and replaced it with an srcu
lock for the readers, and a mutex to synchronize the writers of
the list.

Patch 2: moved 'tracefs_inode' and 'get_tracefs()' to v4 03/10

Patch 3: moved the struct eventfs_file and eventfs_inode into event_inode.c
as it really should not be exposed to all users.

Patch 5: added a recursion check to eventfs_remove_rec() as it is really
dangerous to have unchecked recursion in the kernel (we do have
a fixed size stack).

have the free use srcu callbacks. After the srcu grace periods
are done, it adds the eventfs_file onto a llist (lockless link
list) and wakes up a work queue. Then the work queue does the
freeing (this needs to be done in task/workqueue context, as
srcu callbacks are done in softirq context).

Patch 6: renamed:
eventfs_create_file() -> create_file()
eventfs_create_dir() -> create_dir()


Diff from v2:
Patch 01: new patch:'Require all trace events to have a TRACE_SYSTEM'
Patch 02: moved from v1 1/9
Patch 03: moved from v1 2/9
As suggested by Zheng Yejian, introduced eventfs_prepare_ef()
helper function to add files or directories to eventfs
fix WARNING reported by kernel test robot in v1 8/9
Patch 04: moved from v1 3/9
used eventfs_prepare_ef() to add files
fix WARNING reported by kernel test robot in v1 8/9
Patch 05: moved from v1 4/9
fix compiling warning reported by kernel test robot in v1 4/9
Patch 06: moved from v1 5/9
Patch 07: moved from v1 6/9
Patch 08: moved from v1 7/9
Patch 09: moved from v1 8/9
rebased because of v3 01/10
Patch 10: moved from v1 9/9


Diff from v1:
Patch 1: add header file
Patch 2: resolved kernel test robot issues
protecting eventfs lists using nested eventfs_rwsem
Patch 3: protecting eventfs lists using nested eventfs_rwsem
Patch 4: improve events cleanup code to fix crashes
Patch 5: resolved kernel test robot issues
removed d_instantiate_anon() calls
Patch 6: resolved kernel test robot issues
fix kprobe test in eventfs_root_lookup()
protecting eventfs lists using nested eventfs_rwsem
Patch 7: remove header file
Patch 8: pass eventfs_rwsem as argument to eventfs functions
called eventfs_remove_events_dir() instead of tracefs_remove()
from event_trace_del_tracer()
Patch 9: new patch to fix kprobe test case

fs/tracefs/Makefile | 1 +
fs/tracefs/event_inode.c | 795 ++++++++++++++++++
fs/tracefs/inode.c | 151 +++-
fs/tracefs/internal.h | 26 +
include/linux/trace_events.h | 1 +
include/linux/tracefs.h | 30 +
kernel/trace/trace.h | 2 +-
kernel/trace/trace_events.c | 76 +-
.../ftrace/test.d/kprobe/kprobe_args_char.tc | 9 +-
.../test.d/kprobe/kprobe_args_string.tc | 9 +-
10 files changed, 1048 insertions(+), 52 deletions(-)
create mode 100644 fs/tracefs/event_inode.c
create mode 100644 fs/tracefs/internal.h

--
2.39.0



2023-07-22 21:27:51

by Ajay Kaher

[permalink] [raw]
Subject: [PATCH v5 02/10] eventfs: Implement tracefs_inode_cache

Create a kmem cache of tracefs_inodes. To be more efficient, as there are
lots of tracefs inodes, create its own cache. This also allows to see how
many tracefs inodes have been created.

Add helper functions:
tracefs_alloc_inode()
tracefs_free_inode()
get_tracefs()

Signed-off-by: Ajay Kaher <[email protected]>
Co-developed-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Tested-by: Ching-lin Yu <[email protected]>
---
fs/tracefs/inode.c | 39 +++++++++++++++++++++++++++++++++++++++
fs/tracefs/internal.h | 19 +++++++++++++++++++
2 files changed, 58 insertions(+)
create mode 100644 fs/tracefs/internal.h

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 57ac8aa4a724..2508944cc4d8 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -21,13 +21,33 @@
#include <linux/parser.h>
#include <linux/magic.h>
#include <linux/slab.h>
+#include "internal.h"

#define TRACEFS_DEFAULT_MODE 0700
+static struct kmem_cache *tracefs_inode_cachep __ro_after_init;

static struct vfsmount *tracefs_mount;
static int tracefs_mount_count;
static bool tracefs_registered;

+static struct inode *tracefs_alloc_inode(struct super_block *sb)
+{
+ struct tracefs_inode *ti;
+
+ ti = kmem_cache_alloc(tracefs_inode_cachep, GFP_KERNEL);
+ if (!ti)
+ return NULL;
+
+ ti->flags = 0;
+
+ return &ti->vfs_inode;
+}
+
+static void tracefs_free_inode(struct inode *inode)
+{
+ kmem_cache_free(tracefs_inode_cachep, get_tracefs(inode));
+}
+
static ssize_t default_read_file(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
@@ -346,6 +366,9 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
}

static const struct super_operations tracefs_super_operations = {
+ .alloc_inode = tracefs_alloc_inode,
+ .free_inode = tracefs_free_inode,
+ .drop_inode = generic_delete_inode,
.statfs = simple_statfs,
.remount_fs = tracefs_remount,
.show_options = tracefs_show_options,
@@ -628,10 +651,26 @@ bool tracefs_initialized(void)
return tracefs_registered;
}

+static void init_once(void *foo)
+{
+ struct tracefs_inode *ti = (struct tracefs_inode *) foo;
+
+ inode_init_once(&ti->vfs_inode);
+}
+
static int __init tracefs_init(void)
{
int retval;

+ tracefs_inode_cachep = kmem_cache_create("tracefs_inode_cache",
+ sizeof(struct tracefs_inode),
+ 0, (SLAB_RECLAIM_ACCOUNT|
+ SLAB_MEM_SPREAD|
+ SLAB_ACCOUNT),
+ init_once);
+ if (!tracefs_inode_cachep)
+ return -ENOMEM;
+
retval = sysfs_create_mount_point(kernel_kobj, "tracing");
if (retval)
return -EINVAL;
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
new file mode 100644
index 000000000000..49b5e8949e1c
--- /dev/null
+++ b/fs/tracefs/internal.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _TRACEFS_INTERNAL_H
+#define _TRACEFS_INTERNAL_H
+
+enum {
+ TRACEFS_EVENT_INODE = BIT(1),
+};
+
+struct tracefs_inode {
+ unsigned long flags;
+ void *private;
+ struct inode vfs_inode;
+};
+
+static inline struct tracefs_inode *get_tracefs(const struct inode *inode)
+{
+ return container_of(inode, struct tracefs_inode, vfs_inode);
+}
+#endif /* _TRACEFS_INTERNAL_H */
--
2.39.0


2023-07-25 23:35:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v5 02/10] eventfs: Implement tracefs_inode_cache


Some more nits.

On Sun, 23 Jul 2023 01:06:57 +0530
Ajay Kaher <[email protected]> wrote:

> Create a kmem cache of tracefs_inodes. To be more efficient, as there are
> lots of tracefs inodes, create its own cache. This also allows to see how
> many tracefs inodes have been created.
>
> Add helper functions:
> tracefs_alloc_inode()
> tracefs_free_inode()
> get_tracefs()
>
> Signed-off-by: Ajay Kaher <[email protected]>
> Co-developed-by: Steven Rostedt (VMware) <[email protected]>
> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> Tested-by: Ching-lin Yu <[email protected]>
> ---
> fs/tracefs/inode.c | 39 +++++++++++++++++++++++++++++++++++++++
> fs/tracefs/internal.h | 19 +++++++++++++++++++
> 2 files changed, 58 insertions(+)
> create mode 100644 fs/tracefs/internal.h
>
> diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
> index 57ac8aa4a724..2508944cc4d8 100644
> --- a/fs/tracefs/inode.c
> +++ b/fs/tracefs/inode.c
> @@ -21,13 +21,33 @@
> #include <linux/parser.h>
> #include <linux/magic.h>
> #include <linux/slab.h>
> +#include "internal.h"
>
> #define TRACEFS_DEFAULT_MODE 0700
> +static struct kmem_cache *tracefs_inode_cachep __ro_after_init;
>
> static struct vfsmount *tracefs_mount;
> static int tracefs_mount_count;
> static bool tracefs_registered;
>
> +static struct inode *tracefs_alloc_inode(struct super_block *sb)
> +{
> + struct tracefs_inode *ti;
> +
> + ti = kmem_cache_alloc(tracefs_inode_cachep, GFP_KERNEL);
> + if (!ti)
> + return NULL;
> +
> + ti->flags = 0;
> +
> + return &ti->vfs_inode;
> +}
> +
> +static void tracefs_free_inode(struct inode *inode)
> +{
> + kmem_cache_free(tracefs_inode_cachep, get_tracefs(inode));
> +}
> +
> static ssize_t default_read_file(struct file *file, char __user *buf,
> size_t count, loff_t *ppos)
> {
> @@ -346,6 +366,9 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
> }
>
> static const struct super_operations tracefs_super_operations = {
> + .alloc_inode = tracefs_alloc_inode,
> + .free_inode = tracefs_free_inode,
> + .drop_inode = generic_delete_inode,
> .statfs = simple_statfs,
> .remount_fs = tracefs_remount,
> .show_options = tracefs_show_options,
> @@ -628,10 +651,26 @@ bool tracefs_initialized(void)
> return tracefs_registered;
> }
>
> +static void init_once(void *foo)
> +{
> + struct tracefs_inode *ti = (struct tracefs_inode *) foo;
> +
> + inode_init_once(&ti->vfs_inode);
> +}
> +
> static int __init tracefs_init(void)
> {
> int retval;
>
> + tracefs_inode_cachep = kmem_cache_create("tracefs_inode_cache",
> + sizeof(struct tracefs_inode),
> + 0, (SLAB_RECLAIM_ACCOUNT|
> + SLAB_MEM_SPREAD|
> + SLAB_ACCOUNT),
> + init_once);
> + if (!tracefs_inode_cachep)
> + return -ENOMEM;
> +
> retval = sysfs_create_mount_point(kernel_kobj, "tracing");
> if (retval)
> return -EINVAL;
> diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
> new file mode 100644
> index 000000000000..49b5e8949e1c
> --- /dev/null
> +++ b/fs/tracefs/internal.h
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _TRACEFS_INTERNAL_H
> +#define _TRACEFS_INTERNAL_H
> +
> +enum {
> + TRACEFS_EVENT_INODE = BIT(1),
> +};

Let's not introduce the enum until it is used.

-- Steve

> +
> +struct tracefs_inode {
> + unsigned long flags;
> + void *private;
> + struct inode vfs_inode;
> +};
> +
> +static inline struct tracefs_inode *get_tracefs(const struct inode *inode)
> +{
> + return container_of(inode, struct tracefs_inode, vfs_inode);
> +}
> +#endif /* _TRACEFS_INTERNAL_H */


2023-07-26 20:11:23

by Ajay Kaher

[permalink] [raw]
Subject: Re: [PATCH v5 02/10] eventfs: Implement tracefs_inode_cache


>> diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
>> index 57ac8aa4a724..2508944cc4d8 100644
>> --- a/fs/tracefs/inode.c
>> +++ b/fs/tracefs/inode.c
>> @@ -21,13 +21,33 @@
>> #include <linux/parser.h>
>> #include <linux/magic.h>
>> #include <linux/slab.h>
>> +#include "internal.h"
>>
>> #define TRACEFS_DEFAULT_MODE 0700
>> +static struct kmem_cache *tracefs_inode_cachep __ro_after_init;
>>
>> static struct vfsmount *tracefs_mount;
>> static int tracefs_mount_count;
>> static bool tracefs_registered;
>>
>> +static struct inode *tracefs_alloc_inode(struct super_block *sb)
>> +{
>> + struct tracefs_inode *ti;
>> +
>> + ti = kmem_cache_alloc(tracefs_inode_cachep, GFP_KERNEL);
>> + if (!ti)
>> + return NULL;
>> +
>> + ti->flags = 0;
>> +
>> + return &ti->vfs_inode;
>> +}
>> +
>> +static void tracefs_free_inode(struct inode *inode)
>> +{
>> + kmem_cache_free(tracefs_inode_cachep, get_tracefs(inode));
>> +}
>> +
>> static ssize_t default_read_file(struct file *file, char __user *buf,
>> size_t count, loff_t *ppos)
>> {
>> @@ -346,6 +366,9 @@ static int tracefs_show_options(struct seq_file *m, struct dentry *root)
>> }
>>
>> static const struct super_operations tracefs_super_operations = {
>> + .alloc_inode = tracefs_alloc_inode,
>> + .free_inode = tracefs_free_inode,
>> + .drop_inode = generic_delete_inode,
>> .statfs = simple_statfs,
>> .remount_fs = tracefs_remount,
>> .show_options = tracefs_show_options,
>> @@ -628,10 +651,26 @@ bool tracefs_initialized(void)
>> return tracefs_registered;
>> }
>>
>> +static void init_once(void *foo)
>> +{
>> + struct tracefs_inode *ti = (struct tracefs_inode *) foo;
>> +
>> + inode_init_once(&ti->vfs_inode);
>> +}
>> +
>> static int __init tracefs_init(void)
>> {
>> int retval;
>>
>> + tracefs_inode_cachep = kmem_cache_create("tracefs_inode_cache",
>> + sizeof(struct tracefs_inode),
>> + 0, (SLAB_RECLAIM_ACCOUNT|
>> + SLAB_MEM_SPREAD|
>> + SLAB_ACCOUNT),
>> + init_once);
>> + if (!tracefs_inode_cachep)
>> + return -ENOMEM;
>> +
>> retval = sysfs_create_mount_point(kernel_kobj, "tracing");
>> if (retval)
>> return -EINVAL;
>> diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
>> new file mode 100644
>> index 000000000000..49b5e8949e1c
>> --- /dev/null
>> +++ b/fs/tracefs/internal.h
>> @@ -0,0 +1,19 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _TRACEFS_INTERNAL_H
>> +#define _TRACEFS_INTERNAL_H
>> +
>> +enum {
>> + TRACEFS_EVENT_INODE = BIT(1),
>> +};
>
> Let's not introduce the enum until it is used.


Ok, I will move this to v6 04/10.
Let me know once you will complete the review of v5.

-Ajay


2023-07-26 21:54:49

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v5 02/10] eventfs: Implement tracefs_inode_cache

On Wed, 26 Jul 2023 18:45:34 +0000
Ajay Kaher <[email protected]> wrote:

> Ok, I will move this to v6 04/10.
> Let me know once you will complete the review of v5.

OK, I finished my review. Mostly more nits. Hopefully v6 will be what I
pull into linux-next.

-- Steve