2016-11-08 21:37:33

by Andrei Vagin

[permalink] [raw]
Subject: [PATCH] proc: optimize render_sigset_t()

render_sigset_t() requires about 30% of time to generate
/proc/pid/status.

- 74.44% sys_read
- 74.40% vfs_read
- 74.01% __vfs_read
- 73.36% seq_read
- 72.97% proc_single_show
- 72.26% proc_pid_status
+ 29.79% render_sigset_t
+ 11.47% task_mem
+ 5.60% render_cap_t
+ 4.95% seq_printf
+ 4.28% cpuset_task_status_allowed

seq_printf is called for each symbol of a signal mask. This patch
collect a whole mask in a buffer and prints it for one call of
seq_puts().

- 65.02% proc_single_show
- 63.75% proc_pid_status
+ 15.73% task_mem
+ 7.42% render_sigset_t
+ 7.39% render_cap_t
+ 6.46% cpuset_task_status_allowed

/proc/pid/status is generated 25% faster with this optimization.

Cc: Andrew Morton <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrei Vagin <[email protected]>
---
fs/proc/array.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 81818ad..0190c3e 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -232,11 +232,13 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
void render_sigset_t(struct seq_file *m, const char *header,
sigset_t *set)
{
- int i;
+ char buf[_NSIG / 4 + 2];
+ int i, j;

seq_puts(m, header);

i = _NSIG;
+ j = 0;
do {
int x = 0;

@@ -245,10 +247,13 @@ void render_sigset_t(struct seq_file *m, const char *header,
if (sigismember(set, i+2)) x |= 2;
if (sigismember(set, i+3)) x |= 4;
if (sigismember(set, i+4)) x |= 8;
- seq_printf(m, "%x", x);
+ buf[j++] = hex_asc[x];
} while (i >= 4);

- seq_putc(m, '\n');
+ buf[j++] = '\n';
+ buf[j++] = 0;
+
+ seq_puts(m, buf);
}

static void collect_sigign_sigcatch(struct task_struct *p, sigset_t *ign,
--
2.7.4


2016-11-09 10:19:07

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH] proc: optimize render_sigset_t()

On Wed, Nov 9, 2016 at 12:37 AM, Andrei Vagin <[email protected]> wrote:

> @@ -245,10 +247,13 @@ void render_sigset_t(struct seq_file *m, const char *header,
> if (sigismember(set, i+2)) x |= 2;
> if (sigismember(set, i+3)) x |= 4;
> if (sigismember(set, i+4)) x |= 8;
> - seq_printf(m, "%x", x);
> + buf[j++] = hex_asc[x];
> } while (i >= 4);
>
> - seq_putc(m, '\n');
> + buf[j++] = '\n';
> + buf[j++] = 0;
> +
> + seq_puts(m, buf);

seq_write() should be used to avoid re-reading in strlen().
Anyway I suspect bulk conversion SIMD-style will still be faster.


Alexey