Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752890AbbDAHGb (ORCPT ); Wed, 1 Apr 2015 03:06:31 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:49767 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752849AbbDAHG3 (ORCPT ); Wed, 1 Apr 2015 03:06:29 -0400 Message-ID: <551B98DD.9010608@huawei.com> Date: Wed, 1 Apr 2015 15:06:05 +0800 From: Yunlong Song User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Arnaldo Carvalho de Melo , David Ahern CC: , , , , Subject: Re: [PATCH 3/9] perf sched replay: Alloc the memory of pid_to_task dynamically to adapt to the unexpected change of pid_max References: <1427809596-29559-1-git-send-email-yunlong.song@huawei.com> <1427809596-29559-4-git-send-email-yunlong.song@huawei.com> <551AB005.8080102@gmail.com> <20150331155655.GL9438@kernel.org> In-Reply-To: <20150331155655.GL9438@kernel.org> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.74.205] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020202.551B98EC.000D,ss=1,re=0.001,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 7045c9d44e562f579c7c319cecc5f659 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6437 Lines: 182 On 2015/3/31 23:56, Arnaldo Carvalho de Melo wrote: > Em Tue, Mar 31, 2015 at 08:32:37AM -0600, David Ahern escreveu: >> On 3/31/15 7:46 AM, Yunlong Song wrote: >>> - BUG_ON(pid >= MAX_PID); >>> + if (sched->pid_to_task == NULL) { >>> + if (sysctl__read_int("kernel/pid_max", &pid_max) < 0) >>> + pid_max = MAX_PID; >>> + BUG_ON((sched->pid_to_task = calloc(pid_max, sizeof(struct task_desc *))) == NULL); >>> + } >>> + BUG_ON(pid >= (unsigned long)pid_max); > >> so why the previous patch bumping the MAX_PID count if you move to dynamic >> here? And shouldn't MAX_PID get dropped here as well? > >> So attached is what i put together last week; just have not had time to send >> it out. > > Yunlong, can you please check/Ack this? > > - Arnaldo > >> >From 159dc732e0ad66d9151e93761bc9c685872e9fa4 Mon Sep 17 00:00:00 2001 >> From: David Ahern >> Date: Tue, 24 Mar 2015 16:57:10 -0400 >> Subject: [PATCH 3/5] perf sched: Remove max pid assumption from perf-sched >> >> 'perf sched replay' currently fails on sparc64: >> $ perf sched replay >> run measurement overhead: 2475 nsecs >> sleep measurement overhead: 56165 nsecs >> the run test took 999705 nsecs >> the sleep test took 1059270 nsecs >> perf: builtin-sched.c:384: register_pid: Assertion `!(pid >= 65536)' failed. >> Aborted >> >> The max pid limitation is removed by converting pid_to_task from a >> pid based array to an intlist (rblist) with the pid as the index >> and task_desc stored in the priv element. >> >> In the process pid is converted from a long int to int. >> >> Signed-off-by: David Ahern >> --- >> tools/perf/builtin-sched.c | 30 ++++++++++++++++++++---------- >> 1 file changed, 20 insertions(+), 10 deletions(-) >> >> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c >> index cc52c993a1fa..858d85396d81 100644 >> --- a/tools/perf/builtin-sched.c >> +++ b/tools/perf/builtin-sched.c >> @@ -33,13 +33,12 @@ >> #define MAX_CPUS 4096 >> #define COMM_LEN 20 >> #define SYM_LEN 129 >> -#define MAX_PID 65536 >> >> struct sched_atom; >> >> struct task_desc { >> unsigned long nr; >> - unsigned long pid; >> + int pid; >> char comm[COMM_LEN]; >> >> unsigned long nr_events; >> @@ -129,7 +128,7 @@ struct perf_sched { >> struct perf_tool tool; >> const char *sort_order; >> unsigned long nr_tasks; >> - struct task_desc *pid_to_task[MAX_PID]; >> + struct intlist *pid_to_task; >> struct task_desc **tasks; >> const struct trace_sched_handler *tp_handler; >> pthread_mutex_t start_work_mutex; >> @@ -377,14 +376,18 @@ static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *ta >> } >> >> static struct task_desc *register_pid(struct perf_sched *sched, >> - unsigned long pid, const char *comm) >> + int pid, const char *comm) >> { >> - struct task_desc *task; >> >> - BUG_ON(pid >= MAX_PID); >> + struct int_node *node = intlist__findnew(sched->pid_to_task, pid); >> + struct task_desc *task; >> >> - task = sched->pid_to_task[pid]; >> + if (node == NULL) { >> + pr_err("Failed to allocate entry for task\n"); >> + return NULL; >> + } >> >> + task = (struct task_desc *) node->priv; >> if (task) >> return task; >> >> @@ -392,20 +395,21 @@ static struct task_desc *register_pid(struct perf_sched *sched, >> task->pid = pid; >> task->nr = sched->nr_tasks; >> strcpy(task->comm, comm); >> + >> /* >> * every task starts in sleeping state - this gets ignored >> * if there's no wakeup pointing to this sleep state: >> */ >> add_sched_event_sleep(sched, task, 0, 0); >> >> - sched->pid_to_task[pid] = task; >> + node->priv = task; >> sched->nr_tasks++; >> sched->tasks = realloc(sched->tasks, sched->nr_tasks * sizeof(struct task_task *)); >> BUG_ON(!sched->tasks); >> sched->tasks[task->nr] = task; >> >> if (verbose) >> - printf("registered task #%ld, PID %ld (%s)\n", sched->nr_tasks, pid, comm); >> + printf("registered task #%ld, PID %d (%s)\n", sched->nr_tasks, pid, comm); >> >> return task; >> } >> @@ -418,7 +422,7 @@ static void print_task_traces(struct perf_sched *sched) >> >> for (i = 0; i < sched->nr_tasks; i++) { >> task = sched->tasks[i]; >> - printf("task %6ld (%20s:%10ld), nr_events: %ld\n", >> + printf("task %6ld (%20s:%10d), nr_events: %ld\n", >> task->nr, task->comm, task->pid, task->nr_events); >> } >> } >> @@ -2981,6 +2985,12 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused) >> }; >> unsigned int i; >> >> + sched.pid_to_task = intlist__new(NULL); >> + if (sched.pid_to_task == NULL) { >> + pr_err("Failed to allocate intlist for tracking tasks\n"); >> + return -ENOMEM; >> + } >> + >> for (i = 0; i < ARRAY_SIZE(sched.curr_pid); i++) >> sched.curr_pid[i] = -1; >> >> -- >> 2.3.0 >> > > > . > I have checked David's patch, the difference with my patch is that David's patch uses a rblist to sort and search the pid's task with a time complexity of O(log n), while my patch uses the array (same as the original design) dynamically created with a time complexity of O(1). For every simple, my patch does not need to traverse a list and has nothing to do with the total number of tasks. The maximum value of pid_max is PID_MAX_LIMIT, which equals to 4*1024*1024 in x86_64. Then for each simple, David's patch has to cost O(log 4*1024*1024) in searching a pid's task, vs O(1) of my patch. So I suggest to use array instead of rblist to solve this problem and save unnecessary waste of time. However, if you finally decide to take David's patch rather than my patch, please ignore the following 3 patches: [PATCH 2/9] perf sched replay: Increase the MAX_PID value to fix assertion failure problem [PATCH 3/9] perf sched replay: Alloc the memory of pid_to_task dynamically to adapt to the unexpected change of pid_max [PATCH 4/9] perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations The other 6 patches in these patch sets still need to be applied to make other improvements and bug fixes. -- Thanks, Yunlong Song -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/