Message-ID: <56553022.8000101@huawei.com>
Date: Wed, 25 Nov 2015 11:50:58 +0800
From: "Wangnan (F)" <wangnan0@huawei.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version: 1.0
To: Arnaldo Carvalho de Melo <acme@kernel.org>,
        David Ahern <dsahern@gmail.com>
CC: Yunlong Song <yunlong.song@huawei.com>, <a.p.zijlstra@chello.nl>,
        <paulus@samba.org>, <mingo@redhat.com>, <linux-kernel@vger.kernel.org>,
        <namhyung@kernel.org>, <ast@kernel.org>,
        <masami.hiramatsu.pt@hitachi.com>, <kan.liang@intel.com>,
        <adrian.hunter@intel.com>, <jolsa@kernel.org>, <bp@alien8.de>,
        <jean.pihet@linaro.org>, <rric@kernel.org>, <xiakaixu@huawei.com>,
        <hekuang@huawei.com>
Subject: Re: [PATCH] perf record: Add snapshot mode support for perf's regular
 events
References: <1448373632-8806-1-git-send-email-yunlong.song@huawei.com> <1448373632-8806-2-git-send-email-yunlong.song@huawei.com> <56547D01.8020606@gmail.com> <20151124152023.GE18140@kernel.org>
In-Reply-To: <20151124152023.GE18140@kernel.org>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3843
Lines: 99


On 2015/11/24 23:20, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 24, 2015 at 08:06:41AM -0700, David Ahern escreveu:
>> On 11/24/15 7:00 AM, Yunlong Song wrote:
>>> +static int record__write(struct record *rec, void *bf, size_t size)
>>> +{
>>> +	if (rec->memory.size && memory_enabled) {
>>> +		if (perf_memory__write(&rec->memory, bf, size) < 0) {
>>> +			pr_err("failed to write memory data, error: %m\n");
>>> +			return -1;
>>> +		}
>>> +	} else {
>>> +		if (perf_data_file__write(rec->session->file, bf, size) < 0) {
>>> +			pr_err("failed to write perf data, error: %m\n");
>>> +			return -1;
>>> +		}
>>> +		rec->bytes_written += size;
>>>   	}
>>>
>>> -	rec->bytes_written += size;
>>>   	return 0;
>>>   }
>>>
>>> @@ -86,6 +214,8 @@ static int record__mmap_read(struct record *rec, int idx)
>>>   	if (old == head)
>>>   		return 0;
>>>
>>> +	memory_enabled = 1;
>>> +
>>>   	rec->samples++;
>>>
>>>   	size = head - old;
>>> @@ -113,6 +243,7 @@ static int record__mmap_read(struct record *rec, int idx)
>>>   	md->prev = old;
>>>   	perf_evlist__mmap_consume(rec->evlist, idx);
>>>   out:
>>> +	memory_enabled = 0;
>>>   	return rc;
>>>   }
>>>
>> So you are basically ignoring all samples until SIGUSR2 is received. That
> No, he is not, its just that his code is difficult to follow, has to be
> rewritten, but he is ignoring just PERF_RECORD_SAMPLE events, so it
> will..
>
>> means the resulting data file will have limited history of task events for
> ... have a complete history of task events, since PERF_RECORD_FORK, etc
> are not being ignored.
>
> No?

Actually we are discussing about this problem.

For such tracking events (PERF_RECORD_FORK...), we have dummy event so
it is possible for us to receive tracking events from a separated
channel, therefore we don't have to parse every events to pick those
events out. Instead, we can process tracking events differently, then
more interesting things can be done. For example, squashing those tracking
events if it takes too much memory...

Furthermore, there's another problem being discussed: if userspace 
ringbuffer
is bytes based, parsing event is unavoidable. Without parsing event we are
unable to find the new 'head' pointer when overwriting. Instead, we are
thinking about a bucket-based ringbuffer that, let perf maintain a series
of bucket, each time 'poll' return, perf copies new events to the start of
a bucket. If all bucket is occupied, we drop the oldest bucket. Bucket-based
ringbuffer watest some memory but can avoid event parsing.

And there's many other problems in this patch. For example, when SIGUSR2 is
received, we need to do something to let all perf events start dumping.
Current implementation can't ensure we receive events just before the
SIGUSR2 if we not set 'no-buffer'.

Also, output events are in one perf.data, which is not user friendly.
Our final goal is to make perf a daemonized moniter, which can run 7x24
in user's environment. Each time a glitch is detected, a framework sends
a signal to perf to get a perf.data from it perf. The framework manage
those perf.data like logrotate, help developer analysis those glitch.

We are seeking the route implementing the final monitor. This patch is
an attempt to let you know what we want and get your thought about it.
Looks like you agree out basic idea. That's good. Then we decide to
start from some small feature to support the final goal. For example:
snapshot mode for specific events:

  # perf record -a -e cycles/snapshot/

And when C-c is pressed, for cycles event, only those data still in
kernel would be dump.

Thank you.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/