Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755569Ab3JKBjl (ORCPT ); Thu, 10 Oct 2013 21:39:41 -0400 Received: from mail4.hitachi.co.jp ([133.145.228.5]:40040 "EHLO mail4.hitachi.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751984Ab3JKBjk (ORCPT ); Thu, 10 Oct 2013 21:39:40 -0400 X-AuditID: 85900ec0-d272ab9000001514-e3-525756da8313 Message-ID: <525756D7.8090703@hitachi.com> Date: Fri, 11 Oct 2013 10:39:35 +0900 From: Yoshihiro YUNOMAE User-Agent: Mozilla/5.0 (Windows NT 5.2; rv:13.0) Gecko/20120604 Thunderbird/13.0 MIME-Version: 1.0 To: Steven Rostedt Cc: Hidehiro Kawai , Masami Hiramatsu , linux-kernel@vger.kernel.org, yrl.pp-manager.tt@hitachi.com, aaronx.j.fabbri@intel.com Subject: Re: [PATCH V2 0/5] trace-cmd: Support the feature recording trace data of guests on the host References: <20130913020627.28927.69090.stgit@yunodevel> In-Reply-To: <20130913020627.28927.69090.stgit@yunodevel> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: AAAAAA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8650 Lines: 195 Hi Steven, Would you review this patch set? Thanks, Yoshihiro YUNOMAE (2013/09/13 11:06), Yoshihiro YUNOMAE wrote: > Hi Steven, > > This is a v2 patch set for realizing a part of "Integrated trace" feature which > is a trace merging system for a virtualization environment. Currently, trace-cmd > does not have following features yet: > > a) Server and client for a virtualization environment > b) Structured message platform between guests and host > c) Agent feature of a client > d) Merge feature of trace data of multiple guests and host in chronological > order > > This patch set supports above a) and b) features. > > > > +------------+ +------------+ > Guest | a), c) | | a), c) | client/agent > ^ +------------+ +------------+ > | ^ ^ ^ ^ > ============|===|=================|===|=========== > | v b)v v b)v > v +----------------------------------+ > Host | a) | server > +----------------------------------+ > ||output || || > \/ \/ \/ > /--------+ /--------+ /--------+ > | 010101 | | 101010 | | 100101 | binary data > | 010100 | | 010100 | | 110011 | > +--------+ +--------+ +--------+ > \ / > \-----------------------------------/ > || d) > \/ > /-----------------------------------+ > | (guest1) 123456: sched_switch... | text data > | (guest2) 123458: kmem_free... | > | (host) 123500: kvm_exit (guest1)| > | (host) 123510: kvm_entry(guest1)| > | (guest1) 123550: sched_switch... | > +-----------------------------------+ > > a) Server and client for a virtualization environment > trace-cmd has listen mode for network, but using network will be a high cost > operation for inducing a lot of memory copying. From kernel-3.6, the > virtio-console driver supports splice_write and ftrace supports "steal" for > fops. So, guest clients of trace-cmd can send trace data without copying memory > by using splice(2). If guest clients use virtio-serial, the server also needs to > support virtio-serial I/F. > > b) Structured message platform between guests and a host > Currently, a server(clients) sends unstructured character string to > clients(server), so clients(server) must parse the unstructured messages. > Since it is hard to add complex contents in the protocol, structured binary > message trace-msg is introduced as the communication protocol. > > c) Agent feature of a client > Current trace-cmd client can operate only as "record" mode, so the client > will send trace data to the server immediately. However, when an user tries to > collect trace data of multiple guests on a host, the user must log in to > each guest. This is hard to use, I think. So, trace-cmd client had better > support agent mode which receives a message from the server. > > d) Merge feature of trace data of multiple guests and a host in chronological > order > Current trace-cmd has a merge feature for multiple machines whose times are > synchronized by NTP. When we use the feature, we execute "trace-cmd record" > with --date option on each machine, and then we run "trace-cmd report" with -i > option for each file. > However, there are cases that times of those machines cannot be synchronized. > For example, although multiple users can run guests on virtualization > environments (e.g. multi-tenant cloud hosting), there are no guarantee that > they use the same NTP server. Moreover, even if the times are synchronized, > trace data cannot exactly be merged because the NTP-synchronized time > granularity may not be enough fine for sorting guest-host switching events. > So, I'm considering that trace data use x86-tsc as timestamp in order to merge > trace data. By using x86-tsc, we can merge trace data even if time of those > machines is not synchronized when CPU has the invariant TSC feature or the > constant TSC feature. And the precision will be enough for understanding > operations of guests and host. However, TSC values on a guest are not equal to > the values on the host because > TSC_guest = TSC_host + TSC_offset. > This series actually doesn't support TSC offset, but I'd like to add such > feature to fix host/guest clock difference in the other series. TSC offset > values can be gotten as write_tsc_offset trace event from kernel-3.11. > (see https://lkml.org/lkml/2013/6/12/72) > > For a), this patch introduces "virt-server" and "record --virt" modes for > achieving low-overhead communication of trace data of guests. "virt-server" is a > server mode for collecting trace data of guests. On the other hand, > "record --virt" mode is a guest client for sending trace data of the guest. > Although these functions are similar to "listen" and "record -N" modes each, > these do not use network but use virtio-serial for low-overhead communication. > > For b), this patch series introduce specific message protocol in order to handle > communication messages with 8 commands. When we extend any messages, using > structured message will be easier than using unstructured message. > > > 1. Run virt-server on a host > # trace-cmd virt-server > > 2. Make guest domain directory > # mkdir -p /tmp/trace-cmd/virt/ > # chmod 710 /tmp/trace-cmd/virt/ > # chgrp qemu /tmp/trace-cmd/virt/ > > 3. Make FIFO on the host > # mkfifo /tmp/trace-cmd/virt//trace-path-cpu{0,1,...,X}.{in,out} > > 4. Set up of virtio-serial pipe of a guest on the host > Add the following tags to domain XML files. > # virsh edit > > > > > > > > > ... (cpu1, cpu2, ...) > > 5. Boot the guest > # virsh start > > 6. Execute "record --virt" on the guest > # trace-cmd record --virt -e sched* > > > I measured CPU usage outputted by top command on a guest when client sends > trace data. Client means "record -N"(NW) or "record --virt"(virtio-serial). > > NW virtio-serial(splice) > client(fedora19) ~2.9[%] ~1.7[%] > > > - Add an agent mode based on "record --virt" > - Add a merging feature of trace data of guests and host to "report" > > Changes in V2: > [1/5] Add a comment in open_udp() > [2/5] Regacy protocol support in order to keep backward compatibility > > Thank you, > > --- > > Yoshihiro YUNOMAE (5): > [CLEANUP] trace-cmd: Split out binding a port and fork reader from open_udp() > trace-cmd: Apply the trace-msg protocol for communication between a server and clients > trace-cmd: Use poll(2) to wait for a message > trace-cmd: Add virt-server mode for a virtualization environment > trace-cmd: Add --virt option for record mode > > > Documentation/trace-cmd-record.1.txt | 11 > Documentation/trace-cmd-virt-server.1.txt | 89 +++ > Makefile | 2 > trace-cmd.c | 3 > trace-cmd.h | 14 > trace-listen.c | 601 ++++++++++++++++---- > trace-msg.c | 874 +++++++++++++++++++++++++++++ > trace-msg.h | 31 + > trace-output.c | 4 > trace-record.c | 146 ++++- > trace-recorder.c | 54 +- > trace-usage.c | 10 > 12 files changed, 1678 insertions(+), 161 deletions(-) > create mode 100644 Documentation/trace-cmd-virt-server.1.txt > create mode 100644 trace-msg.c > create mode 100644 trace-msg.h > -- Yoshihiro YUNOMAE Software Platform Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: yoshihiro.yunomae.ez@hitachi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/