Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp781962imu; Wed, 9 Jan 2019 06:20:29 -0800 (PST) X-Google-Smtp-Source: ALg8bN7HeAcKSd7DioBvX16R9y7ujoe/vBK+lqThPUqT1VCD7lGEKgE4aSCO+FPSZQ06Hn6BdyqZ X-Received: by 2002:a17:902:4222:: with SMTP id g31mr6188626pld.240.1547043629487; Wed, 09 Jan 2019 06:20:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547043629; cv=none; d=google.com; s=arc-20160816; b=BktjjF9ujUujuHhQT5WWZlkYCaBBtxzX9+diqzaNF5BFeJyO0i+GmnCc2Xt/ZbqhX9 4SX/SX8lZhdeKu3Y+5GcEvXdSiUneuX1+g5bEsA/AnZgO18OzM8VEiU84jbZUe7jolSl tmfWmORPWiTlfoSrvlZtSetAtMvTTbCyQ/RVPHdEa4Hd6ganafViPBCs5Z5y0cKW6ktI iknU9aW0b2cHCTkIPIYZqstsc8Z9IGVzplX7+xU4ThReVQHfPscZYB9LMmHgzA1/BnU6 tpb3TRUmDZ2ljYS5ERgax++EJ9X7qKVTDWgB+znGhmWhfL2Sfk2efro4Zf22bL/v46n0 7zbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=lQr2MMBoooeIZvC1sUfkR9ACXBzS1/uLXOD1DM7niPs=; b=Hn6xsz3YlWbAIwIioP+RFBkrx7oafkhL+EWhrL9d/kh0DFh0ltJGulngl4RfMlqsgO wXlMoRgVB3KZYDSwamNvUdKV6pr66GmqSq/uz7m71Up6ooTXVIrnCITV0vzRLGjcNN48 kIdbK3xIwle3qoi5pq9IgLdk1BjmwfZFTn/Gbn3zDku99JqNYq99++6Xd8W/e5J6PZK7 osYfXGffHeyah2RTjnEfku/SGCvtsffnBjaBiQ5wxGkn3p8x3Z4nwFQHPTlzg2Y1zTZG 0/mZfX4bBP1YgjXnjHXjzyK7cy1cI0RjgpvTPDrvra5yEYv8G2PRaTogjcGO97IBlls/ kw7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=ABFxu0l3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f1si66599039pgq.553.2019.01.09.06.20.13; Wed, 09 Jan 2019 06:20:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=ABFxu0l3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730466AbfAIM7o (ORCPT + 99 others); Wed, 9 Jan 2019 07:59:44 -0500 Received: from merlin.infradead.org ([205.233.59.134]:35326 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729721AbfAIM7n (ORCPT ); Wed, 9 Jan 2019 07:59:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=lQr2MMBoooeIZvC1sUfkR9ACXBzS1/uLXOD1DM7niPs=; b=ABFxu0l3NG6E2RiEmgS9ME/NS RNmjfbTNoRRikIdd52//Me9ROpFO2MG7QwNNRQjGsmkPtL+pE0E6aC6etQvOmgZZhS66b15weNXQd W1+6ELlKxmq+GS8nVzqL7DI2QNnRox9WwPmcl150gK3y9CgYpxJESbIAOHK5mXkte6F8edFUdR7+J cDdRyIgeXAZ19rdXBtJr0ZSuh7Ons364DEKgM1iEYEfjC6+A6eL4EqUSs9jzTSB1JRnkBgi04QsOJ d2Zmw4j9X19Xn7OX8qYhnd6/ESqntBWBtOzO4uPsw5fXECHy4P5w7TQwM66bg+ZDplwtfSFXQHz2I DZgWXHpQQ==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1ghDSN-0000db-LI; Wed, 09 Jan 2019 12:59:39 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 7170520281DDA; Wed, 9 Jan 2019 13:59:38 +0100 (CET) Date: Wed, 9 Jan 2019 13:59:38 +0100 From: Peter Zijlstra To: Song Liu Cc: lkml , "netdev@vger.kernel.org" , "acme@kernel.org" , "ast@kernel.org" , "daniel@iogearbox.net" , Kernel Team , Andi Kleen Subject: Re: [PATCH v5 perf, bpf-next 3/7] perf, bpf: introduce PERF_RECORD_BPF_EVENT Message-ID: <20190109125938.GJ1900@hirez.programming.kicks-ass.net> References: <20181220182904.4193196-1-songliubraving@fb.com> <20181220182904.4193196-4-songliubraving@fb.com> <20190108184116.GC30894@hirez.programming.kicks-ass.net> <77A478D9-F36F-443A-BBFD-F0C1FFE0DD90@fb.com> <20190108194310.GD1900@hirez.programming.kicks-ass.net> <20190109101808.GG1900@hirez.programming.kicks-ass.net> <924AE46C-B2B9-4E17-A6FC-C678FEADC03B@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <924AE46C-B2B9-4E17-A6FC-C678FEADC03B@fb.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 09, 2019 at 11:32:50AM +0000, Song Liu wrote: > I was thinking about modifying the text in-place scenario. In this case, > we can use something like > > struct perf_record_text_modify { > u64 addr; > u_big_enough old_instr; > u_big_enough new_instr; char[15] for x86 ;-) Also, I don't think we need old, we should already have the old text, either from a previous event or from the initial kcore snapshot. > timestamp ; that lives in struct sample_id. > }; > > It is a fixed size record, and we don't need process it immediately > in user space. At the end of perf run, a series of these events will > help us reconstruct exact text at any time. That works for text_poke users, see also: https://lkml.kernel.org/r/20190109103544.GH1900@hirez.programming.kicks-ass.net But is useless for module / bpf / ftrace dynamic text. > > All we need is some means of ensuring the symbol is still there by the > > time we see the event and do the copy. > > > > I think we can do this with a new ioctl() on /proc/kcore itself: > > > > - when we have kcore open, we queue all text-free operations on list-1. > > > > - when we close kcore, we drain all (text-free) list-* and perform the > > pending frees immediately. > > > > - on ioctl(KCORE_QC) we perform the pending free of list-3 and advance > > list-2 to list-3 and list-1 to list-2. > > > > Perf would then open kcore at the start of the record, make a complete > > copy and keep the FD open. At the end of every buffer process, we issue > > KCORE_QC IFF we observed a ksym unreg in that buffer. > > Does this mean we need to scan every buffer before writing it to perf.data > during perf-record? Just like the BPF events, yes. Now for PT most of the actual data is not in the regular buffer, so it shouldn't be too horrible, but just like the BPF event, it can get its own buffer if it does become a problem. > Also, if we need ksym unreg here, I guess it is NOT really modifying text > in-place, but creating new version and swap? Then can we include something > like this in perf.data: > > struct perf_record_text_modify { > u64 old_addr; > u64 new_addr; > u32 old_len; /* up to MAX_SIZE */ > u32 new_len; /* up to MAX_SIZE */ > u8 old_text[MAX_SIZE]; > u8 new_text[MAX_SIZE]; > timestamp ; > }; > > In this way, this record is embedded in perf.data, and doesn't require > extra processing during perf-record (only at the end of perf-record). > This would work for text modifying case, as modifying text is simply > old-text to new-text. > > Similar solution would not work for BPF case, as bpf_prog_info is > getting a lot more members in the near future. > > Does this make sense...? I don't think we actually need old_text here either. We're creating a new text mapping, there was nothing there before. But still, perf events are limited to 64k, so that means we cannot support symbols larger than that (although I suppose that would be fairly rare). Something like that could work, but I'm not sure it is actually better. Some PT person would have to play with things I suppose.