Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1202633ybl; Wed, 21 Aug 2019 11:38:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqy/gkat8DUy6a9YYxC7qYx3jzYqSh4asPpBbzxFBXvQNQLPGFmQ1BPFnzlU0KNqP8Usawau X-Received: by 2002:a63:7887:: with SMTP id t129mr30551438pgc.309.1566412690097; Wed, 21 Aug 2019 11:38:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566412690; cv=none; d=google.com; s=arc-20160816; b=YbaINJDqbLsxEB0zUPstqoIqRQZwzJhDo+6yOCLDj6k2z06CCqdCwpnR7s8/1r04NW XxA5MXlwmfb2hjgGZ7X7+jP2s3FpjxefrY3/iivXyEq5ToppydiE2HhvDLx75CjuB0HA Cw6lKl6KZy/WSPOPi0CKjqF5SdqzvDKH94d5ThMQCyZ0TKudpr3Tz4rrYDJTs4/KObYo AdeSBo+fQsdkCaVTCSog/Aqcy5MgdCPMtfWtYrbM1/AD2dJtIqSN5f92V5XJheayn3Xn Otw7vHfYSnmOnqbWPQGv43Iz62n8uNZmDhNgQkqkLjA+641ruxZ4Q59wJYI2QeFuTR4s sSfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=tVY1LXmdi32TE9w1SeLZsUhYLzjbIKh1g8jTSkv50s0=; b=Iaj9RVO9RowVYRaEA6SU6wPrSa3Xq6O1xJp+4qOXEWci21oypoerHn3eQpGFqYKIO3 QvnTpO+HtyRkzsW95DFXupi4T4TVMFDWy3LXRaJRKyK80xgf9+iHIuwEj0sMQy9lprLd dl3XjNcNxEMkER4cutcof7QFu6mfx121d6aukQgl0j34NLouq4ysbsxh8PlU6Sp30OEk 43FZN04XWcY65cdfaVWZvuQkxxj+ceCBVjhDOkWg2ozKNt1pfmNUpjP7Dcb0BDQwMOAZ Lq/AOjT9uNDWPGvs/hYyEjGlMWtwIMrSD6GtaAXAFRi4L0xB3W9yZIkPzR4JLWyG7N0l dh6A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=bhJJa4E4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b8si14732318pgn.56.2019.08.21.11.37.55; Wed, 21 Aug 2019 11:38:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=bhJJa4E4; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729170AbfHUScF (ORCPT + 99 others); Wed, 21 Aug 2019 14:32:05 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:41662 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726513AbfHUScF (ORCPT ); Wed, 21 Aug 2019 14:32:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=tVY1LXmdi32TE9w1SeLZsUhYLzjbIKh1g8jTSkv50s0=; b=bhJJa4E42SMEYiIa1DWA2XlMu 4vGjx+rsvfe+CrMjov5MKEqtLN+RI/AcK6Vs7e5E3E7S1/KeayvRFeJ14231xfrWhGN7WN000fvI1 wr6ARpqeyItdr4BeTOr4gIlYjDRXI9jQn/GdZuKv+i6EPx+OkN2r9RgSU//OsEDYyP+kQQNydRsEu 4dsQfkB04uYA3XjbDZZvaRh7ww+8QaF0H1Br4ZgkIHEftCSadBmxj336Y0i2uq2d8euohNvTMJYgr ei8PQDJkEP29FLGm9feeESwlOZHZ9hxmFFSMYiF+iJ9olCPAoFo3nvHYqvnjl+ksRjTGX7ZIFtkAJ prHujlgIw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1i0VOo-0003wh-HG; Wed, 21 Aug 2019 18:31:58 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id E3F5F307456; Wed, 21 Aug 2019 20:31:23 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 5C7DE20F0FB1F; Wed, 21 Aug 2019 20:31:55 +0200 (CEST) Date: Wed, 21 Aug 2019 20:31:55 +0200 From: Peter Zijlstra To: Yonghong Song Cc: Daniel Xu , "bpf@vger.kernel.org" , Song Liu , Andrii Nakryiko , "mingo@redhat.com" , "acme@kernel.org" , Alexei Starovoitov , "alexander.shishkin@linux.intel.com" , "jolsa@redhat.com" , "namhyung@kernel.org" , "linux-kernel@vger.kernel.org" , Kernel Team , Arnaldo Carvalho de Melo Subject: Re: [PATCH v3 bpf-next 1/4] tracing/probe: Add PERF_EVENT_IOC_QUERY_PROBE ioctl Message-ID: <20190821183155.GE2349@hirez.programming.kicks-ass.net> References: <20190820144503.GV2332@hirez.programming.kicks-ass.net> <20190821110856.GB2349@hirez.programming.kicks-ass.net> <62874df3-cae0-36a1-357f-b59484459e52@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <62874df3-cae0-36a1-357f-b59484459e52@fb.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 21, 2019 at 04:54:47PM +0000, Yonghong Song wrote: > Currently, in kernel/trace/bpf_trace.c, we have > > unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx) > { > unsigned int ret; > > if (in_nmi()) /* not supported yet */ > return 1; > > preempt_disable(); > > if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) { Yes, I'm aware of that. > In the above, the events with bpf program attached will be missed > if the context is nmi interrupt, or if some recursion happens even with > the same or different bpf programs. > In case of recursion, the events will not be sent to ring buffer. And while that is significantly worse than what ftrace/perf have, it is fundamentally the same thing. perf allows (and iirc ftrace does too) 4 nested context per CPU (task,softirq,irq,nmi) but any recursion within those context and we drop stuff. The BPF stuff is just more eager to drop things on the floor, but it is fundamentally the same. > A lot of bpf-based tracing programs uses maps to communicate and > do not allocate ring buffer at all. So extending PERF_RECORD_LOST doesn't work. But PERF_FORMAT_LOST might still work fine; but you get to implement it for all software events. > Maybe we can still use ioctl based approach which is light weighted > compared to ring buffer approach? If a fd has bpf attached, nhit/nmisses > means the kprobe is processed by bpf program or not. There is nothing kprobe specific here. Kprobes just appear to be the only one actually accounting the recursion cases, but everyone has them. > Currently, for debugfs, the nhit/nmisses info is exposed at > {k|u}probe_profile. Alternative, we could expose the nhit/nmisses > in /proc/self/fdinfo/. User can query this interface to > get numbers. No, we're not adding stuff to procfs for this.