Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp274449pxb; Mon, 13 Sep 2021 19:06:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwZHeF7uR/glc/kgJ7BC/isdIug4MVzHbD2P2DXd+CYmZ8OxH6FyniILiZ4RIMbUzWfGp++ X-Received: by 2002:a17:906:4310:: with SMTP id j16mr15823270ejm.48.1631585179576; Mon, 13 Sep 2021 19:06:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631585179; cv=none; d=google.com; s=arc-20160816; b=0GvkJgz2UVtXslHjIgrLrOFRr2jRGGDENWS1xMuIIZ9mPezByxtTtKQgVqkZ5GSrmx RaUgR8Vu9yafz3ILQlruKH6qaA81olYdsW9VhjYkDOCs6rxwZR4IcW6M+Rrn1xm03dUm 7KDTBs0dEvNlcYt2AUztW3ls262mXSoqYG5tUKsCymc4/K5AtY1fyZl1lgSyAp5bMXev JCoYMdBXLkoigg79rbExFaXIFoTmDGXYeNUrH/HnyA7NBIDiaWxGp1nzo8mbtP622m5s GnPjMAJhjyGwjYSDeLEADzHeB7J108kFnTmtGcX5tYXfDAA5LQ3Cf5aA+kAn/ZNC3WPI k/WA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=FXpSY4dMBwdpw2YBzO6X/IpaGv249hqR4z87u3cJ0S4=; b=NDjpNVG5wVgg/ejRxyxqek7yznNV9JBxFess5bs3f44t1Trptmo02+NhqEeNCVqPas XWG1TvoKEfPyPBVXAA+U1Nu8KACOEvv/qAqSvuvpnGEcdmYg9NG9pcdUVgaB1IavnBCr dhv3lETthz+VOpTI8GmHtOUiszevU/5xe+Yykw4XT8SOqWWphOvDg1HwEEi65ZXAO+r5 ywtf61eR2aWPpVsIJ1ZGc/bdKk+otFe0iFxl6ay6UCdYFG7paqpLKrHirhQnQhwHjJd3 rM+aACqWVU56+4g7xJ57wKtBlvAd32sSgWj8Mz2KnEfPGAsDKjLCmM7zD4zuz6+Q40dw 3Z3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t12si12263995edc.333.2021.09.13.19.05.56; Mon, 13 Sep 2021 19:06:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235917AbhINCDk (ORCPT + 99 others); Mon, 13 Sep 2021 22:03:40 -0400 Received: from out30-42.freemail.mail.aliyun.com ([115.124.30.42]:45460 "EHLO out30-42.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235913AbhINCD2 (ORCPT ); Mon, 13 Sep 2021 22:03:28 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=yun.wang@linux.alibaba.com;NM=1;PH=DS;RN=19;SR=0;TI=SMTPD_---0UoJjVVb_1631584927; Received: from testdeMacBook-Pro.local(mailfrom:yun.wang@linux.alibaba.com fp:SMTPD_---0UoJjVVb_1631584927) by smtp.aliyun-inc.com(127.0.0.1); Tue, 14 Sep 2021 10:02:08 +0800 Subject: Re: [RFC PATCH] perf: fix panic by mark recursion inside perf_log_throttle To: Peter Zijlstra Cc: Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , "open list:PERFORMANCE EVENTS SUBSYSTEM" , "open list:PERFORMANCE EVENTS SUBSYSTEM" , "open list:BPF (Safe dynamic programs and tools)" , "open list:BPF (Safe dynamic programs and tools)" References: <20210910153839.GH4323@worktop.programming.kicks-ass.net> From: =?UTF-8?B?546L6LSH?= Message-ID: <6aefb972-3691-2b66-a189-97815df10a12@linux.alibaba.com> Date: Tue, 14 Sep 2021 10:02:07 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/9/13 下午6:36, Peter Zijlstra wrote: > On Mon, Sep 13, 2021 at 12:24:24PM +0200, Peter Zijlstra wrote: > > FWIW: > >> I'm confused tho; where does the #DF come from? Because taking a #PF >> from NMI should be perfectly fine. >> >> AFAICT that callchain is something like: >> >> NMI >> perf_event_nmi_handler() >> (part of the chain is missing here) >> perf_log_throttle() >> perf_output_begin() /* events/ring_buffer.c */ >> rcu_read_lock() >> rcu_lock_acquire() >> lock_acquire() >> trace_lock_acquire() --> perf_trace_foo > > This function also calls perf_trace_buf_alloc(), and will have > incremented the recursion count, such that: > >> >> ... >> perf_callchain() >> perf_callchain_user() >> #PF (fully expected during a userspace callchain) >> (some stuff, until the first __fentry) >> perf_trace_function_call >> perf_trace_buf_alloc() >> perf_swevent_get_recursion_context() >> *BOOM* > > this one, if it wouldn't mysteriously explode, would find recursion and > terminate, except that seems to be going side-ways. Yes, it supposed to avoid recursion in the same context, but it never got chance to do that, the function and struct should all be fine, any idea in such situation what can trigger this kind of double fault? Regards, Michael Wang > >> Now, supposedly we then take another #PF from get_recursion_context() or >> something, but that doesn't make sense. That should just work... >> >> Can you figure out what's going wrong there? going with the RIP, this >> almost looks like 'swhash->recursion' goes splat, but again that makes >> no sense, that's a per-cpu variable. >> >>