Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1959435imm; Thu, 11 Oct 2018 02:44:13 -0700 (PDT) X-Google-Smtp-Source: ACcGV61cj62+d/OdkwmY/yQmFxohiV/2DGo65FB54B3UxJuwxpCTU1CXGGb+vfHCisPhkvHwZsEJ X-Received: by 2002:a63:4f20:: with SMTP id d32-v6mr700219pgb.231.1539251053593; Thu, 11 Oct 2018 02:44:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539251053; cv=none; d=google.com; s=arc-20160816; b=ZuHI0uU1hdnoRzMhLCtgwQA0/6Ht8GkZzNto+Gvx8eLAJ22RBxhvGYQthqy2Ax74D+ 7tobGZpMyvEnHo2YpvxA/zUo73ieM5bQzEPNzZQv49RAwC60dO7p8YrCyp9YLZ8/aBHj VM92IFGQNY1AbYXhm4zRGMVYYoVfNS/MYNXqTmfx/SNpL1fpg0y+xtccNafxWSuIwjIy kDg2vFsgUJ0lvoJtC2n+AxlrUGyTUpSMFrzE/bNDVuNfZsHKTeyBLx2/FuaKOrRQYIoh ZdxTxTesckbKpQW8AOfKiEVrm5gbN2vvQAp2KyXG9RqFiMuWRPN3f04EqFg9CKfhSY3z ipGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=FiPNDg8g4hmCcl+wpoJlxuvjEQ56JIfo3dFJ22aIJS4=; b=S9Z6Kb7WgAbMf4KhzYPDNGRvda9FSm89TzygQ4hvH0DZ9N9RCBStEm0fQGqJYF/ftx U/zUDJQjK39FGzvRhtIRbl5OTRNJ2xkW0aRXdE0wec84ikwTgFnhV7+sxupzVGyrEEYS ZJOfZ4b7qY/RFvObz+iZpWX56KNGkFY7VfMLcx+pm8Hq5ekHOxmv3WZ20xzWdk6M7ijo 9d2gy4e3CHa7KiTPVsJ8PVx/gZbTebLoWhVVHUXhntrwQ7L+k8m7u3w0JmtsgYs1uLGN /vhrdlxgA1sTC4cNV97TA6C+ocIMVxYFpe1tSCxs92Qgko2IGpa3SKDDdvoHaVy0GyEK Fdlg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=LXX3o4fx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v10-v6si25569370pfm.162.2018.10.11.02.43.59; Thu, 11 Oct 2018 02:44:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=LXX3o4fx; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728006AbeJKQzt (ORCPT + 99 others); Thu, 11 Oct 2018 12:55:49 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:51726 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726047AbeJKQzt (ORCPT ); Thu, 11 Oct 2018 12:55:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=FiPNDg8g4hmCcl+wpoJlxuvjEQ56JIfo3dFJ22aIJS4=; b=LXX3o4fxisXQmW3/jCCWqaLNZ XPk+D1AlInM/V0xbGk/AddxD1vd1uWWZBpTwMUC4XL7Aur/Z0NvGaN0F6MKR/LOuTItNgok1rPRLA nVyIS8AqtEYKTyzHoxFHvuPymLi/xlsu16u+fdCf10qLQiksSNOs43/ZbKvzOA10RvZsQG2OCrKoS o8wVXoqx+WphQc9s6e0iI34h3Y1i2x3H2SD80613zii/VOcwEtCq6w49Xy7F8flg03OJMwyUwdYmo ZOuQYNWUVStFMRGpyHRqqd8i08G3vXvmddip9Ac22irU+Dqts7oHXe3T6oyPJKVoKHEgSPtv2f8Bm sIk+4rxmA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gAXHP-0007IA-5V; Thu, 11 Oct 2018 09:29:15 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 467A12029856A; Thu, 11 Oct 2018 11:29:13 +0200 (CEST) Date: Thu, 11 Oct 2018 11:29:13 +0200 From: Peter Zijlstra To: Song Liu Cc: Ingo Molnar , lkml , "acme@kernel.org" , "alexander.shishkin@linux.intel.com" , "jolsa@redhat.com" , "eranian@google.com" , "tglx@linutronix.de" , "alexey.budankov@linux.intel.com" , "mark.rutland@arm.com" , "megha.dey@intel.com" , "frederic@kernel.org" Subject: Re: [RFC][PATCH] perf: Rewrite core context handling Message-ID: <20181011092913.GA9848@hirez.programming.kicks-ass.net> References: <20181010104559.GO5728@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 11, 2018 at 07:50:23AM +0000, Song Liu wrote: > Hi Peter, > > I am trying to understand this. Pardon me if any question is silly. > > I am not sure I fully understand the motivation here. I guess we > see problem when there are two (or more) independent hardware PMUs > per cpu? Then on a given cpu, there are two (or more) > perf_cpu_context, but only one task context? Right. > If this is correct (I really doubt...), I guess perf_rotate_context() > is the problem? No, everything comes apart. Where would you put the events of the second PMU? The thing most often proposed it pretending the second PMU is a 'software' PMU and sticking the events on the software PMU context. But because software PMUs must never fail to schedule an event, that results in some quite horrible things -- including that we cannot RR the events. Similarly the big.little guys have the problem that the PMUs are not the same between big and little cores, and they fudge something horrible. By having clear ordering on PMU, that can be cleaned up too. > And if this is still correct, this patch may not help, > as we are doing rotation for each perf_cpu_pmu_context? (or rotation > per perf_event_context is the next step?). We do indeed to rotation per perf_cpu_pmu_context, however: - perf_cpu_pmu_context embeds a cpu scope perf_event_pmu_context, - perf_cpu_pmu_context tracks the currently associated task scope perf_event_pmu_context. So it can rotate all current events for a particular PMU. > Or step back a little... I see two big changes: > > 1. struct perf_ctx_context is now per cpu (instead of per pmu per cpu); > 2. one perf_event_ctxp per task_struct (instead of 2). Correct, we reduce to 1 cpu context and 1 task context at all times. This in fact simplifies quite a bit of things. > I think #1 is a bigger change than #2. Is this correct? They're the 'same' change. But yes the primary purpose was 2, but having only a single cpu context is a direct consequence. > Could you please help me understand it better? I hope this helps to understand, please feel free to ask more.