Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp442651ybg; Wed, 23 Oct 2019 00:08:28 -0700 (PDT) X-Google-Smtp-Source: APXvYqywmK8z9csS30titEZaXozTXd+UGdBofWaKNOElYVogHnrc4RrYsymWU5HTnIXha6LA/+1V X-Received: by 2002:a05:6402:13d6:: with SMTP id a22mr35826312edx.165.1571814508307; Wed, 23 Oct 2019 00:08:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571814508; cv=none; d=google.com; s=arc-20160816; b=GkaN94jDsqTc2fsMXfv7176iUvuJquE9qnK0NDpm7zEnwy5Qd1zxMnP9LGmu71cCvv iQedv2F2bhiu/R+lmGG4yZtvhjmSDC9e3v9scwwyui3DbVS8GK5N7GI9M+Leof8V+qvF d61nl9CCx37rVFJbALf7hiMOZ5QsHcr3+CrYwYh/9wHhrFnihyi0rEEm1z48nor4gHLP RKmgGkC5npj2BJrnKpsqBGqakViOSd0qiXY9Jm/blXTYd7RZqAwT8ZXmkqekw9JnqEXk ZA2q2aCXyBqM7Z0PO+WNyex6DmwM7wO9YxlGe6i6Adj/fDvbFhlquBoJrKzOcw7IyUsy z6Vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=g1wQAe767Y4VUVJWpMXfE/VsEIlm/q5MKCauSB9UuzY=; b=0tDdu13N+S/Lel01N6QJ9CO69mejfPFaIbZqXVAwGh1t3+1ZjjcHzAqC5gFDfoMRRx VHj/qqCvfDczBFNo7V/ViVco6IUxQ6+EG1i/XdTNk7Vi9iAoJSe2wd5TOHBr/HpAdx08 o47u1/x9lSnZoRDSQfpBLYLPd1rZGD+Ljfk//plcfnZCnxGkoaiVW3kwQkebjemZiKqI kTIilH23cmKAJTIdV7Fc33nOB58PRh7e56SI5mQ1XjlfdECVeEkpZExGUu0jNBaACAxS PAribzGMv5qi6SHAscMxNOte5N/QqLMTLhh0cyktC4p0bhpE9twVCWyKFcDcWZQlEnY/ WUng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=UTJb5FGq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w9si6071300eji.341.2019.10.23.00.08.04; Wed, 23 Oct 2019 00:08:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=UTJb5FGq; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389712AbfJWHG4 (ORCPT + 99 others); Wed, 23 Oct 2019 03:06:56 -0400 Received: from mail-io1-f67.google.com ([209.85.166.67]:42602 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727574AbfJWHGz (ORCPT ); Wed, 23 Oct 2019 03:06:55 -0400 Received: by mail-io1-f67.google.com with SMTP id i26so14335458iog.9 for ; Wed, 23 Oct 2019 00:06:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=g1wQAe767Y4VUVJWpMXfE/VsEIlm/q5MKCauSB9UuzY=; b=UTJb5FGqH263nZGavmQfOYJ2XcF5KTSGaZ3asYunGoDt4B9v+rZMfxBQAs0S4eEALL BJ9t+DK1yYH4rrZcY6gpESYVW21aK8QMm0ngfNoO2+9dwnFsFGDgeguiOTM2nVT2gYlD uNGUGCNfWNQyUnYe+KH7tijcQy1l/WssgURBxx+f7FUVCy9rzShe6cR+TxrZfmC/MjzY obJZNM+Syoti3iQFyAZq8QLfbLceKb7UBeRcb/nDAk7l2KqSsNOuR9a/vISya9p1pxrs BIZ+CcpCktFh3qeQapMVmp7s2jM33CRqEDBRSLRtOXTxqsKXqEyP/ghWFT2eBjpH3MDC iNMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=g1wQAe767Y4VUVJWpMXfE/VsEIlm/q5MKCauSB9UuzY=; b=sEtxMz8ucxsYDHi2js0efNEpZmgs9J86eX3Z7vw/Wy0YqeLjHkaQodTbEw7p973nS/ 2bV/22iZoH5KUkDRvCNhjy7zRx2OOlYekl8U7onoxGO6IcSnywDLaIkye1bDMskGrxHq UOrs9HnZNSheEOfFSZ/kB/h1+znGUjlbpoBftLLlvKI8LrefbEG6QEG4HVIPStL5wPSQ LGbrnQStcKPkDGmjb0XiBY769XBX3A2/hUj+ACZ4jTbiEZtBuGG5bEXCzcFrMYuFM7W2 0/CYFRfpKIE8bfqm8X8vbp3WpNgaQqPBTAjwRDWSPjILOWyFOTaxv4TKkliC8ymCmSWi iqaw== X-Gm-Message-State: APjAAAUWw23zucuCHMQBlqLG/nDJ00jEeewlehdNh36e+IjG7+vkkRRf OF27bSf4fkasbMy+ym2ud/Wz4s0Lf7y6LXVKFOkG/g== X-Received: by 2002:a6b:ee18:: with SMTP id i24mr1780121ioh.163.1571814414260; Wed, 23 Oct 2019 00:06:54 -0700 (PDT) MIME-Version: 1.0 References: <20191018002746.149200-1-eranian@google.com> <20191021100558.GC1800@hirez.programming.kicks-ass.net> In-Reply-To: <20191021100558.GC1800@hirez.programming.kicks-ass.net> From: Stephane Eranian Date: Wed, 23 Oct 2019 00:06:43 -0700 Message-ID: Subject: Re: [PATCH] perf/core: fix multiplexing event scheduling issue To: Peter Zijlstra Cc: LKML , mingo@elte.hu, Arnaldo Carvalho de Melo , Jiri Olsa , "Liang, Kan" , Song Liu , Ian Rogers Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 21, 2019 at 3:06 AM Peter Zijlstra wrote: > > On Thu, Oct 17, 2019 at 05:27:46PM -0700, Stephane Eranian wrote: > > @@ -2153,6 +2157,7 @@ __perf_remove_from_context(struct perf_event *event, > > void *info) > > { > > unsigned long flags = (unsigned long)info; > > + int was_necessary = ctx->rotate_necessary; > > > > if (ctx->is_active & EVENT_TIME) { > > update_context_time(ctx); > > @@ -2171,6 +2176,37 @@ __perf_remove_from_context(struct perf_event *event, > > cpuctx->task_ctx = NULL; > > } > > } > > + > > + /* > > + * sanity check that event_sched_out() does not and will not > > + * change the state of ctx->rotate_necessary > > + */ > > + WARN_ON(was_necessary != event->ctx->rotate_necessary); > > It doesn't... why is this important to check? > I can remove that. It is leftover from debugging. It is okay to look at the situation after event_sched_out(). Today, it does not change rotate_necessary. > > + /* > > + * if we remove an event AND we were multiplexing then, that means > > + * we had more events than we have counters for, and thus, at least, > > + * one event was in INACTIVE state. Now, that we removed an event, > > + * we need to resched to give a chance to all events to get scheduled, > > + * otherwise some may get stuck. > > + * > > + * By the time this function is called the event is usually in the OFF > > + * state. > > + * Note that this is not a duplicate of the same code in _perf_event_disable() > > + * because the call path are different. Some events may be simply disabled > > It is the exact same code twice though; IIRC this C language has a > feature to help with that. Sure! I will make a function to check on the condition. > > > + * others removed. There is a way to get removed and not be disabled first. > > + */ > > + if (ctx->rotate_necessary && ctx->nr_events) { > > + int type = get_event_type(event); > > + /* > > + * In case we removed a pinned event, then we need to > > + * resched for both pinned and flexible events. The > > + * opposite is not true. A pinned event can never be > > + * inactive due to multiplexing. > > + */ > > + if (type & EVENT_PINNED) > > + type |= EVENT_FLEXIBLE; > > + ctx_resched(cpuctx, cpuctx->task_ctx, type); > > + } > > What you're relying on is that ->rotate_necessary implies ->is_active > and there's pending events. And if we tighten ->rotate_necessary you can > remove the && ->nr_events. > Imagine I have 6 events and 4 counters and I do delete them all before the timer expires. Then, I can be in a situation where rotate_necessary is still true and yet have no more events in the context. That is because only ctx_sched_out() clears rotate_necessary, IIRC. So that is why there is the && nr_events. Now, calling ctx_resched() with no events wouldn't probably cause any harm, just wasted work. So if by tightening, I am guessing you mean clearing rotate_necessary earlier. But that would be tricky because the only reliable way of clearing it is when you know you are about the reschedule everything. Removing an event by itself may not be enough to eliminate multiplexing. > > @@ -2232,6 +2270,35 @@ static void __perf_event_disable(struct perf_event *event, > > event_sched_out(event, cpuctx, ctx); > > > > perf_event_set_state(event, PERF_EVENT_STATE_OFF); > > + /* > > + * sanity check that event_sched_out() does not and will not > > + * change the state of ctx->rotate_necessary > > + */ > > + WARN_ON_ONCE(was_necessary != event->ctx->rotate_necessary); > > + > > + /* > > + * if we disable an event AND we were multiplexing then, that means > > + * we had more events than we have counters for, and thus, at least, > > + * one event was in INACTIVE state. Now, that we disabled an event, > > + * we need to resched to give a chance to all events to be scheduled, > > + * otherwise some may get stuck. > > + * > > + * Note that this is not a duplicate of the same code in > > + * __perf_remove_from_context() > > + * because events can be disabled without being removed. > > It _IS_ a duplicate, it is the _exact_ same code twice. What you're > trying to say is that we need it in both places, but that's something > else entirely. > Will refactor. > > + */ > > + if (ctx->rotate_necessary && ctx->nr_events) { > > + int type = get_event_type(event); > > + /* > > + * In case we removed a pinned event, then we need to > > + * resched for both pinned and flexible events. The > > + * opposite is not true. A pinned event can never be > > + * inactive due to multiplexing. > > + */ > > + if (type & EVENT_PINNED) > > + type |= EVENT_FLEXIBLE; > > + ctx_resched(cpuctx, cpuctx->task_ctx, type); > > + } > > } > >