Received: by 2002:a05:7412:ba23:b0:fa:4c10:6cad with SMTP id jp35csp1462890rdb; Fri, 19 Jan 2024 23:22:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IHUfJD5qecBgqGBJhQHLrvd5Saie5frl8AAWJyDnyMux8RnL/rXJmvaOjVGXIpMe/Fee4Tu X-Received: by 2002:a92:2a04:0:b0:361:a75b:99a5 with SMTP id r4-20020a922a04000000b00361a75b99a5mr1502760ile.13.1705735331028; Fri, 19 Jan 2024 23:22:11 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705735331; cv=pass; d=google.com; s=arc-20160816; b=Qr7U4saZYuTLESdKSH95you4gY8VUi6RBVE7PYEWNtgMv/4LKnGf+70ZWv9G4paGiO SbGjamcxKu0BxyGFm0+GXFRaKHnpmck0rKxZFkX8BESk4CT3BV/WRX7SI6CAnI6HAoda E+LkR0QlAJ4aHyXjSHv2YvCGNx1YYtJXBTcwdWBmRVD4XMl0GSNzAaI4WH8Sv87hIr9M oP+BngaRz5vfZ3/sFBADhkTP4wovo2y3PQDC5IY/kXnKIccnRVCNgBl5vOqg7qthJJaW zILRjyh+V3osNbRhDfO/MmlQA63XEwp/+wbGEfdqggNLUD1rfulHKQT19YeTi3s3+h5F /FaA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id; bh=ipVInysaskRirMqi6RcnhhuWWVo++xadrH8tD/BTDXs=; fh=jGmX5VRN+OC/GpdNHl8+C03S81flyIt6lFqmX8nSyzg=; b=faybuqZX+2dV2Hslm9YlcMLHbLMMoqqtr9NW7a002PlG+1yi+DC7Lx3E23eqzAvpq+ 4Y69XWysAr76Vuy3CqZLn/Us4aCJWww9EvihaGhzgwjfTp32CFl0mFaAxbjtoxtQf3X6 NeSRnRJsKtKKgb78YKdG+oT1SWUl8h4bfvLj9sED6xcz2Xc69xA6+XRyxsn+NxhZyo2P QU/JvX5ct7vUekDi6HDFYe6Z7gEQtEYOxmR5DFBvl5plwDg9KS44f18h9oc4o86O4h3+ ZbU3duCu8kFfgYPwC9H7r3t4bkIo8r1wpG/iRKLjgJpuCvrNfnvgagVgiwa3xiLW6P5G N99A== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-31732-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31732-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id d22-20020a631d56000000b005c6c950c3bdsi4517820pgm.642.2024.01.19.23.22.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 23:22:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-31732-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-31732-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31732-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 9A29F2823B7 for ; Sat, 20 Jan 2024 07:22:10 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id F00A92101; Sat, 20 Jan 2024 07:22:01 +0000 (UTC) Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC9BC5382; Sat, 20 Jan 2024 07:21:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.191 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705735321; cv=none; b=jTRQwIObpPyxrnCDtszMegH1S+0UsmpwMfkckvelo7fjtFNfFUDoRa1RALM4FFK1zeyj+eQUYoxMt/gcfvKSYaFyRIvwQlJlNNIVaSk2MmYIn8mjhO4VWIKkQFzXRSfPXkh9gMPLESDjfeaIuL5jPt3RyojzrJWl1Mx7k0yN68o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705735321; c=relaxed/simple; bh=/eBJiGEkA3mSsiCm5z/VMhJ9OA0Bpw+Qa9HPDflTNtc=; h=Message-ID:Date:MIME-Version:Subject:To:CC:References:From: In-Reply-To:Content-Type; b=tT19aYXeMIvvkRGisqb3SP9SkMdVa1qSLUBI9RgfOriZft/fjVgJQB+BggvbuYZbxPEmQjn7AAz1HSgAZ1LxtSQC/pu2ZonQaHRJ1fOyv+/yqz3OSk9XYhW36V+PybPICoRTlLdHbdPDSYGTrMkeHKADSy8Z/e+pAG29GINo1Ls= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.191 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4TH7DG0XBVz1gxct; Sat, 20 Jan 2024 15:20:14 +0800 (CST) Received: from dggpemm500019.china.huawei.com (unknown [7.185.36.180]) by mail.maildlp.com (Postfix) with ESMTPS id 8695A1400D3; Sat, 20 Jan 2024 15:21:54 +0800 (CST) Received: from [10.67.108.244] (10.67.108.244) by dggpemm500019.china.huawei.com (7.185.36.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 20 Jan 2024 15:21:54 +0800 Message-ID: Date: Sat, 20 Jan 2024 15:21:53 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] perf/core: Fix small negative period being ignored To: Adrian Hunter CC: , , , , , , , , , References: <20240116083915.2859302-1-luogengkun2@huawei.com> <66cdc5f9-a09a-4841-8fca-252b7c78114b@intel.com> From: Luo Gengkun In-Reply-To: <66cdc5f9-a09a-4841-8fca-252b7c78114b@intel.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm500019.china.huawei.com (7.185.36.180) 在 2024/1/19 15:36, Adrian Hunter 写道: > On 16/01/24 10:39, Luo Gengkun wrote: >> In perf_adjust_period, we will first calculate period, and then use >> this period to calculate delta. However, when delta is less than 0, >> there will be a deviation compared to when delta is greater than or >> equal to 0. For example, when delta is in the range of [-14,-1], the >> range of delta = delta + 7 is between [-7,6], so the final value of >> delta/8 is 0. Therefore, the impact of -1 and -2 will be ignored. >> This is unacceptable when the target period is very short, because >> we will lose a lot of samples. >> >> Here are some tests and analyzes: >> before: >> # perf record -e cs -F 1000 ./a.out >> [ perf record: Woken up 1 times to write data ] >> [ perf record: Captured and wrote 0.022 MB perf.data (518 samples) ] >> >> # perf script >> ... >> a.out 396 257.956048: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.957891: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.959730: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.961545: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.963355: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.965163: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.966973: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.968785: 23 cs: ffffffff81f4eeec schedul> >> a.out 396 257.970593: 23 cs: ffffffff81f4eeec schedul> >> ... >> >> after: >> # perf record -e cs -F 1000 ./a.out >> [ perf record: Woken up 1 times to write data ] >> [ perf record: Captured and wrote 0.058 MB perf.data (1466 samples) ] >> >> # perf script >> ... >> a.out 395 59.338813: 11 cs: ffffffff81f4eeec schedul> >> a.out 395 59.339707: 12 cs: ffffffff81f4eeec schedul> >> a.out 395 59.340682: 13 cs: ffffffff81f4eeec schedul> >> a.out 395 59.341751: 13 cs: ffffffff81f4eeec schedul> >> a.out 395 59.342799: 12 cs: ffffffff81f4eeec schedul> >> a.out 395 59.343765: 11 cs: ffffffff81f4eeec schedul> >> a.out 395 59.344651: 11 cs: ffffffff81f4eeec schedul> >> a.out 395 59.345539: 12 cs: ffffffff81f4eeec schedul> >> a.out 395 59.346502: 13 cs: ffffffff81f4eeec schedul> >> ... >> >> test.c >> >> int main() { >> for (int i = 0; i < 20000; i++) >> usleep(10); >> >> return 0; >> } >> >> # time ./a.out >> real 0m1.583s >> user 0m0.040s >> sys 0m0.298s >> >> The above results were tested on x86-64 qemu with KVM enabled using >> test.c as test program. Ideally, we should have around 1500 samples, >> but the previous algorithm had only about 500, whereas the modified >> algorithm now has about 1400. Further more, the new version shows 1 >> sample per 0.001s, while the previous one is 1 sample per 0.002s.This >> indicates that the new algorithm is more sensitive to small negative >> values compared to old algorithm. >> >> Fixes: bd2b5b12849a ("perf_counter: More aggressive frequency adjustment") >> Signed-off-by: Luo Gengkun > > It seems better, and the maths makes sense, so: > > Reviewed-by: Adrian Hunter > > > But the test case still seems to give unexpected results. Usually: > > # time taskset --cpu 1 ./test > real 0m 1.25s > user 0m 0.03s > sys 0m 0.00 > # taskset --cpu 0 perf record -F 1000 -e cs -- taskset --cpu 1 ./test > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.051 MB perf.data (1290 samples) ] > > But occasionally: > > # taskset --cpu 0 perf record -F 1000 -e cs -- taskset --cpu 1 ./test > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.010 MB perf.data (204 samples) ] > # perf script > ... > test 865 265.377846: 16 cs: ffffffff832e927b schedule+0x2b > test 865 265.378900: 15 cs: ffffffff832e927b schedule+0x2b > test 865 265.379845: 14 cs: ffffffff832e927b schedule+0x2b > test 865 265.380770: 14 cs: ffffffff832e927b schedule+0x2b > test 865 265.381647: 15 cs: ffffffff832e927b schedule+0x2b > test 865 265.382638: 16 cs: ffffffff832e927b schedule+0x2b > test 865 265.383647: 16 cs: ffffffff832e927b schedule+0x2b > test 865 265.384704: 15 cs: ffffffff832e927b schedule+0x2b > test 865 265.385649: 14 cs: ffffffff832e927b schedule+0x2b > test 865 265.386578: 152 cs: ffffffff832e927b schedule+0x2b > test 865 265.396383: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.406183: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.415839: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.425445: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.435052: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.444708: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.454314: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.463970: 154 cs: ffffffff832e927b schedule+0x2b > test 865 265.473577: 154 cs: ffffffff832e927b schedule+0x2b > ... > > > It seems that the unexpected results is caused by Timer Interrupts not coming every TICK_NSEC. I guess this is due to system idleness. I tried the patch and it should have fixed the issue. You can give it a try as well. diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index afb028c54f33..2708f1d0692c 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -265,6 +265,7 @@ struct hw_perf_event { * State for freq target events, see __perf_event_overflow() and * perf_adjust_freq_unthr_context(). */ + u64 freq_tick_stamp; u64 freq_time_stamp; u64 freq_count_stamp; #endif diff --git a/kernel/events/core.c b/kernel/events/core.c index cad50d3439f1..fe0d9b470365 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -4112,7 +4112,7 @@ perf_adjust_freq_unthr_context(struct perf_event_context *ctx, bool unthrottle) { struct perf_event *event; struct hw_perf_event *hwc; - u64 now, period = TICK_NSEC; + u64 now, period, tick_stamp; s64 delta; /* @@ -4151,6 +4151,10 @@ perf_adjust_freq_unthr_context(struct perf_event_context *ctx, bool unthrottle) */ event->pmu->stop(event, PERF_EF_UPDATE); + tick_stamp = perf_clock(); + period = tick_stamp - hwc->freq_tick_stamp; + hwc->freq_tick_stamp = tick_stamp; + now = local64_read(&event->count); delta = now - hwc->freq_count_stamp; hwc->freq_count_stamp = now; @@ -4162,8 +4166,14 @@ perf_adjust_freq_unthr_context(struct perf_event_context *ctx, bool unthrottle) * to perf_adjust_period() to avoid stopping it * twice. */ - if (delta > 0) - perf_adjust_period(event, period, delta, false); + if (delta > 0) { + /* + * we skip first tick adjust period + */ + if (likely(period != tick_stamp)) { + perf_adjust_period(event, period, delta, false); + } + } event->pmu->start(event, delta > 0 ? PERF_EF_RELOAD : 0); next: >> --- >> kernel/events/core.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index 683dc086ef10..cad50d3439f1 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -4078,7 +4078,11 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count, bo >> period = perf_calculate_period(event, nsec, count); >> >> delta = (s64)(period - hwc->sample_period); >> - delta = (delta + 7) / 8; /* low pass filter */ >> + if (delta >= 0) >> + delta += 7; >> + else >> + delta -= 7; >> + delta /= 8; /* low pass filter */ >> >> sample_period = hwc->sample_period + delta; >> >