Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1973273imm; Thu, 19 Jul 2018 10:52:23 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeV/oq6k1TyRuXYAvkmDxhRybXaBOKGAOYHC61YsjKCHnXgNJ1Gcod/wRpZ8giyS44jojME X-Received: by 2002:a62:1089:: with SMTP id 9-v6mr10348047pfq.30.1532022743147; Thu, 19 Jul 2018 10:52:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532022743; cv=none; d=google.com; s=arc-20160816; b=MHg1mxU1Ih5j+DF6Vhs1dJAzPEr/qnkSCIw/ANx04VNCn9gTnxbYp6E+MD3gfXPe0a 6oCAxXRKG2IlRLULFCHOAjcn2+p1q3BqzQn1ShKJeVqaC6KGst8k+Oo+wqidXBa06Dzn XQcIeBhciJN8BYiSZWx8S5u8TzehLa+105eMvBI344wGDN5HDOGgAKWAje9SI6NYSv6z iQAz1OBZhu7Xx13tViS9IOaSaBK0r4ZtxahpW3fqgUKFl+RSISUoRTx7LnCuZ1IeauwV okOZ8U2oIQnVdzN0y6QU9JSCAhD0uY6ENa6Ns8PCijxm7ZCse+D2XY30RJiaW6dTyeu7 mmIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=eiqsWTxMloG3jXlacTFcz45bqVpxQEMMAp/qxR7ltn0=; b=cCEV4+MebTC2lCdCRfP1yrnu2f/KLnzLR/4F5uoEkkdG5cV7ALKFJaiVCTQiWJ/H9W TmfdztmkE/6iGh4gyRvjadEhMWOvJWFJHYsYU0mQoufWDG63WykbBVWoDjTOlPCyNShJ uOlhbDBNHafw3VZ+LwOQUDW1IM5hY80p6hDOzkNSj/e1R6K7uxuwifXmABkFUGmU4pVv ATjSwIAiSPVhvL6FTc5oEcUybW/1atX0sKyZmPiSKH/kDWMiE3+IzUpA6QJBKZ8NSorN /XDKYP1+WOvLT0PgFdjO/VAj8oGCo38C8vhZs37lTiZoOavJGoBz1hg/fE9cf09izs7M ZSYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=w7eaUiZA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j2-v6si6519767pgs.475.2018.07.19.10.52.08; Thu, 19 Jul 2018 10:52:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=w7eaUiZA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732137AbeGSSfd (ORCPT + 99 others); Thu, 19 Jul 2018 14:35:33 -0400 Received: from mail-yb0-f195.google.com ([209.85.213.195]:46964 "EHLO mail-yb0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731699AbeGSSfc (ORCPT ); Thu, 19 Jul 2018 14:35:32 -0400 Received: by mail-yb0-f195.google.com with SMTP id c3-v6so3592653ybi.13 for ; Thu, 19 Jul 2018 10:51:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=eiqsWTxMloG3jXlacTFcz45bqVpxQEMMAp/qxR7ltn0=; b=w7eaUiZAJOauA9UXAmgmNehFlIEETn578nohUb1UsfkuyhK5uXadRY6591KPCN+pNb 4Bqvwph9+44OIB0rZCeArggih+/aGAU3Gc6+SMw2C3XiUCPiLZ4GyqanA29sh6cuoo+1 v2LIoNQj7GP+MxI+UN/AaXjiBbb9ZqVcoy3ISqyrf55ntpJBuEDkHPrkKdsMI1HSnxkp 49eoYfAHQDh46odwRW/z6cVb+9bMPA4/x7dQpJG/C4/r58JfKiZxI6JwwSIt8i0htKmX z7vHjvMuKpSsURvswpZvahUk88TD+fxBhObJqa6+OwYJvszxca1zXzuKTBKZFZLU3F9O b1JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=eiqsWTxMloG3jXlacTFcz45bqVpxQEMMAp/qxR7ltn0=; b=h0+4RfAHwW5LxcjKR1AowAmZwHIAs5M687T6bOYK3uoEfrUspM9wfG0lX13WTleylr cnE5Hf/JU5frrDRTV+rhO4xf29QMg6fiYJ3lfV5SBxxrMQeWB2VF4gnCyIm2j6pFRFCw pfwTeXVWSwmfZs/udH44WzwX/M042clPNIFch+ELByTylQkNj+UVWnzfrC9NyRVujuYk aBoiunpQWG16nIZhzWiZETfDqbKgQDT1ixAQZrzAD6hZWqv6mJ3P6QTb8vhBc5wM5I5C TP5EEDOrwLt4zExZpT8rIRdIZYTj/9IihaOexSFbEDh/h/24pvo8hHfRG6lfuSEdbbXM a5IQ== X-Gm-Message-State: AOUpUlHax2sOc4MgNg/WSLV0I74T8O8X938YI/RamNj6VAE390VDcxy+ sNshg0ROrhtHLU00O3qHsy+SUQ== X-Received: by 2002:a25:9a04:: with SMTP id x4-v6mr5795045ybn.182.1532022678143; Thu, 19 Jul 2018 10:51:18 -0700 (PDT) Received: from localhost ([2620:10d:c091:200::2:3a18]) by smtp.gmail.com with ESMTPSA id z125-v6sm3886540ywg.57.2018.07.19.10.51.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 19 Jul 2018 10:51:16 -0700 (PDT) Date: Thu, 19 Jul 2018 13:54:05 -0400 From: Johannes Weiner To: Linus Torvalds Cc: Peter Zijlstra , Ingo Molnar , Andrew Morton , Tejun Heo , surenb@google.com, Vinayak Menon , Christoph Lameter , Mike Galbraith , shakeelb@google.com, linux-mm , cgroups , Linux Kernel Mailing List , kernel-team Subject: Re: [PATCH 08/10] psi: pressure stall information for CPU, memory, and IO Message-ID: <20180719175405.GA19230@cmpxchg.org> References: <20180712172942.10094-1-hannes@cmpxchg.org> <20180712172942.10094-9-hannes@cmpxchg.org> <20180718120318.GC2476@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 19, 2018 at 08:08:20AM -0700, Linus Torvalds wrote: > On Wed, Jul 18, 2018 at 5:03 AM Peter Zijlstra wrote: > > > > And as said before, we can compress the state from 12 bytes, to 6 bits > > (or 1 byte), giving another 11 bytes for 59 bytes free. > > > > Leaving us just 5 bytes short of needing a single cacheline :/ > > Do you actually need 64 bits for the times? > > That's the big cost. And it seems ridiculous, if you actually care about size. > > You already have a 64-bit start time. Everything else is some > cumulative relative time. Do those really need 64-bit and nanosecond > resolution? > > Maybe a 32-bit microsecond would be ok - would you ever account more > than 35 minutes of anything without starting anew? D'oh, you're right, the per-cpu buckets don't need to be this big at all. In fact, we flush those deltas out every 2 seconds when there is activity to maintain the running averages. Since we get 4.2s worth of nanoseconds into a u32, we don't even need to divide in the hotpath. Something along the lines of this here should work: static void psi_group_change(struct psi_group *group, int cpu, u64 now, unsigned int clear, unsigned int set) { struct psi_group_cpu *groupc; unsigned int *tasks; unsigned int t; u32 delta; groupc = per_cpu_ptr(group->cpus, cpu); tasks = groupc->tasks; /* Time since last task change on this runqueue */ delta = now - groupc->last_time; groupc->last_time = now; /* Tasks waited for IO? */ if (tasks[NR_IOWAIT]) { if (!tasks[NR_RUNNING]) groupc->full_time[PSI_IO] += delta; else groupc->some_time[PSI_IO] += delta; } /* Tasks waited for memory? */ if (tasks[NR_MEMSTALL]) { if (!tasks[NR_RUNNING] || (cpu_curr(cpu)->flags & PF_MEMSTALL)) groupc->full_time[PSI_MEM] += delta; else groupc->some_time[PSI_MEM] += delta; } /* Tasks waited for the CPU? */ if (tasks[NR_RUNNING] > 1) groupc->some_time[PSI_CPU] += delta; /* Tasks were generally non-idle? To weigh the CPU in summaries */ if (tasks[NR_RUNNING] || tasks[NR_IOWAIT] || tasks[NR_MEMSTALL]) groupc->nonidle_time += delta; /* Update task counts according to the set/clear bitmasks */ for (t = 0; clear; clear &= ~(1 << t), t++) if (clear & (1 << t)) groupc->tasks[t]--; for (t = 0; set; set &= ~(1 << t), t++) if (set & (1 << t)) groupc->tasks[t]++; /* Kick the stats aggregation worker if it's gone to sleep */ if (!delayed_work_pending(&group->clock_work)) schedule_delayed_work(&group->clock_work, PSI_FREQ); } And then we can pack it down to one cacheline: struct psi_group_cpu { /* States of the tasks belonging to this group */ unsigned int tasks[NR_PSI_TASK_COUNTS]; // 3 /* Time sampling bucket for pressure states - no FULL for CPU */ u32 some_time[NR_PSI_RESOURCES]; u32 full_time[NR_PSI_RESOURCES - 1]; /* Time sampling bucket for non-idle state (ns) */ u32 nonidle_time; /* Time of last task change in this group (rq_clock) */ u64 last_time; }; I'm going to go test with this. Thanks