Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp16883imm; Wed, 30 May 2018 16:35:13 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKXqhUqsE2AsQ1o4/K6ypPaMtSg6MUooQ7ubKIwaeBfbUZ4AhyqT/gTw+jAx+p3i6hbVdKG X-Received: by 2002:a17:902:8:: with SMTP id 8-v6mr4616873pla.287.1527723313702; Wed, 30 May 2018 16:35:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527723313; cv=none; d=google.com; s=arc-20160816; b=sWcCWDuGQ1VUhSLmog90uxd/N6hi5JE/sawA1+P3dn4bagF+9AYmrRb4I2pTUNHwTN QcqsEV66Jf/5O2dUzmzbVEcphXFpZANVMXYK9TPQappHqFmGjVvYBKz21g/bkF2rejU1 B7u+TLxzEilpcS815CmsCVScvbn+w3qaMFhWyk9Wvha6R1/WLhMHrmq+bTa3tnS7CP+V zrJf9g+WVBdEpk9bdcn21tb2Q2nwcx4FUAaZfuleHjSCMrD6w8htAnRMQshWuDgSOlz5 Tx7duw7MuNZPwqGYtsyVX5i5a3a7YfyG7ZzKH/LGjxi1MpbNdtsIYE+kFwws54ytNhI8 bQRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=hJwJPmg34GR13WY06X4l2DvZJg6z5d+fVYiqshCmDW8=; b=QNwQBi89D8LzveWJ9X6y3gtgYMPE/7seFSP7WOJ+NWf6FJ5GIUe1AOtho6nMz7kl8X QNGiWGI5Wf/i/FNkVR1ISjrE5Kvsx5ySCD4i81C/OLmcfvh4R56PfnohFlwXzKNlxG3T nvQDzgw2zMUK2Trm+y5uOEA6v2wT4IYWRCemCN/eevzjg73zxv+Zz46gtTFF1sOHQZ4a CMUDaO5LxEtFZlxHPI0EomLIyzJHADfnPlqJ+uDV/VWHvA6TDgQjRUKJLfM3WnVjppCC sfvostqubpFPcrmZPA6qAm4uRossXpveepQx+n+VVH18JfCDIqX9z7usOrIKZMRyIcWN jknw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Oueiurlc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q15-v6si10655876pls.358.2018.05.30.16.34.59; Wed, 30 May 2018 16:35:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Oueiurlc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753770AbeE3Xc6 (ORCPT + 99 others); Wed, 30 May 2018 19:32:58 -0400 Received: from mail-io0-f176.google.com ([209.85.223.176]:41416 "EHLO mail-io0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753750AbeE3Xcx (ORCPT ); Wed, 30 May 2018 19:32:53 -0400 Received: by mail-io0-f176.google.com with SMTP id t5-v6so11096782ioa.8 for ; Wed, 30 May 2018 16:32:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=hJwJPmg34GR13WY06X4l2DvZJg6z5d+fVYiqshCmDW8=; b=Oueiurlc8IkQOYPCV9kNLk4cM8TfEKhAWn2dVA+9BDyjB+HMpbhqcFsOhZWKRrwWvH rbgVcKgJeHKAU1c0Lr6l7TUnc14Os6dzWIXdZMCxDQYGsnRSvFb959ZWipxyzVHdHgRi A5Ik7fpwyBRBXUDKQGvkRetO2V9oO3Qxo8Op7u6EStXFqam1yPARa9s4KehKvxy/EEe2 gsVNM1A2Ek3w/m2DIcF2ZPs5GWgI1HOyMASv+v5Yl5jKp01cz2gv+kmOsJI/GAoIEk9b Z65WCMI6+sL49xUbEBjlQ2X0XBF4wK4RjFGy/MB8FFoJx9ehI36OM1Jy3hYrVLw86o3x qL+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=hJwJPmg34GR13WY06X4l2DvZJg6z5d+fVYiqshCmDW8=; b=QjnfA0I6rMpgpF+ZIF/oxWHqLSAvJO6T0tNT5zXY/z0IeawN0kW8rKK2EBY+4d2oue dAPJ/dyY55/vb89uMFz1aehsjd0DZ+3HWMo0hydwkYMj0TtwRVcQmS1e9i2HyPzNNHsz CwKMfDTuD1NOr+HEG70fOkbVnex8UQddaBE6JQR0G2a6X3nKDIPXyJseijeSAcf7jqAp ELdHumFB0t5ye8zeqE6rESik1PXv4HC0lOGtuSk9uhvkYm7VzYOI/ymov4ZINurZWm4p JkBjXDL1PG08xL354nefVzbpgOV+o/8kpKnrZ20sT7FeZXS0ofGC/6nm2K6G1TuK5G/2 5/FA== X-Gm-Message-State: APt69E2Hpx5AYHNTfnPOUV4I5nO6ATcwxsB4JazPR8PtlPy6UpOmaTvX gJeVnmNTXo5q0AmN+JNeF519jzG+w/ov1QWwrX7K4g== X-Received: by 2002:a6b:fa18:: with SMTP id p24-v6mr4036879ioh.134.1527723172876; Wed, 30 May 2018 16:32:52 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac0:a1e5:0:0:0:0:0 with HTTP; Wed, 30 May 2018 16:32:52 -0700 (PDT) In-Reply-To: <20180529181616.GB28689@cmpxchg.org> References: <20180507210135.1823-1-hannes@cmpxchg.org> <20180529181616.GB28689@cmpxchg.org> From: Suren Baghdasaryan Date: Wed, 30 May 2018 16:32:52 -0700 Message-ID: Subject: Re: [PATCH 0/7] psi: pressure stall information for CPU, memory, and IO To: Johannes Weiner Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Andrew Morton , Tejun Heo , Balbir Singh , Mike Galbraith , Oliver Yang , Shakeel Butt , xxx xxx , Taras Kondratiuk , Daniel Walker , Vinayak Menon , Ruslan Ruslichenko , kernel-team@fb.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 29, 2018 at 11:16 AM, Johannes Weiner wrote: > Hi Suren, > > On Fri, May 25, 2018 at 05:29:30PM -0700, Suren Baghdasaryan wrote: >> Hi Johannes, >> I tried your previous memdelay patches before this new set was posted >> and results were promising for predicting when Android system is close >> to OOM. I'm definitely going to try this one after I backport it to >> 4.9. > > I'm happy to hear that! > >> Would it make sense to split CONFIG_PSI into CONFIG_PSI_CPU, >> CONFIG_PSI_MEM and CONFIG_PSI_IO since one might need only specific >> subset of this feature? > > Yes, that should be doable. I'll split them out in the next version. > >> > The total= value gives the absolute stall time in microseconds. This >> > allows detecting latency spikes that might be too short to sway the >> > running averages. It also allows custom time averaging in case the >> > 10s/1m/5m windows aren't adequate for the usecase (or are too coarse >> > with future hardware). >> >> Any reasons these specific windows were chosen (empirical >> data/historical reasons)? I'm worried that with the smallest window >> being 10s the signal might be too inert to detect fast memory pressure >> buildup before OOM kill happens. I'll have to experiment with that >> first, however if you have some insights into this already please >> share them. > > They were chosen empirically. We started out with the loadavg window > sizes, but had to reduce them for exactly the reason you mention - > they're way too coarse to detect acute pressure buildup. > > 10s has been working well for us. We could make it smaller, but there > is some worry that we don't have enough samples then and the average > becomes too erratic - whereas monitoring total= directly would allow > you to detect accute spikes and handle this erraticness explicitly. Unfortunately total= field is now updated only at 2sec intervals which might be too late to react to mounting memory pressure. With previous memdelay patchset md->aggregate which is reported as "total" was calculated directly from inside memdelay_task_change, so it was always up-to-date. Now group->some and group->full are updated from inside psi_clock with up to 2sec delay. This prevents us from detecting these acute pressure spikes immediately. I understand why you moved these calculations out of the hot path but maybe we could keep updating "total" inside psi_group_update? This would allow for custom averaging and eliminate this delay for detecting spikes in the pressure signal. More conceptually I would love to have a way to monitor the averages at a slow rate and when they rise and cross some threshold to increase the monitoring rate and react quickly in case they shoot up. Current 2sec delay poses a problem for doing that. > > Let me know how it works out in your tests. I've done the backporting to 4.9 and running the tests but the 2sec delay is problematic for getting a detailed look at the signal and its usefulness. Thinking about workarounds if only for data collection but don't want to deviate too much from your baseline. Would love to hear from you if a good compromise can be reached here. > > Thanks for your feedback.