Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp376496ybi; Thu, 1 Aug 2019 20:56:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqxnYB+zR/FIUAsUztgsMAJoXAxKaoAwLyFMGsbvQjP557wxil0XEaWUufpjJNjhavAnkpS6 X-Received: by 2002:a65:53cb:: with SMTP id z11mr5507687pgr.200.1564718190240; Thu, 01 Aug 2019 20:56:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564718190; cv=none; d=google.com; s=arc-20160816; b=UBLyGKsbYo9zQoJh+qB2XlA9tnofczyBw3FpV1qb8IfsRQfZkP9Ekf9iy6ai3UjUEp DpbCFzNaZQ9qlPe3zce/ARMXad4TRUUw+9kniIHskFx5E2bYR4ZavUHCs/aBIusWYUc1 xgrV9A08uEY9SF5hWhwfp+Dik2d2r0hKs2r4ij+83g7khvYBeHKzAsjkWp6IO3c8HMmd SszazcxGE0mCAle7Z1BuSXC4gSH6I+K98F8Snd76SasAKujowUNSVHqp8t0e7+cV6CS1 DsdBPrjymwEegIpH5Ccht3Umbl+0oV+wROC8SeI32buDsgQq2PpSh25tuUVs1n4oliRk VXHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=At1IZWoUrcjzoFBo6+pPU11vzH0qsCf+6x4oe65bvIg=; b=FMsftVxU3iIbtqqscm4RgsBnnoLfQ5pkPF6P6RcctPRLWi4CBJQkFiL9eLTk3+8yB+ ZRYhdrzzd6wcnE4gaVKKqXutXFNgTgzlKkLk0Ool3x3bM82f5y1cV5vjiAW5hijF2Imf CO8jHvovv8FsX6q4+Yqmv+N3bPeV1+RPriYHdj+uyX2iaJsB8VQmHJiwZvp8oGqHFfFD Ca+pnyBMwOfRwpRZq+rrkbYGMacAfvW+wdhxLD9M3Qeqki5Yk0Tw3xH4vpHO/665WHYH avIetwXoGA6O3puSLOYx6cCxHc5otZB7i6O2bl2XtrR2BFhLL5yV0/jzKMNnklr5soT+ YjUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=uStsrgDN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 143si37675534pgc.479.2019.08.01.20.55.37; Thu, 01 Aug 2019 20:56:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=uStsrgDN; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388892AbfHAVBH (ORCPT + 99 others); Thu, 1 Aug 2019 17:01:07 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:38252 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725846AbfHAVBG (ORCPT ); Thu, 1 Aug 2019 17:01:06 -0400 Received: by mail-qt1-f193.google.com with SMTP id n11so71745055qtl.5; Thu, 01 Aug 2019 14:01:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=At1IZWoUrcjzoFBo6+pPU11vzH0qsCf+6x4oe65bvIg=; b=uStsrgDNX3A3NRNPBmsfEftPfszUtOF9JqpdBbxcAseb2Cp1nI6QJ4BVFu5H/rIZ8v rbzDuskZ1BvP68LPeL0zq8ZjO/V5OD40IDX9ZX4jz0s3j6TiVwylL6qxzXVjoGygHzq9 NmtbEwgot5NskVLQwlSiyq0FuDuFiySrUDL+272PF2AvoPl6l1u6DBsQ8OlznD+uFli4 4OHGxfrZaIQUIYT2PfKT1uatEHmlFCJQpUua+5gcsJt6g8M0BXkEO+QPsDrI1K3I16L/ aTcKdJV9yk1ZrIuhzinYwy3k623M0pkPpqHfW8HbwvMAGxoiGTdSRxSA1LN1rg9S3KDA /nmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=At1IZWoUrcjzoFBo6+pPU11vzH0qsCf+6x4oe65bvIg=; b=asO6aDk2EJl6o6j7nzIcbGXZsoLRDcw836K0s+GuGY6scCSEwGmDxGH0bLIo2PVm86 nSPIEQCsk6RwM5QdqbBT6Vub3OS8U09Qc/YLytKlYmx/jEI3kwEPs6EqijIIcH/wZQP1 i+RJ4mJPm6077rKDVnjujXSlAXo8OfSwhLzcZucfkKI7eYZhz/z1m2H69X+DxR6A8LkM NRIhoVUNSRN+DSeFZoOJ4FTYnh/phKcTiYVSWPTe3wv63UJhEr8ezPholZ+uq9imjsEv 3YFxL0ZtvzJaZ4XvE/oyaTxsScjtak7Yh7sSsM7lueUuCM6hADCF/RAJocWhV/nzygwE iTnw== X-Gm-Message-State: APjAAAVaPOCP/PN+42shtcnt9wWJbpgaPxOEtmVz9PjiDAO2iavuK76s YkP+eQmWKj2XhYzkLxnDB7ftgmrZZA3ZTXXfO4g= X-Received: by 2002:ac8:2646:: with SMTP id v6mr91139076qtv.205.1564693265783; Thu, 01 Aug 2019 14:01:05 -0700 (PDT) MIME-Version: 1.0 References: <156431697805.3170.6377599347542228221.stgit@buzz> <20190729091738.GF9330@dhcp22.suse.cz> <3d6fc779-2081-ba4b-22cf-be701d617bb4@yandex-team.ru> <20190729103307.GG9330@dhcp22.suse.cz> <20190729184850.GH9330@dhcp22.suse.cz> In-Reply-To: <20190729184850.GH9330@dhcp22.suse.cz> From: Yang Shi Date: Thu, 1 Aug 2019 14:00:51 -0700 Message-ID: Subject: Re: [PATCH RFC] mm/memcontrol: reclaim severe usage over high limit in get_user_pages loop To: Michal Hocko Cc: Konstantin Khlebnikov , Linux MM , Linux Kernel Mailing List , cgroups@vger.kernel.org, Vladimir Davydov , Johannes Weiner Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 29, 2019 at 11:48 AM Michal Hocko wrote: > > On Mon 29-07-19 10:28:43, Yang Shi wrote: > [...] > > I don't worry too much about scale since the scale issue is not unique > > to background reclaim, direct reclaim may run into the same problem. > > Just to clarify. By scaling problem I mean 1:1 kswapd thread to memcg. > You can have thousands of memcgs and I do not think we really do want > to create one kswapd for each. Once we have a kswapd thread pool then we > get into a tricky land where a determinism/fairness would be non trivial > to achieve. Direct reclaim, on the other hand is bound by the workload > itself. Yes, I agree thread pool would introduce more latency than dedicated kswapd thread. But, it looks not that bad in our test. When memory allocation is fast, even though dedicated kswapd thread can't catch up. So, such background reclaim is best effort, not guaranteed. I don't quite get what you mean about fairness. Do you mean they may spend excessive cpu time then cause other processes starvation? I think this could be mitigated by properly organizing and setting groups. But, I agree this is tricky. Typically, the processes are placed into different cgroups according to their importance and priority. For example, in our cluster, system processes would go to system group, then latency sensitive jobs and batch jobs (they are usually second class citizens) go to different groups. The memcg kswapd would be enabled for latency sensitive groups only. The memcg kswapd threads would have the same priority with global kswapd. > -- > Michal Hocko > SUSE Labs