Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp3889799ybb; Mon, 6 Apr 2020 18:28:58 -0700 (PDT) X-Google-Smtp-Source: APiQypLGG433cRMFIk/O19JOOlVXnBE/JH0tcLmagVQk74KWM0vO5nPmRdx2GLD3xBx+YLTwNh3N X-Received: by 2002:a9d:19e8:: with SMTP id k95mr1610665otk.132.1586222938861; Mon, 06 Apr 2020 18:28:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586222938; cv=none; d=google.com; s=arc-20160816; b=paYijFe+3bEqabtyEM69RKl13vs/xfFRVULtv6DfxiK2U4+1X5dCgrAOiZJuIJDucD +ZYQRwbizb0lvEBVU/EayxZCZc7oIRbdNyoXxFWvClzWaccBmcI4XEd9M/zY+zkqysuy YRlTAkf5/lFKKHWbTHeB4TFoQB/fWAAqA3HsNd3/k9eDdg/oZqFxc5vtAN2fx2qBj4sq n9FVuO/F2AoBupx9OLHhqS/ht/zx9N4ucqzGvYYaNgoEGH4lQyw3QOZD1bDPME86V2cg kyIaefDAzjTCCvY9thSI+e+yFvguQw8DlBEbpA9ETAntuaEugtiEGysFFC04swyqbh8y lkxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=yFrfK+tFik+fOM/cHfSt56CL89u4Dc4TxhZb+sXjdrU=; b=Qm41XywOrpiEoE1VncMItEPQlIO5B98IMCxR6KSuJbGRuCaQPnfOhGebOkNF4iyVV2 XkCmcunqn9MxIkt/AaehFTaP/eJYJpqmPN8aAEVMsBODfJS8vwtMWlifOtvMZQDoURSd tRx6BYYzJhpNBiEBbBABaq98sqjt3IytwJnFvljSDwNcQjdGoQKn0D1ohaae5QvAfdni WiqhMlTTptnejjQyZs1aPiGInTR4WhK7HAgNB7zNsyUnH99jHuXfIn+RP7WABvGBGkcx S4pzpluv2Gj15GSZtsRgEJH6j7y+/FVgJvZzTogPBKZ0k27tRGrqfNNeyCsUduMVzgee wy8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ypoth03b; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f17si513074oov.44.2020.04.06.18.28.46; Mon, 06 Apr 2020 18:28:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=Ypoth03b; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726421AbgDGB15 (ORCPT + 99 others); Mon, 6 Apr 2020 21:27:57 -0400 Received: from mail-qk1-f194.google.com ([209.85.222.194]:33513 "EHLO mail-qk1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726230AbgDGB15 (ORCPT ); Mon, 6 Apr 2020 21:27:57 -0400 Received: by mail-qk1-f194.google.com with SMTP id v7so141757qkc.0 for ; Mon, 06 Apr 2020 18:27:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=yFrfK+tFik+fOM/cHfSt56CL89u4Dc4TxhZb+sXjdrU=; b=Ypoth03bjcScy5sQ/sj970OPLHS87XWxJdoxUNWtAhp9h/ospExbL9zZkTxomMcqqp j1lwgbvPDr7Nr40UXuatw0AmqjIuAFeVnFoKatJiKf6n8snALWDxT/yvJi6z3t8aRvoe thh+7SDLG0FX75oNyhry4j0egiW7Al0AXXoZ/IXn8PHSTCt8+TJfF59yPiTWyhye+d++ F/7nv9xE6mNJg4wfxgyu8w4Ay9Ijui3MWPlY3vTSScpokSDG9sAZdjdTx5lgKysGGBRr ERh4rcM/MAxawKTc2MvFAeGa2H/jqhkAwsXEtKhZHJQtyi4Oxw5haQIZs8WtU0zuMtrT UZSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=yFrfK+tFik+fOM/cHfSt56CL89u4Dc4TxhZb+sXjdrU=; b=cX0+Fn5IEPX3M43XG6zKMI4qUMDH83PmjK8uAyxGicCim5UtqBmOyPAVYosAYdUq97 gNFlGbj+QCQjiY6nUx0X7H4GNU4B73NH9WJ+QNQ82R7l1pHFdjhX+/jkK/8qcMpgh7gh 2luaZuCZAuGkC7vggCpmtQrILb8XG4pLoNO16BRgXC4GNOsnJF4xtXjvoa6yP6Q/D1GN L/e0biwbU4bA5BTNWljtb38xi2bVM5BvpJE3OOojQ8sByZYAjDgC0qt0NBguKwmYZKk2 SJ8NpY4mM2eCwVG4EAsfDn50lVVoHc1iXeQt8jvnIHsEgQ+rW411Pr80ExxbjsmNsbEn emAQ== X-Gm-Message-State: AGi0PuaEK2j5bQGf74v7NsRWyZhdb7RdQv0OEMIbbgVWpQg+EQl7zFSq 5fWgyQ3tfNiifGJbJYcNWCkbKPOUH7qlGxNs8Ps= X-Received: by 2002:a05:620a:1311:: with SMTP id o17mr18030143qkj.343.1586222875309; Mon, 06 Apr 2020 18:27:55 -0700 (PDT) MIME-Version: 1.0 References: <1585892447-32059-1-git-send-email-iamjoonsoo.kim@lge.com> <1585892447-32059-6-git-send-email-iamjoonsoo.kim@lge.com> In-Reply-To: From: Joonsoo Kim Date: Tue, 7 Apr 2020 10:27:44 +0900 Message-ID: Subject: Re: [PATCH v5 05/10] mm/swap: charge the page when adding to the swap cache To: Yang Shi Cc: Andrew Morton , Linux MM , Linux Kernel Mailing List , Johannes Weiner , Michal Hocko , Hugh Dickins , Minchan Kim , Vlastimil Babka , Mel Gorman , kernel-team@lge.com, Joonsoo Kim Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2020=EB=85=84 4=EC=9B=94 7=EC=9D=BC (=ED=99=94) =EC=98=A4=EC=A0=84 9:22, Ya= ng Shi =EB=8B=98=EC=9D=B4 =EC=9E=91=EC=84=B1: > > On Sun, Apr 5, 2020 at 6:03 PM Joonsoo Kim wrote: > > > > 2020=EB=85=84 4=EC=9B=94 4=EC=9D=BC (=ED=86=A0) =EC=98=A4=EC=A0=84 3:29= , Yang Shi =EB=8B=98=EC=9D=B4 =EC=9E=91=EC=84=B1: > > > > > > On Thu, Apr 2, 2020 at 10:41 PM wrote: > > > > > > > > From: Joonsoo Kim > > > > > > > > Currently, some swapped-in pages are not charged to the memcg until > > > > actual access to the page happens. I checked the code and found tha= t > > > > it could cause a problem. In this implementation, even if the memcg > > > > is enabled, one can consume a lot of memory in the system by exploi= ting > > > > this hole. For example, one can make all the pages swapped out and > > > > then call madvise_willneed() to load the all swapped-out pages with= out > > > > pressing the memcg. Although actual access requires charging, it's = really > > > > big benefit to load the swapped-out pages to the memory without pre= ssing > > > > the memcg. > > > > > > > > And, for workingset detection which is implemented on the following= patch, > > > > a memcg should be committed before the workingset detection is exec= uted. > > > > For this purpose, the best solution, I think, is charging the page = when > > > > adding to the swap cache. Charging there is not that hard. Caller o= f > > > > adding the page to the swap cache has enough information about the = charged > > > > memcg. So, what we need to do is just passing this information to > > > > the right place. > > > > > > > > With this patch, specific memcg could be pressured more since reada= head > > > > pages are also charged to it now. This would result in performance > > > > degradation to that user but it would be fair since that readahead = is for > > > > that user. > > > > > > If I read the code correctly, the readahead pages may be *not* charge= d > > > to it at all but other memcgs since mem_cgroup_try_charge() would > > > retrieve the target memcg id from the swap entry then charge to it > > > (generally it is the memcg from who the page is swapped out). So, it > > > may open a backdoor to let one memcg stress other memcgs? > > > > It looks like you talk about the call path on CONFIG_MEMCG_SWAP. > > > > The owner (task) for a anonymous page cannot be changed. It means that > > the previous owner written on the swap entry will be the next user. So, > > I think that using the target memcg id from the swap entry for readahea= d pages > > is valid way. > > > > As you concerned, if someone can control swap-readahead to readahead > > other's swap entry, one memcg could stress other memcg by using the fac= t above. > > However, as far as I know, there is no explicit way to readahead other'= s swap > > entry so no problem. > > Swap cluster readahead would readahead in pages on consecutive swap > entries which may belong to different memcgs, however I just figured > out patch #8 ("mm/swap: do not readahead if the previous owner of the > swap entry isn't me") would prevent from reading ahead pages belonging > to other memcgs. This would kill the potential problem. Yes, that patch kill the potential problem. However, I think that swap clus= ter readahead would not open the backdoor even without the patch #8 in CONFIG_MEMCG_SWAP case, because: 1. consecutive swap space is usually filled by the same task. 2. swap cluster readahead needs a large I/O price to the offender and effec= t isn't serious to the target. 3. those pages would be charged to their previous owner and it is valid. Thanks.