Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp4060703ybn; Fri, 27 Sep 2019 15:52:06 -0700 (PDT) X-Google-Smtp-Source: APXvYqw1eOlBHj2Y3og5y4aEG+24f8Bcc03y0WI2tysztadOgsEMYvIskmDhvFmXIV/4twDUhuyN X-Received: by 2002:a17:906:b74e:: with SMTP id fx14mr10000670ejb.226.1569624726027; Fri, 27 Sep 2019 15:52:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569624726; cv=none; d=google.com; s=arc-20160816; b=ZBx0xssDK8oBLrwMaGjbXXFVuLs+Gj/Qa1+EISzcJm91XBsy+uHaKE1qcxFSLVsm5L 2gGRAx1gI+IEtUFQycr9BA6uAy8F8N2FfnkJN32m/Ip7hKeBcmsu9xc5X/TzbMqs3mkW oM8Iv9h7v6zn4qn0NxLzZ7OgVKBThNk4xRQlPwxs2Z/BRjKyoF+zD55oYgCNhOdpROMl rCTm5WSo4ew0Wr1O8aCY9hnVjYORwsjlcMFBiqkk/9sPSdN+rTfYW7XzHZq52EFjZOoT qDYfMHPD2Z0YI/z4+/nlZJOXFQ9+6ysjSPeoJibp7pFOrnkfRf2D0NFCr+you00peAbh tX2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=UestkIPELIjBFfWcMJudydMzKJ+gMFgCuI48K8BqrAY=; b=INgpuQqK1s5bOOYDPqs0JksigGqW2OOtYrpMOiRLUMT8+RohtnWPOqVR6G5U57r3gt 3/ZfA2RYv5rDBtw+/3/z+PiKXw+KkEPQ98PSJ0rtkE2fxrxtteF5fMHrsWGnTdbcxCPX A2bPHDpQDExh24oaNRECj2gr/+Kj09gkiCuaq+o5a5i/rdeYSEBtgejEl12BCjcUk3kY qvsVGNUhyLTn+oMdzv2RaBGu91sJcMIiiE9EsMzzLWyKyyL9Cv4FyE/PngmkOYcLAk7y jhsxORVcNtR8zMW0KViM3ZXcdIhDj5rYGTH4LVX4nSAYCaPkaURladGzwtU1OTxhsnci 2L5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=K1PMWaVA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id oy5si3797424ejb.170.2019.09.27.15.51.40; Fri, 27 Sep 2019 15:52:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=K1PMWaVA; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727046AbfI0Wvh (ORCPT + 99 others); Fri, 27 Sep 2019 18:51:37 -0400 Received: from mail-ot1-f67.google.com ([209.85.210.67]:38773 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725815AbfI0Wvg (ORCPT ); Fri, 27 Sep 2019 18:51:36 -0400 Received: by mail-ot1-f67.google.com with SMTP id e11so3650511otl.5 for ; Fri, 27 Sep 2019 15:51:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=UestkIPELIjBFfWcMJudydMzKJ+gMFgCuI48K8BqrAY=; b=K1PMWaVARwRP8FL++1ErBgWx4Qdus1r4ONn9vNNwu89yrhxqFGuxr4DrOtAfy01N0A VNtv1sAmVr7Y1nn75qAdQz5i7/uPflDiVz2bFpvtrJyKqcf8gQeb7HSoYJ9z/I+3hhQ8 +79su4rMPcLgmKpXa6KALw0ZHLKUqDmNY/5blt3LzapYnEOAwOA789BJrUCupeNuSm5u tWa6sQFHOzbjbghiNGMHKObasz/mUgKI98H6sz8k2+dyNaWyxN/lPAIYxtlnLubNwrc/ JJsR6KcEHhR/XA9NbWHdrW4PNYHkW4DlrQTxNGTW5w48tnH1J8HuDXisU+yvK6QMhkU1 N2nQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UestkIPELIjBFfWcMJudydMzKJ+gMFgCuI48K8BqrAY=; b=Bij6pMbxxkLmBYlaSujx4xyjwD8W9GdcmUTwM5T3EFOJVfGSw1XkeUYmZpmYVl5w1z mCkBGWW4hUvBKdRUjyPD/O9zfMpyOQm0mwEcEh2qyblW/sqHu0PJqiXArK7+5CASurV6 1QiiW+//FB7uL83lOdhGAJj94Hr9sq3TdI/m8wI+kLNApzjUY7ayKLWZyVfjYsB/q1BI zA6mvbSxERfFZzIZZ8K/qWWV3PcUDIXwi71BoKGgSqe1PrOJKCPQs0airOmbaRyXX2Pt S+LUZ7WTs9sZxsGMVGWL+KZcaPcOj0B+BEb9SyHpB9muz6NvdcSGzlyN7h5BCypWWAaO 8Yug== X-Gm-Message-State: APjAAAXZEWnR6hCYThUMsS9DGjKSo2s9OOLmJNcIr+V3HUBAtPFPzH8g Iy09/yZ/jpb6UUz+bdMM5SmI0dV3IPwWTsLlbk3xuQ== X-Received: by 2002:a05:6830:1358:: with SMTP id r24mr4925285otq.349.1569624695280; Fri, 27 Sep 2019 15:51:35 -0700 (PDT) MIME-Version: 1.0 References: <20190919222421.27408-1-almasrymina@google.com> <3c73d2b7-f8d0-16bf-b0f0-86673c3e9ce3@oracle.com> <8f7db4f1-9c16-def5-79dc-d38d6b9d150e@oracle.com> <794398cc-07a4-d235-a0da-0246f5a09f6e@oracle.com> In-Reply-To: <794398cc-07a4-d235-a0da-0246f5a09f6e@oracle.com> From: Mina Almasry Date: Fri, 27 Sep 2019 15:51:24 -0700 Message-ID: Subject: Re: [PATCH v5 0/7] hugetlb_cgroup: Add hugetlb_cgroup reservation limits To: Mike Kravetz Cc: Tejun Heo , David Rientjes , Aneesh Kumar , shuah , Shakeel Butt , Greg Thelen , Andrew Morton , khalid.aziz@oracle.com, open list , linux-mm@kvack.org, linux-kselftest@vger.kernel.org, cgroups@vger.kernel.org, =?UTF-8?Q?Michal_Koutn=C3=BD?= Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 27, 2019 at 2:59 PM Mike Kravetz wrote: > > On 9/26/19 5:55 PM, Mina Almasry wrote: > > Provided we keep the existing controller untouched, should the new > > controller track: > > > > 1. only reservations, or > > 2. both reservations and allocations for which no reservations exist > > (such as the MAP_NORESERVE case)? > > > > I like the 'both' approach. Seems to me a counter like that would work > > automatically regardless of whether the application is allocating > > hugetlb memory with NORESERVE or not. NORESERVE allocations cannot cut > > into reserved hugetlb pages, correct? > > Correct. One other easy way to allocate huge pages without reserves > (that I know is used today) is via the fallocate system call. > > > If so, then applications that > > allocate with NORESERVE will get sigbused when they hit their limit, > > and applications that allocate without NORESERVE may get an error at > > mmap time but will always be within their limits while they access the > > mmap'd memory, correct? > > Correct. At page allocation time we can easily check to see if a reservation > exists and not charge. For any specific page within a hugetlbfs file, > a charge would happen at mmap time or allocation time. > > One exception (that I can think of) to this mmap(RESERVE) will not cause > a SIGBUS rule is in the case of hole punch. If someone punches a hole in > a file, not only do they remove pages associated with the file but the > reservation information as well. Therefore, a subsequent fault will be > the same as an allocation without reservation. > I don't think it causes a sigbus. This is the scenario, right: 1. Make cgroup with limit X bytes. 2. Task in cgroup mmaps a file with X bytes, causing the cgroup to get charged 3. A hole of size Y is punched in the file, causing the cgroup to get uncharged Y bytes. 4. The task faults in memory from the hole, getting charged up to Y bytes again. But they will be still within their limits. IIUC userspace only gets sigbus'd if the limit is lowered between steps 3 and 4, and it's ok if it gets sigbus'd there in my opinion. > I 'think' the code to remove/truncate a file will work corrctly as it > is today, but I need to think about this some more. > > > mmap'd memory, correct? So the 'both' counter seems like a one size > > fits all. > > > > I think the only sticking point left is whether an added controller > > can support both cgroup-v2 and cgroup-v1. If I could get confirmation > > on that I'll provide a patchset. > > Sorry, but I can not provide cgroup expertise. > -- > Mike Kravetz