Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3753854pxb; Tue, 26 Jan 2021 04:01:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJyTn0y9inted0GyFUF3ZOSFib2rUivpcphdqphMODweRp+RRWUQVaYBhPqJ+OZczLjNIcvL X-Received: by 2002:a05:6402:b6f:: with SMTP id cb15mr4264005edb.277.1611662493756; Tue, 26 Jan 2021 04:01:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611662493; cv=none; d=google.com; s=arc-20160816; b=LLkPDi77oCrJFV9TTrwibVLFqv4iTkUfg8o/blcOfb8md+p0+8FFldr7F9wKNgIbCR Ky18dvQyydM0rP5cK6/6WjcmKB23NXg9Q2oDdfOI6P06wSkNpLJHS+57ULeLfCQIVIyW I2wnl7hg2FDMgQYF9+xI4iKY9Ps+f091TwarNlmaZEZB/sPG2pxN1hCMjqcuMK2UjI4g BaK40HNm4OUzLl0XDH0TmPS2jn3rquoOLyyBQpuLLmBTHcjHSxPwtdnQRlYYXwPXJyva yvVVlGfaB+NIm/ht7nVpqioC/tT7wQyDJ/6S8Q8op127lgEok/6utwZ/nWv6wNyvHtej 1nqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject :organization:from:references:cc:to:dkim-signature; bh=qjJ7EBIraH0kcBiHWVTNphRxbXFU8GNfvJrsoVivTWk=; b=Zx6SBrWPn9nUt6y1vIaNKPTOYb09h5KjuMt2fEbWPqFxI4fZ4aLOVVTReXJbC6baXE LNqcQTnkFoaOJqApKsGyMeZgf6M/tM1alMvq39rdkZ+N/tA12Gyp9UC4g74vBSAqUEtv DFXV8j5lyN+3VIefkL31Vo785N0KReNvzqGFGhJzwy9gixNMApvKk8Dn6ZnqVfBEpqv6 98mi1lkS9P/mujWHpSLjeUF44Dk24kGBihuvvqdkw4PrS+puhuG5LysA8j+dmvJEs3cl dVF8tItcxW3qvCp5YACTedzrsCQchmrn+9fN/CNRL9uqKD5GwAVJxv5gCYRkgAcivksd 8KGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=V8WD8NCM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x19si8163162edd.134.2021.01.26.04.01.09; Tue, 26 Jan 2021 04:01:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=V8WD8NCM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405244AbhAZL7o (ORCPT + 99 others); Tue, 26 Jan 2021 06:59:44 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:40262 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2405106AbhAZL6m (ORCPT ); Tue, 26 Jan 2021 06:58:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611662235; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qjJ7EBIraH0kcBiHWVTNphRxbXFU8GNfvJrsoVivTWk=; b=V8WD8NCMWlFDx9o6YzcO4I4dENQNX15Lz56GUb71wwvfnEUCghoGq2WMCpCus3UksH4ryw zGNU8VB40n7JCwxI/iYewHuZuQskrP2EPqwOLVMVQm7Din2QT7woVj7CSLOVIA2dubHziT gr4NL1JD2oKEm1+KynWmZgRhW0U44EU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-344-Ryg-fxAdMkms-vm90M751Q-1; Tue, 26 Jan 2021 06:57:11 -0500 X-MC-Unique: Ryg-fxAdMkms-vm90M751Q-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9F94E107ACF6; Tue, 26 Jan 2021 11:57:06 +0000 (UTC) Received: from [10.36.114.192] (ovpn-114-192.ams2.redhat.com [10.36.114.192]) by smtp.corp.redhat.com (Postfix) with ESMTP id B16B65D751; Tue, 26 Jan 2021 11:56:49 +0000 (UTC) To: Michal Hocko , Mike Rapoport Cc: Andrew Morton , Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dan Williams , Dave Hansen , Elena Reshetova , "H. Peter Anvin" , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mark Rutland , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Rick Edgecombe , Roman Gushchin , Shakeel Butt , Shuah Khan , Thomas Gleixner , Tycho Andersen , Will Deacon , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org, Hagen Paul Pfeifer , Palmer Dabbelt References: <20210121122723.3446-1-rppt@kernel.org> <20210121122723.3446-8-rppt@kernel.org> <20210126114657.GL827@dhcp22.suse.cz> From: David Hildenbrand Organization: Red Hat GmbH Subject: Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation Message-ID: <303f348d-e494-e386-d1f5-14505b5da254@redhat.com> Date: Tue, 26 Jan 2021 12:56:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <20210126114657.GL827@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 26.01.21 12:46, Michal Hocko wrote: > On Thu 21-01-21 14:27:19, Mike Rapoport wrote: >> From: Mike Rapoport >> >> Removing a PAGE_SIZE page from the direct map every time such page is >> allocated for a secret memory mapping will cause severe fragmentation of >> the direct map. This fragmentation can be reduced by using PMD-size pages >> as a pool for small pages for secret memory mappings. >> >> Add a gen_pool per secretmem inode and lazily populate this pool with >> PMD-size pages. >> >> As pages allocated by secretmem become unmovable, use CMA to back large >> page caches so that page allocator won't be surprised by failing attempt to >> migrate these pages. >> >> The CMA area used by secretmem is controlled by the "secretmem=" kernel >> parameter. This allows explicit control over the memory available for >> secretmem and provides upper hard limit for secretmem consumption. > > OK, so I have finally had a look at this closer and this is really not > acceptable. I have already mentioned that in a response to other patch > but any task is able to deprive access to secret memory to other tasks > and cause OOM killer which wouldn't really recover ever and potentially > panic the system. Now you could be less drastic and only make SIGBUS on > fault but that would be still quite terrible. There is a very good > reason why hugetlb implements is non-trivial reservation system to avoid > exactly these problems. > > So unless I am really misreading the code > Nacked-by: Michal Hocko > > That doesn't mean I reject the whole idea. There are some details to > sort out as mentioned elsewhere but you cannot really depend on > pre-allocated pool which can fail at a fault time like that. So, to do it similar to hugetlbfs (e.g., with CMA), there would have to be a mechanism to actually try pre-reserving (e.g., from the CMA area), at which point in time the pages would get moved to the secretmem pool, and a mechanism for mmap() etc. to "reserve" from these secretmem pool, such that there are guarantees at fault time? What we have right now feels like some kind of overcommit (reading, as overcommiting huge pages, so we might get SIGBUS at fault time). TBH, the SIGBUS thingy doesn't sound terrible to me - if this behavior is to be expected right now by applications using it and they can handle it - no guarantees. I fully agree that some kind of reservation/guarantee mechanism would be preferable. -- Thanks, David / dhildenb