Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3293258pxf; Mon, 15 Mar 2021 06:28:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx+dU9Zw4IDFx2OGJXmE5YjNgE5kquzRhc+KcxT/W4j2St0gsM5ttj1H3A3BjD4YMJGbdZE X-Received: by 2002:a17:907:a042:: with SMTP id gz2mr23285041ejc.174.1615814915584; Mon, 15 Mar 2021 06:28:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615814915; cv=none; d=google.com; s=arc-20160816; b=GBAfgZJsBlXMoNcsLxNqpvLFoYdk3ImPUgSQGFmCPretbo1XgmSzLXJhUU3JZf1DLH c0pVVJU+s1fBu5adRkrCKGvrPIy9+KJrebUN4hUxpZ/VMW9ttnukNoAVdoYu8bfIJ/s8 B9QTFCrYZE7GSGVpnzd5HQYLnOEFN8g5ZHz6TLgwlVLpQs6T9HlrPWMxIFJ8l+zbBBcQ pll2n4CdJK86jQr9kwBDtrHp+qokWPUZylcNACkMBKW4NwU0tu8hiC4xRXkEQwc1GReu i7IjiDfLIKIs4tO7LPDg97g377D/gZpeY1dpT3yRnZv2/e8M5trLCNhVfDmo+P+KJCNv ywpQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject :organization:from:references:cc:to:dkim-signature; bh=Qzk8VRdGUUm6dVAUfVqYW6CB+CKrhhCXO30FiEA0udA=; b=YdCm/roZKWSUWJ5JSnppr5DGhWUpR/dM7Jsg4cQJDH4/nnA0e6wxk1gnpirTVLKjfv 2VVmQYF7A+Ov1HI857388d41VKag65zttJW8fwdbyyjCFJd/IOov9/9Y6343llIOSttM fmhjlks+VqZSOYthUQZuB+p95GK45szTqEwxVkTpw2SNtCkFjKemRqIPB5OOQPBYmKUS 1scaxMUWD1seXofLGrVaLYlDBEhCXgU3grypaJdHY/0knEQQVX+5npPNDWtIcqUDD2Vh sv1Q+UL098bPB3gNIw7Kb11mShnyi+K4tRV5LEQA3ItHvT0qFNhwNwZ4K55VEscy0EI5 NnAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="gQs/pYMh"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o22si11376264eje.560.2021.03.15.06.28.13; Mon, 15 Mar 2021 06:28:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="gQs/pYMh"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230414AbhCON1R (ORCPT + 99 others); Mon, 15 Mar 2021 09:27:17 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:28599 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231356AbhCON0i (ORCPT ); Mon, 15 Mar 2021 09:26:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1615814797; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qzk8VRdGUUm6dVAUfVqYW6CB+CKrhhCXO30FiEA0udA=; b=gQs/pYMhkZkQFzNRkm3pa6U2ofouCK14v3whM7J1PzEyUv1yKPv23zbbZGsv2AiZ9Uz6dh xkRWGrUZ14OrCN9iZ48BZyT16tTVLFz1+e8DJBmrCzpHlxAZ7QrJaICs8ma63i08FGh0+2 bjvw9DcHU4j4CClJ3Cu59C0ayVtMql8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-411-G5XfeZlMPduQo4VSW7fwFA-1; Mon, 15 Mar 2021 09:26:34 -0400 X-MC-Unique: G5XfeZlMPduQo4VSW7fwFA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AD900801817; Mon, 15 Mar 2021 13:26:29 +0000 (UTC) Received: from [10.36.112.200] (ovpn-112-200.ams2.redhat.com [10.36.112.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 93C1B510ED; Mon, 15 Mar 2021 13:26:09 +0000 (UTC) To: "Kirill A. Shutemov" Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Arnd Bergmann , Michal Hocko , Oscar Salvador , Matthew Wilcox , Andrea Arcangeli , Minchan Kim , Jann Horn , Jason Gunthorpe , Dave Hansen , Hugh Dickins , Rik van Riel , "Michael S . Tsirkin" , "Kirill A . Shutemov" , Vlastimil Babka , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Chris Zankel , Max Filippov , Mike Kravetz , Peter Xu , Rolf Eike Beer , linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-arch@vger.kernel.org, Linux API References: <20210308164520.18323-1-david@redhat.com> <20210315122213.k52wtlbbhsw42pks@box> <7d607d1c-efd5-3888-39bb-9e5f8bc08185@redhat.com> <20210315130353.iqnwsnp2c2wpt4y2@box> From: David Hildenbrand Organization: Red Hat GmbH Subject: Re: [PATCH RFCv2] mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault/prealloc memory Message-ID: Date: Mon, 15 Mar 2021 14:26:08 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: <20210315130353.iqnwsnp2c2wpt4y2@box> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 15.03.21 14:03, Kirill A. Shutemov wrote: > On Mon, Mar 15, 2021 at 01:25:40PM +0100, David Hildenbrand wrote: >> On 15.03.21 13:22, Kirill A. Shutemov wrote: >>> On Mon, Mar 08, 2021 at 05:45:20PM +0100, David Hildenbrand wrote: >>>> + case -EHWPOISON: /* Skip over any poisoned pages. */ >>>> + start += PAGE_SIZE; >>>> + continue; >>> >>> Why is it good approach? It's not abvious to me. >> >> My main motivation was to simplify return code handling. I don't want to >> return -EHWPOISON to user space > > Why? Hiding the problem under the rug doesn't help anybody. SIGBUS later > is not better than an error upfront. Well, if you think about "prefaulting page tables", the first intuition is certainly not to check for poisoned pages, right? After all, you are not actually accessing memory, you are allocating memory if required and fill page tables. OTOH, mlock() will also choke on poisoned pages. With the current semantics, you can start and run a VM just fine. Preallocation/prefaulting succeeded after all. On access you will get a SIGBUS, from which e.g., QEMU can recover by injecting an MCE into the guest - just like if you would hit a poisoned page later. The problem we are talking about is most probably very rare, especially when using MADV_POPULATE_ for actual preallocation. I don't have a strong opinion; not bailing out on poisoned pages felt like the right thing to do. -- Thanks, David / dhildenb