Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp6987544rwl; Wed, 22 Mar 2023 19:54:11 -0700 (PDT) X-Google-Smtp-Source: AK7set/bz1skIyINOXn/KLNNC9fT1X5egQvd4SXfHzf4hnBJvoseq9ROHSFtBPCHToMgSGNM8YRV X-Received: by 2002:a17:90b:350f:b0:23b:bf03:397e with SMTP id ls15-20020a17090b350f00b0023bbf03397emr6458163pjb.24.1679540051211; Wed, 22 Mar 2023 19:54:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679540051; cv=none; d=google.com; s=arc-20160816; b=zUHhHhC6Fej8RMwLJpoCB2NHaenPnyrBKai2Adglo5p0jScN76qAEs+X4I9W8+iCYs 1SIAAOjOe3EPrDybWrg1gmv4W5nOy4Zaltn3O5vZ+IIiQ0tdF7RKgf6lSIBHOXo9JHBM brAKFMl9vyhszXWHOvAk2C9bAwGvUGAB+c0TSNEaO1gsxWdTWEvEx+vwIgY/QG+3Z7kY gTDq0/+v6rx0N2SbrWBcGaAZMf+TdqY2IRXTeTMYfYLcdE5PDvn0CL0cyv3jJ1vAtFWu VFFhRPDYE4aC50Cq7cQ+nzH842B9lGKD+qRomlfHThqzlbWfnvhRw3qfdBNH6qdNfUVb TyCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=W0ebE6bgoVx/Mv3B6TcJ7pQSb0U66C3AjGiXsmuMBx4=; b=q8LKQxR1nMuA66xhix4DdZKK6lNdrungtv5FAjTyEpgMmO+RvJoM6avxOZh6vgX4Mo ltOaJHG1dau5uTx36bzmY5ENtHxv89h8jlqkAQuz7ABwvIhfE9bSyMtRcWPQMBkahBm5 GCl3yfZcyfUTYnrsythXbtF0Xyl7C7t1r+/NrZkNE71tflf6am7TCPRk4mBcNnQsp0FB GjcnDU3nmwE/SCN/Pmf0vPAb3Cb2x2u0UNVGngmiH+Q+sjZGIiir47lc/SAfCWvmvriO c1q1pQXn+xQkbUHM/uT7CNdf6kYGSArjMN/pXvQ72eWvW7oCe1GXQ2uxH/h3ZTXAVDsd gg4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Whxe1lUq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i21-20020a17090ad35500b0023369af18e2si573496pjx.72.2023.03.22.19.53.59; Wed, 22 Mar 2023 19:54:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Whxe1lUq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229847AbjCWCxJ (ORCPT + 99 others); Wed, 22 Mar 2023 22:53:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229881AbjCWCxB (ORCPT ); Wed, 22 Mar 2023 22:53:01 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F9901F5FD for ; Wed, 22 Mar 2023 19:52:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679539938; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=W0ebE6bgoVx/Mv3B6TcJ7pQSb0U66C3AjGiXsmuMBx4=; b=Whxe1lUq5oreq/rH540bIBT1wG9u9pN/PHyKciqQbxVE56FLSJE/m8UaP2Mhl2BHVNGE0E QJLqrBYlWVx/ptl/BAsVKpB4QrH1fcH66rn+96v4t6aB3sziGbOS8WD38zv9MmzUTefkwL NBacKNFIxDN6qOxe6u1sq/BleO6mRRs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-472-Uu42UDlnMkup59I9Lyv7rg-1; Wed, 22 Mar 2023 22:52:15 -0400 X-MC-Unique: Uu42UDlnMkup59I9Lyv7rg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 93FFB8828C0; Thu, 23 Mar 2023 02:52:14 +0000 (UTC) Received: from localhost (ovpn-12-97.pek2.redhat.com [10.72.12.97]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 76B4B4021B1; Thu, 23 Mar 2023 02:52:13 +0000 (UTC) Date: Thu, 23 Mar 2023 10:52:09 +0800 From: Baoquan He To: Lorenzo Stoakes , David Hildenbrand Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Uladzislau Rezki , Matthew Wilcox , Liu Shixin , Jiri Olsa , Jens Axboe , Alexander Viro Subject: Re: [PATCH v7 4/4] mm: vmalloc: convert vread() to vread_iter() Message-ID: References: <941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <941f88bc5ab928e6656e1e2593b91bf0f8c81e1b.1679511146.git.lstoakes@gmail.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/22/23 at 06:57pm, Lorenzo Stoakes wrote: > Having previously laid the foundation for converting vread() to an iterator > function, pull the trigger and do so. > > This patch attempts to provide minimal refactoring and to reflect the > existing logic as best we can, for example we continue to zero portions of > memory not read, as before. > > Overall, there should be no functional difference other than a performance > improvement in /proc/kcore access to vmalloc regions. > > Now we have eliminated the need for a bounce buffer in read_kcore_iter(), > we dispense with it, and try to write to user memory optimistically but > with faults disabled via copy_page_to_iter_nofault(). We already have > preemption disabled by holding a spin lock. We continue faulting in until > the operation is complete. I don't understand the sentences here. In vread_iter(), the actual content reading is done in aligned_vread_iter(), otherwise we zero filling the region. In aligned_vread_iter(), we will use vmalloc_to_page() to get the mapped page and read out, otherwise zero fill. While in this patch, fault_in_iov_iter_writeable() fault in memory of iter one time and will bail out if failed. I am wondering why we continue faulting in until the operation is complete, and how that is done. If we look into the failing point in vread_iter(), it's mainly coming from copy_page_to_iter_nofault(), e.g page_copy_sane() checking failed, i->data_source checking failed. If these conditional checking failed, should we continue reading again and again? And this is not related to memory faulting in. I saw your discussion with David, but I am still a little lost. Hope I can learn it, thanks in advance. ...... > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c > index 08b795fd80b4..25b44b303b35 100644 > --- a/fs/proc/kcore.c > +++ b/fs/proc/kcore.c ...... > @@ -507,13 +503,30 @@ read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) > > switch (m->type) { > case KCORE_VMALLOC: > - vread(buf, (char *)start, tsz); > - /* we have to zero-fill user buffer even if no read */ > - if (copy_to_iter(buf, tsz, iter) != tsz) { > - ret = -EFAULT; > - goto out; > + { > + const char *src = (char *)start; > + size_t read = 0, left = tsz; > + > + /* > + * vmalloc uses spinlocks, so we optimistically try to > + * read memory. If this fails, fault pages in and try > + * again until we are done. > + */ > + while (true) { > + read += vread_iter(iter, src, left); > + if (read == tsz) > + break; > + > + src += read; > + left -= read; > + > + if (fault_in_iov_iter_writeable(iter, left)) { > + ret = -EFAULT; > + goto out; > + } > } > break; > + } > case KCORE_USER: > /* User page is handled prior to normal kernel page: */ > if (copy_to_iter((char *)start, tsz, iter) != tsz) {