Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752256AbcKQUIW (ORCPT ); Thu, 17 Nov 2016 15:08:22 -0500 Received: from userp1040.oracle.com ([156.151.31.81]:36626 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752106AbcKQUIR (ORCPT ); Thu, 17 Nov 2016 15:08:17 -0500 From: Boris Ostrovsky To: david.vrabel@citrix.com, jgross@suse.com Cc: xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, olaf@aepfle.de, Boris Ostrovsky , stable@vger.kernel.org Subject: [PATCH v3 (re-send)] xen/gntdev: Use mempolicy instead of VM_IO flag to avoid NUMA balancing Date: Thu, 17 Nov 2016 15:10:04 -0500 Message-Id: <1479413404-27332-1-git-send-email-boris.ostrovsky@oracle.com> X-Mailer: git-send-email 2.7.4 X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2890 Lines: 102 Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to NUMA balancing") set VM_IO flag to prevent grant maps from being subjected to NUMA balancing. It was discovered recently that this flag causes get_user_pages() to always fail with -EFAULT. check_vma_flags __get_user_pages __get_user_pages_locked __get_user_pages_unlocked get_user_pages_fast iov_iter_get_pages dio_refill_pages do_direct_IO do_blockdev_direct_IO do_blockdev_direct_IO ext4_direct_IO_read generic_file_read_iter aio_run_iocb (which can happen if guest's vdisk has direct-io-safe option). To avoid this don't use vm_flags. Instead, use mempolicy that prohibits page migration (i.e. clear MPOL_F_MOF|MPOL_F_MORON) and make sure we don't consult task's policy (which may include those flags) if vma doesn't have one. Reported-by: Olaf Hering Signed-off-by: Boris Ostrovsky Cc: stable@vger.kernel.org --- Mis-spelled David's address. Changes in v3: * Don't use __mpol_dup() and get_task_policy() which are not exported for use by drivers. Add vm_operations_struct.get_policy(). * Copy to stable drivers/xen/gntdev.c | 27 ++++++++++++++++++++++++++- 1 files changed, 26 insertions(+), 1 deletions(-) diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index bb95212..632edd4 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include @@ -433,10 +434,28 @@ static void gntdev_vma_close(struct vm_area_struct *vma) return map->pages[(addr - map->pages_vm_start) >> PAGE_SHIFT]; } +#ifdef CONFIG_NUMA +/* + * We have this op to make sure callers (such as vma_policy_mof()) don't + * check current task's policy which may include migrate flags (MPOL_F_MOF + * or MPOL_F_MORON) + */ +static struct mempolicy *gntdev_vma_get_policy(struct vm_area_struct *vma, + unsigned long addr) +{ + if (mpol_needs_cond_ref(vma->vm_policy)) + mpol_get(vma->vm_policy); + return vma->vm_policy; +} +#endif + static const struct vm_operations_struct gntdev_vmops = { .open = gntdev_vma_open, .close = gntdev_vma_close, .find_special_page = gntdev_vma_find_special_page, +#ifdef CONFIG_NUMA + .get_policy = gntdev_vma_get_policy, +#endif }; /* ------------------------------------------------------------------ */ @@ -1007,7 +1026,13 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma) vma->vm_ops = &gntdev_vmops; - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP | VM_IO; + vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; + +#ifdef CONFIG_NUMA + /* Prevent NUMA balancing */ + if (vma->vm_policy) + vma->vm_policy->flags &= ~(MPOL_F_MOF | MPOL_F_MORON); +#endif if (use_ptemod) vma->vm_flags |= VM_DONTCOPY; -- 1.7.1