Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp4539734imm; Mon, 17 Sep 2018 16:08:01 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZ7DG/qreTWPC5f55XrnQvRkhMh1MuYP/SiDB6GDZYeAk3G6a66VSpc1PuyhSWGOP1l3VbY X-Received: by 2002:a63:24c4:: with SMTP id k187-v6mr25464660pgk.162.1537225680991; Mon, 17 Sep 2018 16:08:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537225680; cv=none; d=google.com; s=arc-20160816; b=caRwTaYIGxAuB0CQlQb7QwOV/f+rT0/LImMa7zsWjMfxcd2feoGgMYW4mAbS7VL6Iw 5HMq02iDzIocpLQNugeaOC1Q77Ppb+eFzsoAhwkaapja9SmKzen7sCB2oslyLH4nsf8Y i+YW3TqGrQmcbpJsquG/O9vzjSNQB5fdlPdN18wqxJXHEjtVhAjNziHBIYTcXAarC6t+ 6xnO90gh692qWLavI66ZYU0HZp2WYjhqUnwlzbEdQBTnvWfQBku9Zh8oxTbapvZK+2Mk kvZT97CpkWv4E6sFlTQqP9AdIBV7ObfV5qklTsPQQ0vSfoKZrCNyEH361skIkDZbEaAU AXNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from; bh=SB1AeWMhdAAQnEDdutP+w93xKKQ4K2Y6mvHaDHYjbFk=; b=AvJ3sLWdG0PFSel5fq1xTNEn99n4+603I4vxWJnjytzCTpRqaAcng2K+lwVyZsvker 422OXsPfoPmBADKoiuhusfGYaR6i8w3umf+CAZ4NHyQc3nlhSQ1e3GDGiaGYs1ODJQly DRmF4ms/DoHOUwaw5XFZwPuB3E75tj9cHtRNqf1NtWMuE26DARA2+iHUWQXjsw7wrwzg O9RYtnz2GAIH49mCHCa9moEnrD9OXAwKzXCXU2pWkGKCKiqH7vQ7Xm+fyOGqOxx+k1Wh UojdhbgEb3jxnGW2NRkYrv/6TaXG5fypnmYqSXiDajCXThtP0xzBoQzLyCQB9+CGFQGs zi6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d30-v6si17938055pld.452.2018.09.17.16.07.46; Mon, 17 Sep 2018 16:08:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731081AbeIREfz (ORCPT + 99 others); Tue, 18 Sep 2018 00:35:55 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:49484 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728677AbeIREfy (ORCPT ); Tue, 18 Sep 2018 00:35:54 -0400 Received: from localhost (li1825-44.members.linode.com [172.104.248.44]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id AC600C77; Mon, 17 Sep 2018 23:06:23 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jann Horn , Will Deacon , Davidlohr Bueso , Oleg Nesterov , stable@kernel.org, Linus Torvalds Subject: [PATCH 4.14 126/126] mm: get rid of vmacache_flush_all() entirely Date: Tue, 18 Sep 2018 00:42:54 +0200 Message-Id: <20180917211712.213242780@linuxfoundation.org> X-Mailer: git-send-email 2.19.0 In-Reply-To: <20180917211703.481236999@linuxfoundation.org> References: <20180917211703.481236999@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Linus Torvalds commit 7a9cdebdcc17e426fb5287e4a82db1dfe86339b2 upstream. Jann Horn points out that the vmacache_flush_all() function is not only potentially expensive, it's buggy too. It also happens to be entirely unnecessary, because the sequence number overflow case can be avoided by simply making the sequence number be 64-bit. That doesn't even grow the data structures in question, because the other adjacent fields are already 64-bit. So simplify the whole thing by just making the sequence number overflow case go away entirely, which gets rid of all the complications and makes the code faster too. Win-win. [ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics also just goes away entirely with this ] Reported-by: Jann Horn Suggested-by: Will Deacon Acked-by: Davidlohr Bueso Cc: Oleg Nesterov Cc: stable@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- include/linux/mm_types.h | 2 +- include/linux/mm_types_task.h | 2 +- include/linux/vm_event_item.h | 1 - include/linux/vmacache.h | 5 ----- mm/debug.c | 4 ++-- mm/vmacache.c | 38 -------------------------------------- 6 files changed, 4 insertions(+), 48 deletions(-) --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -354,7 +354,7 @@ struct kioctx_table; struct mm_struct { struct vm_area_struct *mmap; /* list of VMAs */ struct rb_root mm_rb; - u32 vmacache_seqnum; /* per-thread vmacache */ + u64 vmacache_seqnum; /* per-thread vmacache */ #ifdef CONFIG_MMU unsigned long (*get_unmapped_area) (struct file *filp, unsigned long addr, unsigned long len, --- a/include/linux/mm_types_task.h +++ b/include/linux/mm_types_task.h @@ -32,7 +32,7 @@ #define VMACACHE_MASK (VMACACHE_SIZE - 1) struct vmacache { - u32 seqnum; + u64 seqnum; struct vm_area_struct *vmas[VMACACHE_SIZE]; }; --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -105,7 +105,6 @@ enum vm_event_item { PGPGIN, PGPGOUT, PS #ifdef CONFIG_DEBUG_VM_VMACACHE VMACACHE_FIND_CALLS, VMACACHE_FIND_HITS, - VMACACHE_FULL_FLUSHES, #endif #ifdef CONFIG_SWAP SWAP_RA, --- a/include/linux/vmacache.h +++ b/include/linux/vmacache.h @@ -16,7 +16,6 @@ static inline void vmacache_flush(struct memset(tsk->vmacache.vmas, 0, sizeof(tsk->vmacache.vmas)); } -extern void vmacache_flush_all(struct mm_struct *mm); extern void vmacache_update(unsigned long addr, struct vm_area_struct *newvma); extern struct vm_area_struct *vmacache_find(struct mm_struct *mm, unsigned long addr); @@ -30,10 +29,6 @@ extern struct vm_area_struct *vmacache_f static inline void vmacache_invalidate(struct mm_struct *mm) { mm->vmacache_seqnum++; - - /* deal with overflows */ - if (unlikely(mm->vmacache_seqnum == 0)) - vmacache_flush_all(mm); } #endif /* __LINUX_VMACACHE_H */ --- a/mm/debug.c +++ b/mm/debug.c @@ -100,7 +100,7 @@ EXPORT_SYMBOL(dump_vma); void dump_mm(const struct mm_struct *mm) { - pr_emerg("mm %p mmap %p seqnum %d task_size %lu\n" + pr_emerg("mm %p mmap %p seqnum %llu task_size %lu\n" #ifdef CONFIG_MMU "get_unmapped_area %p\n" #endif @@ -128,7 +128,7 @@ void dump_mm(const struct mm_struct *mm) "tlb_flush_pending %d\n" "def_flags: %#lx(%pGv)\n", - mm, mm->mmap, mm->vmacache_seqnum, mm->task_size, + mm, mm->mmap, (long long) mm->vmacache_seqnum, mm->task_size, #ifdef CONFIG_MMU mm->get_unmapped_area, #endif --- a/mm/vmacache.c +++ b/mm/vmacache.c @@ -8,44 +8,6 @@ #include /* - * Flush vma caches for threads that share a given mm. - * - * The operation is safe because the caller holds the mmap_sem - * exclusively and other threads accessing the vma cache will - * have mmap_sem held at least for read, so no extra locking - * is required to maintain the vma cache. - */ -void vmacache_flush_all(struct mm_struct *mm) -{ - struct task_struct *g, *p; - - count_vm_vmacache_event(VMACACHE_FULL_FLUSHES); - - /* - * Single threaded tasks need not iterate the entire - * list of process. We can avoid the flushing as well - * since the mm's seqnum was increased and don't have - * to worry about other threads' seqnum. Current's - * flush will occur upon the next lookup. - */ - if (atomic_read(&mm->mm_users) == 1) - return; - - rcu_read_lock(); - for_each_process_thread(g, p) { - /* - * Only flush the vmacache pointers as the - * mm seqnum is already set and curr's will - * be set upon invalidation when the next - * lookup is done. - */ - if (mm == p->mm) - vmacache_flush(p); - } - rcu_read_unlock(); -} - -/* * This task may be accessing a foreign mm via (for example) * get_user_pages()->find_vma(). The vmacache is task-local and this * task's vmacache pertains to a different mm (ie, its own). There is