2018-07-12 14:58:21

by Kirill A. Shutemov

[permalink] [raw]
Subject: [PATCHv2 0/2] Fix crash due to vma_is_anonymous() false-positives


Fix crash found by syzkaller.

The fix allows to remove ->vm_ops checks.

v2:
- Catch few more cases where we need to initialize ->vm_ops:
+ nommu;
+ ia64;
- Make sure that we initialize ->vm_ops even if ->mmap failed.
We need ->vm_ops in error path too.

Kirill A. Shutemov (2):
mm: Fix vma_is_anonymous() false-positives
mm: Drop unneeded ->vm_ops checks

arch/ia64/kernel/perfmon.c | 1 +
arch/ia64/mm/init.c | 2 ++
drivers/char/mem.c | 1 +
fs/binfmt_elf.c | 2 +-
fs/exec.c | 1 +
fs/hugetlbfs/inode.c | 1 +
fs/kernfs/file.c | 20 +-------------------
fs/proc/task_mmu.c | 2 +-
include/linux/mm.h | 5 ++++-
kernel/events/core.c | 2 +-
kernel/fork.c | 2 +-
mm/gup.c | 2 +-
mm/hugetlb.c | 2 +-
mm/khugepaged.c | 4 ++--
mm/memory.c | 12 ++++++------
mm/mempolicy.c | 10 +++++-----
mm/mmap.c | 25 ++++++++++++++++++-------
mm/mremap.c | 2 +-
mm/nommu.c | 13 ++++++++++---
mm/shmem.c | 1 +
mm/util.c | 12 ++++++++++++
21 files changed, 72 insertions(+), 50 deletions(-)

--
2.18.0



2018-07-12 14:57:38

by Kirill A. Shutemov

[permalink] [raw]
Subject: [PATCHv2 1/2] mm: Fix vma_is_anonymous() false-positives

vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous
VMA. This is unreliable as ->mmap may not set ->vm_ops.

False-positive vma_is_anonymous() may lead to crashes:

next ffff8801ce5e7040 prev ffff8801d20eca50 mm ffff88019c1e13c0
prot 27 anon_vma ffff88019680cdd8 vm_ops 0000000000000000
pgoff 0 file ffff8801b2ec2d00 private_data 0000000000000000
flags: 0xff(read|write|exec|shared|mayread|maywrite|mayexec|mayshare)
------------[ cut here ]------------
kernel BUG at mm/memory.c:1422!
invalid opcode: 0000 [#1] SMP KASAN
CPU: 0 PID: 18486 Comm: syz-executor3 Not tainted 4.18.0-rc3+ #136
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/01/2011
RIP: 0010:zap_pmd_range mm/memory.c:1421 [inline]
RIP: 0010:zap_pud_range mm/memory.c:1466 [inline]
RIP: 0010:zap_p4d_range mm/memory.c:1487 [inline]
RIP: 0010:unmap_page_range+0x1c18/0x2220 mm/memory.c:1508
Code: ff 31 ff 4c 89 e6 42 c6 04 33 f8 e8 92 dd d0 ff 4d 85 e4 0f 85 4a eb ff
ff e8 54 dc d0 ff 48 8b bd 10 fc ff ff e8 82 95 fe ff <0f> 0b e8 41 dc d0 ff
0f 0b 4c 89 ad 18 fc ff ff c7 85 7c fb ff ff
RSP: 0018:ffff8801b0587330 EFLAGS: 00010286
RAX: 000000000000013c RBX: 1ffff100360b0e9c RCX: ffffc90002620000
RDX: 0000000000000000 RSI: ffffffff81631851 RDI: 0000000000000001
RBP: ffff8801b05877c8 R08: ffff880199d40300 R09: ffffed003b5c4fc0
R10: ffffed003b5c4fc0 R11: ffff8801dae27e07 R12: 0000000000000000
R13: ffff88019c1e13c0 R14: dffffc0000000000 R15: 0000000020e01000
FS: 00007fca32251700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f04c540d000 CR3: 00000001ac1f0000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
unmap_single_vma+0x1a0/0x310 mm/memory.c:1553
zap_page_range_single+0x3cc/0x580 mm/memory.c:1644
unmap_mapping_range_vma mm/memory.c:2792 [inline]
unmap_mapping_range_tree mm/memory.c:2813 [inline]
unmap_mapping_pages+0x3a7/0x5b0 mm/memory.c:2845
unmap_mapping_range+0x48/0x60 mm/memory.c:2880
truncate_pagecache+0x54/0x90 mm/truncate.c:800
truncate_setsize+0x70/0xb0 mm/truncate.c:826
simple_setattr+0xe9/0x110 fs/libfs.c:409
notify_change+0xf13/0x10f0 fs/attr.c:335
do_truncate+0x1ac/0x2b0 fs/open.c:63
do_sys_ftruncate+0x492/0x560 fs/open.c:205
__do_sys_ftruncate fs/open.c:215 [inline]
__se_sys_ftruncate fs/open.c:213 [inline]
__x64_sys_ftruncate+0x59/0x80 fs/open.c:213
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Reproducer:

#include <stdio.h>
#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <fcntl.h>

#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
#define KCOV_ENABLE _IO('c', 100)
#define KCOV_DISABLE _IO('c', 101)
#define COVER_SIZE (1024<<10)

#define KCOV_TRACE_PC 0
#define KCOV_TRACE_CMP 1

int main(int argc, char **argv)
{
int fd;
unsigned long *cover;

system("mount -t debugfs none /sys/kernel/debug");
fd = open("/sys/kernel/debug/kcov", O_RDWR);
ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE);
cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long),
PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
munmap(cover, COVER_SIZE * sizeof(unsigned long));
cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long),
PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
memset(cover, 0, COVER_SIZE * sizeof(unsigned long));
ftruncate(fd, 3UL << 20);
return 0;
}

This can be fixed by assigning anonymous VMAs own vm_ops and not relying
on it being NULL.

If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Reported-by: [email protected]
Cc: [email protected]
Cc: Dmitry Vyukov <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
---
arch/ia64/kernel/perfmon.c | 1 +
arch/ia64/mm/init.c | 2 ++
drivers/char/mem.c | 1 +
fs/exec.c | 1 +
fs/hugetlbfs/inode.c | 1 +
include/linux/mm.h | 5 ++++-
mm/khugepaged.c | 4 ++--
mm/mmap.c | 11 +++++++++++
mm/nommu.c | 9 ++++++++-
mm/shmem.c | 1 +
mm/util.c | 12 ++++++++++++
11 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 3b38c717008a..13fd6f7d37e8 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2292,6 +2292,7 @@ pfm_smpl_buffer_alloc(struct task_struct *task, struct file *filp, pfm_context_t
vma->vm_file = get_file(filp);
vma->vm_flags = VM_READ|VM_MAYREAD|VM_DONTEXPAND|VM_DONTDUMP;
vma->vm_page_prot = PAGE_READONLY; /* XXX may need to change */
+ vma->vm_ops = &dummy_vm_ops;

/*
* Now we have everything we need and we can initialize
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index 18278b448530..f5dfcb723518 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -122,6 +122,7 @@ ia64_init_addr_space (void)
vma->vm_end = vma->vm_start + PAGE_SIZE;
vma->vm_flags = VM_DATA_DEFAULT_FLAGS|VM_GROWSUP|VM_ACCOUNT;
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
+ vma->vm_ops = &dummy_vm_ops;
down_write(&current->mm->mmap_sem);
if (insert_vm_struct(current->mm, vma)) {
up_write(&current->mm->mmap_sem);
@@ -141,6 +142,7 @@ ia64_init_addr_space (void)
vma->vm_page_prot = __pgprot(pgprot_val(PAGE_READONLY) | _PAGE_MA_NAT);
vma->vm_flags = VM_READ | VM_MAYREAD | VM_IO |
VM_DONTEXPAND | VM_DONTDUMP;
+ vma->vm_ops = &dummy_vm_ops;
down_write(&current->mm->mmap_sem);
if (insert_vm_struct(current->mm, vma)) {
up_write(&current->mm->mmap_sem);
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index ffeb60d3434c..f0a8b0b1768b 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -708,6 +708,7 @@ static int mmap_zero(struct file *file, struct vm_area_struct *vma)
#endif
if (vma->vm_flags & VM_SHARED)
return shmem_zero_setup(vma);
+ vma->vm_ops = &anon_vm_ops;
return 0;
}

diff --git a/fs/exec.c b/fs/exec.c
index 2d4e0075bd24..a1a246062561 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -307,6 +307,7 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
* configured yet.
*/
BUILD_BUG_ON(VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP);
+ vma->vm_ops = &anon_vm_ops;
vma->vm_end = STACK_TOP_MAX;
vma->vm_start = vma->vm_end - PAGE_SIZE;
vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index d508c7844681..5ff73b1398ad 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -597,6 +597,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset,
memset(&pseudo_vma, 0, sizeof(struct vm_area_struct));
pseudo_vma.vm_flags = (VM_HUGETLB | VM_MAYSHARE | VM_SHARED);
pseudo_vma.vm_file = file;
+ pseudo_vma.vm_ops = &dummy_vm_ops;

for (index = start; index < end; index++) {
/*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a0fbb9ffe380..9a35362bbc92 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1536,9 +1536,12 @@ int clear_page_dirty_for_io(struct page *page);

int get_cmdline(struct task_struct *task, char *buffer, int buflen);

+extern const struct vm_operations_struct anon_vm_ops;
+extern const struct vm_operations_struct dummy_vm_ops;
+
static inline bool vma_is_anonymous(struct vm_area_struct *vma)
{
- return !vma->vm_ops;
+ return vma->vm_ops == &anon_vm_ops;
}

#ifdef CONFIG_SHMEM
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index d7b2a4bf8671..5ae34097aed1 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -440,7 +440,7 @@ int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
* page fault if needed.
*/
return 0;
- if (vma->vm_ops || (vm_flags & VM_NO_KHUGEPAGED))
+ if (!vma_is_anonymous(vma) || (vm_flags & VM_NO_KHUGEPAGED))
/* khugepaged not yet working on file or special mappings */
return 0;
hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
@@ -831,7 +831,7 @@ static bool hugepage_vma_check(struct vm_area_struct *vma)
return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff,
HPAGE_PMD_NR);
}
- if (!vma->anon_vma || vma->vm_ops)
+ if (!vma->anon_vma || !vma_is_anonymous(vma))
return false;
if (is_vma_temporary_stack(vma))
return false;
diff --git a/mm/mmap.c b/mm/mmap.c
index d1eb87ef4b1a..527c17f31635 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -561,6 +561,8 @@ static unsigned long count_vma_pages_range(struct mm_struct *mm,
void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma,
struct rb_node **rb_link, struct rb_node *rb_parent)
{
+ WARN_ONCE(!vma->vm_ops, "missing vma->vm_ops");
+
/* Update tracking information for the gap following the new vma. */
if (vma->vm_next)
vma_gap_update(vma->vm_next);
@@ -1762,6 +1764,11 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
*/
vma->vm_file = get_file(file);
error = call_mmap(file, vma);
+
+ /* All mappings must have ->vm_ops set */
+ if (!vma->vm_ops)
+ vma->vm_ops = &dummy_vm_ops;
+
if (error)
goto unmap_and_free_vma;

@@ -1780,6 +1787,9 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
error = shmem_zero_setup(vma);
if (error)
goto free_vma;
+ } else {
+ /* vma_is_anonymous() relies on this. */
+ vma->vm_ops = &anon_vm_ops;
}

vma_link(mm, vma, prev, rb_link, rb_parent);
@@ -2999,6 +3009,7 @@ static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long

INIT_LIST_HEAD(&vma->anon_vma_chain);
vma->vm_mm = mm;
+ vma->vm_ops = &anon_vm_ops;
vma->vm_start = addr;
vma->vm_end = addr + len;
vma->vm_pgoff = pgoff;
diff --git a/mm/nommu.c b/mm/nommu.c
index 4452d8bd9ae4..f00f209833ab 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1058,6 +1058,8 @@ static int do_mmap_shared_file(struct vm_area_struct *vma)
int ret;

ret = call_mmap(vma->vm_file, vma);
+ if (!vma->vm_ops)
+ vma->vm_ops = &dummy_vm_ops;
if (ret == 0) {
vma->vm_region->vm_top = vma->vm_region->vm_end;
return 0;
@@ -1089,6 +1091,8 @@ static int do_mmap_private(struct vm_area_struct *vma,
*/
if (capabilities & NOMMU_MAP_DIRECT) {
ret = call_mmap(vma->vm_file, vma);
+ if (!vma->vm_ops)
+ vma->vm_ops = &dummy_vm_ops;
if (ret == 0) {
/* shouldn't return success if we're not sharing */
BUG_ON(!(vma->vm_flags & VM_MAYSHARE));
@@ -1137,6 +1141,8 @@ static int do_mmap_private(struct vm_area_struct *vma,
fpos = vma->vm_pgoff;
fpos <<= PAGE_SHIFT;

+ vma->vm_ops = &dummy_vm_ops;
+
ret = kernel_read(vma->vm_file, base, len, &fpos);
if (ret < 0)
goto error_free;
@@ -1144,7 +1150,8 @@ static int do_mmap_private(struct vm_area_struct *vma,
/* clear the last little bit */
if (ret < len)
memset(base + ret, 0, len - ret);
-
+ } else {
+ vma->vm_ops = &anon_vm_ops;
}

return 0;
diff --git a/mm/shmem.c b/mm/shmem.c
index 2cab84403055..db5c319161ca 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1424,6 +1424,7 @@ static void shmem_pseudo_vma_init(struct vm_area_struct *vma,
/* Bias interleave by inode number to distribute better across nodes */
vma->vm_pgoff = index + info->vfs_inode.i_ino;
vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index);
+ vma->vm_ops = &dummy_vm_ops;
}

static void shmem_pseudo_vma_destroy(struct vm_area_struct *vma)
diff --git a/mm/util.c b/mm/util.c
index 3351659200e6..20e6a78f3237 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -20,6 +20,18 @@

#include "internal.h"

+/*
+ * All anonymous VMAs have ->vm_ops set to anon_vm_ops.
+ * vma_is_anonymous() reiles on anon_vm_ops to detect anonymous VMA.
+ */
+const struct vm_operations_struct anon_vm_ops = {};
+
+/*
+ * All VMAs have to have ->vm_ops set. dummy_vm_ops can be used if the VMA
+ * doesn't need to handle any of the operations.
+ */
+const struct vm_operations_struct dummy_vm_ops = {};
+
static inline int is_kernel_rodata(unsigned long addr)
{
return addr >= (unsigned long)__start_rodata &&
--
2.18.0


2018-07-12 14:58:08

by Kirill A. Shutemov

[permalink] [raw]
Subject: [PATCHv2 2/2] mm: Drop unneeded ->vm_ops checks

We now have all VMAs with ->vm_ops set and don't need to check it for
NULL everywhere.

Signed-off-by: Kirill A. Shutemov <[email protected]>
---
fs/binfmt_elf.c | 2 +-
fs/kernfs/file.c | 20 +-------------------
fs/proc/task_mmu.c | 2 +-
kernel/events/core.c | 2 +-
kernel/fork.c | 2 +-
mm/gup.c | 2 +-
mm/hugetlb.c | 2 +-
mm/memory.c | 12 ++++++------
mm/mempolicy.c | 10 +++++-----
mm/mmap.c | 14 +++++++-------
mm/mremap.c | 2 +-
mm/nommu.c | 4 ++--
12 files changed, 28 insertions(+), 46 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 0ac456b52bdd..4f171cf21bc2 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1302,7 +1302,7 @@ static bool always_dump_vma(struct vm_area_struct *vma)
* Assume that all vmas with a .name op should always be dumped.
* If this changes, a new vm_ops field can easily be added.
*/
- if (vma->vm_ops && vma->vm_ops->name && vma->vm_ops->name(vma))
+ if (vma->vm_ops->name && vma->vm_ops->name(vma))
return true;

/*
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index 2015d8c45e4a..945c3d306d8f 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -336,9 +336,6 @@ static void kernfs_vma_open(struct vm_area_struct *vma)
struct file *file = vma->vm_file;
struct kernfs_open_file *of = kernfs_of(file);

- if (!of->vm_ops)
- return;
-
if (!kernfs_get_active(of->kn))
return;

@@ -354,9 +351,6 @@ static vm_fault_t kernfs_vma_fault(struct vm_fault *vmf)
struct kernfs_open_file *of = kernfs_of(file);
vm_fault_t ret;

- if (!of->vm_ops)
- return VM_FAULT_SIGBUS;
-
if (!kernfs_get_active(of->kn))
return VM_FAULT_SIGBUS;

@@ -374,9 +368,6 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct vm_fault *vmf)
struct kernfs_open_file *of = kernfs_of(file);
vm_fault_t ret;

- if (!of->vm_ops)
- return VM_FAULT_SIGBUS;
-
if (!kernfs_get_active(of->kn))
return VM_FAULT_SIGBUS;

@@ -397,9 +388,6 @@ static int kernfs_vma_access(struct vm_area_struct *vma, unsigned long addr,
struct kernfs_open_file *of = kernfs_of(file);
int ret;

- if (!of->vm_ops)
- return -EINVAL;
-
if (!kernfs_get_active(of->kn))
return -EINVAL;

@@ -419,9 +407,6 @@ static int kernfs_vma_set_policy(struct vm_area_struct *vma,
struct kernfs_open_file *of = kernfs_of(file);
int ret;

- if (!of->vm_ops)
- return 0;
-
if (!kernfs_get_active(of->kn))
return -EINVAL;

@@ -440,9 +425,6 @@ static struct mempolicy *kernfs_vma_get_policy(struct vm_area_struct *vma,
struct kernfs_open_file *of = kernfs_of(file);
struct mempolicy *pol;

- if (!of->vm_ops)
- return vma->vm_policy;
-
if (!kernfs_get_active(of->kn))
return vma->vm_policy;

@@ -511,7 +493,7 @@ static int kernfs_fop_mmap(struct file *file, struct vm_area_struct *vma)
* So error if someone is trying to use close.
*/
rc = -EINVAL;
- if (vma->vm_ops && vma->vm_ops->close)
+ if (vma->vm_ops->close)
goto out_put;

rc = 0;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e9679016271f..e959623123e4 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -326,7 +326,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid)
goto done;
}

- if (vma->vm_ops && vma->vm_ops->name) {
+ if (vma->vm_ops->name) {
name = vma->vm_ops->name(vma);
if (name)
goto done;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8f0434a9951a..2e35401a5c68 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7269,7 +7269,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)

goto got_name;
} else {
- if (vma->vm_ops && vma->vm_ops->name) {
+ if (vma->vm_ops->name) {
name = (char *) vma->vm_ops->name(vma);
if (name)
goto cpy_name;
diff --git a/kernel/fork.c b/kernel/fork.c
index 9440d61b925c..e5e7a220a124 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -519,7 +519,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm,
if (!(tmp->vm_flags & VM_WIPEONFORK))
retval = copy_page_range(mm, oldmm, mpnt);

- if (tmp->vm_ops && tmp->vm_ops->open)
+ if (tmp->vm_ops->open)
tmp->vm_ops->open(tmp);

if (retval)
diff --git a/mm/gup.c b/mm/gup.c
index b70d7ba7cc13..b732768ed3ac 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -31,7 +31,7 @@ static struct page *no_page_table(struct vm_area_struct *vma,
* But we can only make this optimization where a hole would surely
* be zero-filled if handle_mm_fault() actually did handle it.
*/
- if ((flags & FOLL_DUMP) && (!vma->vm_ops || !vma->vm_ops->fault))
+ if ((flags & FOLL_DUMP) && !vma->vm_ops->fault)
return ERR_PTR(-EFAULT);
return NULL;
}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 039ddbc574e9..2065acc5a6aa 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -637,7 +637,7 @@ EXPORT_SYMBOL_GPL(linear_hugepage_index);
*/
unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
{
- if (vma->vm_ops && vma->vm_ops->pagesize)
+ if (vma->vm_ops->pagesize)
return vma->vm_ops->pagesize(vma);
return PAGE_SIZE;
}
diff --git a/mm/memory.c b/mm/memory.c
index 7206a634270b..02fbef2bd024 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -768,7 +768,7 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
(void *)addr, vma->vm_flags, vma->anon_vma, mapping, index);
pr_alert("file:%pD fault:%pf mmap:%pf readpage:%pf\n",
vma->vm_file,
- vma->vm_ops ? vma->vm_ops->fault : NULL,
+ vma->vm_ops->fault,
vma->vm_file ? vma->vm_file->f_op->mmap : NULL,
mapping ? mapping->a_ops->readpage : NULL);
dump_stack();
@@ -825,7 +825,7 @@ struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
if (likely(!pte_special(pte)))
goto check_pfn;
- if (vma->vm_ops && vma->vm_ops->find_special_page)
+ if (vma->vm_ops->find_special_page)
return vma->vm_ops->find_special_page(vma, addr);
if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
return NULL;
@@ -2404,7 +2404,7 @@ static void fault_dirty_shared_page(struct vm_area_struct *vma,
{
struct address_space *mapping;
bool dirtied;
- bool page_mkwrite = vma->vm_ops && vma->vm_ops->page_mkwrite;
+ bool page_mkwrite = vma->vm_ops->page_mkwrite;

dirtied = set_page_dirty(page);
VM_BUG_ON_PAGE(PageAnon(page), page);
@@ -2648,7 +2648,7 @@ static int wp_pfn_shared(struct vm_fault *vmf)
{
struct vm_area_struct *vma = vmf->vma;

- if (vma->vm_ops && vma->vm_ops->pfn_mkwrite) {
+ if (vma->vm_ops->pfn_mkwrite) {
int ret;

pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -2669,7 +2669,7 @@ static int wp_page_shared(struct vm_fault *vmf)

get_page(vmf->page);

- if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
+ if (vma->vm_ops->page_mkwrite) {
int tmp;

pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -4439,7 +4439,7 @@ int __access_remote_vm(struct task_struct *tsk, struct mm_struct *mm,
vma = find_vma(mm, addr);
if (!vma || vma->vm_start > addr)
break;
- if (vma->vm_ops && vma->vm_ops->access)
+ if (vma->vm_ops->access)
ret = vma->vm_ops->access(vma, addr, buf,
len, write);
if (ret <= 0)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 9ac49ef17b4e..f0fcf70bcec7 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -651,13 +651,13 @@ static int vma_replace_policy(struct vm_area_struct *vma,
pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
vma->vm_start, vma->vm_end, vma->vm_pgoff,
vma->vm_ops, vma->vm_file,
- vma->vm_ops ? vma->vm_ops->set_policy : NULL);
+ vma->vm_ops->set_policy);

new = mpol_dup(pol);
if (IS_ERR(new))
return PTR_ERR(new);

- if (vma->vm_ops && vma->vm_ops->set_policy) {
+ if (vma->vm_ops->set_policy) {
err = vma->vm_ops->set_policy(vma, new);
if (err)
goto err_out;
@@ -845,7 +845,7 @@ static long do_get_mempolicy(int *policy, nodemask_t *nmask,
up_read(&mm->mmap_sem);
return -EFAULT;
}
- if (vma->vm_ops && vma->vm_ops->get_policy)
+ if (vma->vm_ops->get_policy)
pol = vma->vm_ops->get_policy(vma, addr);
else
pol = vma->vm_policy;
@@ -1617,7 +1617,7 @@ struct mempolicy *__get_vma_policy(struct vm_area_struct *vma,
struct mempolicy *pol = NULL;

if (vma) {
- if (vma->vm_ops && vma->vm_ops->get_policy) {
+ if (vma->vm_ops->get_policy) {
pol = vma->vm_ops->get_policy(vma, addr);
} else if (vma->vm_policy) {
pol = vma->vm_policy;
@@ -1663,7 +1663,7 @@ bool vma_policy_mof(struct vm_area_struct *vma)
{
struct mempolicy *pol;

- if (vma->vm_ops && vma->vm_ops->get_policy) {
+ if (vma->vm_ops->get_policy) {
bool ret = false;

pol = vma->vm_ops->get_policy(vma, vma->vm_start);
diff --git a/mm/mmap.c b/mm/mmap.c
index 527c17f31635..5adaf9f9b941 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -177,7 +177,7 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
struct vm_area_struct *next = vma->vm_next;

might_sleep();
- if (vma->vm_ops && vma->vm_ops->close)
+ if (vma->vm_ops->close)
vma->vm_ops->close(vma);
if (vma->vm_file)
fput(vma->vm_file);
@@ -998,7 +998,7 @@ static inline int is_mergeable_vma(struct vm_area_struct *vma,
return 0;
if (vma->vm_file != file)
return 0;
- if (vma->vm_ops && vma->vm_ops->close)
+ if (vma->vm_ops->close)
return 0;
if (!is_mergeable_vm_userfaultfd_ctx(vma, vm_userfaultfd_ctx))
return 0;
@@ -1638,7 +1638,7 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot)
return 0;

/* The backer wishes to know when pages are first written to? */
- if (vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite))
+ if (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite)
return 1;

/* The open routine did something to the protections that pgprot_modify
@@ -2624,7 +2624,7 @@ int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *new;
int err;

- if (vma->vm_ops && vma->vm_ops->split) {
+ if (vma->vm_ops->split) {
err = vma->vm_ops->split(vma, addr);
if (err)
return err;
@@ -2657,7 +2657,7 @@ int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
if (new->vm_file)
get_file(new->vm_file);

- if (new->vm_ops && new->vm_ops->open)
+ if (new->vm_ops->open)
new->vm_ops->open(new);

if (new_below)
@@ -2671,7 +2671,7 @@ int __split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
return 0;

/* Clean everything up if vma_adjust failed. */
- if (new->vm_ops && new->vm_ops->close)
+ if (new->vm_ops->close)
new->vm_ops->close(new);
if (new->vm_file)
fput(new->vm_file);
@@ -3232,7 +3232,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap,
goto out_free_mempol;
if (new_vma->vm_file)
get_file(new_vma->vm_file);
- if (new_vma->vm_ops && new_vma->vm_ops->open)
+ if (new_vma->vm_ops->open)
new_vma->vm_ops->open(new_vma);
vma_link(mm, new_vma, prev, rb_link, rb_parent);
*need_rmap_locks = false;
diff --git a/mm/mremap.c b/mm/mremap.c
index 5c2e18505f75..7ab222c283de 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -302,7 +302,7 @@ static unsigned long move_vma(struct vm_area_struct *vma,
need_rmap_locks);
if (moved_len < old_len) {
err = -ENOMEM;
- } else if (vma->vm_ops && vma->vm_ops->mremap) {
+ } else if (vma->vm_ops->mremap) {
err = vma->vm_ops->mremap(new_vma);
}

diff --git a/mm/nommu.c b/mm/nommu.c
index f00f209833ab..73f66e81cfb0 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -764,7 +764,7 @@ static void delete_vma_from_mm(struct vm_area_struct *vma)
*/
static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma)
{
- if (vma->vm_ops && vma->vm_ops->close)
+ if (vma->vm_ops->close)
vma->vm_ops->close(vma);
if (vma->vm_file)
fput(vma->vm_file);
@@ -1496,7 +1496,7 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma,
region->vm_pgoff = new->vm_pgoff += npages;
}

- if (new->vm_ops && new->vm_ops->open)
+ if (new->vm_ops->open)
new->vm_ops->open(new);

delete_vma_from_mm(vma);
--
2.18.0


2018-07-12 15:09:42

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCHv2 0/2] Fix crash due to vma_is_anonymous() false-positives

On Thu, Jul 12, 2018 at 05:56:24PM +0300, Kirill A. Shutemov wrote:
>
> Fix crash found by syzkaller.
>
> The fix allows to remove ->vm_ops checks.
>
> v2:
> - Catch few more cases where we need to initialize ->vm_ops:
> + nommu;
> + ia64;
> - Make sure that we initialize ->vm_ops even if ->mmap failed.
> We need ->vm_ops in error path too.

Just to be clear: it *should* help found issues, but I don't have setup to
test nommu changes.

And ion-related bug was actually caused by fault injection that failed
page allocation and ->mmap not setting ->vm_ops. It should be fine now.
But again I wasn't able to trigger the exact situation.

--
Kirill A. Shutemov

2018-07-12 16:22:07

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCHv2 1/2] mm: Fix vma_is_anonymous() false-positives

Kirill, I am not trying to review this change (but it looks good to me),
just a silly question...

On 07/12, Kirill A. Shutemov wrote:
>
> This can be fixed by assigning anonymous VMAs own vm_ops and not relying
> on it being NULL.

I agree, this makes sense, but...

> If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> dummy_vm_ops.

Shouldn't this change alone fix the problem?

Oleg.


2018-07-12 16:41:06

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCHv2 1/2] mm: Fix vma_is_anonymous() false-positives

On Thu, Jul 12, 2018 at 06:20:39PM +0200, Oleg Nesterov wrote:
> Kirill, I am not trying to review this change (but it looks good to me),
> just a silly question...
>
> On 07/12, Kirill A. Shutemov wrote:
> >
> > This can be fixed by assigning anonymous VMAs own vm_ops and not relying
> > on it being NULL.
>
> I agree, this makes sense, but...
>
> > If ->mmap() failed to set ->vm_ops, mmap_region() will set it to
> > dummy_vm_ops.
>
> Shouldn't this change alone fix the problem?

Unfortunately, no. I've tried it before. Mapping /dev/zero with
MAP_PRIVATE hast to produce anonymous VMA. The trick with dummy_vm_ops
wouldn't be able to handle the situation.

--
Kirill A. Shutemov

2018-07-16 13:32:52

by Marcel Ziswiler

[permalink] [raw]
Subject: REGRESSION: [PATCHv2 2/2] mm: Drop unneeded ->vm_ops checks

On Thu, 2018-07-12 at 17:56 +0300, Kirill A. Shutemov wrote:
> We now have all VMAs with ->vm_ops set and don't need to check it for
> NULL everywhere.
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
> ---
> fs/binfmt_elf.c | 2 +-
> fs/kernfs/file.c | 20 +-------------------
> fs/proc/task_mmu.c | 2 +-
> kernel/events/core.c | 2 +-
> kernel/fork.c | 2 +-
> mm/gup.c | 2 +-
> mm/hugetlb.c | 2 +-
> mm/memory.c | 12 ++++++------
> mm/mempolicy.c | 10 +++++-----
> mm/mmap.c | 14 +++++++-------
> mm/mremap.c | 2 +-
> mm/nommu.c | 4 ++--
> 12 files changed, 28 insertions(+), 46 deletions(-)
>
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 0ac456b52bdd..4f171cf21bc2 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -1302,7 +1302,7 @@ static bool always_dump_vma(struct
> vm_area_struct *vma)
> * Assume that all vmas with a .name op should always be
> dumped.
> * If this changes, a new vm_ops field can easily be added.
> */
> - if (vma->vm_ops && vma->vm_ops->name && vma->vm_ops-
> >name(vma))
> + if (vma->vm_ops->name && vma->vm_ops->name(vma))
> return true;
>
> /*
> diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> index 2015d8c45e4a..945c3d306d8f 100644
> --- a/fs/kernfs/file.c
> +++ b/fs/kernfs/file.c
> @@ -336,9 +336,6 @@ static void kernfs_vma_open(struct vm_area_struct
> *vma)
> struct file *file = vma->vm_file;
> struct kernfs_open_file *of = kernfs_of(file);
>
> - if (!of->vm_ops)
> - return;
> -
> if (!kernfs_get_active(of->kn))
> return;
>
> @@ -354,9 +351,6 @@ static vm_fault_t kernfs_vma_fault(struct
> vm_fault *vmf)
> struct kernfs_open_file *of = kernfs_of(file);
> vm_fault_t ret;
>
> - if (!of->vm_ops)
> - return VM_FAULT_SIGBUS;
> -
> if (!kernfs_get_active(of->kn))
> return VM_FAULT_SIGBUS;
>
> @@ -374,9 +368,6 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct
> vm_fault *vmf)
> struct kernfs_open_file *of = kernfs_of(file);
> vm_fault_t ret;
>
> - if (!of->vm_ops)
> - return VM_FAULT_SIGBUS;
> -
> if (!kernfs_get_active(of->kn))
> return VM_FAULT_SIGBUS;
>
> @@ -397,9 +388,6 @@ static int kernfs_vma_access(struct
> vm_area_struct *vma, unsigned long addr,
> struct kernfs_open_file *of = kernfs_of(file);
> int ret;
>
> - if (!of->vm_ops)
> - return -EINVAL;
> -
> if (!kernfs_get_active(of->kn))
> return -EINVAL;
>
> @@ -419,9 +407,6 @@ static int kernfs_vma_set_policy(struct
> vm_area_struct *vma,
> struct kernfs_open_file *of = kernfs_of(file);
> int ret;
>
> - if (!of->vm_ops)
> - return 0;
> -
> if (!kernfs_get_active(of->kn))
> return -EINVAL;
>
> @@ -440,9 +425,6 @@ static struct mempolicy
> *kernfs_vma_get_policy(struct vm_area_struct *vma,
> struct kernfs_open_file *of = kernfs_of(file);
> struct mempolicy *pol;
>
> - if (!of->vm_ops)
> - return vma->vm_policy;
> -
> if (!kernfs_get_active(of->kn))
> return vma->vm_policy;
>
> @@ -511,7 +493,7 @@ static int kernfs_fop_mmap(struct file *file,
> struct vm_area_struct *vma)
> * So error if someone is trying to use close.
> */
> rc = -EINVAL;
> - if (vma->vm_ops && vma->vm_ops->close)
> + if (vma->vm_ops->close)
> goto out_put;
>
> rc = 0;
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e9679016271f..e959623123e4 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -326,7 +326,7 @@ show_map_vma(struct seq_file *m, struct
> vm_area_struct *vma, int is_pid)
> goto done;
> }
>
> - if (vma->vm_ops && vma->vm_ops->name) {
> + if (vma->vm_ops->name) {
> name = vma->vm_ops->name(vma);
> if (name)
> goto done;
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8f0434a9951a..2e35401a5c68 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7269,7 +7269,7 @@ static void perf_event_mmap_event(struct
> perf_mmap_event *mmap_event)
>
> goto got_name;
> } else {
> - if (vma->vm_ops && vma->vm_ops->name) {
> + if (vma->vm_ops->name) {
> name = (char *) vma->vm_ops->name(vma);
> if (name)
> goto cpy_name;
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 9440d61b925c..e5e7a220a124 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -519,7 +519,7 @@ static __latent_entropy int dup_mmap(struct
> mm_struct *mm,
> if (!(tmp->vm_flags & VM_WIPEONFORK))
> retval = copy_page_range(mm, oldmm, mpnt);
>
> - if (tmp->vm_ops && tmp->vm_ops->open)
> + if (tmp->vm_ops->open)
> tmp->vm_ops->open(tmp);
>
> if (retval)
> diff --git a/mm/gup.c b/mm/gup.c
> index b70d7ba7cc13..b732768ed3ac 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -31,7 +31,7 @@ static struct page *no_page_table(struct
> vm_area_struct *vma,
> * But we can only make this optimization where a hole would
> surely
> * be zero-filled if handle_mm_fault() actually did handle
> it.
> */
> - if ((flags & FOLL_DUMP) && (!vma->vm_ops || !vma->vm_ops-
> >fault))
> + if ((flags & FOLL_DUMP) && !vma->vm_ops->fault)
> return ERR_PTR(-EFAULT);
> return NULL;
> }
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 039ddbc574e9..2065acc5a6aa 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -637,7 +637,7 @@ EXPORT_SYMBOL_GPL(linear_hugepage_index);
> */
> unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
> {
> - if (vma->vm_ops && vma->vm_ops->pagesize)
> + if (vma->vm_ops->pagesize)
> return vma->vm_ops->pagesize(vma);
> return PAGE_SIZE;
> }
> diff --git a/mm/memory.c b/mm/memory.c
> index 7206a634270b..02fbef2bd024 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -768,7 +768,7 @@ static void print_bad_pte(struct vm_area_struct
> *vma, unsigned long addr,
> (void *)addr, vma->vm_flags, vma->anon_vma,
> mapping, index);
> pr_alert("file:%pD fault:%pf mmap:%pf readpage:%pf\n",
> vma->vm_file,
> - vma->vm_ops ? vma->vm_ops->fault : NULL,
> + vma->vm_ops->fault,
> vma->vm_file ? vma->vm_file->f_op->mmap : NULL,
> mapping ? mapping->a_ops->readpage : NULL);
> dump_stack();
> @@ -825,7 +825,7 @@ struct page *_vm_normal_page(struct
> vm_area_struct *vma, unsigned long addr,
> if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
> if (likely(!pte_special(pte)))
> goto check_pfn;
> - if (vma->vm_ops && vma->vm_ops->find_special_page)
> + if (vma->vm_ops->find_special_page)
> return vma->vm_ops->find_special_page(vma,
> addr);
> if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
> return NULL;
> @@ -2404,7 +2404,7 @@ static void fault_dirty_shared_page(struct
> vm_area_struct *vma,
> {
> struct address_space *mapping;
> bool dirtied;
> - bool page_mkwrite = vma->vm_ops && vma->vm_ops-
> >page_mkwrite;
> + bool page_mkwrite = vma->vm_ops->page_mkwrite;
>
> dirtied = set_page_dirty(page);
> VM_BUG_ON_PAGE(PageAnon(page), page);
> @@ -2648,7 +2648,7 @@ static int wp_pfn_shared(struct vm_fault *vmf)
> {
> struct vm_area_struct *vma = vmf->vma;
>
> - if (vma->vm_ops && vma->vm_ops->pfn_mkwrite) {
> + if (vma->vm_ops->pfn_mkwrite) {
> int ret;
>
> pte_unmap_unlock(vmf->pte, vmf->ptl);
> @@ -2669,7 +2669,7 @@ static int wp_page_shared(struct vm_fault *vmf)
>
> get_page(vmf->page);
>
> - if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> + if (vma->vm_ops->page_mkwrite) {
> int tmp;
>
> pte_unmap_unlock(vmf->pte, vmf->ptl);
> @@ -4439,7 +4439,7 @@ int __access_remote_vm(struct task_struct *tsk,
> struct mm_struct *mm,
> vma = find_vma(mm, addr);
> if (!vma || vma->vm_start > addr)
> break;
> - if (vma->vm_ops && vma->vm_ops->access)
> + if (vma->vm_ops->access)
> ret = vma->vm_ops->access(vma, addr,
> buf,
> len,
> write);
> if (ret <= 0)
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 9ac49ef17b4e..f0fcf70bcec7 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -651,13 +651,13 @@ static int vma_replace_policy(struct
> vm_area_struct *vma,
> pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy
> %p\n",
> vma->vm_start, vma->vm_end, vma->vm_pgoff,
> vma->vm_ops, vma->vm_file,
> - vma->vm_ops ? vma->vm_ops->set_policy : NULL);
> + vma->vm_ops->set_policy);
>
> new = mpol_dup(pol);
> if (IS_ERR(new))
> return PTR_ERR(new);
>
> - if (vma->vm_ops && vma->vm_ops->set_policy) {
> + if (vma->vm_ops->set_policy) {
> err = vma->vm_ops->set_policy(vma, new);
> if (err)
> goto err_out;
> @@ -845,7 +845,7 @@ static long do_get_mempolicy(int *policy,
> nodemask_t *nmask,
> up_read(&mm->mmap_sem);
> return -EFAULT;
> }
> - if (vma->vm_ops && vma->vm_ops->get_policy)
> + if (vma->vm_ops->get_policy)
> pol = vma->vm_ops->get_policy(vma, addr);
> else
> pol = vma->vm_policy;
> @@ -1617,7 +1617,7 @@ struct mempolicy *__get_vma_policy(struct
> vm_area_struct *vma,
> struct mempolicy *pol = NULL;
>
> if (vma) {
> - if (vma->vm_ops && vma->vm_ops->get_policy) {
> + if (vma->vm_ops->get_policy) {
> pol = vma->vm_ops->get_policy(vma, addr);
> } else if (vma->vm_policy) {
> pol = vma->vm_policy;
> @@ -1663,7 +1663,7 @@ bool vma_policy_mof(struct vm_area_struct *vma)
> {
> struct mempolicy *pol;
>
> - if (vma->vm_ops && vma->vm_ops->get_policy) {
> + if (vma->vm_ops->get_policy) {
> bool ret = false;
>
> pol = vma->vm_ops->get_policy(vma, vma->vm_start);
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 527c17f31635..5adaf9f9b941 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -177,7 +177,7 @@ static struct vm_area_struct *remove_vma(struct
> vm_area_struct *vma)
> struct vm_area_struct *next = vma->vm_next;
>
> might_sleep();
> - if (vma->vm_ops && vma->vm_ops->close)
> + if (vma->vm_ops->close)
> vma->vm_ops->close(vma);
> if (vma->vm_file)
> fput(vma->vm_file);
> @@ -998,7 +998,7 @@ static inline int is_mergeable_vma(struct
> vm_area_struct *vma,
> return 0;
> if (vma->vm_file != file)
> return 0;
> - if (vma->vm_ops && vma->vm_ops->close)
> + if (vma->vm_ops->close)
> return 0;
> if (!is_mergeable_vm_userfaultfd_ctx(vma,
> vm_userfaultfd_ctx))
> return 0;
> @@ -1638,7 +1638,7 @@ int vma_wants_writenotify(struct vm_area_struct
> *vma, pgprot_t vm_page_prot)
> return 0;
>
> /* The backer wishes to know when pages are first written
> to? */
> - if (vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite))
> + if (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite)
> return 1;
>
> /* The open routine did something to the protections that
> pgprot_modify
> @@ -2624,7 +2624,7 @@ int __split_vma(struct mm_struct *mm, struct
> vm_area_struct *vma,
> struct vm_area_struct *new;
> int err;
>
> - if (vma->vm_ops && vma->vm_ops->split) {
> + if (vma->vm_ops->split) {
> err = vma->vm_ops->split(vma, addr);
> if (err)
> return err;
> @@ -2657,7 +2657,7 @@ int __split_vma(struct mm_struct *mm, struct
> vm_area_struct *vma,
> if (new->vm_file)
> get_file(new->vm_file);
>
> - if (new->vm_ops && new->vm_ops->open)
> + if (new->vm_ops->open)
> new->vm_ops->open(new);
>
> if (new_below)
> @@ -2671,7 +2671,7 @@ int __split_vma(struct mm_struct *mm, struct
> vm_area_struct *vma,
> return 0;
>
> /* Clean everything up if vma_adjust failed. */
> - if (new->vm_ops && new->vm_ops->close)
> + if (new->vm_ops->close)
> new->vm_ops->close(new);
> if (new->vm_file)
> fput(new->vm_file);
> @@ -3232,7 +3232,7 @@ struct vm_area_struct *copy_vma(struct
> vm_area_struct **vmap,
> goto out_free_mempol;
> if (new_vma->vm_file)
> get_file(new_vma->vm_file);
> - if (new_vma->vm_ops && new_vma->vm_ops->open)
> + if (new_vma->vm_ops->open)
> new_vma->vm_ops->open(new_vma);
> vma_link(mm, new_vma, prev, rb_link, rb_parent);
> *need_rmap_locks = false;
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 5c2e18505f75..7ab222c283de 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -302,7 +302,7 @@ static unsigned long move_vma(struct
> vm_area_struct *vma,
> need_rmap_locks);
> if (moved_len < old_len) {
> err = -ENOMEM;
> - } else if (vma->vm_ops && vma->vm_ops->mremap) {
> + } else if (vma->vm_ops->mremap) {
> err = vma->vm_ops->mremap(new_vma);
> }
>
> diff --git a/mm/nommu.c b/mm/nommu.c
> index f00f209833ab..73f66e81cfb0 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -764,7 +764,7 @@ static void delete_vma_from_mm(struct
> vm_area_struct *vma)
> */
> static void delete_vma(struct mm_struct *mm, struct vm_area_struct
> *vma)
> {
> - if (vma->vm_ops && vma->vm_ops->close)
> + if (vma->vm_ops->close)
> vma->vm_ops->close(vma);
> if (vma->vm_file)
> fput(vma->vm_file);
> @@ -1496,7 +1496,7 @@ int split_vma(struct mm_struct *mm, struct
> vm_area_struct *vma,
> region->vm_pgoff = new->vm_pgoff += npages;
> }
>
> - if (new->vm_ops && new->vm_ops->open)
> + if (new->vm_ops->open)
> new->vm_ops->open(new);
>
> delete_vma_from_mm(vma);

Today's -next on Apalis T30 [1] gives the following error upon boot:

[ 16.147496] Unable to handle kernel NULL pointer dereference at
virtual address 0000002c
[ 16.156152] pgd = 843045af
[ 16.158986] [0000002c] *pgd=facd9831
[ 16.162578] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[ 16.167970] Modules linked in:
[ 16.171034] CPU: 2 PID: 442 Comm: polkitd Not tainted 4.18.0-rc5-
next-20180716-dirty #75
[ 16.179111] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
[ 16.185382] PC is at show_map_vma.constprop.3+0xac/0x158
[ 16.190686] LR is at show_map_vma.constprop.3+0xa8/0x158
[ 16.195989] pc : [<c02c4900>] lr : [<c02c48fc>] psr: 800e0013
[ 16.202243] sp : ec02de60 ip : 000003ce fp : c0f09a3c
[ 16.207457] r10: ec02df78 r9 : 00000000 r8 : 00000000
[ 16.212672] r7 : 00000000 r6 : eda8ec48 r5 : 00000000 r4 :
c0f09a3c
[ 16.219188] r3 : 00000000 r2 : ed1df000 r1 : 00000020 r0 :
eda8ec48
[ 16.225705] Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA
ARM Segment none
[ 16.232829] Control: 10c5387d Table: ac01804a DAC: 00000051
[ 16.238573] Process polkitd (pid: 442, stack limit = 0xc0e83ce5)
[ 16.244572] Stack: (0xec02de60 to 0xec02e000)
[ 16.248928] de60: 00000000 00000000 00000000 00000000 eda8ec48
eda8ec48 c0f09a3c 000003a6
[ 16.257097] de80: ecf46300 00000096 00000000 c02c4efc eda8ec48
00000000 000003a6 c0289908
[ 16.265287] dea0: 0000000c eda8ec78 ecf46300 000003f4 00081114
eda8ec60 00000000 c0f04c48
[ 16.273482] dec0: c028956c 00000400 ec02df78 00000000 00081108
00000400 00000000 c0263b20
[ 16.281671] dee0: 5b4c9a7c 0ee6b280 000039ea 00000000 c0f04c48
8bb3ec56 c0f04c48 be8c7a00
[ 16.289853] df00: ecf46308 00000000 000007ff c0f04c48 00000001
00000000 00000000 00000000
[ 16.298037] df20: 00000000 8bb3ec56 000039ea 8bb3ec56 ecf46300
00081108 00000400 ec02df78
[ 16.306210] df40: 00000000 00081108 00000400 c0263cdc c0f04c48
b686ac78 000005e8 c0f04c48
[ 16.314381] df60: ecf46303 00002400 00000000 ecf46300 00081108
c02641c0 00002400 00000000
[ 16.322549] df80: 00000000 8bb3ec56 00022698 b686ac78 000005e8
00000003 c0101204 ec02c000
[ 16.330718] dfa0: 00000003 c0101000 00022698 b686ac78 00000009
00081108 00000400 000000c2
[ 16.338886] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
be8c7af4 00000000 00000000
[ 16.347053] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98 000e0010
00000009 00000000 00000000
[ 16.355237] [<c02c4900>] (show_map_vma.constprop.3) from
[<c02c4efc>] (show_pid_map+0x10/0x34)
[ 16.363846] [<c02c4efc>] (show_pid_map) from [<c0289908>]
(seq_read+0x39c/0x4f4)
[ 16.371264] [<c0289908>] (seq_read) from [<c0263b20>]
(__vfs_read+0x2c/0x15c)
[ 16.378401] [<c0263b20>] (__vfs_read) from [<c0263cdc>]
(vfs_read+0x8c/0x110)
[ 16.385546] [<c0263cdc>] (vfs_read) from [<c02641c0>]
(ksys_read+0x4c/0xac)
[ 16.392519] [<c02641c0>] (ksys_read) from [<c0101000>]
(ret_fast_syscall+0x0/0x54)
[ 16.400083] Exception stack(0xec02dfa8 to 0xec02dff0)
[ 16.405135] dfa0: 00022698 b686ac78 00000009
00081108 00000400 000000c2
[ 16.413311] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
be8c7af4 00000000 00000000
[ 16.421485] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98
[ 16.426542] Code: e1cd80f0 e5947020 ebfffb4f e5943048 (e593302c)
[ 16.432734] ---[ end trace 5dbf91c64da6bd91 ]---

Reverting this makes it behave as expected again. Anybody knows what is
going on?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
/tree/arch/arm/boot/dts/tegra30-apalis-eval.dts?h=next-20180716
--
Best regards - Mit freundlichen Grüssen - Meilleures salutations

Marcel Ziswiler
Platform Manager Embedded Linux

Toradex AG
Altsagenstrasse 5 | 6048 Horw/Luzern | Switzerland | T: +41 41 500 48 00
(main line) | Direct: +41 41 500 48 10

2018-07-16 14:21:37

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: REGRESSION: [PATCHv2 2/2] mm: Drop unneeded ->vm_ops checks

On Mon, Jul 16, 2018 at 01:30:34PM +0000, Marcel Ziswiler wrote:
> On Thu, 2018-07-12 at 17:56 +0300, Kirill A. Shutemov wrote:
> > We now have all VMAs with ->vm_ops set and don't need to check it for
> > NULL everywhere.
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
> > ---
> > fs/binfmt_elf.c | 2 +-
> > fs/kernfs/file.c | 20 +-------------------
> > fs/proc/task_mmu.c | 2 +-
> > kernel/events/core.c | 2 +-
> > kernel/fork.c | 2 +-
> > mm/gup.c | 2 +-
> > mm/hugetlb.c | 2 +-
> > mm/memory.c | 12 ++++++------
> > mm/mempolicy.c | 10 +++++-----
> > mm/mmap.c | 14 +++++++-------
> > mm/mremap.c | 2 +-
> > mm/nommu.c | 4 ++--
> > 12 files changed, 28 insertions(+), 46 deletions(-)
> >
> > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> > index 0ac456b52bdd..4f171cf21bc2 100644
> > --- a/fs/binfmt_elf.c
> > +++ b/fs/binfmt_elf.c
> > @@ -1302,7 +1302,7 @@ static bool always_dump_vma(struct
> > vm_area_struct *vma)
> > * Assume that all vmas with a .name op should always be
> > dumped.
> > * If this changes, a new vm_ops field can easily be added.
> > */
> > - if (vma->vm_ops && vma->vm_ops->name && vma->vm_ops-
> > >name(vma))
> > + if (vma->vm_ops->name && vma->vm_ops->name(vma))
> > return true;
> >
> > /*
> > diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> > index 2015d8c45e4a..945c3d306d8f 100644
> > --- a/fs/kernfs/file.c
> > +++ b/fs/kernfs/file.c
> > @@ -336,9 +336,6 @@ static void kernfs_vma_open(struct vm_area_struct
> > *vma)
> > struct file *file = vma->vm_file;
> > struct kernfs_open_file *of = kernfs_of(file);
> >
> > - if (!of->vm_ops)
> > - return;
> > -
> > if (!kernfs_get_active(of->kn))
> > return;
> >
> > @@ -354,9 +351,6 @@ static vm_fault_t kernfs_vma_fault(struct
> > vm_fault *vmf)
> > struct kernfs_open_file *of = kernfs_of(file);
> > vm_fault_t ret;
> >
> > - if (!of->vm_ops)
> > - return VM_FAULT_SIGBUS;
> > -
> > if (!kernfs_get_active(of->kn))
> > return VM_FAULT_SIGBUS;
> >
> > @@ -374,9 +368,6 @@ static vm_fault_t kernfs_vma_page_mkwrite(struct
> > vm_fault *vmf)
> > struct kernfs_open_file *of = kernfs_of(file);
> > vm_fault_t ret;
> >
> > - if (!of->vm_ops)
> > - return VM_FAULT_SIGBUS;
> > -
> > if (!kernfs_get_active(of->kn))
> > return VM_FAULT_SIGBUS;
> >
> > @@ -397,9 +388,6 @@ static int kernfs_vma_access(struct
> > vm_area_struct *vma, unsigned long addr,
> > struct kernfs_open_file *of = kernfs_of(file);
> > int ret;
> >
> > - if (!of->vm_ops)
> > - return -EINVAL;
> > -
> > if (!kernfs_get_active(of->kn))
> > return -EINVAL;
> >
> > @@ -419,9 +407,6 @@ static int kernfs_vma_set_policy(struct
> > vm_area_struct *vma,
> > struct kernfs_open_file *of = kernfs_of(file);
> > int ret;
> >
> > - if (!of->vm_ops)
> > - return 0;
> > -
> > if (!kernfs_get_active(of->kn))
> > return -EINVAL;
> >
> > @@ -440,9 +425,6 @@ static struct mempolicy
> > *kernfs_vma_get_policy(struct vm_area_struct *vma,
> > struct kernfs_open_file *of = kernfs_of(file);
> > struct mempolicy *pol;
> >
> > - if (!of->vm_ops)
> > - return vma->vm_policy;
> > -
> > if (!kernfs_get_active(of->kn))
> > return vma->vm_policy;
> >
> > @@ -511,7 +493,7 @@ static int kernfs_fop_mmap(struct file *file,
> > struct vm_area_struct *vma)
> > * So error if someone is trying to use close.
> > */
> > rc = -EINVAL;
> > - if (vma->vm_ops && vma->vm_ops->close)
> > + if (vma->vm_ops->close)
> > goto out_put;
> >
> > rc = 0;
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index e9679016271f..e959623123e4 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -326,7 +326,7 @@ show_map_vma(struct seq_file *m, struct
> > vm_area_struct *vma, int is_pid)
> > goto done;
> > }
> >
> > - if (vma->vm_ops && vma->vm_ops->name) {
> > + if (vma->vm_ops->name) {
> > name = vma->vm_ops->name(vma);
> > if (name)
> > goto done;
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 8f0434a9951a..2e35401a5c68 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -7269,7 +7269,7 @@ static void perf_event_mmap_event(struct
> > perf_mmap_event *mmap_event)
> >
> > goto got_name;
> > } else {
> > - if (vma->vm_ops && vma->vm_ops->name) {
> > + if (vma->vm_ops->name) {
> > name = (char *) vma->vm_ops->name(vma);
> > if (name)
> > goto cpy_name;
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 9440d61b925c..e5e7a220a124 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -519,7 +519,7 @@ static __latent_entropy int dup_mmap(struct
> > mm_struct *mm,
> > if (!(tmp->vm_flags & VM_WIPEONFORK))
> > retval = copy_page_range(mm, oldmm, mpnt);
> >
> > - if (tmp->vm_ops && tmp->vm_ops->open)
> > + if (tmp->vm_ops->open)
> > tmp->vm_ops->open(tmp);
> >
> > if (retval)
> > diff --git a/mm/gup.c b/mm/gup.c
> > index b70d7ba7cc13..b732768ed3ac 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -31,7 +31,7 @@ static struct page *no_page_table(struct
> > vm_area_struct *vma,
> > * But we can only make this optimization where a hole would
> > surely
> > * be zero-filled if handle_mm_fault() actually did handle
> > it.
> > */
> > - if ((flags & FOLL_DUMP) && (!vma->vm_ops || !vma->vm_ops-
> > >fault))
> > + if ((flags & FOLL_DUMP) && !vma->vm_ops->fault)
> > return ERR_PTR(-EFAULT);
> > return NULL;
> > }
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 039ddbc574e9..2065acc5a6aa 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -637,7 +637,7 @@ EXPORT_SYMBOL_GPL(linear_hugepage_index);
> > */
> > unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
> > {
> > - if (vma->vm_ops && vma->vm_ops->pagesize)
> > + if (vma->vm_ops->pagesize)
> > return vma->vm_ops->pagesize(vma);
> > return PAGE_SIZE;
> > }
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 7206a634270b..02fbef2bd024 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -768,7 +768,7 @@ static void print_bad_pte(struct vm_area_struct
> > *vma, unsigned long addr,
> > (void *)addr, vma->vm_flags, vma->anon_vma,
> > mapping, index);
> > pr_alert("file:%pD fault:%pf mmap:%pf readpage:%pf\n",
> > vma->vm_file,
> > - vma->vm_ops ? vma->vm_ops->fault : NULL,
> > + vma->vm_ops->fault,
> > vma->vm_file ? vma->vm_file->f_op->mmap : NULL,
> > mapping ? mapping->a_ops->readpage : NULL);
> > dump_stack();
> > @@ -825,7 +825,7 @@ struct page *_vm_normal_page(struct
> > vm_area_struct *vma, unsigned long addr,
> > if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
> > if (likely(!pte_special(pte)))
> > goto check_pfn;
> > - if (vma->vm_ops && vma->vm_ops->find_special_page)
> > + if (vma->vm_ops->find_special_page)
> > return vma->vm_ops->find_special_page(vma,
> > addr);
> > if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
> > return NULL;
> > @@ -2404,7 +2404,7 @@ static void fault_dirty_shared_page(struct
> > vm_area_struct *vma,
> > {
> > struct address_space *mapping;
> > bool dirtied;
> > - bool page_mkwrite = vma->vm_ops && vma->vm_ops-
> > >page_mkwrite;
> > + bool page_mkwrite = vma->vm_ops->page_mkwrite;
> >
> > dirtied = set_page_dirty(page);
> > VM_BUG_ON_PAGE(PageAnon(page), page);
> > @@ -2648,7 +2648,7 @@ static int wp_pfn_shared(struct vm_fault *vmf)
> > {
> > struct vm_area_struct *vma = vmf->vma;
> >
> > - if (vma->vm_ops && vma->vm_ops->pfn_mkwrite) {
> > + if (vma->vm_ops->pfn_mkwrite) {
> > int ret;
> >
> > pte_unmap_unlock(vmf->pte, vmf->ptl);
> > @@ -2669,7 +2669,7 @@ static int wp_page_shared(struct vm_fault *vmf)
> >
> > get_page(vmf->page);
> >
> > - if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> > + if (vma->vm_ops->page_mkwrite) {
> > int tmp;
> >
> > pte_unmap_unlock(vmf->pte, vmf->ptl);
> > @@ -4439,7 +4439,7 @@ int __access_remote_vm(struct task_struct *tsk,
> > struct mm_struct *mm,
> > vma = find_vma(mm, addr);
> > if (!vma || vma->vm_start > addr)
> > break;
> > - if (vma->vm_ops && vma->vm_ops->access)
> > + if (vma->vm_ops->access)
> > ret = vma->vm_ops->access(vma, addr,
> > buf,
> > len,
> > write);
> > if (ret <= 0)
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 9ac49ef17b4e..f0fcf70bcec7 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -651,13 +651,13 @@ static int vma_replace_policy(struct
> > vm_area_struct *vma,
> > pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy
> > %p\n",
> > vma->vm_start, vma->vm_end, vma->vm_pgoff,
> > vma->vm_ops, vma->vm_file,
> > - vma->vm_ops ? vma->vm_ops->set_policy : NULL);
> > + vma->vm_ops->set_policy);
> >
> > new = mpol_dup(pol);
> > if (IS_ERR(new))
> > return PTR_ERR(new);
> >
> > - if (vma->vm_ops && vma->vm_ops->set_policy) {
> > + if (vma->vm_ops->set_policy) {
> > err = vma->vm_ops->set_policy(vma, new);
> > if (err)
> > goto err_out;
> > @@ -845,7 +845,7 @@ static long do_get_mempolicy(int *policy,
> > nodemask_t *nmask,
> > up_read(&mm->mmap_sem);
> > return -EFAULT;
> > }
> > - if (vma->vm_ops && vma->vm_ops->get_policy)
> > + if (vma->vm_ops->get_policy)
> > pol = vma->vm_ops->get_policy(vma, addr);
> > else
> > pol = vma->vm_policy;
> > @@ -1617,7 +1617,7 @@ struct mempolicy *__get_vma_policy(struct
> > vm_area_struct *vma,
> > struct mempolicy *pol = NULL;
> >
> > if (vma) {
> > - if (vma->vm_ops && vma->vm_ops->get_policy) {
> > + if (vma->vm_ops->get_policy) {
> > pol = vma->vm_ops->get_policy(vma, addr);
> > } else if (vma->vm_policy) {
> > pol = vma->vm_policy;
> > @@ -1663,7 +1663,7 @@ bool vma_policy_mof(struct vm_area_struct *vma)
> > {
> > struct mempolicy *pol;
> >
> > - if (vma->vm_ops && vma->vm_ops->get_policy) {
> > + if (vma->vm_ops->get_policy) {
> > bool ret = false;
> >
> > pol = vma->vm_ops->get_policy(vma, vma->vm_start);
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 527c17f31635..5adaf9f9b941 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -177,7 +177,7 @@ static struct vm_area_struct *remove_vma(struct
> > vm_area_struct *vma)
> > struct vm_area_struct *next = vma->vm_next;
> >
> > might_sleep();
> > - if (vma->vm_ops && vma->vm_ops->close)
> > + if (vma->vm_ops->close)
> > vma->vm_ops->close(vma);
> > if (vma->vm_file)
> > fput(vma->vm_file);
> > @@ -998,7 +998,7 @@ static inline int is_mergeable_vma(struct
> > vm_area_struct *vma,
> > return 0;
> > if (vma->vm_file != file)
> > return 0;
> > - if (vma->vm_ops && vma->vm_ops->close)
> > + if (vma->vm_ops->close)
> > return 0;
> > if (!is_mergeable_vm_userfaultfd_ctx(vma,
> > vm_userfaultfd_ctx))
> > return 0;
> > @@ -1638,7 +1638,7 @@ int vma_wants_writenotify(struct vm_area_struct
> > *vma, pgprot_t vm_page_prot)
> > return 0;
> >
> > /* The backer wishes to know when pages are first written
> > to? */
> > - if (vm_ops && (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite))
> > + if (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite)
> > return 1;
> >
> > /* The open routine did something to the protections that
> > pgprot_modify
> > @@ -2624,7 +2624,7 @@ int __split_vma(struct mm_struct *mm, struct
> > vm_area_struct *vma,
> > struct vm_area_struct *new;
> > int err;
> >
> > - if (vma->vm_ops && vma->vm_ops->split) {
> > + if (vma->vm_ops->split) {
> > err = vma->vm_ops->split(vma, addr);
> > if (err)
> > return err;
> > @@ -2657,7 +2657,7 @@ int __split_vma(struct mm_struct *mm, struct
> > vm_area_struct *vma,
> > if (new->vm_file)
> > get_file(new->vm_file);
> >
> > - if (new->vm_ops && new->vm_ops->open)
> > + if (new->vm_ops->open)
> > new->vm_ops->open(new);
> >
> > if (new_below)
> > @@ -2671,7 +2671,7 @@ int __split_vma(struct mm_struct *mm, struct
> > vm_area_struct *vma,
> > return 0;
> >
> > /* Clean everything up if vma_adjust failed. */
> > - if (new->vm_ops && new->vm_ops->close)
> > + if (new->vm_ops->close)
> > new->vm_ops->close(new);
> > if (new->vm_file)
> > fput(new->vm_file);
> > @@ -3232,7 +3232,7 @@ struct vm_area_struct *copy_vma(struct
> > vm_area_struct **vmap,
> > goto out_free_mempol;
> > if (new_vma->vm_file)
> > get_file(new_vma->vm_file);
> > - if (new_vma->vm_ops && new_vma->vm_ops->open)
> > + if (new_vma->vm_ops->open)
> > new_vma->vm_ops->open(new_vma);
> > vma_link(mm, new_vma, prev, rb_link, rb_parent);
> > *need_rmap_locks = false;
> > diff --git a/mm/mremap.c b/mm/mremap.c
> > index 5c2e18505f75..7ab222c283de 100644
> > --- a/mm/mremap.c
> > +++ b/mm/mremap.c
> > @@ -302,7 +302,7 @@ static unsigned long move_vma(struct
> > vm_area_struct *vma,
> > need_rmap_locks);
> > if (moved_len < old_len) {
> > err = -ENOMEM;
> > - } else if (vma->vm_ops && vma->vm_ops->mremap) {
> > + } else if (vma->vm_ops->mremap) {
> > err = vma->vm_ops->mremap(new_vma);
> > }
> >
> > diff --git a/mm/nommu.c b/mm/nommu.c
> > index f00f209833ab..73f66e81cfb0 100644
> > --- a/mm/nommu.c
> > +++ b/mm/nommu.c
> > @@ -764,7 +764,7 @@ static void delete_vma_from_mm(struct
> > vm_area_struct *vma)
> > */
> > static void delete_vma(struct mm_struct *mm, struct vm_area_struct
> > *vma)
> > {
> > - if (vma->vm_ops && vma->vm_ops->close)
> > + if (vma->vm_ops->close)
> > vma->vm_ops->close(vma);
> > if (vma->vm_file)
> > fput(vma->vm_file);
> > @@ -1496,7 +1496,7 @@ int split_vma(struct mm_struct *mm, struct
> > vm_area_struct *vma,
> > region->vm_pgoff = new->vm_pgoff += npages;
> > }
> >
> > - if (new->vm_ops && new->vm_ops->open)
> > + if (new->vm_ops->open)
> > new->vm_ops->open(new);
> >
> > delete_vma_from_mm(vma);
>
> Today's -next on Apalis T30 [1] gives the following error upon boot:
>
> [ 16.147496] Unable to handle kernel NULL pointer dereference at
> virtual address 0000002c
> [ 16.156152] pgd = 843045af
> [ 16.158986] [0000002c] *pgd=facd9831
> [ 16.162578] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> [ 16.167970] Modules linked in:
> [ 16.171034] CPU: 2 PID: 442 Comm: polkitd Not tainted 4.18.0-rc5-
> next-20180716-dirty #75
> [ 16.179111] Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
> [ 16.185382] PC is at show_map_vma.constprop.3+0xac/0x158
> [ 16.190686] LR is at show_map_vma.constprop.3+0xa8/0x158
> [ 16.195989] pc : [<c02c4900>] lr : [<c02c48fc>] psr: 800e0013
> [ 16.202243] sp : ec02de60 ip : 000003ce fp : c0f09a3c
> [ 16.207457] r10: ec02df78 r9 : 00000000 r8 : 00000000
> [ 16.212672] r7 : 00000000 r6 : eda8ec48 r5 : 00000000 r4 :
> c0f09a3c
> [ 16.219188] r3 : 00000000 r2 : ed1df000 r1 : 00000020 r0 :
> eda8ec48
> [ 16.225705] Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA
> ARM Segment none
> [ 16.232829] Control: 10c5387d Table: ac01804a DAC: 00000051
> [ 16.238573] Process polkitd (pid: 442, stack limit = 0xc0e83ce5)
> [ 16.244572] Stack: (0xec02de60 to 0xec02e000)
> [ 16.248928] de60: 00000000 00000000 00000000 00000000 eda8ec48
> eda8ec48 c0f09a3c 000003a6
> [ 16.257097] de80: ecf46300 00000096 00000000 c02c4efc eda8ec48
> 00000000 000003a6 c0289908
> [ 16.265287] dea0: 0000000c eda8ec78 ecf46300 000003f4 00081114
> eda8ec60 00000000 c0f04c48
> [ 16.273482] dec0: c028956c 00000400 ec02df78 00000000 00081108
> 00000400 00000000 c0263b20
> [ 16.281671] dee0: 5b4c9a7c 0ee6b280 000039ea 00000000 c0f04c48
> 8bb3ec56 c0f04c48 be8c7a00
> [ 16.289853] df00: ecf46308 00000000 000007ff c0f04c48 00000001
> 00000000 00000000 00000000
> [ 16.298037] df20: 00000000 8bb3ec56 000039ea 8bb3ec56 ecf46300
> 00081108 00000400 ec02df78
> [ 16.306210] df40: 00000000 00081108 00000400 c0263cdc c0f04c48
> b686ac78 000005e8 c0f04c48
> [ 16.314381] df60: ecf46303 00002400 00000000 ecf46300 00081108
> c02641c0 00002400 00000000
> [ 16.322549] df80: 00000000 8bb3ec56 00022698 b686ac78 000005e8
> 00000003 c0101204 ec02c000
> [ 16.330718] dfa0: 00000003 c0101000 00022698 b686ac78 00000009
> 00081108 00000400 000000c2
> [ 16.338886] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
> be8c7af4 00000000 00000000
> [ 16.347053] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98 000e0010
> 00000009 00000000 00000000
> [ 16.355237] [<c02c4900>] (show_map_vma.constprop.3) from
> [<c02c4efc>] (show_pid_map+0x10/0x34)
> [ 16.363846] [<c02c4efc>] (show_pid_map) from [<c0289908>]
> (seq_read+0x39c/0x4f4)
> [ 16.371264] [<c0289908>] (seq_read) from [<c0263b20>]
> (__vfs_read+0x2c/0x15c)
> [ 16.378401] [<c0263b20>] (__vfs_read) from [<c0263cdc>]
> (vfs_read+0x8c/0x110)
> [ 16.385546] [<c0263cdc>] (vfs_read) from [<c02641c0>]
> (ksys_read+0x4c/0xac)
> [ 16.392519] [<c02641c0>] (ksys_read) from [<c0101000>]
> (ret_fast_syscall+0x0/0x54)
> [ 16.400083] Exception stack(0xec02dfa8 to 0xec02dff0)
> [ 16.405135] dfa0: 00022698 b686ac78 00000009
> 00081108 00000400 000000c2
> [ 16.413311] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
> be8c7af4 00000000 00000000
> [ 16.421485] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98
> [ 16.426542] Code: e1cd80f0 e5947020 ebfffb4f e5943048 (e593302c)
> [ 16.432734] ---[ end trace 5dbf91c64da6bd91 ]---
>
> Reverting this makes it behave as expected again. Anybody knows what is
> going on?

Could you check if this fixup helps?

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 225d1c58d2de..553262999564 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -334,6 +334,7 @@ static struct vm_area_struct gate_vma = {
.vm_start = 0xffff0000,
.vm_end = 0xffff0000 + PAGE_SIZE,
.vm_flags = VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYEXEC,
+ .vm_ops = &dummy_vm_ops,
};

static int __init gate_vma_init(void)
--
Kirill A. Shutemov

2018-07-16 16:11:34

by Marcel Ziswiler

[permalink] [raw]
Subject: Re: REGRESSION: [PATCHv2 2/2] mm: Drop unneeded ->vm_ops checks

On Mon, 2018-07-16 at 17:20 +0300, Kirill A. Shutemov wrote:
> On Mon, Jul 16, 2018 at 01:30:34PM +0000, Marcel Ziswiler wrote:
> > On Thu, 2018-07-12 at 17:56 +0300, Kirill A. Shutemov wrote:
> > > We now have all VMAs with ->vm_ops set and don't need to check it
> > > for
> > > NULL everywhere.
> > >
> > > Signed-off-by: Kirill A. Shutemov <[email protected]
> > > m>
> > > ---
> > > fs/binfmt_elf.c | 2 +-
> > > fs/kernfs/file.c | 20 +-------------------
> > > fs/proc/task_mmu.c | 2 +-
> > > kernel/events/core.c | 2 +-
> > > kernel/fork.c | 2 +-
> > > mm/gup.c | 2 +-
> > > mm/hugetlb.c | 2 +-
> > > mm/memory.c | 12 ++++++------
> > > mm/mempolicy.c | 10 +++++-----
> > > mm/mmap.c | 14 +++++++-------
> > > mm/mremap.c | 2 +-
> > > mm/nommu.c | 4 ++--
> > > 12 files changed, 28 insertions(+), 46 deletions(-)
> > >
> > > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> > > index 0ac456b52bdd..4f171cf21bc2 100644
> > > --- a/fs/binfmt_elf.c
> > > +++ b/fs/binfmt_elf.c
> > > @@ -1302,7 +1302,7 @@ static bool always_dump_vma(struct
> > > vm_area_struct *vma)
> > > * Assume that all vmas with a .name op should always be
> > > dumped.
> > > * If this changes, a new vm_ops field can easily be
> > > added.
> > > */
> > > - if (vma->vm_ops && vma->vm_ops->name && vma->vm_ops-
> > > > name(vma))
> > >
> > > + if (vma->vm_ops->name && vma->vm_ops->name(vma))
> > > return true;
> > >
> > > /*
> > > diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> > > index 2015d8c45e4a..945c3d306d8f 100644
> > > --- a/fs/kernfs/file.c
> > > +++ b/fs/kernfs/file.c
> > > @@ -336,9 +336,6 @@ static void kernfs_vma_open(struct
> > > vm_area_struct
> > > *vma)
> > > struct file *file = vma->vm_file;
> > > struct kernfs_open_file *of = kernfs_of(file);
> > >
> > > - if (!of->vm_ops)
> > > - return;
> > > -
> > > if (!kernfs_get_active(of->kn))
> > > return;
> > >
> > > @@ -354,9 +351,6 @@ static vm_fault_t kernfs_vma_fault(struct
> > > vm_fault *vmf)
> > > struct kernfs_open_file *of = kernfs_of(file);
> > > vm_fault_t ret;
> > >
> > > - if (!of->vm_ops)
> > > - return VM_FAULT_SIGBUS;
> > > -
> > > if (!kernfs_get_active(of->kn))
> > > return VM_FAULT_SIGBUS;
> > >
> > > @@ -374,9 +368,6 @@ static vm_fault_t
> > > kernfs_vma_page_mkwrite(struct
> > > vm_fault *vmf)
> > > struct kernfs_open_file *of = kernfs_of(file);
> > > vm_fault_t ret;
> > >
> > > - if (!of->vm_ops)
> > > - return VM_FAULT_SIGBUS;
> > > -
> > > if (!kernfs_get_active(of->kn))
> > > return VM_FAULT_SIGBUS;
> > >
> > > @@ -397,9 +388,6 @@ static int kernfs_vma_access(struct
> > > vm_area_struct *vma, unsigned long addr,
> > > struct kernfs_open_file *of = kernfs_of(file);
> > > int ret;
> > >
> > > - if (!of->vm_ops)
> > > - return -EINVAL;
> > > -
> > > if (!kernfs_get_active(of->kn))
> > > return -EINVAL;
> > >
> > > @@ -419,9 +407,6 @@ static int kernfs_vma_set_policy(struct
> > > vm_area_struct *vma,
> > > struct kernfs_open_file *of = kernfs_of(file);
> > > int ret;
> > >
> > > - if (!of->vm_ops)
> > > - return 0;
> > > -
> > > if (!kernfs_get_active(of->kn))
> > > return -EINVAL;
> > >
> > > @@ -440,9 +425,6 @@ static struct mempolicy
> > > *kernfs_vma_get_policy(struct vm_area_struct *vma,
> > > struct kernfs_open_file *of = kernfs_of(file);
> > > struct mempolicy *pol;
> > >
> > > - if (!of->vm_ops)
> > > - return vma->vm_policy;
> > > -
> > > if (!kernfs_get_active(of->kn))
> > > return vma->vm_policy;
> > >
> > > @@ -511,7 +493,7 @@ static int kernfs_fop_mmap(struct file *file,
> > > struct vm_area_struct *vma)
> > > * So error if someone is trying to use close.
> > > */
> > > rc = -EINVAL;
> > > - if (vma->vm_ops && vma->vm_ops->close)
> > > + if (vma->vm_ops->close)
> > > goto out_put;
> > >
> > > rc = 0;
> > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > > index e9679016271f..e959623123e4 100644
> > > --- a/fs/proc/task_mmu.c
> > > +++ b/fs/proc/task_mmu.c
> > > @@ -326,7 +326,7 @@ show_map_vma(struct seq_file *m, struct
> > > vm_area_struct *vma, int is_pid)
> > > goto done;
> > > }
> > >
> > > - if (vma->vm_ops && vma->vm_ops->name) {
> > > + if (vma->vm_ops->name) {
> > > name = vma->vm_ops->name(vma);
> > > if (name)
> > > goto done;
> > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > index 8f0434a9951a..2e35401a5c68 100644
> > > --- a/kernel/events/core.c
> > > +++ b/kernel/events/core.c
> > > @@ -7269,7 +7269,7 @@ static void perf_event_mmap_event(struct
> > > perf_mmap_event *mmap_event)
> > >
> > > goto got_name;
> > > } else {
> > > - if (vma->vm_ops && vma->vm_ops->name) {
> > > + if (vma->vm_ops->name) {
> > > name = (char *) vma->vm_ops->name(vma);
> > > if (name)
> > > goto cpy_name;
> > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > index 9440d61b925c..e5e7a220a124 100644
> > > --- a/kernel/fork.c
> > > +++ b/kernel/fork.c
> > > @@ -519,7 +519,7 @@ static __latent_entropy int dup_mmap(struct
> > > mm_struct *mm,
> > > if (!(tmp->vm_flags & VM_WIPEONFORK))
> > > retval = copy_page_range(mm, oldmm,
> > > mpnt);
> > >
> > > - if (tmp->vm_ops && tmp->vm_ops->open)
> > > + if (tmp->vm_ops->open)
> > > tmp->vm_ops->open(tmp);
> > >
> > > if (retval)
> > > diff --git a/mm/gup.c b/mm/gup.c
> > > index b70d7ba7cc13..b732768ed3ac 100644
> > > --- a/mm/gup.c
> > > +++ b/mm/gup.c
> > > @@ -31,7 +31,7 @@ static struct page *no_page_table(struct
> > > vm_area_struct *vma,
> > > * But we can only make this optimization where a hole
> > > would
> > > surely
> > > * be zero-filled if handle_mm_fault() actually did
> > > handle
> > > it.
> > > */
> > > - if ((flags & FOLL_DUMP) && (!vma->vm_ops || !vma-
> > > >vm_ops-
> > > > fault))
> > >
> > > + if ((flags & FOLL_DUMP) && !vma->vm_ops->fault)
> > > return ERR_PTR(-EFAULT);
> > > return NULL;
> > > }
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 039ddbc574e9..2065acc5a6aa 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -637,7 +637,7 @@ EXPORT_SYMBOL_GPL(linear_hugepage_index);
> > > */
> > > unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
> > > {
> > > - if (vma->vm_ops && vma->vm_ops->pagesize)
> > > + if (vma->vm_ops->pagesize)
> > > return vma->vm_ops->pagesize(vma);
> > > return PAGE_SIZE;
> > > }
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 7206a634270b..02fbef2bd024 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -768,7 +768,7 @@ static void print_bad_pte(struct
> > > vm_area_struct
> > > *vma, unsigned long addr,
> > > (void *)addr, vma->vm_flags, vma->anon_vma,
> > > mapping, index);
> > > pr_alert("file:%pD fault:%pf mmap:%pf readpage:%pf\n",
> > > vma->vm_file,
> > > - vma->vm_ops ? vma->vm_ops->fault : NULL,
> > > + vma->vm_ops->fault,
> > > vma->vm_file ? vma->vm_file->f_op->mmap : NULL,
> > > mapping ? mapping->a_ops->readpage : NULL);
> > > dump_stack();
> > > @@ -825,7 +825,7 @@ struct page *_vm_normal_page(struct
> > > vm_area_struct *vma, unsigned long addr,
> > > if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
> > > if (likely(!pte_special(pte)))
> > > goto check_pfn;
> > > - if (vma->vm_ops && vma->vm_ops-
> > > >find_special_page)
> > > + if (vma->vm_ops->find_special_page)
> > > return vma->vm_ops-
> > > >find_special_page(vma,
> > > addr);
> > > if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
> > > return NULL;
> > > @@ -2404,7 +2404,7 @@ static void fault_dirty_shared_page(struct
> > > vm_area_struct *vma,
> > > {
> > > struct address_space *mapping;
> > > bool dirtied;
> > > - bool page_mkwrite = vma->vm_ops && vma->vm_ops-
> > > > page_mkwrite;
> > >
> > > + bool page_mkwrite = vma->vm_ops->page_mkwrite;
> > >
> > > dirtied = set_page_dirty(page);
> > > VM_BUG_ON_PAGE(PageAnon(page), page);
> > > @@ -2648,7 +2648,7 @@ static int wp_pfn_shared(struct vm_fault
> > > *vmf)
> > > {
> > > struct vm_area_struct *vma = vmf->vma;
> > >
> > > - if (vma->vm_ops && vma->vm_ops->pfn_mkwrite) {
> > > + if (vma->vm_ops->pfn_mkwrite) {
> > > int ret;
> > >
> > > pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > @@ -2669,7 +2669,7 @@ static int wp_page_shared(struct vm_fault
> > > *vmf)
> > >
> > > get_page(vmf->page);
> > >
> > > - if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> > > + if (vma->vm_ops->page_mkwrite) {
> > > int tmp;
> > >
> > > pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > @@ -4439,7 +4439,7 @@ int __access_remote_vm(struct task_struct
> > > *tsk,
> > > struct mm_struct *mm,
> > > vma = find_vma(mm, addr);
> > > if (!vma || vma->vm_start > addr)
> > > break;
> > > - if (vma->vm_ops && vma->vm_ops->access)
> > > + if (vma->vm_ops->access)
> > > ret = vma->vm_ops->access(vma,
> > > addr,
> > > buf,
> > > len,
> > > write);
> > > if (ret <= 0)
> > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > > index 9ac49ef17b4e..f0fcf70bcec7 100644
> > > --- a/mm/mempolicy.c
> > > +++ b/mm/mempolicy.c
> > > @@ -651,13 +651,13 @@ static int vma_replace_policy(struct
> > > vm_area_struct *vma,
> > > pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p
> > > set_policy
> > > %p\n",
> > > vma->vm_start, vma->vm_end, vma->vm_pgoff,
> > > vma->vm_ops, vma->vm_file,
> > > - vma->vm_ops ? vma->vm_ops->set_policy : NULL);
> > > + vma->vm_ops->set_policy);
> > >
> > > new = mpol_dup(pol);
> > > if (IS_ERR(new))
> > > return PTR_ERR(new);
> > >
> > > - if (vma->vm_ops && vma->vm_ops->set_policy) {
> > > + if (vma->vm_ops->set_policy) {
> > > err = vma->vm_ops->set_policy(vma, new);
> > > if (err)
> > > goto err_out;
> > > @@ -845,7 +845,7 @@ static long do_get_mempolicy(int *policy,
> > > nodemask_t *nmask,
> > > up_read(&mm->mmap_sem);
> > > return -EFAULT;
> > > }
> > > - if (vma->vm_ops && vma->vm_ops->get_policy)
> > > + if (vma->vm_ops->get_policy)
> > > pol = vma->vm_ops->get_policy(vma,
> > > addr);
> > > else
> > > pol = vma->vm_policy;
> > > @@ -1617,7 +1617,7 @@ struct mempolicy *__get_vma_policy(struct
> > > vm_area_struct *vma,
> > > struct mempolicy *pol = NULL;
> > >
> > > if (vma) {
> > > - if (vma->vm_ops && vma->vm_ops->get_policy) {
> > > + if (vma->vm_ops->get_policy) {
> > > pol = vma->vm_ops->get_policy(vma,
> > > addr);
> > > } else if (vma->vm_policy) {
> > > pol = vma->vm_policy;
> > > @@ -1663,7 +1663,7 @@ bool vma_policy_mof(struct vm_area_struct
> > > *vma)
> > > {
> > > struct mempolicy *pol;
> > >
> > > - if (vma->vm_ops && vma->vm_ops->get_policy) {
> > > + if (vma->vm_ops->get_policy) {
> > > bool ret = false;
> > >
> > > pol = vma->vm_ops->get_policy(vma, vma-
> > > >vm_start);
> > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > index 527c17f31635..5adaf9f9b941 100644
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -177,7 +177,7 @@ static struct vm_area_struct
> > > *remove_vma(struct
> > > vm_area_struct *vma)
> > > struct vm_area_struct *next = vma->vm_next;
> > >
> > > might_sleep();
> > > - if (vma->vm_ops && vma->vm_ops->close)
> > > + if (vma->vm_ops->close)
> > > vma->vm_ops->close(vma);
> > > if (vma->vm_file)
> > > fput(vma->vm_file);
> > > @@ -998,7 +998,7 @@ static inline int is_mergeable_vma(struct
> > > vm_area_struct *vma,
> > > return 0;
> > > if (vma->vm_file != file)
> > > return 0;
> > > - if (vma->vm_ops && vma->vm_ops->close)
> > > + if (vma->vm_ops->close)
> > > return 0;
> > > if (!is_mergeable_vm_userfaultfd_ctx(vma,
> > > vm_userfaultfd_ctx))
> > > return 0;
> > > @@ -1638,7 +1638,7 @@ int vma_wants_writenotify(struct
> > > vm_area_struct
> > > *vma, pgprot_t vm_page_prot)
> > > return 0;
> > >
> > > /* The backer wishes to know when pages are first
> > > written
> > > to? */
> > > - if (vm_ops && (vm_ops->page_mkwrite || vm_ops-
> > > >pfn_mkwrite))
> > > + if (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite)
> > > return 1;
> > >
> > > /* The open routine did something to the protections
> > > that
> > > pgprot_modify
> > > @@ -2624,7 +2624,7 @@ int __split_vma(struct mm_struct *mm,
> > > struct
> > > vm_area_struct *vma,
> > > struct vm_area_struct *new;
> > > int err;
> > >
> > > - if (vma->vm_ops && vma->vm_ops->split) {
> > > + if (vma->vm_ops->split) {
> > > err = vma->vm_ops->split(vma, addr);
> > > if (err)
> > > return err;
> > > @@ -2657,7 +2657,7 @@ int __split_vma(struct mm_struct *mm,
> > > struct
> > > vm_area_struct *vma,
> > > if (new->vm_file)
> > > get_file(new->vm_file);
> > >
> > > - if (new->vm_ops && new->vm_ops->open)
> > > + if (new->vm_ops->open)
> > > new->vm_ops->open(new);
> > >
> > > if (new_below)
> > > @@ -2671,7 +2671,7 @@ int __split_vma(struct mm_struct *mm,
> > > struct
> > > vm_area_struct *vma,
> > > return 0;
> > >
> > > /* Clean everything up if vma_adjust failed. */
> > > - if (new->vm_ops && new->vm_ops->close)
> > > + if (new->vm_ops->close)
> > > new->vm_ops->close(new);
> > > if (new->vm_file)
> > > fput(new->vm_file);
> > > @@ -3232,7 +3232,7 @@ struct vm_area_struct *copy_vma(struct
> > > vm_area_struct **vmap,
> > > goto out_free_mempol;
> > > if (new_vma->vm_file)
> > > get_file(new_vma->vm_file);
> > > - if (new_vma->vm_ops && new_vma->vm_ops->open)
> > > + if (new_vma->vm_ops->open)
> > > new_vma->vm_ops->open(new_vma);
> > > vma_link(mm, new_vma, prev, rb_link, rb_parent);
> > > *need_rmap_locks = false;
> > > diff --git a/mm/mremap.c b/mm/mremap.c
> > > index 5c2e18505f75..7ab222c283de 100644
> > > --- a/mm/mremap.c
> > > +++ b/mm/mremap.c
> > > @@ -302,7 +302,7 @@ static unsigned long move_vma(struct
> > > vm_area_struct *vma,
> > > need_rmap_locks);
> > > if (moved_len < old_len) {
> > > err = -ENOMEM;
> > > - } else if (vma->vm_ops && vma->vm_ops->mremap) {
> > > + } else if (vma->vm_ops->mremap) {
> > > err = vma->vm_ops->mremap(new_vma);
> > > }
> > >
> > > diff --git a/mm/nommu.c b/mm/nommu.c
> > > index f00f209833ab..73f66e81cfb0 100644
> > > --- a/mm/nommu.c
> > > +++ b/mm/nommu.c
> > > @@ -764,7 +764,7 @@ static void delete_vma_from_mm(struct
> > > vm_area_struct *vma)
> > > */
> > > static void delete_vma(struct mm_struct *mm, struct
> > > vm_area_struct
> > > *vma)
> > > {
> > > - if (vma->vm_ops && vma->vm_ops->close)
> > > + if (vma->vm_ops->close)
> > > vma->vm_ops->close(vma);
> > > if (vma->vm_file)
> > > fput(vma->vm_file);
> > > @@ -1496,7 +1496,7 @@ int split_vma(struct mm_struct *mm, struct
> > > vm_area_struct *vma,
> > > region->vm_pgoff = new->vm_pgoff += npages;
> > > }
> > >
> > > - if (new->vm_ops && new->vm_ops->open)
> > > + if (new->vm_ops->open)
> > > new->vm_ops->open(new);
> > >
> > > delete_vma_from_mm(vma);
> >
> > Today's -next on Apalis T30 [1] gives the following error upon
> > boot:
> >
> > [ 16.147496] Unable to handle kernel NULL pointer dereference at
> > virtual address 0000002c
> > [ 16.156152] pgd = 843045af
> > [ 16.158986] [0000002c] *pgd=facd9831
> > [ 16.162578] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> > [ 16.167970] Modules linked in:
> > [ 16.171034] CPU: 2 PID: 442 Comm: polkitd Not tainted 4.18.0-
> > rc5-
> > next-20180716-dirty #75
> > [ 16.179111] Hardware name: NVIDIA Tegra SoC (Flattened Device
> > Tree)
> > [ 16.185382] PC is at show_map_vma.constprop.3+0xac/0x158
> > [ 16.190686] LR is at show_map_vma.constprop.3+0xa8/0x158
> > [ 16.195989] pc : [<c02c4900>] lr : [<c02c48fc>] psr:
> > 800e0013
> > [ 16.202243] sp : ec02de60 ip : 000003ce fp : c0f09a3c
> > [ 16.207457] r10: ec02df78 r9 : 00000000 r8 : 00000000
> > [ 16.212672] r7 : 00000000 r6 : eda8ec48 r5 : 00000000 r4 :
> > c0f09a3c
> > [ 16.219188] r3 : 00000000 r2 : ed1df000 r1 : 00000020 r0 :
> > eda8ec48
> > [ 16.225705] Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA
> > ARM Segment none
> > [ 16.232829] Control: 10c5387d Table: ac01804a DAC: 00000051
> > [ 16.238573] Process polkitd (pid: 442, stack limit = 0xc0e83ce5)
> > [ 16.244572] Stack: (0xec02de60 to 0xec02e000)
> > [ 16.248928] de60: 00000000 00000000 00000000 00000000 eda8ec48
> > eda8ec48 c0f09a3c 000003a6
> > [ 16.257097] de80: ecf46300 00000096 00000000 c02c4efc eda8ec48
> > 00000000 000003a6 c0289908
> > [ 16.265287] dea0: 0000000c eda8ec78 ecf46300 000003f4 00081114
> > eda8ec60 00000000 c0f04c48
> > [ 16.273482] dec0: c028956c 00000400 ec02df78 00000000 00081108
> > 00000400 00000000 c0263b20
> > [ 16.281671] dee0: 5b4c9a7c 0ee6b280 000039ea 00000000 c0f04c48
> > 8bb3ec56 c0f04c48 be8c7a00
> > [ 16.289853] df00: ecf46308 00000000 000007ff c0f04c48 00000001
> > 00000000 00000000 00000000
> > [ 16.298037] df20: 00000000 8bb3ec56 000039ea 8bb3ec56 ecf46300
> > 00081108 00000400 ec02df78
> > [ 16.306210] df40: 00000000 00081108 00000400 c0263cdc c0f04c48
> > b686ac78 000005e8 c0f04c48
> > [ 16.314381] df60: ecf46303 00002400 00000000 ecf46300 00081108
> > c02641c0 00002400 00000000
> > [ 16.322549] df80: 00000000 8bb3ec56 00022698 b686ac78 000005e8
> > 00000003 c0101204 ec02c000
> > [ 16.330718] dfa0: 00000003 c0101000 00022698 b686ac78 00000009
> > 00081108 00000400 000000c2
> > [ 16.338886] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
> > be8c7af4 00000000 00000000
> > [ 16.347053] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98 000e0010
> > 00000009 00000000 00000000
> > [ 16.355237] [<c02c4900>] (show_map_vma.constprop.3) from
> > [<c02c4efc>] (show_pid_map+0x10/0x34)
> > [ 16.363846] [<c02c4efc>] (show_pid_map) from [<c0289908>]
> > (seq_read+0x39c/0x4f4)
> > [ 16.371264] [<c0289908>] (seq_read) from [<c0263b20>]
> > (__vfs_read+0x2c/0x15c)
> > [ 16.378401] [<c0263b20>] (__vfs_read) from [<c0263cdc>]
> > (vfs_read+0x8c/0x110)
> > [ 16.385546] [<c0263cdc>] (vfs_read) from [<c02641c0>]
> > (ksys_read+0x4c/0xac)
> > [ 16.392519] [<c02641c0>] (ksys_read) from [<c0101000>]
> > (ret_fast_syscall+0x0/0x54)
> > [ 16.400083] Exception stack(0xec02dfa8 to 0xec02dff0)
> > [ 16.405135] dfa0: 00022698 b686ac78 00000009
> > 00081108 00000400 000000c2
> > [ 16.413311] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
> > be8c7af4 00000000 00000000
> > [ 16.421485] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98
> > [ 16.426542] Code: e1cd80f0 e5947020 ebfffb4f e5943048 (e593302c)
> > [ 16.432734] ---[ end trace 5dbf91c64da6bd91 ]---
> >
> > Reverting this makes it behave as expected again. Anybody knows
> > what is
> > going on?
>
> Could you check if this fixup helps?
>
> diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> index 225d1c58d2de..553262999564 100644
> --- a/arch/arm/kernel/process.c
> +++ b/arch/arm/kernel/process.c
> @@ -334,6 +334,7 @@ static struct vm_area_struct gate_vma = {
> .vm_start = 0xffff0000,
> .vm_end = 0xffff0000 + PAGE_SIZE,
> .vm_flags = VM_READ | VM_EXEC | VM_MAYREAD |
> VM_MAYEXEC,
> + .vm_ops = &dummy_vm_ops,
> };
>
> static int __init gate_vma_init(void)

Yep, that cuts it. Thanks!

I assume you will send a proper patch and/or do the needful?

2018-07-16 16:58:03

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: REGRESSION: [PATCHv2 2/2] mm: Drop unneeded ->vm_ops checks

On Mon, Jul 16, 2018 at 04:09:54PM +0000, Marcel Ziswiler wrote:
> On Mon, 2018-07-16 at 17:20 +0300, Kirill A. Shutemov wrote:
> > On Mon, Jul 16, 2018 at 01:30:34PM +0000, Marcel Ziswiler wrote:
> > > On Thu, 2018-07-12 at 17:56 +0300, Kirill A. Shutemov wrote:
> > > > We now have all VMAs with ->vm_ops set and don't need to check it
> > > > for
> > > > NULL everywhere.
> > > >
> > > > Signed-off-by: Kirill A. Shutemov <[email protected]
> > > > m>
> > > > ---
> > > > fs/binfmt_elf.c | 2 +-
> > > > fs/kernfs/file.c | 20 +-------------------
> > > > fs/proc/task_mmu.c | 2 +-
> > > > kernel/events/core.c | 2 +-
> > > > kernel/fork.c | 2 +-
> > > > mm/gup.c | 2 +-
> > > > mm/hugetlb.c | 2 +-
> > > > mm/memory.c | 12 ++++++------
> > > > mm/mempolicy.c | 10 +++++-----
> > > > mm/mmap.c | 14 +++++++-------
> > > > mm/mremap.c | 2 +-
> > > > mm/nommu.c | 4 ++--
> > > > 12 files changed, 28 insertions(+), 46 deletions(-)
> > > >
> > > > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> > > > index 0ac456b52bdd..4f171cf21bc2 100644
> > > > --- a/fs/binfmt_elf.c
> > > > +++ b/fs/binfmt_elf.c
> > > > @@ -1302,7 +1302,7 @@ static bool always_dump_vma(struct
> > > > vm_area_struct *vma)
> > > > * Assume that all vmas with a .name op should always be
> > > > dumped.
> > > > * If this changes, a new vm_ops field can easily be
> > > > added.
> > > > */
> > > > - if (vma->vm_ops && vma->vm_ops->name && vma->vm_ops-
> > > > > name(vma))
> > > >
> > > > + if (vma->vm_ops->name && vma->vm_ops->name(vma))
> > > > return true;
> > > >
> > > > /*
> > > > diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
> > > > index 2015d8c45e4a..945c3d306d8f 100644
> > > > --- a/fs/kernfs/file.c
> > > > +++ b/fs/kernfs/file.c
> > > > @@ -336,9 +336,6 @@ static void kernfs_vma_open(struct
> > > > vm_area_struct
> > > > *vma)
> > > > struct file *file = vma->vm_file;
> > > > struct kernfs_open_file *of = kernfs_of(file);
> > > >
> > > > - if (!of->vm_ops)
> > > > - return;
> > > > -
> > > > if (!kernfs_get_active(of->kn))
> > > > return;
> > > >
> > > > @@ -354,9 +351,6 @@ static vm_fault_t kernfs_vma_fault(struct
> > > > vm_fault *vmf)
> > > > struct kernfs_open_file *of = kernfs_of(file);
> > > > vm_fault_t ret;
> > > >
> > > > - if (!of->vm_ops)
> > > > - return VM_FAULT_SIGBUS;
> > > > -
> > > > if (!kernfs_get_active(of->kn))
> > > > return VM_FAULT_SIGBUS;
> > > >
> > > > @@ -374,9 +368,6 @@ static vm_fault_t
> > > > kernfs_vma_page_mkwrite(struct
> > > > vm_fault *vmf)
> > > > struct kernfs_open_file *of = kernfs_of(file);
> > > > vm_fault_t ret;
> > > >
> > > > - if (!of->vm_ops)
> > > > - return VM_FAULT_SIGBUS;
> > > > -
> > > > if (!kernfs_get_active(of->kn))
> > > > return VM_FAULT_SIGBUS;
> > > >
> > > > @@ -397,9 +388,6 @@ static int kernfs_vma_access(struct
> > > > vm_area_struct *vma, unsigned long addr,
> > > > struct kernfs_open_file *of = kernfs_of(file);
> > > > int ret;
> > > >
> > > > - if (!of->vm_ops)
> > > > - return -EINVAL;
> > > > -
> > > > if (!kernfs_get_active(of->kn))
> > > > return -EINVAL;
> > > >
> > > > @@ -419,9 +407,6 @@ static int kernfs_vma_set_policy(struct
> > > > vm_area_struct *vma,
> > > > struct kernfs_open_file *of = kernfs_of(file);
> > > > int ret;
> > > >
> > > > - if (!of->vm_ops)
> > > > - return 0;
> > > > -
> > > > if (!kernfs_get_active(of->kn))
> > > > return -EINVAL;
> > > >
> > > > @@ -440,9 +425,6 @@ static struct mempolicy
> > > > *kernfs_vma_get_policy(struct vm_area_struct *vma,
> > > > struct kernfs_open_file *of = kernfs_of(file);
> > > > struct mempolicy *pol;
> > > >
> > > > - if (!of->vm_ops)
> > > > - return vma->vm_policy;
> > > > -
> > > > if (!kernfs_get_active(of->kn))
> > > > return vma->vm_policy;
> > > >
> > > > @@ -511,7 +493,7 @@ static int kernfs_fop_mmap(struct file *file,
> > > > struct vm_area_struct *vma)
> > > > * So error if someone is trying to use close.
> > > > */
> > > > rc = -EINVAL;
> > > > - if (vma->vm_ops && vma->vm_ops->close)
> > > > + if (vma->vm_ops->close)
> > > > goto out_put;
> > > >
> > > > rc = 0;
> > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > > > index e9679016271f..e959623123e4 100644
> > > > --- a/fs/proc/task_mmu.c
> > > > +++ b/fs/proc/task_mmu.c
> > > > @@ -326,7 +326,7 @@ show_map_vma(struct seq_file *m, struct
> > > > vm_area_struct *vma, int is_pid)
> > > > goto done;
> > > > }
> > > >
> > > > - if (vma->vm_ops && vma->vm_ops->name) {
> > > > + if (vma->vm_ops->name) {
> > > > name = vma->vm_ops->name(vma);
> > > > if (name)
> > > > goto done;
> > > > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > > > index 8f0434a9951a..2e35401a5c68 100644
> > > > --- a/kernel/events/core.c
> > > > +++ b/kernel/events/core.c
> > > > @@ -7269,7 +7269,7 @@ static void perf_event_mmap_event(struct
> > > > perf_mmap_event *mmap_event)
> > > >
> > > > goto got_name;
> > > > } else {
> > > > - if (vma->vm_ops && vma->vm_ops->name) {
> > > > + if (vma->vm_ops->name) {
> > > > name = (char *) vma->vm_ops->name(vma);
> > > > if (name)
> > > > goto cpy_name;
> > > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > > index 9440d61b925c..e5e7a220a124 100644
> > > > --- a/kernel/fork.c
> > > > +++ b/kernel/fork.c
> > > > @@ -519,7 +519,7 @@ static __latent_entropy int dup_mmap(struct
> > > > mm_struct *mm,
> > > > if (!(tmp->vm_flags & VM_WIPEONFORK))
> > > > retval = copy_page_range(mm, oldmm,
> > > > mpnt);
> > > >
> > > > - if (tmp->vm_ops && tmp->vm_ops->open)
> > > > + if (tmp->vm_ops->open)
> > > > tmp->vm_ops->open(tmp);
> > > >
> > > > if (retval)
> > > > diff --git a/mm/gup.c b/mm/gup.c
> > > > index b70d7ba7cc13..b732768ed3ac 100644
> > > > --- a/mm/gup.c
> > > > +++ b/mm/gup.c
> > > > @@ -31,7 +31,7 @@ static struct page *no_page_table(struct
> > > > vm_area_struct *vma,
> > > > * But we can only make this optimization where a hole
> > > > would
> > > > surely
> > > > * be zero-filled if handle_mm_fault() actually did
> > > > handle
> > > > it.
> > > > */
> > > > - if ((flags & FOLL_DUMP) && (!vma->vm_ops || !vma-
> > > > >vm_ops-
> > > > > fault))
> > > >
> > > > + if ((flags & FOLL_DUMP) && !vma->vm_ops->fault)
> > > > return ERR_PTR(-EFAULT);
> > > > return NULL;
> > > > }
> > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > > index 039ddbc574e9..2065acc5a6aa 100644
> > > > --- a/mm/hugetlb.c
> > > > +++ b/mm/hugetlb.c
> > > > @@ -637,7 +637,7 @@ EXPORT_SYMBOL_GPL(linear_hugepage_index);
> > > > */
> > > > unsigned long vma_kernel_pagesize(struct vm_area_struct *vma)
> > > > {
> > > > - if (vma->vm_ops && vma->vm_ops->pagesize)
> > > > + if (vma->vm_ops->pagesize)
> > > > return vma->vm_ops->pagesize(vma);
> > > > return PAGE_SIZE;
> > > > }
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 7206a634270b..02fbef2bd024 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -768,7 +768,7 @@ static void print_bad_pte(struct
> > > > vm_area_struct
> > > > *vma, unsigned long addr,
> > > > (void *)addr, vma->vm_flags, vma->anon_vma,
> > > > mapping, index);
> > > > pr_alert("file:%pD fault:%pf mmap:%pf readpage:%pf\n",
> > > > vma->vm_file,
> > > > - vma->vm_ops ? vma->vm_ops->fault : NULL,
> > > > + vma->vm_ops->fault,
> > > > vma->vm_file ? vma->vm_file->f_op->mmap : NULL,
> > > > mapping ? mapping->a_ops->readpage : NULL);
> > > > dump_stack();
> > > > @@ -825,7 +825,7 @@ struct page *_vm_normal_page(struct
> > > > vm_area_struct *vma, unsigned long addr,
> > > > if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) {
> > > > if (likely(!pte_special(pte)))
> > > > goto check_pfn;
> > > > - if (vma->vm_ops && vma->vm_ops-
> > > > >find_special_page)
> > > > + if (vma->vm_ops->find_special_page)
> > > > return vma->vm_ops-
> > > > >find_special_page(vma,
> > > > addr);
> > > > if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
> > > > return NULL;
> > > > @@ -2404,7 +2404,7 @@ static void fault_dirty_shared_page(struct
> > > > vm_area_struct *vma,
> > > > {
> > > > struct address_space *mapping;
> > > > bool dirtied;
> > > > - bool page_mkwrite = vma->vm_ops && vma->vm_ops-
> > > > > page_mkwrite;
> > > >
> > > > + bool page_mkwrite = vma->vm_ops->page_mkwrite;
> > > >
> > > > dirtied = set_page_dirty(page);
> > > > VM_BUG_ON_PAGE(PageAnon(page), page);
> > > > @@ -2648,7 +2648,7 @@ static int wp_pfn_shared(struct vm_fault
> > > > *vmf)
> > > > {
> > > > struct vm_area_struct *vma = vmf->vma;
> > > >
> > > > - if (vma->vm_ops && vma->vm_ops->pfn_mkwrite) {
> > > > + if (vma->vm_ops->pfn_mkwrite) {
> > > > int ret;
> > > >
> > > > pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > > @@ -2669,7 +2669,7 @@ static int wp_page_shared(struct vm_fault
> > > > *vmf)
> > > >
> > > > get_page(vmf->page);
> > > >
> > > > - if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
> > > > + if (vma->vm_ops->page_mkwrite) {
> > > > int tmp;
> > > >
> > > > pte_unmap_unlock(vmf->pte, vmf->ptl);
> > > > @@ -4439,7 +4439,7 @@ int __access_remote_vm(struct task_struct
> > > > *tsk,
> > > > struct mm_struct *mm,
> > > > vma = find_vma(mm, addr);
> > > > if (!vma || vma->vm_start > addr)
> > > > break;
> > > > - if (vma->vm_ops && vma->vm_ops->access)
> > > > + if (vma->vm_ops->access)
> > > > ret = vma->vm_ops->access(vma,
> > > > addr,
> > > > buf,
> > > > len,
> > > > write);
> > > > if (ret <= 0)
> > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > > > index 9ac49ef17b4e..f0fcf70bcec7 100644
> > > > --- a/mm/mempolicy.c
> > > > +++ b/mm/mempolicy.c
> > > > @@ -651,13 +651,13 @@ static int vma_replace_policy(struct
> > > > vm_area_struct *vma,
> > > > pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p
> > > > set_policy
> > > > %p\n",
> > > > vma->vm_start, vma->vm_end, vma->vm_pgoff,
> > > > vma->vm_ops, vma->vm_file,
> > > > - vma->vm_ops ? vma->vm_ops->set_policy : NULL);
> > > > + vma->vm_ops->set_policy);
> > > >
> > > > new = mpol_dup(pol);
> > > > if (IS_ERR(new))
> > > > return PTR_ERR(new);
> > > >
> > > > - if (vma->vm_ops && vma->vm_ops->set_policy) {
> > > > + if (vma->vm_ops->set_policy) {
> > > > err = vma->vm_ops->set_policy(vma, new);
> > > > if (err)
> > > > goto err_out;
> > > > @@ -845,7 +845,7 @@ static long do_get_mempolicy(int *policy,
> > > > nodemask_t *nmask,
> > > > up_read(&mm->mmap_sem);
> > > > return -EFAULT;
> > > > }
> > > > - if (vma->vm_ops && vma->vm_ops->get_policy)
> > > > + if (vma->vm_ops->get_policy)
> > > > pol = vma->vm_ops->get_policy(vma,
> > > > addr);
> > > > else
> > > > pol = vma->vm_policy;
> > > > @@ -1617,7 +1617,7 @@ struct mempolicy *__get_vma_policy(struct
> > > > vm_area_struct *vma,
> > > > struct mempolicy *pol = NULL;
> > > >
> > > > if (vma) {
> > > > - if (vma->vm_ops && vma->vm_ops->get_policy) {
> > > > + if (vma->vm_ops->get_policy) {
> > > > pol = vma->vm_ops->get_policy(vma,
> > > > addr);
> > > > } else if (vma->vm_policy) {
> > > > pol = vma->vm_policy;
> > > > @@ -1663,7 +1663,7 @@ bool vma_policy_mof(struct vm_area_struct
> > > > *vma)
> > > > {
> > > > struct mempolicy *pol;
> > > >
> > > > - if (vma->vm_ops && vma->vm_ops->get_policy) {
> > > > + if (vma->vm_ops->get_policy) {
> > > > bool ret = false;
> > > >
> > > > pol = vma->vm_ops->get_policy(vma, vma-
> > > > >vm_start);
> > > > diff --git a/mm/mmap.c b/mm/mmap.c
> > > > index 527c17f31635..5adaf9f9b941 100644
> > > > --- a/mm/mmap.c
> > > > +++ b/mm/mmap.c
> > > > @@ -177,7 +177,7 @@ static struct vm_area_struct
> > > > *remove_vma(struct
> > > > vm_area_struct *vma)
> > > > struct vm_area_struct *next = vma->vm_next;
> > > >
> > > > might_sleep();
> > > > - if (vma->vm_ops && vma->vm_ops->close)
> > > > + if (vma->vm_ops->close)
> > > > vma->vm_ops->close(vma);
> > > > if (vma->vm_file)
> > > > fput(vma->vm_file);
> > > > @@ -998,7 +998,7 @@ static inline int is_mergeable_vma(struct
> > > > vm_area_struct *vma,
> > > > return 0;
> > > > if (vma->vm_file != file)
> > > > return 0;
> > > > - if (vma->vm_ops && vma->vm_ops->close)
> > > > + if (vma->vm_ops->close)
> > > > return 0;
> > > > if (!is_mergeable_vm_userfaultfd_ctx(vma,
> > > > vm_userfaultfd_ctx))
> > > > return 0;
> > > > @@ -1638,7 +1638,7 @@ int vma_wants_writenotify(struct
> > > > vm_area_struct
> > > > *vma, pgprot_t vm_page_prot)
> > > > return 0;
> > > >
> > > > /* The backer wishes to know when pages are first
> > > > written
> > > > to? */
> > > > - if (vm_ops && (vm_ops->page_mkwrite || vm_ops-
> > > > >pfn_mkwrite))
> > > > + if (vm_ops->page_mkwrite || vm_ops->pfn_mkwrite)
> > > > return 1;
> > > >
> > > > /* The open routine did something to the protections
> > > > that
> > > > pgprot_modify
> > > > @@ -2624,7 +2624,7 @@ int __split_vma(struct mm_struct *mm,
> > > > struct
> > > > vm_area_struct *vma,
> > > > struct vm_area_struct *new;
> > > > int err;
> > > >
> > > > - if (vma->vm_ops && vma->vm_ops->split) {
> > > > + if (vma->vm_ops->split) {
> > > > err = vma->vm_ops->split(vma, addr);
> > > > if (err)
> > > > return err;
> > > > @@ -2657,7 +2657,7 @@ int __split_vma(struct mm_struct *mm,
> > > > struct
> > > > vm_area_struct *vma,
> > > > if (new->vm_file)
> > > > get_file(new->vm_file);
> > > >
> > > > - if (new->vm_ops && new->vm_ops->open)
> > > > + if (new->vm_ops->open)
> > > > new->vm_ops->open(new);
> > > >
> > > > if (new_below)
> > > > @@ -2671,7 +2671,7 @@ int __split_vma(struct mm_struct *mm,
> > > > struct
> > > > vm_area_struct *vma,
> > > > return 0;
> > > >
> > > > /* Clean everything up if vma_adjust failed. */
> > > > - if (new->vm_ops && new->vm_ops->close)
> > > > + if (new->vm_ops->close)
> > > > new->vm_ops->close(new);
> > > > if (new->vm_file)
> > > > fput(new->vm_file);
> > > > @@ -3232,7 +3232,7 @@ struct vm_area_struct *copy_vma(struct
> > > > vm_area_struct **vmap,
> > > > goto out_free_mempol;
> > > > if (new_vma->vm_file)
> > > > get_file(new_vma->vm_file);
> > > > - if (new_vma->vm_ops && new_vma->vm_ops->open)
> > > > + if (new_vma->vm_ops->open)
> > > > new_vma->vm_ops->open(new_vma);
> > > > vma_link(mm, new_vma, prev, rb_link, rb_parent);
> > > > *need_rmap_locks = false;
> > > > diff --git a/mm/mremap.c b/mm/mremap.c
> > > > index 5c2e18505f75..7ab222c283de 100644
> > > > --- a/mm/mremap.c
> > > > +++ b/mm/mremap.c
> > > > @@ -302,7 +302,7 @@ static unsigned long move_vma(struct
> > > > vm_area_struct *vma,
> > > > need_rmap_locks);
> > > > if (moved_len < old_len) {
> > > > err = -ENOMEM;
> > > > - } else if (vma->vm_ops && vma->vm_ops->mremap) {
> > > > + } else if (vma->vm_ops->mremap) {
> > > > err = vma->vm_ops->mremap(new_vma);
> > > > }
> > > >
> > > > diff --git a/mm/nommu.c b/mm/nommu.c
> > > > index f00f209833ab..73f66e81cfb0 100644
> > > > --- a/mm/nommu.c
> > > > +++ b/mm/nommu.c
> > > > @@ -764,7 +764,7 @@ static void delete_vma_from_mm(struct
> > > > vm_area_struct *vma)
> > > > */
> > > > static void delete_vma(struct mm_struct *mm, struct
> > > > vm_area_struct
> > > > *vma)
> > > > {
> > > > - if (vma->vm_ops && vma->vm_ops->close)
> > > > + if (vma->vm_ops->close)
> > > > vma->vm_ops->close(vma);
> > > > if (vma->vm_file)
> > > > fput(vma->vm_file);
> > > > @@ -1496,7 +1496,7 @@ int split_vma(struct mm_struct *mm, struct
> > > > vm_area_struct *vma,
> > > > region->vm_pgoff = new->vm_pgoff += npages;
> > > > }
> > > >
> > > > - if (new->vm_ops && new->vm_ops->open)
> > > > + if (new->vm_ops->open)
> > > > new->vm_ops->open(new);
> > > >
> > > > delete_vma_from_mm(vma);
> > >
> > > Today's -next on Apalis T30 [1] gives the following error upon
> > > boot:
> > >
> > > [ 16.147496] Unable to handle kernel NULL pointer dereference at
> > > virtual address 0000002c
> > > [ 16.156152] pgd = 843045af
> > > [ 16.158986] [0000002c] *pgd=facd9831
> > > [ 16.162578] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> > > [ 16.167970] Modules linked in:
> > > [ 16.171034] CPU: 2 PID: 442 Comm: polkitd Not tainted 4.18.0-
> > > rc5-
> > > next-20180716-dirty #75
> > > [ 16.179111] Hardware name: NVIDIA Tegra SoC (Flattened Device
> > > Tree)
> > > [ 16.185382] PC is at show_map_vma.constprop.3+0xac/0x158
> > > [ 16.190686] LR is at show_map_vma.constprop.3+0xa8/0x158
> > > [ 16.195989] pc : [<c02c4900>] lr : [<c02c48fc>] psr:
> > > 800e0013
> > > [ 16.202243] sp : ec02de60 ip : 000003ce fp : c0f09a3c
> > > [ 16.207457] r10: ec02df78 r9 : 00000000 r8 : 00000000
> > > [ 16.212672] r7 : 00000000 r6 : eda8ec48 r5 : 00000000 r4 :
> > > c0f09a3c
> > > [ 16.219188] r3 : 00000000 r2 : ed1df000 r1 : 00000020 r0 :
> > > eda8ec48
> > > [ 16.225705] Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA
> > > ARM Segment none
> > > [ 16.232829] Control: 10c5387d Table: ac01804a DAC: 00000051
> > > [ 16.238573] Process polkitd (pid: 442, stack limit = 0xc0e83ce5)
> > > [ 16.244572] Stack: (0xec02de60 to 0xec02e000)
> > > [ 16.248928] de60: 00000000 00000000 00000000 00000000 eda8ec48
> > > eda8ec48 c0f09a3c 000003a6
> > > [ 16.257097] de80: ecf46300 00000096 00000000 c02c4efc eda8ec48
> > > 00000000 000003a6 c0289908
> > > [ 16.265287] dea0: 0000000c eda8ec78 ecf46300 000003f4 00081114
> > > eda8ec60 00000000 c0f04c48
> > > [ 16.273482] dec0: c028956c 00000400 ec02df78 00000000 00081108
> > > 00000400 00000000 c0263b20
> > > [ 16.281671] dee0: 5b4c9a7c 0ee6b280 000039ea 00000000 c0f04c48
> > > 8bb3ec56 c0f04c48 be8c7a00
> > > [ 16.289853] df00: ecf46308 00000000 000007ff c0f04c48 00000001
> > > 00000000 00000000 00000000
> > > [ 16.298037] df20: 00000000 8bb3ec56 000039ea 8bb3ec56 ecf46300
> > > 00081108 00000400 ec02df78
> > > [ 16.306210] df40: 00000000 00081108 00000400 c0263cdc c0f04c48
> > > b686ac78 000005e8 c0f04c48
> > > [ 16.314381] df60: ecf46303 00002400 00000000 ecf46300 00081108
> > > c02641c0 00002400 00000000
> > > [ 16.322549] df80: 00000000 8bb3ec56 00022698 b686ac78 000005e8
> > > 00000003 c0101204 ec02c000
> > > [ 16.330718] dfa0: 00000003 c0101000 00022698 b686ac78 00000009
> > > 00081108 00000400 000000c2
> > > [ 16.338886] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
> > > be8c7af4 00000000 00000000
> > > [ 16.347053] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98 000e0010
> > > 00000009 00000000 00000000
> > > [ 16.355237] [<c02c4900>] (show_map_vma.constprop.3) from
> > > [<c02c4efc>] (show_pid_map+0x10/0x34)
> > > [ 16.363846] [<c02c4efc>] (show_pid_map) from [<c0289908>]
> > > (seq_read+0x39c/0x4f4)
> > > [ 16.371264] [<c0289908>] (seq_read) from [<c0263b20>]
> > > (__vfs_read+0x2c/0x15c)
> > > [ 16.378401] [<c0263b20>] (__vfs_read) from [<c0263cdc>]
> > > (vfs_read+0x8c/0x110)
> > > [ 16.385546] [<c0263cdc>] (vfs_read) from [<c02641c0>]
> > > (ksys_read+0x4c/0xac)
> > > [ 16.392519] [<c02641c0>] (ksys_read) from [<c0101000>]
> > > (ret_fast_syscall+0x0/0x54)
> > > [ 16.400083] Exception stack(0xec02dfa8 to 0xec02dff0)
> > > [ 16.405135] dfa0: 00022698 b686ac78 00000009
> > > 00081108 00000400 000000c2
> > > [ 16.413311] dfc0: 00022698 b686ac78 000005e8 00000003 0000004b
> > > be8c7af4 00000000 00000000
> > > [ 16.421485] dfe0: 0004d1b2 be8c7a84 b686b94c b686ac98
> > > [ 16.426542] Code: e1cd80f0 e5947020 ebfffb4f e5943048 (e593302c)
> > > [ 16.432734] ---[ end trace 5dbf91c64da6bd91 ]---
> > >
> > > Reverting this makes it behave as expected again. Anybody knows
> > > what is
> > > going on?
> >
> > Could you check if this fixup helps?
> >
> > diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
> > index 225d1c58d2de..553262999564 100644
> > --- a/arch/arm/kernel/process.c
> > +++ b/arch/arm/kernel/process.c
> > @@ -334,6 +334,7 @@ static struct vm_area_struct gate_vma = {
> > .vm_start = 0xffff0000,
> > .vm_end = 0xffff0000 + PAGE_SIZE,
> > .vm_flags = VM_READ | VM_EXEC | VM_MAYREAD |
> > VM_MAYEXEC,
> > + .vm_ops = &dummy_vm_ops,
> > };
> >
> > static int __init gate_vma_init(void)
>
> Yep, that cuts it. Thanks!
>
> I assume you will send a proper patch and/or do the needful?

Andrew, could you fold it into 1/2?

--
Kirill A. Shutemov