Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp21340896ybl; Mon, 6 Jan 2020 02:43:45 -0800 (PST) X-Google-Smtp-Source: APXvYqy8rHo/dgka2Gdl0S8ljFg+1rM5M1WYgND2lhB1ap2O5OOoIi3rYHyLp2wTftFr7u1hIBQT X-Received: by 2002:a9d:6045:: with SMTP id v5mr107184033otj.252.1578307425591; Mon, 06 Jan 2020 02:43:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578307425; cv=none; d=google.com; s=arc-20160816; b=XbExq5pJc/hPA77oEpvhXjTaVU9RPCGfF4DIWUco7HUzccY1VH2G65Kb81fUcjXHCp hLqKZit83XfmpNScM1SZMHadsKxTMl22t+gjYalVP14KLj7gZE6AH8NT/5t5VicXaG+o 7vJmv8NNxeqrEqH4rJ2XD/vz9hqs3kBKx/6T3kcmZj6qZxHmbOhrOj9jD5g2ABnuyDow CLmefCUuB+xj36CUhb0Q50m1GnUTvDdojp3JXI+CoU4KJSDAm2PN6pNhby77419dYzrm 06OaRRYRy/hvA4lYGcoQhJJg7vhBtKgh/Bkehg8r+MFi8cHavZDwQbobwGag9PBsKo41 NRUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:message-id:date:cc:to:from:subject:dkim-signature; bh=bRVeZsD0ssI2ITrXsE3HS5bNXpMA/dOYQCa6X9aIh44=; b=nY39AEchHLndkKu2p2lzFrtxHNeJJ/6j5VqLsxpO2Vm7qrqVTAD6PXU/w4E4RM+U5q DXMKn19BzlPjRt94xobHIir+m6kAJyqugpQCAGG6ptfSIh0hcweZ8SrsrtwPIud0KjtR V7RI8NUAOfzDG/RVZsBzdLAkfVRNEpZR1h51w5bsKN8nBilhgyxv4BdZdHM1rTUUBpTo UOa9rJUfEIz3r7MhfIk8uOCU48flLoaIE5hOFVJ0u5m4n3/OefgzyPhDYa/0wWXXtE2P rxaOK25Sb3vrlnBWBM3damhL9N5BZdMLwSCKEArgWBLoyAYlsnhMQgjioCjxWXVCuaJm OomQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=S6iDG8Pd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 24si32383475oip.248.2020.01.06.02.43.33; Mon, 06 Jan 2020 02:43:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=S6iDG8Pd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726368AbgAFKmp (ORCPT + 99 others); Mon, 6 Jan 2020 05:42:45 -0500 Received: from forwardcorp1o.mail.yandex.net ([95.108.205.193]:42482 "EHLO forwardcorp1o.mail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726281AbgAFKmo (ORCPT ); Mon, 6 Jan 2020 05:42:44 -0500 Received: from mxbackcorp2j.mail.yandex.net (mxbackcorp2j.mail.yandex.net [IPv6:2a02:6b8:0:1619::119]) by forwardcorp1o.mail.yandex.net (Yandex) with ESMTP id 47ED62E1430; Mon, 6 Jan 2020 13:42:41 +0300 (MSK) Received: from myt5-70c90f7d6d7d.qloud-c.yandex.net (myt5-70c90f7d6d7d.qloud-c.yandex.net [2a02:6b8:c12:3e2c:0:640:70c9:f7d]) by mxbackcorp2j.mail.yandex.net (mxbackcorp/Yandex) with ESMTP id 2E6fIXUjzS-gePakUEF; Mon, 06 Jan 2020 13:42:41 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1578307361; bh=bRVeZsD0ssI2ITrXsE3HS5bNXpMA/dOYQCa6X9aIh44=; h=Message-ID:Date:To:From:Subject:Cc; b=S6iDG8PdFlivCb3idaunrZXUk+s8cjKe00WFd8ruvOMteYgbxUMcvKxRnlwLlCPKL jU7SyUXe7aEgJLKnKqk1uaR3LB0HyOftX67mVpzB8SiQBqjkJYUB8v+rxmDONr4rQD lIdx4c9TkLnzrdKTviHiYQOcxTF6gews81MRLPKg= Authentication-Results: mxbackcorp2j.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Received: from unknown (unknown [2a02:6b8:b080:6708::1:7]) by myt5-70c90f7d6d7d.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id j38nwkNrq9-geV4IAKt; Mon, 06 Jan 2020 13:42:40 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) Subject: [PATCH] mm/rmap: fix reusing mergeable anon_vma as parent when fork From: Konstantin Khlebnikov To: linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, Wei Yang Cc: "Kirill A. Shutemov" Date: Mon, 06 Jan 2020 13:42:40 +0300 Message-ID: <157830736034.8148.7070851958306750616.stgit@buzz> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This fixes couple misconceptions in commit 4e4a9eb92133 ("mm/rmap.c: reuse mergeable anon_vma as parent when fork"). First problem caused by initialization order in dup_mmap(): vma->vm_prev is set after calling anon_vma_fork(). Thus in anon_vma_fork() it points to previous VMA in parent mm. This is fixed by rearrangement in dup_mmap(). If in parent VMAs: SRC1 SRC2 .. SRCn share anon-vma ANON0, then after fork before all patches in child process related VMAs: DST1 DST2 .. DSTn will use different anon-vmas: ANON1 ANON2 .. ANONn. Before this patch only DST1 will fork new ANON1 and following DST2 .. DSTn will share parent's ANON0. With this patch DST1 will create new ANON1 and DST2 .. DSTn will share it. Also this patch moves sharing logic out of anon_vma_clone() into more specific anon_vma_fork() because this supposed to work only at fork(). Function anon_vma_clone() is more generic is also used at splitting VMAs. Second problem is hidden behind first one: assumption "Parent has vm_prev, which implies we have vm_prev" is wrong if first VMA in parent mm has set flag VM_DONTCOPY. Luckily prev->anon_vma doesn't dereference NULL pointer because in current code 'prev' actually is same as 'pprev'. To avoid that this patch just checks pointer and compares vm_start to verify relation between previous VMAs in parent and child. Signed-off-by: Konstantin Khlebnikov Fixes: 4e4a9eb92133 ("mm/rmap.c: reuse mergeable anon_vma as parent when fork") --- kernel/fork.c | 4 ++-- mm/rmap.c | 25 ++++++++++++------------- 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 2508a4f238a3..04ee5e243f65 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -548,6 +548,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, if (retval) goto fail_nomem_policy; tmp->vm_mm = mm; + tmp->vm_prev = prev; /* anon_vma_fork use this */ + tmp->vm_next = NULL; retval = dup_userfaultfd(tmp, &uf); if (retval) goto fail_nomem_anon_vma_fork; @@ -559,7 +561,6 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, } else if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; tmp->vm_flags &= ~(VM_LOCKED | VM_LOCKONFAULT); - tmp->vm_next = tmp->vm_prev = NULL; file = tmp->vm_file; if (file) { struct inode *inode = file_inode(file); @@ -592,7 +593,6 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, */ *pprev = tmp; pprev = &tmp->vm_next; - tmp->vm_prev = prev; prev = tmp; __vma_link_rb(mm, tmp, rb_link, rb_parent); diff --git a/mm/rmap.c b/mm/rmap.c index b3e381919835..77b3aa38d5c2 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -269,19 +269,6 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src) { struct anon_vma_chain *avc, *pavc; struct anon_vma *root = NULL; - struct vm_area_struct *prev = dst->vm_prev, *pprev = src->vm_prev; - - /* - * If parent share anon_vma with its vm_prev, keep this sharing in in - * child. - * - * 1. Parent has vm_prev, which implies we have vm_prev. - * 2. Parent and its vm_prev have the same anon_vma. - */ - if (!dst->anon_vma && src->anon_vma && - pprev && pprev->anon_vma == src->anon_vma) - dst->anon_vma = prev->anon_vma; - list_for_each_entry_reverse(pavc, &src->anon_vma_chain, same_vma) { struct anon_vma *anon_vma; @@ -334,6 +321,7 @@ int anon_vma_clone(struct vm_area_struct *dst, struct vm_area_struct *src) */ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma) { + struct vm_area_struct *prev = vma->vm_prev, *pprev = pvma->vm_prev; struct anon_vma_chain *avc; struct anon_vma *anon_vma; int error; @@ -345,6 +333,17 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma) /* Drop inherited anon_vma, we'll reuse existing or allocate new. */ vma->anon_vma = NULL; + /* + * If parent shares anon_vma with its vm_prev, keep this sharing. + * + * Previous VMA could be missing or not match previuos in parent + * if VM_DONTCOPY is set: compare vm_start to avoid this case. + */ + if (pvma->anon_vma && pprev && prev && + pprev->anon_vma == pvma->anon_vma && + pprev->vm_start == prev->vm_start) + vma->anon_vma = prev->anon_vma; + /* * First, attach the new VMA to the parent VMA's anon_vmas, * so rmap can find non-COWed pages in child processes.