Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp446259rdb; Thu, 21 Dec 2023 14:08:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IH3SHp7kvQi61W+OxgbJHYtM3TxUZzMqro1+EJKO4av1AGmL9nOMl3iP5TJkrklR+tc+MBM X-Received: by 2002:a05:6e02:1b04:b0:35f:75df:530f with SMTP id i4-20020a056e021b0400b0035f75df530fmr392292ilv.23.1703196511460; Thu, 21 Dec 2023 14:08:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703196511; cv=none; d=google.com; s=arc-20160816; b=ek5Jdguifu7Wd/LgUc9VwpNqibAWhK6/FIIkF3PvyH57BBA9UNFRcEU/jx0a47diCe y92zBMJF5Y8vn26+imqJk9JLjvIH096Bo4LT5wovMrOEMKqJ6Du2jCb6wWd83rX5+Cor 1PFenn5NPKRIKt3ZAZ5H9gb3e9jxkAMFdtCVnjkcskx6nn/Ncakq6GvtcfzUC6pKxvYR /JNEwZBRUGUKySbgXUFMKx7jVGdMUgNrNW6mDR6t79FiMy1vydg7oBv6wXWu67NcNedZ HMr290Cv6Lwgz1ISjxBRUubxwTqslYga8eex25eeNRjRi2jZEASAiS4zliM6w6yL6V7D EB+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=eyqM/f7+FtNt+KEaXKw1YbLhIXeKkkaARlHdPAqHB0s=; fh=/EOZoclJ0TMAOT3bBMvMfd+hwb7leyoaCEKZVK53Mhc=; b=o7FBm0vK1FwY1a6wiUiM+jl1OQ34yy2WD0Y+JLoOj3OvUAZ+LbjyKfRWyzoCeIp5C2 a59msLg/6rM8Tw27Rjk8oiFblRkaY9+Vd7xwXGTvNx7XNOWJEwRaLUe9oVCvRDQ7lpKq 9QousNAfc4hHRBGD1NKdMEM8uAdiXwtsX63dnHsaVxS10Aa3l0X7ZYOKQon4K5zmdsjz 67KoYQuB0Ac01y3K8ZCPRwUxRXEjCPJpaPnneUsVWsiIzOajKet6GrI0+HuyJ+WFqTh1 igbL8ZfDaAj+4jlWJ6MDhaxAJB1NtjePjL+9BeDf88ZCEiXdSAiHQ5SuiQFEqR7/KIQV aCSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=1GRPxZ7b; spf=pass (google.com: domain of linux-kernel+bounces-9115-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9115-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id j73-20020a638b4c000000b005cda5bb89cbsi2210916pge.163.2023.12.21.14.08.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Dec 2023 14:08:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-9115-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=1GRPxZ7b; spf=pass (google.com: domain of linux-kernel+bounces-9115-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9115-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id E8939283FF0 for ; Thu, 21 Dec 2023 22:08:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id CFBCA77F15; Thu, 21 Dec 2023 22:08:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="1GRPxZ7b" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0CD727763A for ; Thu, 21 Dec 2023 22:08:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6FFDAC433C7; Thu, 21 Dec 2023 22:08:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1703196504; bh=GZnak5fMCSFehBPFlJ9iudVAVRz/t843L8EYKs7DzGM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=1GRPxZ7b311HpzBeZf2P8zlj7pnwlHOqjK2zu2QKRZjC9gpFum5xJ/h47c6/G6vDT l4RviXTke0ZLEiff7cmrFFjI/vggofwRA/5gc8soFea3DHyKRDVyHExh5iT3FQFHAa xeRN/bW7u1lOpatORbe9ptWfl52eRqxDsj+nGJTk= Date: Thu, 21 Dec 2023 14:08:23 -0800 From: Andrew Morton To: Jiajun Xie Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1] mm: fix unmap_mapping_range high bits shift bug Message-Id: <20231221140823.2908189514c0081ae9efbda8@linux-foundation.org> In-Reply-To: References: <20231220052839.26970-1-jiajun.xie.sh@gmail.com> <20231220095343.326584f605e8ce995ac151d0@linux-foundation.org> X-Mailer: Sylpheed 3.8.0beta1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Thu, 21 Dec 2023 13:40:11 +0800 Jiajun Xie wrote: > > (obviously bad, but it's good to spell it out) and under what > > circumstances it occurs? > > Thanks for the quick reply. > > The issue happens in Heterogeneous computing, where the > device(e.g. gpu) and host share the same virtual address space. > > A simple workflow pattern which hit the issue is: > /* host */ > 1. userspace first mmap a file backed VA range with specified offset. > e.g. (offset=0x800..., mmap return: va_a) > 2. write some data to the corresponding sys page > e.g. (va_a = 0xAABB) > /* device */ > 3. gpu workload touches VA, triggers gpu fault and notify the host. > /* host */ > 4. reviced gpu fault notification, then it will: > 4.1 unmap host pages and also takes care of cpu tlb > (use unmap_mapping_range with offset=0x800...) > 4.2 migrate sys page to device > 4.3 setup device page table and resolve device fault. > /* device */ > 5. gpu workload continued, it accessed va_a and got 0xAABB. > 6. gpu workload continued, it wrote 0xBBCC to va_a. > /* host */ > 7. userspace access va_a, as expected, it will: > 7.1 trigger cpu vm fault. > 7.2 driver handling fault to migrate gpu local page to host. > 8. userspace then could correctly get 0xBBCC from va_a > 9. done > > But in step 4.1, if we hitted the bug this patch mentioned, then user space > would never trigger cpu fault, and still get the old value: 0xAABB. Thanks. Based on the above, I added cc:stable to the changelog so the fix will be backported into earlier kernels (it looks like that's 20+ years worth!). And I pasted the above text into that changelog.