Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp664507ybi; Fri, 12 Jul 2019 02:40:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqxa7V8iJNzoZ/Q+G1Qf3RIIDNm0eBCmc5NN34pRKvIiW21aflYoEQZSLHhVKPT85XykDYNz X-Received: by 2002:a17:90a:bf92:: with SMTP id d18mr10774729pjs.128.1562924443182; Fri, 12 Jul 2019 02:40:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562924443; cv=none; d=google.com; s=arc-20160816; b=JYCsNO9Xoyt/LA7KqdkwqjWwFLo+5dSZRNg9GCz0IvlOU0mdx1UxPoHdqE/2z4/hdH F0NdyPCBQMTKCbdSMzwkjfNRu7KGocXe3l71uoVm0w8/3QDN20P704QjgcdWm/edTKq2 e0/CLHDKdlqsbDElqqb6j8gRhr5cnOwKzbDEIXcEGmnTOgkqkYnfcUSYrLskkIgrqzBW 29LAgnDfxyKQpknRDJ9GmTa4bNpTPhkTKY8iMjzdtR+fOhFc95eTERhWZCRmlrNU7v+Z yGklCzRYQ0FZp/3jgbyDCksJIdO0tnQLtdS8l04zeBEZAmmCuL2TuzDOCWuGtGhOWVwV +szA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=fuhIK3fvw2cbR2un/4M5iDCL/WtI5Lq2Px8Inz4OMfE=; b=UI8HfULJJDcbWxp06SB/QcEpCQmNX5gA3erHFL2n9eT9q/TqCpJIxGNMQDznViE/60 8Qf4v0igj0JalGIBWjtAaHoCclmtv6DOJ65k6QLz05b21geZxqjvANAB10FEF4JyO2zZ DoI36NZKAdlpK+89IQ66l13Sa8btZLAsrJI+Dv0a8J+JLC+pTJocI+CsCGnKNZ6KKtPP dX68Yihh+3aGAVhrK8dZEtTOK/cmaJjLGRNtAej6AnvThqYF+tbS28CRR/IjXVV/3Pe3 RPz5zUt3a6Vw4a/aGhJ1LzVa7NpNh5BhPsFn13lZFtvRjb2BmQJ9pUFYdHOX3+OHvdXd OgqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id cl14si7329960plb.341.2019.07.12.02.40.26; Fri, 12 Jul 2019 02:40:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726765AbfGLJiR (ORCPT + 99 others); Fri, 12 Jul 2019 05:38:17 -0400 Received: from ozlabs.ru ([107.173.13.209]:58434 "EHLO ozlabs.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726085AbfGLJiQ (ORCPT ); Fri, 12 Jul 2019 05:38:16 -0400 Received: from fstn1-p1.ozlabs.ibm.com (localhost [IPv6:::1]) by ozlabs.ru (Postfix) with ESMTP id 4873EAE80597; Fri, 12 Jul 2019 05:30:00 -0400 (EDT) From: Alexey Kardashevskiy To: linux-kernel@vger.kernel.org Cc: "Oliver O'Halloran" , David Gibson , Sam Bobroff , Alistair Popple , Alexey Kardashevskiy , stable@vger.kernel.org Subject: [PATCH kernel v4 1/4] powerpc/powernv/ioda: Fix race in TCE level allocation Date: Fri, 12 Jul 2019 19:29:52 +1000 Message-Id: <20190712092955.56218-2-aik@ozlabs.ru> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190712092955.56218-1-aik@ozlabs.ru> References: <20190712092955.56218-1-aik@ozlabs.ru> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org pnv_tce() returns a pointer to a TCE entry and originally a TCE table would be pre-allocated. For the default case of 2GB window the table needs only a single level and that is fine. However if more levels are requested, it is possible to get a race when 2 threads want a pointer to a TCE entry from the same page of TCEs. This adds cmpxchg to handle the race. Note that once TCE is non-zero, it cannot become zero again. CC: stable@vger.kernel.org # v4.19+ Fixes: a68bd1267b72 ("powerpc/powernv/ioda: Allocate indirect TCE levels on demand") Signed-off-by: Alexey Kardashevskiy --- The race occurs about 30 times in the first 3 minutes of copying files via rsync and that's about it. This fixes EEH's from https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=110810 --- Changes: v2: * replaced spin_lock with cmpxchg+readonce --- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c b/arch/powerpc/platforms/powernv/pci-ioda-tce.c index e28f03e1eb5e..8d6569590161 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c +++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c @@ -48,6 +48,9 @@ static __be64 *pnv_alloc_tce_level(int nid, unsigned int shift) return addr; } +static void pnv_pci_ioda2_table_do_free_pages(__be64 *addr, + unsigned long size, unsigned int levels); + static __be64 *pnv_tce(struct iommu_table *tbl, bool user, long idx, bool alloc) { __be64 *tmp = user ? tbl->it_userspace : (__be64 *) tbl->it_base; @@ -57,9 +60,9 @@ static __be64 *pnv_tce(struct iommu_table *tbl, bool user, long idx, bool alloc) while (level) { int n = (idx & mask) >> (level * shift); - unsigned long tce; + unsigned long oldtce, tce = be64_to_cpu(READ_ONCE(tmp[n])); - if (tmp[n] == 0) { + if (!tce) { __be64 *tmp2; if (!alloc) @@ -70,10 +73,15 @@ static __be64 *pnv_tce(struct iommu_table *tbl, bool user, long idx, bool alloc) if (!tmp2) return NULL; - tmp[n] = cpu_to_be64(__pa(tmp2) | - TCE_PCI_READ | TCE_PCI_WRITE); + tce = __pa(tmp2) | TCE_PCI_READ | TCE_PCI_WRITE; + oldtce = be64_to_cpu(cmpxchg(&tmp[n], 0, + cpu_to_be64(tce))); + if (oldtce) { + pnv_pci_ioda2_table_do_free_pages(tmp2, + ilog2(tbl->it_level_size) + 3, 1); + tce = oldtce; + } } - tce = be64_to_cpu(tmp[n]); tmp = __va(tce & ~(TCE_PCI_READ | TCE_PCI_WRITE)); idx &= ~mask; -- 2.17.1