Received: by 2002:ab2:7b86:0:b0:1f7:5705:b850 with SMTP id q6csp42433lqh; Fri, 3 May 2024 12:44:30 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWF6ACBp+oGt3X+1at91yWSRuxOxTviNcPdrPFtUOQHX00rjMxcX3n/5IgNR3WWMEwrXS3w50bWZ6noKYZFlFZO/ZvLv/9lDtMhQtohaQ== X-Google-Smtp-Source: AGHT+IE8P6s1WvWsK4OBMB0S2uPuoIx60oByh0tCYzvWnMBBdR2OqA8IaO71hpkBu8x5MdnfMRLH X-Received: by 2002:a17:902:f7c6:b0:1e3:ff5e:159f with SMTP id h6-20020a170902f7c600b001e3ff5e159fmr3536326plw.18.1714765470487; Fri, 03 May 2024 12:44:30 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714765470; cv=pass; d=google.com; s=arc-20160816; b=GCB0uR5VfS1/dGpGb5R42o2LE63iF9qkfHrQIiDgCKe9TCRadQDoopbyAnQhlESrGr 4j9S8ZQcX8NcyHa8UD41I8xS1/8/4Funn5FPf1SyZIbePr6uLSeOSlCGDSVmvOmp7Q9H wEhSuG5ssx1JM/6HirrhBe+9qTQWgsYS2ThFfJxLCre77H4eanuZDOZ662uNle34IW/I cTkgAoS1pbVG60vsHdjrOETjktylHfOJ6jJtOfU8mf5RSF/77177BXHaV/U+3i24D+Z5 QSmXtukxePSXHnHErKfVIJgebpwp1PqAMWR1HDjVKqbgQevtB4bQpIrkiUHTdHS8pZkf ObGA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=CGjDpzqhZQyAQMKyqH+Z7WQdcPMW4kVzNIgeNLfdaAY=; fh=U3JzLr8UcxI30xI1T5PEbF6TfWTsUmbwXn4eMrv3Uw8=; b=i1nquabjenlx3D28WV9dvdEhbUG3rxwuUZkCVa2WMX6yG5CHR7FJZf5nZayio7lZhi 82CenABD77M2Rgg2pQQBWR5L0cZpW3OzKhaQUvSAy3pH3lOnnrYrh/blH6ZghDwzcHvm CG3HxFq97qXqtHJopDit9H1CrHbDD1VE7z16ICyWXjuwfyORvnXkqX++2qxA92vpA1yr Fi3h5rjvrxG1dhoNsAw2ucDFDGp6SUSKk2goCLgCs1WckNo6wy0u158NUi9/28Bo/5xu vCZvguN6ZoXqrvr23bTxywAWfWpSkDFyBRP0ndJ7vnjfmhPfVfLquOz7SDXcZDm7MMYj Sk6w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=2oFPrmCc; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-168149-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-168149-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id ma12-20020a170903094c00b001e445b8108csi966946plb.391.2024.05.03.12.44.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 May 2024 12:44:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-168149-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=2oFPrmCc; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-168149-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-168149-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 1B26F282769 for ; Fri, 3 May 2024 19:44:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0662A158D61; Fri, 3 May 2024 19:44:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b="2oFPrmCc" Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A15A158A04 for ; Fri, 3 May 2024 19:44:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.179 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714765463; cv=none; b=jzymy5ArPk0xd6uDKPBAhSD1tUnpIq2WJNitRDqB8vODoYnty8HhsrTxCe8n9Nkkv9OX9+WiDd+6aRtZQvTPSFbqamAdLDQf+e2blvgSnxBi3gSxS1D79MyHA21ZhwtCHImPisLP4QuEsd98MoN1zTXnfEFY1HmFnLxnZtIiEyk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714765463; c=relaxed/simple; bh=0NLlSbUWjtfkSNMe90tPQL2xSqgQxt8SNfb42yWTU8Y=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=MGj43Y6+nDU1qy33hFRLD2Of3T72dwraw/orhwU09mSf/av9toxro38+OrwZNVGxWunUbBJfBMwOPvaR063NwRN+7okHzUuDz+ZFF978sU3Od/CBF2qrwe0cPnHWL/iuuoKZzQoRsdB36l9luXSTIdEU8gOr7JPswoydqfi+/Wc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com; spf=pass smtp.mailfrom=rivosinc.com; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b=2oFPrmCc; arc=none smtp.client-ip=209.85.215.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-5d3907ff128so12827a12.3 for ; Fri, 03 May 2024 12:44:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1714765461; x=1715370261; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=CGjDpzqhZQyAQMKyqH+Z7WQdcPMW4kVzNIgeNLfdaAY=; b=2oFPrmCcjzasyoCpy5AhnCPSBFINufmdYubjfIa/ZpLhV9vXxe8Ab0fmp1qlLA17GO N4wUO8m1SF1FI5xDq7sPKJTdsQ5clYcPEWZZCrWDTMWkIou3dNxyeULIUnTkVbWE620S 8EjzELyznOoBaMQglczLqTy9ZMByUIwdyQwXVuFM0KyvqCb6nEVfaaOj+xMqjK+uSmOV VHOp/0s600qSUKASTlwUzDP9u4nsInS/qdtzhnW40vLkks4NHtU4yhIKGx9fm9V52h8i Ri0TKS6cMttWgjfe/4yC/lGe6d24MONCPSVfjglG+F4ZyFtcMgbpKCBuy3F4+Mrx3CPt 5e0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714765461; x=1715370261; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CGjDpzqhZQyAQMKyqH+Z7WQdcPMW4kVzNIgeNLfdaAY=; b=eH71HYY6Yl36iMixrtGV5NTT9jsxlyFENkYtS/UUaTgaVqIKSOhqFY7Z1+c+9+cISr vt1LLpxi0TtUWUp/XGCRU1zHmkxEiAamIMzozZ0sFAVl92dDNqwgVcZ76yrTSn3LYIuZ WQamCzIOIP06LDmG2RS1RC6XSjXoFvLGDcD/BAIqbIC46NDjp8J6oldtQT+ImLh1IJN4 FtLchqDSOnP2IS+w3vbiptytuEeqMrIuOXOdiTj+KbTmtfGoKR4+C0jRalhOOn6g8ovE +IrgDI4j5ZfWDgMcs1zX5qLtJUxZTXTGBGvQ17R9PKMo9bq7iZr4ivnNECe1adoIK0hF foLA== X-Forwarded-Encrypted: i=1; AJvYcCVJ0ktL3mRnbePQ3q09j4Cq+cJuTXxt4MEbelKdp0GKcixzU31ylqw7xDovMFM/j6LF6n9EUcme0+oj1h6EcYm08to6tBQO8m/cU0gM X-Gm-Message-State: AOJu0Yx4p51t387ymxZKIEbMYSvUNlmoMywlCJ4FQ8DYwsBUCDb/5/tD Qeb3mdLZAg+DMRw7gbcUoCbnVa9lAI/V9HiD3tmevba7j/liDUioAmYCq1D7FJ0kAOVaPPtLev2 EzEwxZ7P+BWQOEEHBeQ24zNCdmkm60EKZPSos+Q== X-Received: by 2002:a17:90b:354e:b0:2b4:33ee:25a1 with SMTP id lt14-20020a17090b354e00b002b433ee25a1mr3442632pjb.26.1714765460928; Fri, 03 May 2024 12:44:20 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240501145621.GD1723318@ziepe.ca> <20240503181059.GC901876@ziepe.ca> In-Reply-To: <20240503181059.GC901876@ziepe.ca> From: Tomasz Jeznach Date: Fri, 3 May 2024 12:44:09 -0700 Message-ID: Subject: Re: [PATCH v3 7/7] iommu/riscv: Paging domain support To: Jason Gunthorpe Cc: Joerg Roedel , Will Deacon , Robin Murphy , Paul Walmsley , Palmer Dabbelt , Albert Ou , Anup Patel , Sunil V L , Nick Kossifidis , Sebastien Boeuf , Rob Herring , Krzysztof Kozlowski , Conor Dooley , devicetree@vger.kernel.org, iommu@lists.linux.dev, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux@rivosinc.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 3, 2024 at 11:11=E2=80=AFAM Jason Gunthorpe wrot= e: > > On Fri, May 03, 2024 at 10:44:14AM -0700, Tomasz Jeznach wrote: > > > > + list_for_each_entry_rcu(bond, &domain->bonds, list) { > > > > + if (bond->dev =3D=3D dev) { > > > > + list_del_rcu(&bond->list); > > > > + found =3D bond; > > > > + } > > > > + } > > > > + spin_unlock_irqrestore(&domain->lock, flags); > > > > + > > > > + /* Release and wait for all read-rcu critical sections have c= ompleted. */ > > > > + kfree_rcu(found, rcu); > > > > + synchronize_rcu(); > > > > > > Please no, synchronize_rcu() on a path like this is not so > > > reasonable.. Also you don't need kfree_rcu() if you write it like thi= s. > > > > > > This still looks better to do what I said before, put the iommu not > > > the dev in the bond struct. > > > > > > > > > > I was trying not to duplicate data in bond struct and use whatever is > > available to be referenced from dev pointer (eg iommu / ids / private > > iommu dev data). > > I'm not sure that is a valuable goal considering the RCU > complexity.. But I suppose it would be a bit of a hassle to replicate > the ids list into bond structurs. Maybe something to do when you get > to ATS since you'll probably want to replicate the ATS RIDs. (see what > Intel did, which I think is pretty good) > > > If I'm reading core iommu code correctly, device pointer and iommu > > pointers should be valid between _probe_device and _release_device > > calls. I've moved synchronize_rcu out of the domain attach path to > > _release_device, LMK if that would be ok for now. I'll have a > > second another to rework other patches to avoid storing dev pointers > > at all. > > Yes, that seems better.. I'm happier to see device hot-unplug be slow > than un attach > > There is another issue with the RCU that I haven't wrapped my head > around.. > > Technically we can have concurrent map/unmap/invalidation along side > device attach/detach. Does the invalidation under RCU work correctly? > > For detach I think yes: > > Inv CPU Detach CPU > > write io_pte Update device descriptor > rcu_read_lock > list_for_each > > dma_wmb() dma_wmb() > > rcu_read_unlock > list_del_rcu() > > > In this case I think we never miss an invalidation, the list_del is > always after the HW has been fully fenced, so I don't think we can > have any issue. Maybe a suprious invalidation if the ASID gets > re-used, but who cares. > > Attach is different.. > > Inv CPU Attach CPU > > write io_pte > rcu_read_lock > list_for_each // empty > list_add_rcu() > Update device descriptor > > dma_wmb() > > rcu_read_unlock > > As above shows we can "miss" an invalidation. The issue is narrow, the > io_pte could still be sitting in write buffers in "Inv CPU" and not > yet globally visiable. "Attach CPU" could get the device descriptor > installed in the IOMMU and the IOMMU could walk an io_pte that is in > the old state. Effectively this is because there is no release/acquire > barrier passing the io_pte store from the Inv CPU to the Attach CPU to th= e > IOMMU. > > It seems like it should be solvable somehow: > 1) Inv CPU releases all the io ptes > 2) Attach CPU acquires the io ptes before updating the DDT > 3) Inv CPU acquires the RCU list in such a way that either attach > CPU will acquire the io_pte or inv CPU will acquire the RCU list. > 4) Either invalidation works or we release the new iopte to the SMMU > and don't need it. > > But #3 is a really weird statement. smb_mb() on both sides may do the > job?? > Actual attach sequence is slightly different. Inv CPU Attach CPU write io_pte rcu_read_lock list_for_each // empty list_add_rcu() IOTLB.INVAL(PSCID) dma_wmb() rcu_read_unlock I've tried to cover this case with riscv_iommu_iotlb_inval() called before the attached domain is visible to the device. This is something to optimize to avoid invalidating the whole PSCID if the domain was already attached to the same IOMMU, but I'd leave it for a separate patch series. > > > The number of radix levels is a tunable alot of iommus have that we > > > haven't really exposed to anything else yet. > > > > Makes sense. I've left an option to pick mode from MMU for cases where > > dev/iommu is not known at allocation time (with iommu_domain_alloc()). > > I'd guess it's reasonable to assume IOMMU supported page modes will > > match MMU. > > Reasonable, but for this case you'd be best to have a global static > that unifies the capability of all the iommu instances. Then you can > pick the right thing from the installed iommus, and arguably, that is > the right thing to do in all cases as we'd slightly prefer domains > that work everywhere in that edge case. > That is a viable option. If you're ok I'll address this as a separate patch= . I've been trying to avoid adding more global variables, but it might not be avoidable. > Jason Best, - Tomasz