Received: by 2002:ab2:620c:0:b0:1ef:ffd0:ce49 with SMTP id o12csp1039387lqt; Tue, 19 Mar 2024 10:54:33 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXglm+eZfpzcSML1JVNtYi5CoMrCoDYun/0v6CzCjOGAQJnFsX3T0dycP7DTRDTjRh4hrfkoGb0I12NSmkZt6MY5X6WKxKLE8VCpUtVOA== X-Google-Smtp-Source: AGHT+IGlvAjO+2giiFOonittaVrhB2vB2u3TamQNW5bjbnvLylFmmmS7kycTjwZx2gWbVYRGnAFY X-Received: by 2002:a05:620a:621a:b0:788:41d6:14cb with SMTP id ou26-20020a05620a621a00b0078841d614cbmr17240047qkn.24.1710870872825; Tue, 19 Mar 2024 10:54:32 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710870872; cv=pass; d=google.com; s=arc-20160816; b=bck/rA9AX3y52Fq9nyEGPnx6jMRQAI67z36JARmOhUu74+O3eqDbfd036qC0hzIy/e 2GHw4DmNxM9nP/GiLL0Ig/uwuZAiH7nV8umRiGsjpuKX0aYYAXaJTxYS2zcVq5OA627t 0MyoCrkWL05HEtWwbwEuMycYnTuLgfpxHOifzw4g5Ohwli2YDbTvVIL1ZJZvSnTh6KO4 Jhm7bQ/3EbnObXCZRKcCBAVmKWndk7APkStrU3uSQDkVpzbFIWN4j2dK/u+NqUQavzHE Q2AkSxhz9Y31KNK3oyTU0r/MEXjxZ5Zrc1QlKUvQM5jQA+L98J3jLPYVODU4mTf8cMUY 581w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=B9HuiwQg7X+0lbII8qyF6ES+SfD7F1pHs2MmLusxi84=; fh=M7Vl7JGg5EHkwftUiVHDTB8Jl/0EU1CHCvk6Zv4JmgM=; b=ek4vrEm7nyiT9lhSuaTgopWpnZ5iwETia93B0zxjttVlBjUOnVvrP5fWs+YVonNfF5 wKdrJLubUA9fMyl9nGt+bK9+L8UMVBQEDcdWT2u0hn94eHf1Ro+d2ApcqX+JumJ63pqb bo3BqNeesSiLtymiKxQRRxInQ1eD/ffG2Tctl3R+LWDgF6hrCmsjfJGlHVGcSpuBNjMw RB1prA8XakAb8hvoeQm8Vb2f+REVxno/thLVL6WvBMQoIszCfRlwdory8tQogiDw3gFa d2W2rfOMWDgYqrrTTsOUNQ3joGkqEC1Q5Kjn6Cuf84FdJhgQYGQ0sUt26PYBrnGbUyo5 GAyQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=WPJEz9SG; arc=pass (i=1 spf=pass spfdomain=ziepe.ca dkim=pass dkdomain=ziepe.ca); spf=pass (google.com: domain of linux-kernel+bounces-107952-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107952-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id dt28-20020a05620a479c00b00789e8fbcfaesi8949617qkb.46.2024.03.19.10.54.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Mar 2024 10:54:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-107952-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=WPJEz9SG; arc=pass (i=1 spf=pass spfdomain=ziepe.ca dkim=pass dkdomain=ziepe.ca); spf=pass (google.com: domain of linux-kernel+bounces-107952-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-107952-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 802761C22AEE for ; Tue, 19 Mar 2024 17:54:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3481D282F1; Tue, 19 Mar 2024 17:54:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="WPJEz9SG" Received: from mail-oa1-f41.google.com (mail-oa1-f41.google.com [209.85.160.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73A2A20DC8 for ; Tue, 19 Mar 2024 17:54:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.41 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710870869; cv=none; b=PvSH+x3ir8XE8bNkW4VAggcu9oFUeE62pbBbk8MEfs8567LrO3yuLZV1D4EKjtpz0WIa3kahU5mzYSUVoJpUvpP3+BnTdrGBB/3MjpuujJ9q2Dmzt2TjTSVGLENiV/K0ihhqPJP7ysPEcZBfua0UN231TClwZ2GopptX/WTr3kQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710870869; c=relaxed/simple; bh=B9HuiwQg7X+0lbII8qyF6ES+SfD7F1pHs2MmLusxi84=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=QrL0y4HtgAzYw7r0p5JD5bYUy13ZjhwadNgSUoUlDrR0PvsG6IZ+klJN30GexSRajuwv23kbV9kcuwSn73mcJLOaZLiF12Kph3+Y4qdZJ3L5Q7DduI7e8IXcHbM3ZWBHt6iLg81SNtuAq9UUEt0eHDjFOTurIadSzG7jUvT5k00= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca; spf=pass smtp.mailfrom=ziepe.ca; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b=WPJEz9SG; arc=none smtp.client-ip=209.85.160.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ziepe.ca Received: by mail-oa1-f41.google.com with SMTP id 586e51a60fabf-2220a3b3871so2597021fac.2 for ; Tue, 19 Mar 2024 10:54:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; t=1710870866; x=1711475666; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=B9HuiwQg7X+0lbII8qyF6ES+SfD7F1pHs2MmLusxi84=; b=WPJEz9SGaGxWfXrSjYp8a6um1xt0Nh7D0PgzOKoPv4tUrB07npQC8t0o4sHUlYzsr/ jV5IAg4bjlLbnvTwJNr/vXZjjJY1FWMcxPPfPAl9ncouIcaE5shloDQgx+Iljixyk9jf S0Zzy+QTtsE3ULClsbMBvZV45T+wRwbw4ozuk7DWLysH0ggz99fSzqkC8bYAqUB4uAnD I8rzgCj05tL9HDqH/1fjmPcgoP7Q7I4Kk/aYpNM0CMw+pfwRPo0KKpeHxy3kjZdFPim2 /cYNbGR5qWFjnkRWm83OvojpzetTCwy+nMOqjZhIyha1qc+zmvDVdxt01adVtan6AITr +2bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710870866; x=1711475666; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=B9HuiwQg7X+0lbII8qyF6ES+SfD7F1pHs2MmLusxi84=; b=hF46Z78sgrakYMGqcA8jU2Pc3hAoYqqWg25TiCU8ZUJXG64kwtgUhgDemvXFwN4HpL yiPn3/knWgIMhP1j9V/ExkHSiN57ajThc3mBSOOmiKaRVRmiWOVPkoDq2BKliK1OSi5u UPHNO+/qfm7hQ06TUZGYPSaDyFnGWih7X32JM/yARAEeGeXnm16m1cl8XriFQgGmEGiS c6aqhKMcdxLquSzlnEMRkQlTf2SEiTLwmjrU369/eFY/HhpebvStxML6opFjAnPvJs+E kKy+KvJWRNVRDz1pTh7aHaQWccPh17SH1V+YoP2yPHSpeZ5DA/WK6JqzdHTTWWxejZxb pckg== X-Forwarded-Encrypted: i=1; AJvYcCXXhra6eYPO10Dcw+WAyG+mITEitrC26UdkiXP21Jv1JHXk4CqMohWZGwoCD8/2JbJXNr45yi2XXw6tKaAZL9OkTY9O0riuO2VEohsG X-Gm-Message-State: AOJu0YzH03KZSqHYGO9LunwbRECFseIg+9b+MSfSQT5IAAdRYgp5Obg2 PzT0MpnCLZj4QcFu72hVXZRia587pZ3tnTeXsF0jXkun0ALVVYHhda5yRw2R9f0= X-Received: by 2002:a05:6870:524f:b0:221:bf42:cf76 with SMTP id o15-20020a056870524f00b00221bf42cf76mr16367906oai.10.1710870866557; Tue, 19 Mar 2024 10:54:26 -0700 (PDT) Received: from ziepe.ca ([12.97.180.36]) by smtp.gmail.com with ESMTPSA id gh11-20020a056638698b00b00477716fcbb8sm2429986jab.40.2024.03.19.10.54.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Mar 2024 10:54:25 -0700 (PDT) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1rmdbD-001kiW-EG; Tue, 19 Mar 2024 14:50:07 -0300 Date: Tue, 19 Mar 2024 14:50:07 -0300 From: Jason Gunthorpe To: Will Deacon Cc: Robin Murphy , Tyler Hicks , Jerry Snitselaar , linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Dexuan Cui , Easwar Hariharan Subject: Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? Message-ID: <20240319175007.GC66976@ziepe.ca> References: <120d0dec-450f-41f8-9e05-fd763e84f6dd@arm.com> <20240319154756.GB2901@willie-the-truck> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240319154756.GB2901@willie-the-truck> On Tue, Mar 19, 2024 at 03:47:56PM +0000, Will Deacon wrote: > Right, it's hard to win if DMA-active devices weren't quiesced properly > by the outgoing kernel. Either the SMMU was left in abort (leading to the > problems you list above) or the SMMU is left in bypass (leading to possible > data corruption). Which is better? For whatever reason (and I really don't like this design) alot of work was done on x86 so that device continues to work as-was right up until the crash kernel does the first DMA operation. Including having the crash kernel non disruptively inherit and retain the IOMMU configuration. (eg see translation_pre_enabled() stuff in intel driver) I think the idea was that the crash kernel driver will recover control of the device prior to trying to do DMA. Devices without a driver or devices that are not operated by the crash kernel just keep going as they were. In general practice this is unworkable as some devices can't be recovered without doing DMA in the first place creating a catch-22. So now lots of devices use their shutdown handler to quiet the device before handing over to the crash kernel. I think this emerged as some 'small work' to try and make crash kernels functional at all. Implementing every shutdown handler would be pretty hard, but many (?) devices seem to work OK if the crash kernel drivers runs for a bit before destroying their DMA setup. We don't trigger weird platform crashes or anything due to failing DMA operations either. Now we have all kinds of infrastructure and deployed crash kernels that have this assumption baked in. :( It sure would be nice to not spread this full complexity to ARM. If the original kernel could signal to the crash kernel that specific devices are quieted and then the crash kernel could simply ignore unquieted devices and set the IOMMU to abort them and don't allow any crash drivers to attach. (or maybe FLR them?) If someone wants a device to be usuable in the crash kernel then the original kernel needs to implement the shutdown handler. Regardless, I think if your goal is to support crash kernels then you have to do at least a bit of the x86 'keep the iommu unchanged'. The iommu shutdown should do less like x86 does and the iommu startup should detect the special case and try to atomic switch to the new STE table that aborts unquieted devices. Booting a non-crash OS is a different matter and in that case you really want every bit of HW put back to a clean "just booted" state, and arguably it can't work unless the original kernel implements all the shutdown handlers... I don't know if x86 kexec actually support this, it looks like it only works on Linux OS and things like the Linux iommu driver have code to support the crash focused hand over even in non crash cases. Jason