Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp7086725ybl; Mon, 23 Dec 2019 18:01:47 -0800 (PST) X-Google-Smtp-Source: APXvYqy1peiuhTY4WynVyZ4II0zrJdrpL0UV5AtdqRAvJ/u0WuEqOinroItGEzPwYkt0HgrGPRbN X-Received: by 2002:a9d:7501:: with SMTP id r1mr33627775otk.196.1577152907364; Mon, 23 Dec 2019 18:01:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1577152907; cv=none; d=google.com; s=arc-20160816; b=WJ+1R5xrVeJvWDKNP0Vwe1uGKp2FDWtSApND7W5DQqwXn7rFlFnxLgG/c9HQvdF1mQ eLJDMDp3kMnN7D5ZsQDTVOBk23pApbe0Fg+4Tn8bM4Ytme36LsNY/AuDXU1kH8rrsjZJ 3ljhsuUvAcfyVe9Kig7iSBe2Vxp4SMf3nWt//goVN1CnWVLMHWVWxzYG0oso46XJq+GS QF8UTG6mkdzDsUSQwqcgMDTcD9JciAPts04FGfc8+P3sc1mIVhvx1PTn/Ultyrd+J8mA QcG5hgvrefoq0WN1oJSJ7LkeP9V22OeaarGjSHVjq8jGxceV1wuyyKFGyj13a8l6pLjJ b0zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=Zwkq1OMYRFcOoNV3ORj1tDVlAaogqmYkae4w0Mmho3U=; b=RQzkNL46pLDXolmr44ZSNNR6EaelbLAMF9m4vfFzM9Mi2TWluky3ZGzj/WWBkTcFnr /8xL9G7FnMS+CjSCs9yigfeO5D1m/6kKD7TiBqeZJDMljj2qVgA1PBRNOrPClWMeUA21 JFTuvXW3JK2Ezk10mXfKsX9ICrC+fkuPfrc/YwQjCWhPpx6Naue2Np+eRdfd2ce1fT50 TZDBsvNSik3wacwg2edXhcFx5LPiGqREfm95pta6aQFJyH1nGlBOZ1PV247mrjcHcbUq ZbiAH/ag/4Bt4SPVlb6a6tf1NX370dZW8o8BEr+QGprB1l7CEkW6DGNOG90bBfAZihNh yc6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HAp1Gnn3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b25si10321863otp.212.2019.12.23.18.01.35; Mon, 23 Dec 2019 18:01:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HAp1Gnn3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727188AbfLXB7y (ORCPT + 99 others); Mon, 23 Dec 2019 20:59:54 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:39162 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727008AbfLXB7w (ORCPT ); Mon, 23 Dec 2019 20:59:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1577152790; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Zwkq1OMYRFcOoNV3ORj1tDVlAaogqmYkae4w0Mmho3U=; b=HAp1Gnn32jZ/9B/eqKEPSCvaQEq9e76Tl8oNvZbBR8CgR3WkJPzPYZb6z61QJZJH9lo0cx cjlY/FB8AYKBiB1ISXILw4Lo7JcdpR6BCb+tXdcQQ+SEC2ys4Qjd1PHB4GpMl6Tn98q3jL 7QfmsF1MO76ah+Y9713vY0gAOf9MrWA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-264-4qPfLq_-N8CABWrZGZ8gLg-1; Mon, 23 Dec 2019 20:59:49 -0500 X-MC-Unique: 4qPfLq_-N8CABWrZGZ8gLg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3C875800EBF; Tue, 24 Dec 2019 01:59:47 +0000 (UTC) Received: from ming.t460p (ovpn-8-20.pek2.redhat.com [10.72.8.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6631968888; Tue, 24 Dec 2019 01:59:32 +0000 (UTC) Date: Tue, 24 Dec 2019 09:59:26 +0800 From: Ming Lei To: Marc Zyngier Cc: John Garry , tglx@linutronix.de, "chenxiang (M)" , bigeasy@linutronix.de, linux-kernel@vger.kernel.org, hare@suse.com, hch@lst.de, axboe@kernel.dk, bvanassche@acm.org, peterz@infradead.org, mingo@redhat.com, Zhang Yi Subject: Re: [PATCH RFC 1/1] genirq: Make threaded handler use irq affinity for managed interrupt Message-ID: <20191224015926.GC13083@ming.t460p> References: <68058fd28c939b8e065524715494de95@www.loen.fr> <687cbcc4-89d9-63ea-a246-ce2abaae501a@huawei.com> <0fd543f8ffd90f90deb691aea1c275b4@www.loen.fr> <20191220233138.GB12403@ming.t460p> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.12.1 (2019-06-15) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 23, 2019 at 10:47:07AM +0000, Marc Zyngier wrote: > On 2019-12-23 10:26, John Garry wrote: > > > > > > I've also managed to trigger some of them now that I have > > > > > access to > > > > > > a decent box with nvme storage. > > > > > > > > > > I only have 2x NVMe SSDs when this occurs - I should not be > > > > > hitting this... > > > > > > > > > > Out of curiosity, have you tried > > > > > > with the SMMU disabled? I'm wondering whether we hit some > > > > > livelock > > > > > > condition on unmapping buffers... > > > > > > > > > > No, but I can give it a try. Doing that should lower the CPU > > > > > usage, though, > > > > > so maybe masks the issue - probably not. > > > > > > > > Lots of CPU lockup can is performance issue if there isn't > > > > obvious bug. > > > > > > > > I am wondering if you may explain it a bit why enabling SMMU may > > > > save > > > > CPU a it? > > > The other way around. mapping/unmapping IOVAs doesn't comes for > > > free. > > > I'm trying to find out whether the NVMe map/unmap patterns trigger > > > something unexpected in the SMMU driver, but that's a very long > > > shot. > > > > So I tested v5.5-rc3 with and without the SMMU enabled, and without > > the SMMU enabled I don't get the lockup. > > OK, so my hunch wasn't completely off... At least we have something > to look into. > > [...] > > > Obviously this is not conclusive, especially with such limited > > testing - 5 minute runs each. The CPU load goes up when disabling the > > SMMU, but that could be attributed to extra throughput (1183K -> > > 1539K) loading. > > > > I do notice that since we complete the NVMe request in irq context, > > we also do the DMA unmap, i.e. talk to the SMMU, in the same context, > > which is less than ideal. > > It depends on how much overhead invalidating the TLB adds to the > equation, but we should be able to do some tracing and find out. > > > I need to finish for the Christmas break today, so can't check this > > much further ATM. > > No worries. May I suggest creating a new thread in the new year, maybe > involving Robin and Will as well? Zhang Yi has observed the CPU lockup issue once when running heavy IO on single nvme drive, and please CC him if you have new patch to try. Then looks the DMA unmap cost is too big on aarch64 if SMMU is involved. Thanks, Ming