Received: by 10.213.65.68 with SMTP id h4csp3514754imn; Tue, 3 Apr 2018 06:22:31 -0700 (PDT) X-Google-Smtp-Source: AIpwx482H432d7c8OS44Ah21K8y6ag5KUbK5rIrv4OLPfpf9XsAJGLjU67IJI/V672IWdMgW7lP6 X-Received: by 10.101.98.22 with SMTP id d22mr9279250pgv.344.1522761751616; Tue, 03 Apr 2018 06:22:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522761751; cv=none; d=google.com; s=arc-20160816; b=axW1kIkEXUo9+iyotuYMkFL3+jbQWP9m9o/YZ/Jp9qzmCt1XrpIuWqLGVQomGKo2PE F1jyV4yAfJd0dkXP+N7aIns9S+7c/P+9noD2JeG5TLynKfLPsqy9urNcPkNVqc7HddGY 5KakTXRYPwTzBnY76Je7Zq/k9ytHuUSMuUatnBSHWkpw3ZGlR00TyYLC3tGua4BXRfTc 8vqZ+GrGQxWtO4P/PAq4Oi7vO6bl82x4BKzpoGOAvjIyFwLicETRQmxs9K5pbOeP2zyv afzghb020quNodDMXrd0wMUMm5AVA0/Hcy2J6bnsiXf3bFeFH/x1Z9LSX4TUxWqdvjMW 6elg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=J8Te04SfNOa5RkrGuVy+CapGqttUxF+ZUQ7oN6HkSnM=; b=nrfNzvrJrk3jCRiXQxSHtHXC0AjNyM7lVo5awoB/6DEGlcwGdOSE1AXAHe10QR+Er1 Eio2QCHoK9+Kl/L7P1pNaskIDHkM5ljZvb4FPYVEIrucymTLvMxqJQMBezvBOFRMDkz9 ebIaeVoFgzEum/0X/GvGvdpofJIzdODTp/gUXrMCs9U1W/5CpXjzlefu9ZsrvgmJ2Dp0 87eN65THFFeAbCewx2jIwTOwj+ZALk5hXGs691vqKjyawUExuQW2KIc2Ya+fWhKg3P4/ +DVgmI1wPH029MEO8NGSgG1wD2RIlEzzZv8ieHI6PJiNsdHdxofsB6AJ1aOVH1LTGec1 SHzQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f4si1979406pgc.267.2018.04.03.06.22.17; Tue, 03 Apr 2018 06:22:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932315AbeDCNVG (ORCPT + 99 others); Tue, 3 Apr 2018 09:21:06 -0400 Received: from ste-pvt-msa2.bahnhof.se ([213.80.101.71]:17021 "EHLO ste-pvt-msa2.bahnhof.se" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932212AbeDCNVB (ORCPT ); Tue, 3 Apr 2018 09:21:01 -0400 X-Greylist: delayed 495 seconds by postgrey-1.27 at vger.kernel.org; Tue, 03 Apr 2018 09:21:01 EDT Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa2.bahnhof.se (Postfix) with ESMTP id 0C5193F3B0; Tue, 3 Apr 2018 15:12:39 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at bahnhof.se Received: from ste-pvt-msa2.bahnhof.se ([127.0.0.1]) by localhost (ste-ftg-msa2.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ugn_nQUCNWMl; Tue, 3 Apr 2018 15:12:37 +0200 (CEST) Received: from mail1.shipmail.org (h-205-56.A357.priv.bahnhof.se [155.4.205.56]) (Authenticated sender: mb878879) by ste-pvt-msa2.bahnhof.se (Postfix) with ESMTPA id 9A4E93F36C; Tue, 3 Apr 2018 15:12:36 +0200 (CEST) Received: from localhost.localdomain (h-205-56.A357.priv.bahnhof.se [155.4.205.56]) by mail1.shipmail.org (Postfix) with ESMTPSA id 11FAF360BDE; Tue, 3 Apr 2018 15:12:36 +0200 (CEST) Subject: Re: Signal handling in a page fault handler To: Chris Wilson , Matthew Wilcox , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Souptick Joarder Cc: linux-kernel@vger.kernel.org References: <20180402141058.GL13332@bombadil.infradead.org> <152275879566.32747.9293394837417347482@mail.alporthouse.com> From: Thomas Hellstrom Message-ID: Date: Tue, 3 Apr 2018 15:12:35 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <152275879566.32747.9293394837417347482@mail.alporthouse.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/03/2018 02:33 PM, Chris Wilson wrote: > Quoting Matthew Wilcox (2018-04-02 15:10:58) >> Souptick and I have been auditing the various page fault handler routines >> and we've noticed that graphics drivers assume that a signal should be >> able to interrupt a page fault. In contrast, the page cache takes great >> care to allow only fatal signals to interrupt a page fault. >> >> I believe (but have not verified) that a non-fatal signal being delivered >> to a task which is in the middle of a page fault may well end up in an >> infinite loop, attempting to handle the page fault and failing forever. >> >> Here's one of the simpler ones: >> >> ret = mutex_lock_interruptible(&etnaviv_obj->lock); >> if (ret) >> return VM_FAULT_NOPAGE; >> >> (many other drivers do essentially the same thing including i915) >> >> On seeing NOPAGE, the fault handler believes the PTE is in the page >> table, so does nothing before it returns to arch code at which point >> I get lost in the magic assembler macros. I believe it will end up >> returning to userspace if the signal is non-fatal, at which point it'll >> go right back into the page fault handler, and mutex_lock_interruptible() >> will immediately fail. So we've converted a sleeping lock into the most >> expensive spinlock. > I'll ask the obvious question: why isn't the signal handled on return to > userspace? +1 > >> I don't think the graphics drivers really want to be interrupted by >> any signal. > Assume the worst case and we may block for 10s. Even a 10ms delay may be > unacceptable to some signal handlers (one presumes). For the number one > ^C usecase, yes that may be reduced to only bother if it's killable, but > I wonder if there are not timing loops (e.g. sigitimer in Xorg < 1.19) > that want to be able to interrupt random blockages. > -Chris I think the TTM page fault handler originally set the standard for this. First, IMO any critical section that waits for the GPU (like typically the page fault handler does), should be locked at least killable. The need for interruptible locks came from the X server's silken mouse relying on signals for smooth mouse operations: You didn't want the X server to be stuck in the kernel waiting for GPU completion when it should handle the cursor move request.. Now that doesn't seem to be the case anymore but to reiterate Chris' question, why would the signal persist once returned to user-space? /Thomas > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel