Received: by 10.223.176.46 with SMTP id f43csp935549wra; Fri, 26 Jan 2018 09:09:23 -0800 (PST) X-Google-Smtp-Source: AH8x224HHh526w7x1akhXVQVsWge0l/dxrxkJxIMUbq9JPvc0SJFrCjU+z1r4mJz7OEQxL71Gz8s X-Received: by 10.101.64.196 with SMTP id u4mr15799594pgp.336.1516986563606; Fri, 26 Jan 2018 09:09:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516986563; cv=none; d=google.com; s=arc-20160816; b=yRsp7f0J1+OswVTIR+8e6Xg3n1ChZjuiiwlkJLseP+4QFKBd1LaqEHhxzJ1KFWIU2j PGDFI2QjJ75u2rHXoqs5D0APq36XqrY9r2LNuEkxinTRX/TxlOzg/yZAaLcvPzvQIyJw elFVSDhvBUuH97JUQBv/yBQJAnv+yZQUyFAD+zctXUyL/gJZHD6Jmvy4EaeL05lnTvK+ rTAhNm+LPHH+1Bw9kynEUvA/K6evG3N/OkSfVnvB2+KKjQsCc9Y1H/mP0WjpCeP22xlk 9hQ7bT7YP1P2Zj2VT+r2MsEqVt4lLNIHKfsO+sWS+IZXPlnyDGaTOcAJDnASu3XtFJbQ RISA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=o9DG0o86pCTmkA4VseiExS3vDl3w9VpLPRX1h45BVmI=; b=V3JM9d6D/1K2POxxTxgZguvkGlqUe29891NUi/qjb3LE5UtreXF4qc8zIQXWa0KzgG VwR+dRPIKKoxhP3ripkQC7LmldoENvppr8miVJicSfAoE01slBY5512qzijoEaRA5eYf gBj+jji+XSw4aRRAbedy0uA4RD9uGyjXCoHgClrfFRQlasAtE4KcCgXSYRVkPtjEwR/y ijbC4adlGVv+EUlRZTk4W9Pa1Xe9xAGWzfdOWCkqh+K2hb4ant8J9+RS2w0mYPrG+D1S tXxxgueBz3VTYmwVUYEakG8gEkLWyl9g5OS9kPLDNN4XyKk6c5VP3tgL3myC6AG3xyhh Kl7Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q14-v6si3041432pls.355.2018.01.26.09.09.09; Fri, 26 Jan 2018 09:09:23 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751841AbeAZRId (ORCPT + 99 others); Fri, 26 Jan 2018 12:08:33 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:45624 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751449AbeAZRIb (ORCPT ); Fri, 26 Jan 2018 12:08:31 -0500 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.87 #1 (Red Hat Linux)) id 1ef7UI-0003Ir-97; Fri, 26 Jan 2018 17:08:26 +0000 Date: Fri, 26 Jan 2018 17:08:26 +0000 From: Al Viro To: David Miller Cc: baijiaju1990@gmail.com, 3chas3@gmail.com, linux-atm-general@lists.sourceforge.net, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] atm: firestream: Replace GFP_ATOMIC with GFP_KERNEL in fs_send Message-ID: <20180126170826.GY13338@ZenIV.linux.org.uk> References: <1516953627-30983-1-git-send-email-baijiaju1990@gmail.com> <20180126120522.GX13338@ZenIV.linux.org.uk> <20180126.110739.419621238821097728.davem@davemloft.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180126.110739.419621238821097728.davem@davemloft.net> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 26, 2018 at 11:07:39AM -0500, David Miller wrote: > >> This is found by a static analysis tool named DCNS written by myself. > > > > The trouble is, places like > > net/atm/raw.c:65: vcc->send = atm_send_aal0; > > net/atm/raw.c:74: vcc->send = vcc->dev->ops->send; > > net/atm/raw.c:83: vcc->send = vcc->dev->ops->send; > > mean extra call chains. It's *not* just vcc_sendmsg(), and e.g. > > ret = ATM_SKB(skb)->vcc->send(ATM_SKB(skb)->vcc, skb) > > ? DROP_PACKET : 1; > > bh_unlock_sock(sk_atm(vcc)); > > in pppoatm_send() definitely is called under a spinlock. > > > > Looking through the driver (in advanced bitrot, as usual for drivers/atm), > > I'd say that submit_queue() is fucked in head in the "queue full" case. > > And judging by the history, had been thus since the original merge... > > Jia-Ju, I'm probably not going to apply any of your GFP_KERNEL > conversions. > > Al's analysis above is similar to how things looked for other patches > you submited of this nature. > > So because of the lack of care and serious auditing you put into these > conversions, I really have no choice but to drop them from my queue > because on the whole they are adding bugs rather than improving the > code. FWIW, the tool *does* promise to be useful - as far as I understand it looks for places where GFP_ATOMIC allocation goes with blocking operations near every callchain leading there. And that indicates something fishy going on - either pointless GFP_ATOMIC (in benign case), or something much nastier: a callchain that would require GFP_ATOMIC. In that case whatever blocking operation found along the way is a bug. This time it has, AFAICS, caught a genuine bug in drivers/atm/firestream.c - static void submit_qentry (struct fs_dev *dev, struct queue *q, struct FS_QENTRY *qe) { u32 wp; struct FS_QENTRY *cqe; /* XXX Sanity check: the write pointer can be checked to be still the same as the value passed as qe... -- REW */ /* udelay (5); */ while ((wp = read_fs (dev, Q_WP (q->offset))) & Q_FULL) { fs_dprintk (FS_DEBUG_TXQ, "Found queue at %x full. Waiting.\n", q->offset); schedule (); } ... static void submit_queue (struct fs_dev *dev, struct queue *q, u32 cmd, u32 p1, u32 p2, u32 p3) { struct FS_QENTRY *qe; qe = get_qentry (dev, q); qe->cmd = cmd; qe->p0 = p1; qe->p1 = p2; qe->p2 = p3; submit_qentry (dev, q, qe); ... static int fs_send (struct atm_vcc *atm_vcc, struct sk_buff *skb) { ... td = kmalloc (sizeof (struct FS_BPENTRY), GFP_ATOMIC); ... submit_queue (dev, &dev->hp_txq, QE_TRANSMIT_DE | vcc->channo, virt_to_bus (td), 0, virt_to_bus (td)); ... Either all callchains leading to fs_send() are in non-atomic contexts (in which case that GFP_ATOMIC would be pointless) or there's one where we cannot block. Any such would be a potential deadlock. And the latter appears to be the case - fs_send() is atmdev_ops ->send(), which can end up in atm_vcc ->send(), which can be called from under ->sk_lock.slock. What would be really useful: * list of "here's a list of locations where we do something blocking; each callchain to this GFP_ATOMIC allocation passes either upstream of one of those without leaving atomic context in between or downstream without entering one". * after that - backtracking these callchains further, watching for ones in atomic contexts. Can be done manually, but if that tool can assist in doing so, all the better. If we do find one, we have found a deadlock - just take the blocking operation reported for that callchain and that's it. If it's not an obvious false positive (e.g. if (!foo) spin_lock(&splat); ..... if (foo) schedule(); ), it's worth reporting to maintainers of the code in question. * if all callchains reach obviously non-atomic contexts (syscall entry, for example, or a kernel thread payload, or a function documented to require a mutex held by caller, etc.) - then a switch to GFP_KERNEL might be appropriate. With analysis of callchains posted as you are posting that. * either way, having the tool print the callchains out would be a good idea - for examining them, for writing reports, etc.