Received: by 10.223.176.5 with SMTP id f5csp162106wra; Fri, 26 Jan 2018 20:11:03 -0800 (PST) X-Google-Smtp-Source: AH8x224nep/EKSPzk2FW16s60sY3Oje3t2+dCqnC0p0G7/BrIbRMvumI9lrR69093YsKkEPHYEHj X-Received: by 2002:a17:902:4083:: with SMTP id c3-v6mr13229320pld.90.1517026263835; Fri, 26 Jan 2018 20:11:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517026263; cv=none; d=google.com; s=arc-20160816; b=tq157UX+By9/lty5kEaXNtGJrqZuPUi5fc3Ek1T5YOzkZyTh2jxKos3DWnXz5/ScR3 2Gy0q64ae0J/gaU5VMFWW25NczBzt8AeIaMCKInvVqLIntNHmjRE7IbyeK+62ZQtqpYc HNRRIIi8NqLMSXXDlZ/wTbF6M+7nRtccxqP6ek7GXGP4ePhFRxPMGHF5OxJK9XGV8JVx GV3+x6ILYGEzaxlZT5Yq/ThcmHylKTE/dS0QB45V0dUB40FLwDb9mdFIopAVBdu2t+BV uZWWuTH7oM5TQhg7OEF0ctP6C0T2ZFJNuxh6Mp01ddb6W9ovE9CCXJxNuZoOqzPhfDNV 2vFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=6j6P6Ir2V+lsZUrgozg2sDUIsCAaMVD8/DdwarIhk5E=; b=AXJBGP8YpLPGneCkGZ4bDF4rxZ9qJvBIb4Tut+NmPZsHOXufgH929tlwaQN9VKI1eq 0hj9EJvqjZSBeucwDBqs1gHIueaEOWFqSUmFdJzN/TKBCKek9dtf2qvdJzzSvrB6h4XU /Njvf0GJpsOo58H0ro+FTp7slTnWvTuGTpyEX9TvKNx3nVKI/uCya4pGwq+M4cguUW3E qO2BQ5/oBJy1WkhWQF4CxgMYYvHxE7H6RwiztCHaeGR0oOVoMpAO45PDsxlS3TTrEv+H E5Mya9WHttnVB7mRgAwYdonTaMZSW/orUe588SeAaIn0WbcFS65xHYe8CtNtEgDaLt+5 yjgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=QROCiE4J; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a15si3920079pgn.436.2018.01.26.20.10.36; Fri, 26 Jan 2018 20:11:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=QROCiE4J; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751828AbeA0EKB (ORCPT + 99 others); Fri, 26 Jan 2018 23:10:01 -0500 Received: from mail-pf0-f196.google.com ([209.85.192.196]:44283 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751475AbeA0EJ7 (ORCPT ); Fri, 26 Jan 2018 23:09:59 -0500 Received: by mail-pf0-f196.google.com with SMTP id 17so38983pfw.11; Fri, 26 Jan 2018 20:09:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=6j6P6Ir2V+lsZUrgozg2sDUIsCAaMVD8/DdwarIhk5E=; b=QROCiE4J0EPssyNLQjDdiHquLTnRCThpExZ/Ci38b+L5tpqhvqtN2FukY7wtitkkND xURJro3INNeVJb/JEajxo3lef493LalrgNicDHBfeNyUTRKXU3gyTWBhVnqdzYqxWZvi ip3jCgBQVHdvX502VwJzTNf4/MOSdRX16SxJcmCoxnVsST2kqosQgNn44DiYV+m4Y4RD 7nBSfvHnBKzp5lSZ8XUwsUUiipPSn0fWsNwKLSfiH6TtINopWB59shWOCAqZIRG2wAGN BHZWTTxY6N6QIo/QNSmEBcaKKTZ1+gbBrHfiHPERHTIkScVIfWpq2PqQchbFlnzjaoW6 7cAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=6j6P6Ir2V+lsZUrgozg2sDUIsCAaMVD8/DdwarIhk5E=; b=QiAohPuK4/Fbc0O6l2JwVVRFXqVElfHuSvkfF4l56IOWGjnkTsDJShznr8O+mG89ap szARJ3dPpbXXc+RYa3Bd/G8ZPeTdP8U1etk1mGhBLCjNxB+E3zWfmZTp9Xt7gVZ6GuvF 0A+8v+yKu45zg3X5pq/bdlw3fUJlzQJr/wqZM9+mNdULzBoKqhTm9EIEO+lzbJDHsyEO /v41lheDu4gVN1/hjPlU5uFTUr+0XSTcCCmBp5sz7qGFrLmgEcec1xLfUS8TYsaDRsn8 JWN0jD4i6a5C3b/sm/6FG2Od4Abrn8yptKPQRTtUbZ1zy9fJd60j9jAGVmvEOsic+4Id WZxA== X-Gm-Message-State: AKwxytdOeME7LeZJdaXxS42nDfeFTc59/VPy6AIW3lnn9CHQWppQi8Cb jw8+D39euDv6x7PLYKS+k4I9Byof X-Received: by 10.98.11.17 with SMTP id t17mr21031446pfi.201.1517026198863; Fri, 26 Jan 2018 20:09:58 -0800 (PST) Received: from ?IPv6:2402:f000:1:1501:200:5efe:166.111.70.14? ([2402:f000:1:1501:200:5efe:a66f:460e]) by smtp.gmail.com with ESMTPSA id p25sm19212983pfj.29.2018.01.26.20.09.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Jan 2018 20:09:58 -0800 (PST) Subject: Re: [PATCH] atm: firestream: Replace GFP_ATOMIC with GFP_KERNEL in fs_send To: Al Viro , David Miller Cc: 3chas3@gmail.com, linux-atm-general@lists.sourceforge.net, netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: <1516953627-30983-1-git-send-email-baijiaju1990@gmail.com> <20180126120522.GX13338@ZenIV.linux.org.uk> <20180126.110739.419621238821097728.davem@davemloft.net> <20180126170826.GY13338@ZenIV.linux.org.uk> From: Jia-Ju Bai Message-ID: Date: Sat, 27 Jan 2018 12:09:38 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <20180126170826.GY13338@ZenIV.linux.org.uk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/1/27 1:08, Al Viro wrote: > On Fri, Jan 26, 2018 at 11:07:39AM -0500, David Miller wrote: >>>> This is found by a static analysis tool named DCNS written by myself. >>> The trouble is, places like >>> net/atm/raw.c:65: vcc->send = atm_send_aal0; >>> net/atm/raw.c:74: vcc->send = vcc->dev->ops->send; >>> net/atm/raw.c:83: vcc->send = vcc->dev->ops->send; >>> mean extra call chains. It's *not* just vcc_sendmsg(), and e.g. >>> ret = ATM_SKB(skb)->vcc->send(ATM_SKB(skb)->vcc, skb) >>> ? DROP_PACKET : 1; >>> bh_unlock_sock(sk_atm(vcc)); >>> in pppoatm_send() definitely is called under a spinlock. >>> >>> Looking through the driver (in advanced bitrot, as usual for drivers/atm), >>> I'd say that submit_queue() is fucked in head in the "queue full" case. >>> And judging by the history, had been thus since the original merge... >> Jia-Ju, I'm probably not going to apply any of your GFP_KERNEL >> conversions. >> >> Al's analysis above is similar to how things looked for other patches >> you submited of this nature. >> >> So because of the lack of care and serious auditing you put into these >> conversions, I really have no choice but to drop them from my queue >> because on the whole they are adding bugs rather than improving the >> code. > FWIW, the tool *does* promise to be useful Thanks, I am happy to hear that :) > - as far as I understand it > looks for places where GFP_ATOMIC allocation goes with blocking operations > near every callchain leading there. And that indicates something fishy > going on - either pointless GFP_ATOMIC (in benign case), or something > much nastier: a callchain that would require GFP_ATOMIC. In that case > whatever blocking operation found along the way is a bug. In fact, my tool first collects all places of GFP_ATOMIC and mdelay in the whole kernel code. Then it starts analysis from the entry of each interrupt handler and spin-lock function call in the whole kernel code, to mark the places of GFP_ATOMIC and mdelay that are called in atomic context. The remaining unmarked places of GFP_ATOMIC and mdelay are reported and can be replaced with GFP_KERNEL and mdelay (or usleep_range). Though my tool has handled some common situations of function pointers, But it does not well handle function pointer passing in this code (vcc->send = vcc->dev->ops->send), so the tool needs to be improved. I am sorry for my incorrect report... > This time it has, AFAICS, caught a genuine bug in drivers/atm/firestream.c - > static void submit_qentry (struct fs_dev *dev, struct queue *q, struct FS_QENTRY *qe) > { > u32 wp; > struct FS_QENTRY *cqe; > > /* XXX Sanity check: the write pointer can be checked to be > still the same as the value passed as qe... -- REW */ > /* udelay (5); */ > while ((wp = read_fs (dev, Q_WP (q->offset))) & Q_FULL) { > fs_dprintk (FS_DEBUG_TXQ, "Found queue at %x full. Waiting.\n", > q->offset); > schedule (); > } > ... > static void submit_queue (struct fs_dev *dev, struct queue *q, > u32 cmd, u32 p1, u32 p2, u32 p3) > { > struct FS_QENTRY *qe; > > qe = get_qentry (dev, q); > qe->cmd = cmd; > qe->p0 = p1; > qe->p1 = p2; > qe->p2 = p3; > submit_qentry (dev, q, qe); > ... > static int fs_send (struct atm_vcc *atm_vcc, struct sk_buff *skb) > { > ... > td = kmalloc (sizeof (struct FS_BPENTRY), GFP_ATOMIC); > ... > submit_queue (dev, &dev->hp_txq, > QE_TRANSMIT_DE | vcc->channo, > virt_to_bus (td), 0, > virt_to_bus (td)); > ... > > Either all callchains leading to fs_send() are in non-atomic contexts > (in which case that GFP_ATOMIC would be pointless) or there's one > where we cannot block. Any such would be a potential deadlock. > > And the latter appears to be the case - fs_send() is atmdev_ops ->send(), > which can end up in atm_vcc ->send(), which can be called from under > ->sk_lock.slock. Yes, I think schedule() can cause a sleep-in-atomic-context bug. > What would be really useful: > * list of "here's a list of locations where we do something > blocking; each callchain to this GFP_ATOMIC allocation passes either > upstream of one of those without leaving atomic context in between > or downstream without entering one". > * after that - backtracking these callchains further, watching > for ones in atomic contexts. Can be done manually, but if that tool > can assist in doing so, all the better. If we do find one, we have > found a deadlock - just take the blocking operation reported for > that callchain and that's it. If it's not an obvious false positive > (e.g. > if (!foo) > spin_lock(&splat); > ..... > if (foo) > schedule(); > ), it's worth reporting to maintainers of the code in question. > * if all callchains reach obviously non-atomic contexts > (syscall entry, for example, or a kernel thread payload, or > a function documented to require a mutex held by caller, etc.) - > then a switch to GFP_KERNEL might be appropriate. With analysis > of callchains posted as you are posting that. > * either way, having the tool print the callchains out > would be a good idea - for examining them, for writing reports, > etc. Thanks for your very helpful advice :) I will follow it in my patches. Thanks, Jia-Ju Bai