Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp522879pxb; Sat, 20 Feb 2021 11:33:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJyypPP3c6LNOC7Ni6Rbvjo82sHd6QasXfvE6+M9uBNg0V2G2CkThntkaSaUuZbaOFoHVOGg X-Received: by 2002:a50:a086:: with SMTP id 6mr14854418edo.70.1613849612874; Sat, 20 Feb 2021 11:33:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613849612; cv=none; d=google.com; s=arc-20160816; b=H/B3f81ZUY2H13I4MqNWaDzzPAo55k3K4FLEFjUlneYyos0f7/dqaM/jbDoegfqRlV WKjvPhi44d4x+9BNTQx9O+xIl9E5H8QQcTtF0Ck8W9I6XsiLfq93r9WFVmkNU6E6DEo2 wGeh3YcoEOpYREG3s2uO6vwtQdXOE4oRTLUcx5ZEsx43NS2RRlKmuuZBrsMQ6X28BQMl MxAmYEg9QkvqCPpslKfBtAx8ee3+cCFY9MUV3HfcjbLjk2MZfIQ9fK/xUSKfKQBUfOvR dSxwA0/Hkdoqx0xtBYLZWZh+XcIHhhe9VOUdemdaIw/Smrf9vsl9wRn9Mg83tTFBnbze ODiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=x1MWgUMMTaq2qlIGvJNT811s7R+qpkDKq6t3BQhYrq4=; b=Uf2ru/4oCA5Iz8him5sKYwUIRCfg+jQit76aDmu1iiP3JoMauTQPfeD8LLvRog5EdN gdmsBt8d+nV9LbWjOc5Cm70YAI2A12fqLSM991lyGjcoH2I9RNMtj69tqEUpVxrJtcTQ Vys0rTl7XRoX0vr5gDXWFwZSA5BCGSTQ1VNiyMd2cn1JJixIYk1/5z5NEL3t4YRGJhK/ rZGLCDUl9kvgo90c5Rvcl7MBCDg1Frb7s7UuhJtRN/eGhZRDFGPGI1vB8/5CjeSFbb+J QH0ujmVz0y5oaaIOPomp8EZUEzRCX9EzhmjjKenhODmrYvb0zjY5L9wmlr09FCv0Sb9Q Hgdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z3si7922451edp.327.2021.02.20.11.33.07; Sat, 20 Feb 2021 11:33:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229808AbhBTTat (ORCPT + 99 others); Sat, 20 Feb 2021 14:30:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229784AbhBTTat (ORCPT ); Sat, 20 Feb 2021 14:30:49 -0500 Received: from zeniv-ca.linux.org.uk (zeniv-ca.linux.org.uk [IPv6:2607:5300:60:148a::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9277C061574 for ; Sat, 20 Feb 2021 11:30:08 -0800 (PST) Received: from viro by zeniv-ca.linux.org.uk with local (Exim 4.94 #2 (Red Hat Linux)) id 1lDXwz-00GPTD-HA; Sat, 20 Feb 2021 19:29:57 +0000 Date: Sat, 20 Feb 2021 19:29:57 +0000 From: Al Viro To: Linus Torvalds Cc: syzbot , Greg Kroah-Hartman , Jiri Slaby , linux-kernel@vger.kernel.org, snovitoll@gmail.com, syzkaller-bugs@googlegroups.com Subject: Re: WARNING in iov_iter_revert (2) Message-ID: References: <0000000000001fb73f05bb767334@google.com> <0000000000000ca18b05bbc556d6@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Al Viro Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Feb 20, 2021 at 05:38:49PM +0000, Al Viro wrote: > On Sat, Feb 20, 2021 at 08:56:40AM -0800, Linus Torvalds wrote: > > Al, > > This is the "FIXME! Have Al check this!" case in do_tty_write(). You were > > in on that whole discussion, but we never did get to that issue... > > > > There are some subtle rules about doing the iov_iter_revert(), but what's > > the best way to do this properly? Instead of doing a copy_from_iter() and > > then reverting the part that didn't fit in the buffer, doing a > > non-advancing copy and then advancing the amount that did fit, or what? > > > > I still don't have power, so this is all me on mobile with html email > > (sorry), and limited ability to really look closer. > > > > "Help me, Albi-wan Viro, you're my only hope" > > Will check... BTW, when you get around to doing pulls, could you pick > the replacement (in followup) instead of the first pull request for > work.namei? Jens has caught a braino in the last commit there... It turned out to be really amusing. What happens is write(fd, NULL, 0) on /dev/ttyprintk, with N_GSM0710 for ldisc (== "pass the data as is to tty->op->write()". And that's the first write since opening that sucker, so we end up with /* write_buf/write_cnt is protected by the atomic_write_lock mutex */ if (tty->write_cnt < chunk) { unsigned char *buf_chunk; if (chunk < 1024) chunk = 1024; buf_chunk = kmalloc(chunk, GFP_KERNEL); if (!buf_chunk) { ret = -ENOMEM; goto out; } kfree(tty->write_buf); tty->write_cnt = chunk; tty->write_buf = buf_chunk; } doing nothing - ->write_cnt is still 0 and ->write_buf - NULL. Then we copy 0 bytes from source to ->write_buf(), which reports that 0 bytes had been copied, TYVM. Then we call ret = write(tty, file, tty->write_buf, size); i.e. ret = gsm_write(tty, file, NULL, 0); which calls tpk_write(tty, NULL, 0) which does tpk_printk(NULL, 0); and _that_ has a very special semantics: int i = tpk_curr; if (buf == NULL) { tpk_flush(); return i; } i.e. it *can* return a positive number that gets propagated all way back to do_tty_write(). And then you notice that it has reports successful write of amount other than what you'd passed and tries to pull back. By amount passed - amount written. With iov_iter_revert() saying that some tosser has asked it to revert by something close to ~(size_t)0. IOW, it's not iov_iter_revert() being weird or do_tty_write() misuing it - it's tpk_write() playing silly buggers. Note that old tree would've gone through seriously weird contortions on the same call: // chunk and count are 0, ->write_buf is NULL for (;;) { size_t size = count; if (size > chunk) size = chunk; ret = -EFAULT; if (copy_from_user(tty->write_buf, buf, size)) break; ret = write(tty, file, tty->write_buf, size); if (ret <= 0) break; written += ret; buf += ret; count -= ret; if (!count) break; ret = -ERESTARTSYS; if (signal_pending(current)) break; cond_resched(); } and we get written = ret = small positive, count = - that amount, buf = NULL + that mount. On the next iteration size = 0 (since chunk is still 0), with same no-op copy_from_user() of 0 bytes, then gsm_write(tty, file, NULL, 0) and since tpk_flush() zeroes tpk_curr we finally get 0 out of tpk_printk/tpk_write/gsm_write and bugger off on if (ret <= 0). Then we have the value in written returned. So yeah, this return value *was* returned to userland. Except that if we had done any writes before that, we'd find ->write_buf non-NULL and the magical semantics of write(fd, NULL, 0) would *not* have triggered - we would've gotten zero. Do we want to preserve that weirdness of /dev/ttyprintk writes? That's orthogonal to the iov_iter uses in there.