Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp999881imm; Sat, 26 May 2018 17:50:14 -0700 (PDT) X-Google-Smtp-Source: AB8JxZp76esJ5B5GjV+y1q62nVsh1PclISds0zfNcSGhd36TyJ4FJ6rYHquT0zlUvf39bmftzIu5 X-Received: by 2002:a62:6756:: with SMTP id b83-v6mr8176844pfc.76.1527382214917; Sat, 26 May 2018 17:50:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527382214; cv=none; d=google.com; s=arc-20160816; b=QZ3LdXvQJK/eH2C+W1J/Po2Y1XGHHSKLJrYCYPx3sFlG4/StMhLqH+v/i9pkY/9w2F rcDx5H8HjkpLUFQawqbBrm7a0AQiF5E6ajqtsMVJPdBk1CLFYHShbONWM4oiNRRvjQDm WBdXmY5J8Rfevfx0h5xhus4+RSSF958AuHdWyhiIOI60RJIGBD/8jUNxIAQ/i/w/JRfw mLtOokWi6I7KWGEobuLU/XoIt7PiqIPybCbPL8817gfP+PKgHaUkLZeOooBQmdKBVQWb G2vnH1L+VcOFtTg/YGP+CggBCuOjgrPEa7sPfiubr9KydnHm3mUtMzVwSFe71JAaiB1C /01g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:cc:references:to:subject:arc-authentication-results; bh=GERnd8eskh+m32zgmimOk6hJDuzb3ZnvycioC44GdhE=; b=Wx01t9h+rbn8bQydMpUzqJSuBKqKm3qJwwnZRRafZAf1orpB+yo91jo8Y7ZrM9F54u SRonQRb42TyCdJiruCc+jbgzWrGSkpdiva/FxisdBfVQ8OwLaAUrMIHYkcUaQwwFDsVG Ea0zKnjZkTB+E/uOolAY7Sz6S8AJXggonwiwKtvBZnTQG/foBm28nflGJ6xJutf0FLzj 1Zohmg7s9jlv36uWQRUO1vMpMIuOnRvqq/mE3zvW+NLdCy2nIMpRF0eO5/Tl3BExJm9e wX2GuqDwZpIOo7CIG6dQuWjlC1UGhVy1gHESpoRgS42Sd88Qg+Yi47OFxXZR2Mz8cocv pDNw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n10-v6si20961886pgp.457.2018.05.26.17.49.59; Sat, 26 May 2018 17:50:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1032506AbeE0Atu (ORCPT + 99 others); Sat, 26 May 2018 20:49:50 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:24259 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751103AbeE0Ats (ORCPT ); Sat, 26 May 2018 20:49:48 -0400 Received: from fsav404.sakura.ne.jp (fsav404.sakura.ne.jp [133.242.250.103]) by www262.sakura.ne.jp (8.14.5/8.14.5) with ESMTP id w4R0lxt1010394; Sun, 27 May 2018 09:47:59 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav404.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav404.sakura.ne.jp); Sun, 27 May 2018 09:47:59 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav404.sakura.ne.jp) Received: from [192.168.1.8] (softbank126074194044.bbtec.net [126.74.194.44]) (authenticated bits=0) by www262.sakura.ne.jp (8.14.5/8.14.5) with ESMTP id w4R0lsr2010377 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 27 May 2018 09:47:59 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Subject: Re: general protection fault in wb_workfn (2) To: syzbot , syzkaller-bugs@googlegroups.com, jack@suse.cz References: <000000000000cbd959056d1851ca@google.com> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, tj@kernel.org, david@fromorbit.com, linux-block@vger.kernel.org From: Tetsuo Handa Message-ID: <0c7c5dea-7312-8a59-9d1b-5467f69719bf@I-love.SAKURA.ne.jp> Date: Sun, 27 May 2018 09:47:54 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <000000000000cbd959056d1851ca@google.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Forwarding http://lkml.kernel.org/r/201805251915.FGH64517.HVFJOOLFFMQStO@I-love.SAKURA.ne.jp . Jan Kara wrote: > > void delayed_work_timer_fn(struct timer_list *t) > > { > > struct delayed_work *dwork = from_timer(dwork, t, timer); > > > > /* should have been called from irqsafe timer with irq already off */ > > __queue_work(dwork->cpu, dwork->wq, &dwork->work); > > } > > > > Then, wb_workfn() is after all scheduled even if we check for > > WB_registered bit, isn't it? > > It can be queued after WB_registered bit is cleared but it cannot be queued > after mod_delayed_work(bdi_wq, &wb->dwork, 0) has finished. That function > deletes the pending timer (the timer cannot be armed again because > WB_registered is cleared) and queues what should be the last round of > wb_workfn(). mod_delayed_work() deletes the pending timer but does not wait for already invoked timer handler to complete because it is using del_timer() rather than del_timer_sync(). Then, what happens if __queue_work() is almost concurrently executed from two CPUs, one from mod_delayed_work(bdi_wq, &wb->dwork, 0) from wb_shutdown() path (which is called without spin_lock_bh(&wb->work_lock)) and the other from delayed_work_timer_fn() path (which is called without checking WB_registered bit under spin_lock_bh(&wb->work_lock)) ? wb_wakeup_delayed() { spin_lock_bh(&wb->work_lock); if (test_bit(WB_registered, &wb->state)) // succeeds queue_delayed_work(bdi_wq, &wb->d_work, timeout) { queue_delayed_work_on(WORK_CPU_UNBOUND, bdi_wq, &wb->d_work, timeout) { if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(&wb->d_work.work))) { // succeeds __queue_delayed_work(WORK_CPU_UNBOUND, bdi_wq, &wb->d_work, timeout) { add_timer(timer); // schedules for delayed_work_timer_fn() } } } } spin_unlock_bh(&wb->work_lock); } delayed_work_timer_fn() { // del_timer() already returns false at this point because this timer // is already inside handler. But something took long here enough to // wait for __queue_work() from wb_shutdown() path to finish? __queue_work(WORK_CPU_UNBOUND, bdi_wq, &wb->d_work.work) { insert_work(pwq, work, worklist, work_flags); } } wb_shutdown() { mod_delayed_work(bdi_wq, &wb->dwork, 0) { mod_delayed_work_on(WORK_CPU_UNBOUND, bdi_wq, &wb->dwork, 0) { ret = try_to_grab_pending(&wb->dwork.work, true, &flags) { if (likely(del_timer(&wb->dwork.timer))) // fails because already in delayed_work_timer_fn() return 1; if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(&wb->dwork.work))) // fails because already set by queue_delayed_work() return 0; // Returns 1 or -ENOENT after doing something? } if (ret >= 0) __queue_delayed_work(WORK_CPU_UNBOUND, bdi_wq, &wb->dwork, 0) { __queue_work(WORK_CPU_UNBOUND, bdi_wq, &wb->dwork.work) { insert_work(pwq, work, worklist, work_flags); } } } } }