Received: by 2002:ac0:8845:0:0:0:0:0 with SMTP id g63csp419292img; Thu, 28 Feb 2019 01:37:03 -0800 (PST) X-Google-Smtp-Source: AHgI3Iax6ql5EESS0CfvGgN0fKcgaPMGyUVn64NEVmdOZ3m1k/63xMxYC2XwS7v3Hav1bxbS7UuZ X-Received: by 2002:a63:d347:: with SMTP id u7mr7631883pgi.269.1551346623041; Thu, 28 Feb 2019 01:37:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551346623; cv=none; d=google.com; s=arc-20160816; b=Ol/IUE1WI6H+atA+4XMmNn8Y++QzY64FIF8tq48mbatLg14vVTTz9YksF+SWnFC76G Ouyk6TGKsU+H8lpZzTnmLmSHLe23BBkW2hjAnLQyCqrC1q0MD8KKgmVq/0rz3cjJAygY Tr7le12EwtDP+fe1mNT9UmHc+yUwNmGqHOQu4Q+OEhI+KsqlWBkT1SWM8o9a2ewVdGMF RmLKIcKgpMR+n18PxQp7eNxail4yXpKFPta3QnRieAIpyy4xYSF6REm9xlwyruYG4Tmm J7oOBsxcP67S8TwXlW9rFCYvJyBjSVPcCRAQq45tbvuVSzwWVvpZBTL3wJCyewN4sB4L qa1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=nKotgps3dNTeWXxoCTNOk7PR/4UzeovGKyvC/PglSYo=; b=HA9m1EX8eOfZQN5jz5aWl5kRzv4/5VZwExrg+d/tb0byH4dWU14sN8A0fiGXbbM+tw U+pZVgGgniF9vurZzoIZuSEm/k1hlaD0b8IOYOr2I3BL+nmIcrltKu4oHsKFytDe1P1e tlBhGdUwD+AYfb+sbZGEQTtVBT0K0ziy+2LXnRYney4ZIhyrWxGX0WANDHKyAiG+Fn8N xAsG5YOn8MKYMUHbJIBjBlWob97T+R/usseniW0hPg/tBrNji8TvwkDhvrxYgSA7n9c9 8F7rHa3snO6vcXqOg96e7vTVjjvX+MdEHpxp11vO6WoqnK7AQz9oIVjzmBjyVr52i6qF QnIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="tt/IhnnT"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v3si16912196pgn.546.2019.02.28.01.36.47; Thu, 28 Feb 2019 01:37:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="tt/IhnnT"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731264AbfB1Jel (ORCPT + 99 others); Thu, 28 Feb 2019 04:34:41 -0500 Received: from mail-io1-f68.google.com ([209.85.166.68]:43548 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725896AbfB1Jel (ORCPT ); Thu, 28 Feb 2019 04:34:41 -0500 Received: by mail-io1-f68.google.com with SMTP id y6so16037790ioq.10 for ; Thu, 28 Feb 2019 01:34:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=nKotgps3dNTeWXxoCTNOk7PR/4UzeovGKyvC/PglSYo=; b=tt/IhnnTtsSLg0QDVee45lFT4pOMGGRtyYA91xKR+yZG0pyHzwmqrTUl1SJUqRy5eC myKCzvorwfSAtK8grZBNp/lsZo7RrBw73llQFYdEk7ocYMVbyb2z/pDkTBx1a0DY6O6D hxJrBDnOvEnpD5bPTej+t0zX2r7Vx8c6WpV+L/8p1hXwUKQZNYOFNlvnfQ8A5uhoEHst MYxWXPJVNH/2M/o0+/Py3VkmQahNM8NYgVElp4DMicR9tC9mHNRahWgV3fWMhZyVjlaU MJ5/D9MY4HkarSBGhDtiTJoltVP7atDpycM+eaTQ8vzijqon0xZyGAQhCoyg9I9N2x+D iSZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=nKotgps3dNTeWXxoCTNOk7PR/4UzeovGKyvC/PglSYo=; b=ZZM0MSNOQaLfq8XK42qKVHOWAhp7mzrRa9LJwXYSjH99ouy1PUor9IMjHZz5l93F19 Wxed/h0Kxuz7omAJbjUCC2BvCqejBNLLhshHdz2XVV4bq2x17mmoimdfcZjHfw4CSpW7 9pRd22lIDpNN2m9g6dF/jm9rHZrho+PU5VZnBwpMcT5GTfZjhMWtG6omWzYw7d/MCTHY WM8y1NFd1ZmdTLxZ4p6p+KH8Y4Vj2v1qSeSZ2/Etq89i/ZbwL4lFEZL4fJAuc+AtwRSG tR4zrmpG796KPYuCkUGqodd9Wq8j3aXUxUwtcI2E2UWR/hZcvISfecDwYMi2aD4zMONn tUSA== X-Gm-Message-State: APjAAAUb3XM9Mx6sTdvNFzc64/IKGF16VNwEXttYGgqqfyzuDWAtQ1oh VCI7VdpKTwCTjeleqgJ9sYHWVzylmhnMwHy25Y314w== X-Received: by 2002:a5d:84c3:: with SMTP id z3mr4692780ior.11.1551346480461; Thu, 28 Feb 2019 01:34:40 -0800 (PST) MIME-Version: 1.0 References: <0000000000009a01370582c6772a@google.com> <20190226151738.GA6430@mit.edu> <20190227215755.GD10828@mit.edu> In-Reply-To: <20190227215755.GD10828@mit.edu> From: Dmitry Vyukov Date: Thu, 28 Feb 2019 10:34:29 +0100 Message-ID: Subject: Re: INFO: rcu detected stall in ext4_file_write_iter To: "Theodore Y. Ts'o" , Dmitry Vyukov , syzbot , Andreas Dilger , linux-ext4@vger.kernel.org, LKML , linux-fsdevel , syzkaller-bugs , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 27, 2019 at 10:58 PM Theodore Y. Ts'o wrote: > > On Wed, Feb 27, 2019 at 10:58:50AM +0100, Dmitry Vyukov wrote: > > Peter, Ingo, do you have any updates on the > > perf_event_open/sched_setattr stalls? This bug cause assorted hangs > > throughout kernel and so is nasty. > > > > syzkaller tries to remove all syscalls from reproducers one-by-one. > > Somehow without sched_setattr the hang did not reproduce (a bunch of > > repros have perf_event_open+sched_setattr so somehow they seem to be > > related) > > FWIW, at least for me, the repro.c with sched_setattr commented out > (see the repro.c attached to a message[1] earlier in the thread) it > was reproducing reliably on a 2 CPU, 2 GB memory KVM using the > ext4.git tree (dev branch, 5.0-rc3 plus ext4 commits for the next > merge window) using a Debian stable-based VM[2]. > > [1] https://groups.google.com/d/msg/syzkaller-bugs/ByPpM3WZw1s/li7SsaEyAgAJ > [2] https://mirrors.edge.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests/root_fs.img.amd64 > > > But even with perfect repros machines still won't be > > able to tell in all cases that even though the hang happened in ext4 > > code, the root cause is actually another scheduler-related system > > call. So thanks for looking into this. > > To be clear, there was *not* a scheduler-related system call in the > repro.c I was playing with (see [2]); just perf_event_open(2) and > sendfile(2). Let me correct the statement then: But even with perfect repros machines still won't be able to tell in all cases that even though the hang happened in ext4 code, the root cause is actually another perf-related system call. So thanks for looking into this.