Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp484080pxb; Wed, 11 Nov 2020 08:27:42 -0800 (PST) X-Google-Smtp-Source: ABdhPJwPwpBUZ3H9mNRX4P6x/4XGVWk7siD56HQhbNnnPfuJA83jUZcENv6Bebo5FMQxg5PLjw7K X-Received: by 2002:a17:906:22c7:: with SMTP id q7mr26983338eja.291.1605112060785; Wed, 11 Nov 2020 08:27:40 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1605112060; cv=pass; d=google.com; s=arc-20160816; b=G27yPU13r7ryxHwppMGm6kPM6F/+4Oyso//7IEpP7dXckfk99qWqBCtlb3JOlGO3dy ASx62DeFJ+S6gWr78weHrBwkE4f2oqEVAw8f9gk2snA3A69mDK/lKIHmpVaUf3YrZl55 Kw/IYfmqvf+9pd3ygX5FuQLZ4qWP56FgKrHVErxIwaaEIHVVd69I3JPQJ6RZ6nw8ZIYn GYBX3H737UjWnssLCq5SqFTEvHnCHXWQHnSa0Bm42BvElPQuvWE0BhiJht42A91SxCMu omge8zPvIFVMo3Y7gLoQ8YmzEo36HJjljRFAr9c09ti1Qppjzu8KO8tysaZ3hEogep/8 yiyA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :content-language:in-reply-to:user-agent:date:message-id:from :references:cc:to:subject:dkim-signature; bh=kLmnittKlrQpwi8UxsdZcf8RkoJlLDQrmN2RuKhJbE4=; b=AFXvo/tVdHu7RnU/HF6xjVyjsJgyQ0uNU+d5pGFpAur7QoQUujbwHtIkkFT+GkAb3/ 3lFEXW6fWZK/JOvF/DianvhnGOw11fvHGb/JGxChFSOEC8zstviZ+C4KEiJw40E51d+l IQhkFOvsEJvvRp/qESR99q5lEYSfZzR/yp3ivR34rnNd2jWoJviGbuiakNypLG/3ZIjP +XXci4AU1tWhDRuP6ISswjK8enNquOWd6COmiJllW6H3nPkhMUtLHG2ov2/U2O6Zz01T KRau8Wtolxq8FAlwFQUfmuS9iQWqeS8V69oCeGc5sr3PRZSwkEbRXA8rQ9qRzp4j6zUQ mvAw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@windriversystems.onmicrosoft.com header.s=selector2-windriversystems-onmicrosoft-com header.b=WHFtNcfe; arc=pass (i=1 spf=pass spfdomain=windriver.com dkim=pass dkdomain=windriver.com dmarc=pass fromdomain=windriver.com); spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a10si1875454eda.423.2020.11.11.08.27.10; Wed, 11 Nov 2020 08:27:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@windriversystems.onmicrosoft.com header.s=selector2-windriversystems-onmicrosoft-com header.b=WHFtNcfe; arc=pass (i=1 spf=pass spfdomain=windriver.com dkim=pass dkdomain=windriver.com dmarc=pass fromdomain=windriver.com); spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726208AbgKKQYm (ORCPT + 99 others); Wed, 11 Nov 2020 11:24:42 -0500 Received: from mail-bn7nam10on2052.outbound.protection.outlook.com ([40.107.92.52]:1632 "EHLO NAM10-BN7-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726036AbgKKQYl (ORCPT ); Wed, 11 Nov 2020 11:24:41 -0500 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Wp837lmgj3iRA//ZBFcZ0jB6K3c7/msnZ8cjsqyWkqXckrKCiFuEEwRJp23XvxqAJHj71bJMbR0EKtk/vKSctuPwEnC3eg9Thh8/J2i9MK4vWjhvhEtvyGWepEyWE3omuSZU1vwIzqscT4GJqTqw61dQP9tOVhRIr2YByPL52Yr4hk4GAezwU4A5I2rPHawzwHkZUcHZs4m1zWqxlq/FQf7BT39w0Lw2/XabaJgsJ5/RXn6e8V8emYsSHXdu26VhBuF482l1Vi+U3HH1G9++NM5XDZby4ARfkCGERCwdLxTcDvORYNnns3LmESsbwza1NXo2hjDqI9rTy/PC4zWNRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kLmnittKlrQpwi8UxsdZcf8RkoJlLDQrmN2RuKhJbE4=; b=jLalon9swfk6l6Qyc8/jPj3UN+BMG/M9ucXD9CtEpBNYogkGjZcTlDqKAmUiZFlbLA4rzaN+c1d8m2ITC6UMiZY38znLfHhbVUCwbKolN6NBKwvhm3BzZ7Mrlu5Mxshxj1ZmTcMTtctyU4Wet7TBiXRbLR40w7qSaorC5AYJoerkc0VYVUgxN0iaosNcpteNkmFGX5Zq2UPj1lf0nQLH9OAYo2drwpYyaLSMGdQx0lHa05nVCSbu7E5BQBkT15jtEnzWtbOwdBvfh5mZmBW3uOSOCbFq8qYQKeNBJY15AW4uLbc6inwLQ7iGGT+A2aKt06vioMAL+hS+0p8mo6bMIQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=windriver.com; dmarc=pass action=none header.from=windriver.com; dkim=pass header.d=windriver.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=windriversystems.onmicrosoft.com; s=selector2-windriversystems-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kLmnittKlrQpwi8UxsdZcf8RkoJlLDQrmN2RuKhJbE4=; b=WHFtNcfeyU+gcV4ns39+u3OmccCACN55ivD+bgCbbonXp0cWZauZnIkMZYTq1gnuYvhgburH/AfKWcOn7hhmCsZQn3U7VJ9O8Uly21l1SZajyQ0eJOFrTNTxg5pwHxs/W/XNAtXCNOTjJyKQbGfeIgAkSgkn3LaY4/gcKMJm8bM= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=windriver.com; Received: from SJ0PR11MB5120.namprd11.prod.outlook.com (2603:10b6:a03:2d1::13) by BYAPR11MB2855.namprd11.prod.outlook.com (2603:10b6:a02:ca::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3541.25; Wed, 11 Nov 2020 16:24:38 +0000 Received: from SJ0PR11MB5120.namprd11.prod.outlook.com ([fe80::c048:b134:f828:e40]) by SJ0PR11MB5120.namprd11.prod.outlook.com ([fe80::c048:b134:f828:e40%6]) with mapi id 15.20.3541.025; Wed, 11 Nov 2020 16:24:38 +0000 Subject: Re: looking for assistance with jbd2 (and other processes) hung trying to write to disk To: Jan Kara Cc: linux-ext4@vger.kernel.org References: <17a059de-6e95-ef97-6e0a-5e52af1b9a04@windriver.com> <20201110114202.GF20780@quack2.suse.cz> <7fa5a43f-bdd6-9cf1-172a-b2af47239e96@windriver.com> <20201111155743.GG28132@quack2.suse.cz> From: Chris Friesen Message-ID: <654925b2-7c4a-2432-f75e-4fb9ef08a816@windriver.com> Date: Wed, 11 Nov 2020 10:24:35 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 In-Reply-To: <20201111155743.GG28132@quack2.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [70.64.76.237] X-ClientProxiedBy: BYAPR04CA0012.namprd04.prod.outlook.com (2603:10b6:a03:40::25) To SJ0PR11MB5120.namprd11.prod.outlook.com (2603:10b6:a03:2d1::13) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [172.25.39.5] (70.64.76.237) by BYAPR04CA0012.namprd04.prod.outlook.com (2603:10b6:a03:40::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3541.21 via Frontend Transport; Wed, 11 Nov 2020 16:24:37 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0fe19525-a214-4897-77f6-08d8865e48ca X-MS-TrafficTypeDiagnostic: BYAPR11MB2855: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2YWnAVlv/fAkbl91NHS2uM42KDAVUy8BnvF1+oGpgCC9eIrm4fYgRM9NUFkBk9W5pu53rsGMOxhRIDZI0pd37nuVMhCVjsnR22YkGeRF5965KFTRVcWw9pB6eMP67uhGoD/300L1d00J+kWw7Dz0G5DD/N+Qk7ovktp1JRfMivI5bJgzBklKwZQAkxYSsbYkZo4Ebq6qdnrnkodac5g1Fc8lo8+yMuwP4RDi6bkA1b2gc89eGOrBOV/zdzUY2xrMJQXzI0IEkvN/+2pI4ey/QZxeSkjgXz7feTWPaxAt8OjBTrxrla9Jdb2Fk5ZEk1rkvRTp0MDj37VfxMuh4PgdPfsUxMl5ssYg9dTOn8a9G8rEwbcpgXUo0UQIFglyhZdD X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SJ0PR11MB5120.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(366004)(346002)(396003)(39850400004)(376002)(136003)(31696002)(44832011)(83380400001)(5660300002)(16526019)(66476007)(66556008)(26005)(186003)(6916009)(8676002)(86362001)(53546011)(31686004)(36756003)(66946007)(8936002)(316002)(16576012)(478600001)(2906002)(52116002)(6486002)(956004)(4326008)(2616005)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData: s1tb4I1kS1RLKqck9gahKQQmeysEfWSajpxDoYQseTviZMuOLEbksq3mG9lXaDJmugCzkfXaX34rcc4kJUfdbud18P/ElYZi+wMmcjT9fM2oolLi3gg3flS4veJm03Ufy8BMdwKrB6uVRhUZWYN9NPVzem7CqpwWkMEsv2Gb5eloM73vozGOkZpawKRW8EPKYMUqEJz1gSUnjq+j4wDLrIqY2ygCz5DQRU9z97YTHMYb3ypDoyPv8i6sNxcNRjp7Pbx1QPXK3ZKvz6knNYnSVV+Ud5z0fNNoSqPvrWzsLA0cSz7Rbixr7Y5K2/Bu7hwO2qA5Lb3Icq74llX4tBRjF8W3gM8DZHYQmvuz/hfXFaFHRikLParpOld4ze3csE/GgriV2O/z6NV7G1ME9yU0GFZvGeoEbvC3sgrjvEPB3rvm/Nshdy03vpu3bnMM0F5sB/L3zFU6Vd7n/bCJ9U3wrzZiIyalBXyNREYN4k6Ghp7V6aGIgNhkv/VEC/MR1OhQjHoatd4P8NBJEbZne6RjOURKxc7JD5UNQSsNf9FoFBh6TALGaY1En6QD5fDrN1wuOsegBZdAKt7vKwWfqzkmO19seKqb1jqIjJ45f/Nx93rleXPolai6ZMLesvaxW7TISYvYjpkW+kC/JHM6n+QDqGwPMWh3rbHNOwGsa9v9UzvjsyG1AAU5kcbiw2+XhN1lNgJ0pzjbbpe7oZ88Zj7t5+yAYcIT7opJDTRn8jjFfWTq6Avu4KX/i6YQW/TaodcWLwyVkdLGmeX0XYCifA06FY4C/JiFj0Bpqjv0Xnxe12JAAafPyBDOu8ZAAOfnzBqPJQauzuH8mzZZw9hVDWZC2yrIbi8eOVa7jxsP+pPcopyZE8soTy92xhYRjgCGlrYEX5mqnKqaLiVeZtxC4Tv4HA== X-OriginatorOrg: windriver.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0fe19525-a214-4897-77f6-08d8865e48ca X-MS-Exchange-CrossTenant-AuthSource: SJ0PR11MB5120.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Nov 2020 16:24:38.0595 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ddb2873-a1ad-4a18-ae4e-4644631433be X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: LvhzG2RE4zFERZDKK69+p3cy3KMdSu2wsdjkbHV3IgH/1L3t0cTo7hJ4jLtSQCleOvn85TCFzDN3r2834pllG/pWbVmZ7KqHnNIFk9H+wR4= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB2855 Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 11/11/2020 9:57 AM, Jan Kara wrote: > On Tue 10-11-20 09:57:39, Chris Friesen wrote: >> Just to be sure, I'm looking for whoever has the BH_Lock bit set on the >> buffer_head "b_state" field, right? I don't see any ownership field the way >> we have for mutexes. Is there some way to find out who would have locked >> the buffer? > > Buffer lock is a bitlock so there's no owner field. If you can reproduce > the problem at will and can use debug kernels, then it's easiest to add > code to lock_buffer() (and fields to struct buffer_head) to track lock > owner and then see who locked the buffer. Without this, the only way is to > check stack traces of all UN processes and see whether some stacktrace > looks suspicious like it could hold the buffer locked (e.g. recursing into > memory allocation and reclaim while holding buffer locked or something like > that)... That's what I thought. :) Debug kernels are doable, but unfortunately we can't (yet) reproduce the problem at will. Naturally it's only shown up in a couple of customer sites so far and not in any test labs. > As Ted wrote the buffer is indeed usually locked because the IO is running > and that would be the expected situation with the jdb2 stacktrace you > posted. So it could also be the IO got stuck somewhere in the block layer > or NVME (frankly, AFAIR NVME was pretty rudimentary with 3.10). To see > whether that's the case, you need to find 'bio' pointing to the buffer_head > (through bi_private field), possibly also struct request for that bio and see > what state they are in... Again, if you can run debug kernels, you can > write code to simplify this search for you... Thanks, that's helpful. Chris