Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp396810pxb; Fri, 15 Oct 2021 07:43:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxHhqCGmxUjIgY5bgSk8/pW0H/pEDN+Y7YHbQ7mhaRmXFliCNmWZiR9kt0XdVrXlqwEpdPo X-Received: by 2002:a17:90a:16:: with SMTP id 22mr28788195pja.25.1634309007754; Fri, 15 Oct 2021 07:43:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634309007; cv=none; d=google.com; s=arc-20160816; b=jEzDDTRC3Yb72PCSPiUrr7TU4iMIP61r8U7GfrRU/abOmlY95tH2VyyjaYNZnr2A/D p4AqBJYyaqb1RrXXV+6fT3Cnu4RuIPuq/JODIHmdME197bzO0y1PSSXwaPDWjWcg5rUF hsRzF+9L6IQRVUhRTU8UBj5cDW7vMcczKl7sAFTPeSklHOB9MDBeuBnBa6cx4wiZQwLy el0OEs5SHBp03q4W4KhBgsk+9dbPW55NdXWHbxqLPrvtJIZXDOUuUuIi/nVeQ28QvL+i pvPak4M1GdN4HK6aY7duO/yaGUFG3AQBvhRlCvVZIh/8GJVZvpaxBO+7FIEV1NFpvABV yZHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=oYX48g/GByb5COiqBCRcxJ2MLkXfZEqi7srhgCtRw9w=; b=NAZEPt9l+OwaTLn+ICNzYJVyRAHwwg80v0hyv3VXo/LNl9nFClEHUYl9/VeAEgi+6U eWdWPVBFcjKTZ5sDee2dm0LmRKmWUQ8V0vDJ0WJ5sGmKqcsP7YgEHYrxe6AXRAnuBoxu 54Gfbca7xVCV4ffx0zaSg8rvPHWCY/LE0YEDPhHZz44g6ewVR2ucGT3w7XapphPmN1qb pXPyUeNy1ITyF8JiFb8HTf1e1BuSVtYJySiaa9nneQhFthjmdWcFYxQFv36i8NmPc95O dElRnw0Kv7vPF7ulk0GT9zd9ZxR2iW9W8kqfKOhmiF/8DPJoNyCKe43OHXewOQLv22zU +9nQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=DIiUrjFq; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ep3si17080225pjb.8.2021.10.15.07.43.14; Fri, 15 Oct 2021 07:43:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=DIiUrjFq; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234383AbhJOIk1 (ORCPT + 99 others); Fri, 15 Oct 2021 04:40:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231825AbhJOIk0 (ORCPT ); Fri, 15 Oct 2021 04:40:26 -0400 Received: from mail-il1-x12d.google.com (mail-il1-x12d.google.com [IPv6:2607:f8b0:4864:20::12d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F3ABC061570; Fri, 15 Oct 2021 01:38:20 -0700 (PDT) Received: by mail-il1-x12d.google.com with SMTP id d11so6320403ilc.8; Fri, 15 Oct 2021 01:38:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oYX48g/GByb5COiqBCRcxJ2MLkXfZEqi7srhgCtRw9w=; b=DIiUrjFqocE4pTm5C5n+cDKaRzu5e4vYsiMCS38pEl3HYJzLt8s7ahTngoBwNZHP0P uInVxQOppKSheiqgXZMwLLxliHTEFS+eOp9dm4Xi7kdd4luPQUpX9VloZcpQkb1oQJu6 GBlOWTvYEshYktFU2eX6JV8TM7uZRfYaLbs75FU/vMAr0G2porxAM4fhasPJOQ7vtmNR 1kzKQxJ9z7p+sxteRTbr6hJjY1uim+gb8za0LHERvQ+mQaCiZbfnsUOpfolm1c5fz+ca PTDscvoe9nk3HJg3G0Y3G0q7AkCXc+3i69GGyZFftxIcP7rNORZRmFKfVqhuzHP8rAXC 4j5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oYX48g/GByb5COiqBCRcxJ2MLkXfZEqi7srhgCtRw9w=; b=feEd6xFue+/ONKr77mBMNOxyaQ+vtvTLYmbUKv3WJIp+6/Y8/h0EM5RGRUrlUQEC4k Vrc4tsO64LmkJmp5pWU9T2h8UL1JB7kkBXXjaTdgEiBX5EkuJF3nFGKmLhfV/LAKgM/7 5bcZl9U5cI5GoUJpg7Bg6gVBRwoQMWozGEwU1epJv6I0ccQOUay1W85TID6xWYJN35GO zR/6fmvduU25ux9E3fm4zt9U2ZbupmnnxdqO2LYj3hi+XlFkUUEHq9MdwSv2LWyIXtfy 6ClVwCu/5T2P2mpYlYAQJdvU3DPh5eHzku4fRyh34h/NVbgG+Xg/1R0N57eDLD/yWDYG 4x1w== X-Gm-Message-State: AOAM532UsvMv/WaWaXdXgMGaG7w5EIyGtUpIzECTqHX/kz/tnXMQimLR E7XLNhDW4PkJNFQ/1j17Ks1pX9H+DFZrS5OxxSnxGtWJ X-Received: by 2002:a05:6e02:20ed:: with SMTP id q13mr3032792ilv.254.1634287099558; Fri, 15 Oct 2021 01:38:19 -0700 (PDT) MIME-Version: 1.0 References: <20211014213646.1139469-1-krisman@collabora.com> In-Reply-To: <20211014213646.1139469-1-krisman@collabora.com> From: Amir Goldstein Date: Fri, 15 Oct 2021 11:38:08 +0300 Message-ID: Subject: Re: [PATCH v7 00/28] file system-wide error monitoring To: Gabriel Krisman Bertazi Cc: Jan Kara , "Darrick J. Wong" , Theodore Tso , David Howells , Khazhismel Kumykov , linux-fsdevel , Ext4 , Linux API , Matthew Bobrowski , kernel@collabora.com, Dave Chinner Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri, Oct 15, 2021 at 12:37 AM Gabriel Krisman Bertazi wrote: > > Hi, > > This attempts to get the ball rolling again for the FAN_FS_ERROR. This > version is slightly different from the previous approaches, since it uses > mempool for memory allocation, as suggested by Jan. It has the > advantage of simplifying a lot the enqueue/dequeue, which is now much > more similar to other event types, but it also means the guarantee that > an error event will be available is diminished. Makes me very happy not having to worry about new enqueue/dequeue bugs :) > > The way we propagate superblock errors also changed. Now we use > FILEID_ROOT internally, and mangle it prior to copy_to_user. > > I am no longer sure how to guarantee that at least one mempoll slot will > be available for each filesystem. Since we are now tying the poll to > the entire group, a stream of errors in a single file system might > prevent others from emitting an error. The possibility of this is > reduced since we merge errors to the same filesystem, but it is still > possible that they occur during the small window where the event is > dequeued and before it is freed, in which case another filesystem might > not be able to obtain a slot. Double buffering. Each mark/fs should have one slot reserved for equeue and one reserved for copying the event to user. > > I'm also creating a poll of 32 entries initially to avoid spending too > much memory. This means that only 32 filesystems can be watched per > group with the FAN_FS_ERROR mark, before fanotify_mark starts returning > ENOMEM. I don't see a problem to grow the pool dynamically up to a reasonable size, although it is a shame that the pool is not accounted to the group's memcg (I think?). Overall, the series looks very good to me, modulo to above comments about the mempool size/resize and a few minor implementation details. Good job! Thanks, Amir.