Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp1740872ybh; Tue, 14 Jul 2020 06:14:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz0WSB/1bjbv3LjV6yc+MUxVmdvWMpntHUHV8TbDCiatg0Zx12jmbrxW7XLHr3wNBsZvh1t X-Received: by 2002:a17:906:ae56:: with SMTP id lf22mr4321779ejb.59.1594732497364; Tue, 14 Jul 2020 06:14:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594732497; cv=none; d=google.com; s=arc-20160816; b=G89h0s9NzyEe8i4rb70kvY2E9n1STtzF3ax/ksDFkVxb5atqokRHFRzWBf6JMhX1UP YzJB2PADgO3eg1mvW/4VLzU9lWIYgXBazp2TmaCiTO7mg90KGyrIHh88Azp8yACNz1+j D2FePIzrSDL8yDS4xw96XDoXJhEmt4aNvZFmM5bfoAYyJ4T+vAIRSoGHCRTfxdd03P6t HqvzrhI43nv+hVotOl+QhH/q1Av2tHqEx3ZMXdB9rHMt02ginnUQwpbUEOeA4iRWfGjM FONKstKOGk1QzdS3tusFEtK/8HvnHUbcB6rLTTqiJMMwVt7yhGPIbAyBFpmbIOTonVLe 1EHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=zPahPTQn2842f3/DBKFKQoQFKe1Vqjgqh68BT6xv0NA=; b=pYwX5Bq1LIqXjGAQsKxT5s5FipofRXFJ0Lnc8l2W7ZD0QJLOSXwMT0QqMcLSx7OzJG XzgRSX30qh3HFr1x0dyDjGlvVi5uMW1Fz4wJMwMmDg/vmDH9wE4+9oWnvelQWXmgPemi X1Wc47M4V5uTLXjrbiCEI5Hyv+KW4ui33brd3NvHlVA1EPO6kyx4FOIIgq+TDKCsgIXK GySnmFdbTSYlEyyH1mR1+WaPy0H/RbuP4VLXtd20knGEa6wqfQZAfdFEitS4+Zjbw7Id 8uoylvoqImIdr8j8xvBsHD5j79kJ1i1J4+q/jQHehE0/5y5IeQOQ2E8WLrNKOtnuMS1W fmpA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z20si10822463ejb.342.2020.07.14.06.14.33; Tue, 14 Jul 2020 06:14:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728471AbgGNNOT (ORCPT + 99 others); Tue, 14 Jul 2020 09:14:19 -0400 Received: from mx2.suse.de ([195.135.220.15]:51132 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726914AbgGNNOS (ORCPT ); Tue, 14 Jul 2020 09:14:18 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AAFCEAFB7; Tue, 14 Jul 2020 13:14:18 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id C99941E12C9; Tue, 14 Jul 2020 15:14:13 +0200 (CEST) Date: Tue, 14 Jul 2020 15:14:13 +0200 From: Jan Kara To: Francesco Ruggeri Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, amir73il@gmail.com, jack@suse.cz Subject: Re: soft lockup in fanotify_read Message-ID: <20200714131413.GJ23073@quack2.suse.cz> References: <20200714025417.A25EB95C0339@us180.sjc.aristanetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200714025417.A25EB95C0339@us180.sjc.aristanetworks.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello! On Mon 13-07-20 19:54:17, Francesco Ruggeri wrote: > We are getting this soft lockup in fanotify_read. > The reason is that this code does not seem to scale to cases where there > are big bursts of events generated by fanotify_handle_event. > fanotify_read acquires group->notification_lock for each event. > fanotify_handle_event uses the lock to add one event, which also involves > fanotify_merge, which scans the whole list trying to find an event to > merge the new one with. > In our case fanotify_read is invoked with a buffer big enough for 200 > events, and what happens is that every time fanotify_read dequeues an > event and releases the lock, fanotify_handle_event adds several more, > scanning a longer and longer list. This causes fanotify_read to wait > longer and longer for the lock, and the soft lockup happens before > fanotify_read can reach 200 events. > Is it intentional for fanotify_read to acquire the lock for each event, > rather than batching together a user buffer worth of events? Thanks for report and the analysis. I agree what you describe is possible. The locking is actually fine I think but you're correct that the merging logic isn't ideal and for large amounts of queued events may be too slow. We were already discussing with Amir how to speed it up but didn't end up doing anything yet since the issue wasn't really pressing. WRT fanotify_read() removing events from the list in batches: That's certainly one possible optimization but (especially with recent extensions to fanotify interface) it is difficult to tell how many events will actually fit in the provided buffer so we'd have to provide a way to push events back to the event queue which may get a bit tricky. And as I wrote above I think the real problem is actually with fanotify merge logic which ends up holding the notification_lock for too long... But we may want to add cond_resched() into the loop in fanotify_read() as that can currently take time proportinal to user-provided buffer which can be a lot. That will probably silence the softlockup for you as well (although it's not really fixing the underlying issue). We'll have a look what we can do about this :) Honza -- Jan Kara SUSE Labs, CR