Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2194700pxv; Sat, 3 Jul 2021 01:44:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxJFkxeQsJ0nb199TH+la9FcDg2ebyh/wh+eThGJHwR/Ysz/ZLQPqUXE779BVjXtrNGBX9P X-Received: by 2002:aa7:c7c2:: with SMTP id o2mr4227566eds.166.1625301890138; Sat, 03 Jul 2021 01:44:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625301890; cv=none; d=google.com; s=arc-20160816; b=PGbFnAA3afxyOnv7H+sOCpWSQDaJzVsxziG3oxFCeSfto0ZnToCF+g7iLH/9fWDP0f KGF+bTGnUGdeiW1RjNME1UQQsR09D3WnY2yFXduJNaDGFC73k6P7QyeB97Fheyf6boVF 0PRVWOLnrIy/hE24TzQVhI6bO2qOlNH8dgAcx5555QNUP0frpLH3SoeS4B7yfRAFoi/h b2vbX0wAewsyG1kfJhqJZ3T55mY0NLkf+ell2Je54BfBiy9oNyBzxLQ3rM9nBRrDvEzS P01yNmCWOGIc/v8b0igHCH2WTZ9OsAV89bYVjbJ3kDbboiNEFUjdUnuSnDepISlYRR9j YWqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=A5pLvLjSYytl9fxuQeHwRO0qsjIw9mqvaKkb7Y9O3i8=; b=jzdvZDXdGmmbbUxeLtDX6AU9Q6cq8Q9PrhXOHhB5sbEjfhngKkvLN150Wb5Tnyw/x/ rpYsZ/GpalPZaZ3TkqWfWC7NnLReft4pj1xzfJRQRW1nF/AucKe4wd352zeSFO7hL7BM 6XGpflGfDLh+Y+enOPKitSrFJVMZ/WkAIKAS5JTrUkSZjUBnPssoeSmrakTX4jAx4MXN LLTlpVMzGFGlginZZIxM3opQgpH6ft7UY16FrbjY98oMlZ80q9JBJovyEnD4hiBKStpK sjj3ERhSQJfuEry3jjTUY1XyUcHKsTwl1fteaxSaJFMtbhJJjJtymSgt6PvLlFcNb1zd uLNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OM8Xesgo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h93si6510889edd.605.2021.07.03.01.44.13; Sat, 03 Jul 2021 01:44:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OM8Xesgo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230060AbhGCIdq (ORCPT + 99 others); Sat, 3 Jul 2021 04:33:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:32934 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230002AbhGCIdp (ORCPT ); Sat, 3 Jul 2021 04:33:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1625301071; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=A5pLvLjSYytl9fxuQeHwRO0qsjIw9mqvaKkb7Y9O3i8=; b=OM8Xesgo2gzvEtyKXLIpGN3qSWG7LbUwIjQ5hWgOLAJd5UIoGuI/J0hb5Rl/5CUbvgbFLd +wmfzOH1nZJ0f86yHYB95iS2+EtgYq5q32JCZA9+34LMD0PEnEafPTC5WfXPMMfR/D1+UD cqYLH+X8vDgOVmyeopAd6rWyqeOjRSE= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-83-ciCrKZCdM3qDv3d5hymwpA-1; Sat, 03 Jul 2021 04:31:10 -0400 X-MC-Unique: ciCrKZCdM3qDv3d5hymwpA-1 Received: by mail-wm1-f72.google.com with SMTP id j6-20020a05600c1906b029019e9c982271so7523917wmq.0 for ; Sat, 03 Jul 2021 01:31:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=A5pLvLjSYytl9fxuQeHwRO0qsjIw9mqvaKkb7Y9O3i8=; b=PuIC5N+D8VPf6U2E3q7qQ27z2h+QC7Sn2E3VSwbCkmX19dOk1UkOOuOlBuaP2z/ZdS nrxJ1U7fiuZl0aCBYDFOa4wSJovyNLqjn8uyBQb15Chls5WVJ+7ieIpRHCTCLmOrwsxn x0SKFkpqRm0yI75XTc5rXcvHk78ZDBk9x+vMaG6JruSxoZhxBJE2WQFWBmn5cxkFOgwl mpVkOSwKLWpGhKOsiewbJaIX+aS/ofbFS8wPOdLlqOx8nR/VCRATHWSsgh4P8IOeTZg1 REp2vwdLwWNmUMXVT8HbsKFkRWO6Bcw+AmgCiSAUVcBbnM03bhXFpdRaXnu8at5WMw7R hWZQ== X-Gm-Message-State: AOAM532Zyv1VSwtwXkmzif1UOR/C2OZ9ANxjOw0JCd68mJ500Lbk2p76 PJQ7s/7x3cSAWz7WsA0jwKBTZ/ScRKYn7ewTXN8NekP6JIcLz+FzNyN8RC81TIpwIo2u9eE52Or jRbxK5WICafBWvxM/mzlq+9ZE X-Received: by 2002:a05:600c:296:: with SMTP id 22mr3900762wmk.17.1625301069177; Sat, 03 Jul 2021 01:31:09 -0700 (PDT) X-Received: by 2002:a05:600c:296:: with SMTP id 22mr3900747wmk.17.1625301069032; Sat, 03 Jul 2021 01:31:09 -0700 (PDT) Received: from redhat.com ([2.55.4.39]) by smtp.gmail.com with ESMTPSA id k5sm5943632wmk.11.2021.07.03.01.31.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 03 Jul 2021 01:31:08 -0700 (PDT) Date: Sat, 3 Jul 2021 04:31:03 -0400 From: "Michael S. Tsirkin" To: He Zhe Cc: xieyongji@bytedance.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, gregkh@linuxfoundation.org, songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, qiang.zhang@windriver.com Subject: Re: [PATCH] eventfd: Enlarge recursion limit to allow vhost to work Message-ID: <20210703043039-mutt-send-email-mst@kernel.org> References: <20210618084412.18257-1-zhe.he@windriver.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210618084412.18257-1-zhe.he@windriver.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 18, 2021 at 04:44:12PM +0800, He Zhe wrote: > commit b5e683d5cab8 ("eventfd: track eventfd_signal() recursion depth") > introduces a percpu counter that tracks the percpu recursion depth and > warn if it greater than zero, to avoid potential deadlock and stack > overflow. > > However sometimes different eventfds may be used in parallel. Specifically, > when heavy network load goes through kvm and vhost, working as below, it > would trigger the following call trace. > > - 100.00% > - 66.51% > ret_from_fork > kthread > - vhost_worker > - 33.47% handle_tx_kick > handle_tx > handle_tx_copy > vhost_tx_batch.isra.0 > vhost_add_used_and_signal_n > eventfd_signal > - 33.05% handle_rx_net > handle_rx > vhost_add_used_and_signal_n > eventfd_signal > - 33.49% > ioctl > entry_SYSCALL_64_after_hwframe > do_syscall_64 > __x64_sys_ioctl > ksys_ioctl > do_vfs_ioctl > kvm_vcpu_ioctl > kvm_arch_vcpu_ioctl_run > vmx_handle_exit > handle_ept_misconfig > kvm_io_bus_write > __kvm_io_bus_write > eventfd_signal > > 001: WARNING: CPU: 1 PID: 1503 at fs/eventfd.c:73 eventfd_signal+0x85/0xa0 > ---- snip ---- > 001: Call Trace: > 001: vhost_signal+0x15e/0x1b0 [vhost] > 001: vhost_add_used_and_signal_n+0x2b/0x40 [vhost] > 001: handle_rx+0xb9/0x900 [vhost_net] > 001: handle_rx_net+0x15/0x20 [vhost_net] > 001: vhost_worker+0xbe/0x120 [vhost] > 001: kthread+0x106/0x140 > 001: ? log_used.part.0+0x20/0x20 [vhost] > 001: ? kthread_park+0x90/0x90 > 001: ret_from_fork+0x35/0x40 > 001: ---[ end trace 0000000000000003 ]--- > > This patch enlarges the limit to 1 which is the maximum recursion depth we > have found so far. > > The credit of modification for eventfd_signal_count goes to > Xie Yongji > And maybe: Fixes: b5e683d5cab8 ("eventfd: track eventfd_signal() recursion depth") who's merging this? > Signed-off-by: He Zhe > --- > fs/eventfd.c | 3 ++- > include/linux/eventfd.h | 5 ++++- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/fs/eventfd.c b/fs/eventfd.c > index e265b6dd4f34..add6af91cacf 100644 > --- a/fs/eventfd.c > +++ b/fs/eventfd.c > @@ -71,7 +71,8 @@ __u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n) > * it returns true, the eventfd_signal() call should be deferred to a > * safe context. > */ > - if (WARN_ON_ONCE(this_cpu_read(eventfd_wake_count))) > + if (WARN_ON_ONCE(this_cpu_read(eventfd_wake_count) > > + EFD_WAKE_COUNT_MAX)) > return 0; > > spin_lock_irqsave(&ctx->wqh.lock, flags); > diff --git a/include/linux/eventfd.h b/include/linux/eventfd.h > index fa0a524baed0..74be152ebe87 100644 > --- a/include/linux/eventfd.h > +++ b/include/linux/eventfd.h > @@ -29,6 +29,9 @@ > #define EFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) > #define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE) > > +/* This is the maximum recursion depth we find so far */ > +#define EFD_WAKE_COUNT_MAX 1 > + > struct eventfd_ctx; > struct file; > > @@ -47,7 +50,7 @@ DECLARE_PER_CPU(int, eventfd_wake_count); > > static inline bool eventfd_signal_count(void) > { > - return this_cpu_read(eventfd_wake_count); > + return this_cpu_read(eventfd_wake_count) > EFD_WAKE_COUNT_MAX; > } > > #else /* CONFIG_EVENTFD */ > -- > 2.17.1