Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp1135972ybv; Thu, 20 Feb 2020 14:01:07 -0800 (PST) X-Google-Smtp-Source: APXvYqyRpTO/YCyAb8mka0t43j2sOZF9Y5DSzQl/WLrzDop6sZ4DcHEDGDsr1B708MiCjWiNwscY X-Received: by 2002:aca:44c1:: with SMTP id r184mr3522162oia.151.1582236067773; Thu, 20 Feb 2020 14:01:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582236067; cv=none; d=google.com; s=arc-20160816; b=lI7hlgqOlAK4kBaCi9m6NVpI5SwliVah4d5kruiFXApwnRpvxGVSDPXBwysRJQDn+f 3gb0NWy2H3I36XoR6Vu9JEy9HBQac3gSNSTy46iQSkaeLtl9Al7nEV4VD5Auamzp2L1P mfuxHhCMpq4H+BGUhpuG71NflRHO6wv5tt8IOqhxmf3X0UkK2MH7zA3Szny6aRhIWKMa e6bvDTm0rKZW+5vXwJ/d9GdUwy0mfURRL+HVGRWQXAQHgeZ6h8BlamVanxT2YIndfhoz AMJC2+a4cj2DRdyJx7VNWsw3G98Lodp6+DBZcdT6oA1gQfeUu3Q3qjhLwHaEfXFX2azn YSWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=4DEb+MFtlfOkYpWfs64nM33/4kdQBlL8UrNBhh3ectk=; b=ZCfxr29u9wAn4og02a6NdAQ92L4PwHIfU7Q0ya3K8NYGbqO2tIWrmFDgFM5VuDGy4E RoOTitG2ykUkVcLxxGvFGxXsVgWAeU9yqsgjXxc3iKcAFpzKFcARK4RQOkRu25o+4ss8 6WTClWcsFOsoUL52WSG0SCQCqeWz+MnUYh6eIzgsOGfT/68o1NtdwkhnEPdHRQrKtx4k bar25f/Nq6clHYhEhHTr4Q7h2kDxNb0u1euoaUh9S29+WSx2To+Kh4xWVYFLAdg3h018 FswHd3kqWO4/ejeuofnffxUUgiCgVEpf/Rkxi/K1d9D1ycGHUDcZ6hrg55qN3yfpRx+5 nMOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=Kmj9oigm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d83si306690oib.166.2020.02.20.14.00.54; Thu, 20 Feb 2020 14:01:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=Kmj9oigm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729145AbgBTWAN (ORCPT + 99 others); Thu, 20 Feb 2020 17:00:13 -0500 Received: from mail-qt1-f196.google.com ([209.85.160.196]:42815 "EHLO mail-qt1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729090AbgBTWAM (ORCPT ); Thu, 20 Feb 2020 17:00:12 -0500 Received: by mail-qt1-f196.google.com with SMTP id r5so4053788qtt.9 for ; Thu, 20 Feb 2020 14:00:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=4DEb+MFtlfOkYpWfs64nM33/4kdQBlL8UrNBhh3ectk=; b=Kmj9oigm0jMnd4HOhL2kj5adwwG8XTmckWqG+btnWkfj+uDXxwZkdqrxOQ7PGHF76d UQIhwBsksdU7D+Tr0CtIQnWeoDw2S0FzAeaOZkKFaGQjh+4tG+7lcFKp1TJdyxwnWBvz 0xMlnq2i02KkLLLUS3e6GXooE84fiTc0bIa0MWZiFa5PTTiLv4F1ka6+J2bubbPVkwSF irbntREuc6iemK8X0WglElHqFXvXyME57ic83jBhi08/Z35rZMunYsQcPNE+C4gQG3TA oGV2OVyd97f+UK61/rT8CudV1HpMf+Y52tCRrkdnvwvKDGVKpsp7LtK+f5lf8R+fTbSD 4wUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=4DEb+MFtlfOkYpWfs64nM33/4kdQBlL8UrNBhh3ectk=; b=UZewUlI8LI3S+7bkHHmsrZgE8fQOjf6x/ZsO0Ts7bRrAb3ZLVNXqP8hyV0rYj18Z+8 P/AL/cJIXeaZN6I+Ppo0m3/eQTillTXyuIzHzI/kLRlgP4tYaIslgEKoZAdE8V/WvV8T qMDyag0xFz4O5yAduouP59TeMkFxCIyM0muBRW1IuY48LAs1yUf1/x2FeLgwjcufq2qL Jn74naX8l/F2o8jPQja2dmIdCgbBD274iTIQJVB+CpgYASwiqGQ08my8qHqz1sPwVqym EfKCUHjwa/50W52SkYA4gnIDKBnhHdMOyofYzmNxBlB+SFHsG3KmvhE8HZGMostAz4Tz +jTw== X-Gm-Message-State: APjAAAWYV1wIo24f6Cy2DcjbTvxsZ9Anm3sgD4nBZGevMy3gFUcqJ0NZ WMQq9OK1HAlFrTcgWIi2y3r+mA== X-Received: by 2002:ac8:72d6:: with SMTP id o22mr28347095qtp.174.1582236011362; Thu, 20 Feb 2020 14:00:11 -0800 (PST) Received: from localhost ([2620:10d:c091:500::1:3504]) by smtp.gmail.com with ESMTPSA id t3sm513004qtc.8.2020.02.20.14.00.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Feb 2020 14:00:10 -0800 (PST) Date: Thu, 20 Feb 2020 17:00:09 -0500 From: Johannes Weiner To: Dan Schatzberg Cc: Jens Axboe , Tejun Heo , Li Zefan , Michal Hocko , Vladimir Davydov , Andrew Morton , Hugh Dickins , Roman Gushchin , Shakeel Butt , Chris Down , Yang Shi , Thomas Gleixner , "open list:BLOCK LAYER" , open list , "open list:CONTROL GROUP (CGROUP)" , "open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)" Subject: Re: [PATCH v3 1/3] loop: Use worker per cgroup instead of kworker Message-ID: <20200220220009.GA68937@cmpxchg.org> References: <118a1bd99d12f1980c7fc01ab732b40ffd8f0537.1582216294.git.schatzberg.dan@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <118a1bd99d12f1980c7fc01ab732b40ffd8f0537.1582216294.git.schatzberg.dan@gmail.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 20, 2020 at 11:51:51AM -0500, Dan Schatzberg wrote: > Existing uses of loop device may have multiple cgroups reading/writing > to the same device. Simply charging resources for I/O to the backing > file could result in priority inversion where one cgroup gets > synchronously blocked, holding up all other I/O to the loop device. > > In order to avoid this priority inversion, we use a single workqueue > where each work item is a "struct loop_worker" which contains a queue of > struct loop_cmds to issue. The loop device maintains a tree mapping blk > css_id -> loop_worker. This allows each cgroup to independently make > forward progress issuing I/O to the backing file. > > There is also a single queue for I/O associated with the rootcg which > can be used in cases of extreme memory shortage where we cannot allocate > a loop_worker. > > The locking for the tree and queues is fairly heavy handed - we acquire > the per-loop-device spinlock any time either is accessed. The existing > implementation serializes all I/O through a single thread anyways, so I > don't believe this is any worse. > > Signed-off-by: Dan Schatzberg FWIW, this looks good to me, please feel free to include: Acked-by: Johannes Weiner I have only some minor style nitpicks (along with the other email I sent earlier on this patch), that would be nice to get fixed: > +static void loop_queue_work(struct loop_device *lo, struct loop_cmd *cmd) > +{ > + struct rb_node **node = &(lo->worker_tree.rb_node), *parent = NULL; > + struct loop_worker *cur_worker, *worker = NULL; > + struct work_struct *work; > + struct list_head *cmd_list; > + > + spin_lock_irq(&lo->lo_lock); > + > + if (!cmd->css) > + goto queue_work; > + > + node = &(lo->worker_tree.rb_node); -> and . are > &, the parentheses aren't necessary. > + while (*node) { > + parent = *node; > + cur_worker = container_of(*node, struct loop_worker, rb_node); > + if ((long)cur_worker->css == (long)cmd->css) { The casts aren't necessary, but they made me doubt myself and look up the types. I wouldn't add them just to be symmetrical with the other arm of the branch. > + worker = cur_worker; > + break; > + } else if ((long)cur_worker->css < (long)cmd->css) { > + node = &((*node)->rb_left); > + } else { > + node = &((*node)->rb_right); The outer parentheses aren't necessary. > + } > + } > + if (worker) > + goto queue_work; > + > + worker = kzalloc(sizeof(struct loop_worker), > + GFP_NOWAIT | __GFP_NOWARN); This fits on an 80 character line.