Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753701Ab3EHXXL (ORCPT ); Wed, 8 May 2013 19:23:11 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:33455 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752823Ab3EHXXJ (ORCPT ); Wed, 8 May 2013 19:23:09 -0400 Date: Wed, 8 May 2013 16:23:01 -0700 From: "Darrick J. Wong" To: Mike Snitzer Cc: Joe Thornber , device-mapper development , linux-kernel@vger.kernel.org Subject: Re: dm-cache not writing out cache metadata at reboot? Message-ID: <20130508232301.GB8371@blackbox.djwong.org> References: <20130508214845.GA7729@blackbox.djwong.org> <20130508220526.GA24132@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130508220526.GA24132@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2753 Lines: 58 On Wed, May 08, 2013 at 06:05:26PM -0400, Mike Snitzer wrote: > On Wed, May 08 2013 at 5:48pm -0400, > Darrick J. Wong wrote: > > > Hi, > > > > So I've been watching the hit/miss counters in dmcache and I've noticed a > > couple of things that look like errors to me: > > > > First, I noticed that if I reboot the system, neither cache_postsuspend nor > > cache_dtr get called. This might simply be expected behavior, but it means > > that the in-memory superblock structure doesn't get written out to disk upon > > reboot. Just to be sure, I put a printk into __commit_transaction. It prints > > out for 'dmsetup info' and 'dmsetup remove' but nothing at reboot. > > We don't have reboot notifiers that auto-magically tear down an > artbitrary DM stack. Typically the device shutdown includes unmounting > filesystems, stopping LVM (which tears down DM devices, etc). > > So given that we don't have any userspace LVM2 support for dm-cache yet > I'm not surprised by this. In fact it is expected. Hmm, I wasn't aware that the lvm2 package had any teardown scripts. It doesn't seem to have any in RHEL5.8 or Ubuntu... > > Second, cache_status calls dm_cache_commit, which writes out a superblock to > > the metadata device. However, there's no call to save_stats to copy the > > current values of the counters out to the disk's copy prior to calling > > dm_cache_commit. Therefore, we seem to be writing out stale copies of > > superblock fields. > > > > The second one seems fixable with the attached patch > > I'll defer to Joe on this but I think sync_metadata() is pretty heavy to > be doing every 'dmsetup info'. BTW, with just dm_cache_commit() the > superblock fields aren't stale; only the on-disk hints are. How often does dmsetup info run? I admit that it becomes slower with the patch, but I didn't think it was really in anyone's hot path. But given that there's a comment just prior that says: /* Commit to ensure statistics aren't out-of-date */ it feels like we ought at least to be calling save_stats() so that we update the on-disk statistics. Though, given that the metadata size should be about 10MB for a 100GB cache device, I don't mind flushing out 10MB of metadata to get the device info. Really the problem is that with both of these complaints active, the superblock counters and tables /never/ seem to get updated, even across multiple reboots. (I'm still digging for why I see such weird unreproduceable benchmark numbers.) --D -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/