2011-02-07 04:16:03

by Dan Magenheimer

[permalink] [raw]
Subject: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

[PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

(Historical note: This "new" zcache patchset supercedes both the
kztmem patchset and the "old" zcache patchset as described in:
http://lkml.org/lkml/2011/2/5/148)

HIGH LEVEL OVERVIEW

Zcache doubles RAM efficiency while providing a significant
performance boost on many workloads.

Summary for kernel DEVELOPERS: Zcache uses lzo1x compression to increase
RAM efficiency for both page cache and swap resulting in a significant
performance increase (3-4% or more) on memory-pressured workloads
due to a large reduction in disk I/O. To do this, zcache uses an
in-kernel (no virtualization required) implementation of transcendent
memory ("tmem"), which has other proven uses and intriguing future uses
as well.

Summary for kernel MAINTAINERS: Zcache is a fully-functional,
in-kernel (non-virtualization) implementation of transcendent memory
("tmem"), providing an in-kernel user for cleancache and frontswap.
The patch is based on 2.6.37 and requires either the cleancache patch
or the frontswap patch or both. The patch is proposed as a staging
driver to obtain broader exposure for further evolution and,
GregKH-willing, is merge-able at the next opportunity. Zcache will
hopefully also, Linus-and-akpm-willing, remove the barrier to merge-ability
for cleancache and frontswap. Please note that there is a dependency
on xvmalloc.[ch], currently in drivers/staging/zram.

For kernel USERS seeking a new toy: Want to try it out? A complete
monolithic patch for 2.6.37 including zcache, cleancache, and frontswap
can be downloaded at:
http://oss.oracle.com/projects/tmem/dist/files/zcache/zcache-linux-2.6.37-110205.patch
(IMPORTANT NOTE: zcache MUST be specified as a kernel boot parameter or
nothing happens!) And if you love to see tons of detailed statstics
changing dynamically try running the following bash script in a big window
with "watch -d":
http://oss.oracle.com/projects/tmem/dist/files/zcache/zcache-stats

VERSION HISTORY

Version 2 is a bit more restrictive of concurrency (disabling irqs
in gets and flushes) and fixes a build problem reported by gregkh.

Version 1 changed considerably from V0 thanks to some excellent feedback
from Jeremy Fitzhardinge.

Feedback from others would be greatly appreciated. See "SPECIFIC AREAS
FOR HELP/FEEDBACK" below.

"ACADEMIC" OVERVIEW

The objective of all of this code (including previously posted
cleancache and frontswap patches) is to provide a mechanism
by which the kernel can store a potentially huge amount of
certain kinds of page-oriented data so that it (the kernel)
can be more flexible, dynamic, and/or efficient in the amount
of directly-addressable RAM that it uses with little or no loss
of performance and, on some workloads and configuration, even a
substantial increase in performance.

The data store for this page-oriented data, called "page-
addressable memory", or "PAM", is assumed to be
cheaper, slower, more plentiful, and/or more idiosyncratic
than RAM, but faster, more-expensive, and/or scarcer than disk.
Data in this store is page-addressable only, not byte-addressable,
which increases flexibility for the methods by which the
data can be stored, for example allowing for compression and
efficient deduplication. Further, the number of pages that
can be stored is entirely dynamic, which allows for multiple
independent data sources to share PAM resources effectively
and securely.

Cleancache and frontswap are data sources for two types of this
page-oriented data: "ephemeral pages" such as clean page cache
pages that can be recovered elsewhere if necessary (e.g. from
disk); and "persistent" pages which are dirty pages that need
a short-term home to survive a brief RAM utilization spike but
need not be permanently saved to survive a reboot (e.g. swap).
The data source "puts" and "gets" pages and is also responsible
for directing coherency, via explicit "flushes" of pages and
related-groups of pages called "objects".

Transcendent memory, or "tmem", is a clean API/ABI that provides
for an efficient address translation layer and a set of highly
concurrent access methods to copy data between the data source
and the PAM data store. The first tmem implementation is in Xen.
This second tmem implementation is in-kernel (no virtualization
required) but is designed to be easily extensible for KVM or
possibly for cgroups.

A PAM data store must be fast enough to be accessed synchronously
since, when a put/get/flush is invoked by a data source, the
data transfer or invalidation is assumed to be completed on return.
The first PAM is implemented as a secure pool of Xen hypervisor memory
to allow highly-dynamic memory load balancing between guests.
This second PAM implementation uses in-kernel compression to roughly
halve RAM requirements for some workloads. Future proposed PAM
possibilities include: fast NVRAM, memory blades, far-far NUMA.
The clean layering provided here should simplify the implementation
of these future PAM data stores for Linux.

THIS PATCHSET

(NOTE: use requires cleancache and/or frontswap patches!)

This patchset provides an in-kernel implementation of transcendent
memory ("tmem") [1] and a PAM implementation where pages are compressed
and kept in kernel space (i.e. no virtualization, neither Xen nor KVM,
is required).

This patch is fully functional, but will benefit from some tuning and
some "policy" implementation. It demonstrates an in-kernel user for
the cleancache and frontswap patches [2,3] and, in many ways,
supplements/replaces the zram and "old" zcache patches [4,5] with a
more dynamic mechanism. Though some or all of this code may eventually
belong in mm or lib, this patch places it with staging drivers
so it can obtain exposure as its usage evolves.

The in-kernel transcendent memory implementation (see tmem.c)
conforms to the same ABI as the Xen tmem shim [6] but also provides
a generic interface to be used by one or more page-addressable
memory ("PAM") [7] implementations. This generic tmem code is
also designed to support multiple "clients", so should be easily
adaptable for KVM or possibly cgroups, allowing multiple guests
to more efficiently "timeshare" physical memory.

Zcache (see zcache.c) provides both "host" services (setup and
core memory allocation) for a single client for the generic tmem
code plus two different PAM implementations:

A. "compression buddies" ("zbud") which mates compression with a
shrinker interface to store ephemeral pages so they can be
easily reclaimed; compressed pages are paired and stored in
a physical page, resulting in higher internal fragmentation
B. a shim to xvMalloc [8] which is more space-efficient but
less receptive to page reclamation, so is fine for persistent
pages

Both of these use lzo1x compression (see linux/lib/lzo/*).

IMHO, it should be relatively easy to plug in other PAM implementations,
such as: PRAM [9], disaggregated memory [10], or far-far NUMA.

References:
[1] http://oss.oracle.com/projects/tmem
[2] http://lkml.org/lkml/2010/9/3/383
[3] https://lkml.org/lkml/2010/9/22/337
[4] http://lkml.org/lkml/2010/8/9/226
[5] http://lkml.org/lkml/2010/7/16/161
[6] http://lkml.org/lkml/2010/9/3/405
[7] http://marc.info/?l=linux-mm&m=127811271605009
[8] http://code.google.com/p/compcache/wiki/xvMalloc
[9] http://www.linuxsymposium.org/2010/view_abstract.php?content_kty=35
[10] http://www.eecs.umich.edu/~tnm/trev_test/dissertationsPDF/kevinL.pdf

SPECIFIC REQUESTED AREAS FOR ADVICE/FEEDBACK

1. Some debugging code and extensive sysfs entries have been left in
place for this patch so its activity can be easily monitored. We welcome
other developers to play with it.
2. Little policy is in place (yet) to limit zcache from eventually
absorbing all free memory for compressed frontswap pages or
(if the shrinker isn't "fast enough") compressed cleancache
pages. On some workloads and some memory sizes, this eventually
results in OOMs. (In my testing, the OOM'ing is not worse, just
different.) We'd appreciate feedback on or patches that try
out various policies.
3. We've studied the GFP flags but am still not fully clear on the best
combination to use with zcache memory allocation. In particular,
We think "timid" GFP choices result in lower hit rate, while using
GFP_ATOMIC might be considered rude, but results in a higher hit
rate and may be fine for this usage. We'd appreciate guidance on this.
4. We think we have the irq/softirq/premption code correct but we're
definitely not expert in this area, so review would be appreciated.
5. Cleancache works best when the "clean working set" is larger
than the active file cache, but smaller than the memory available
for cleancache store. This scenario can be difficult to duplicate
in a kernel with fixed RAM size. For best results, zcache may benefit
from tuning changes to file cache parameters.
6. Benchmarking: Theoretically, zcache should have a negligible
worst case performance loss and a substantial best case performance
gain. Older processors may show a bigger worst case hit. We'd
appreciate any help running workloads on different boxes to better
characterize worst case and best case performance.

Signed-off-by: Dan Magenheimer <[email protected]>
Signed-off-by: Nitin Gupta <[email protected]>

drivers/staging/Kconfig | 2
drivers/staging/Makefile | 1
drivers/staging/zcache/Kconfig | 13
drivers/staging/zcache/Makefile | 1
drivers/staging/zcache/tmem.c | 710 +++++++++++++++++
drivers/staging/zcache/tmem.h | 195 ++++
drivers/staging/zcache/zcache.c | 1657 ++++++++++++++++++++++++++++++++++++++++
7 files changed, 2579 insertions(+)


2011-02-09 01:05:40

by Dan Magenheimer

[permalink] [raw]
Subject: RE: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

> (Historical note: This "new" zcache patchset supercedes both the
> kztmem patchset and the "old" zcache patchset as described in:
> http://lkml.org/lkml/2011/2/5/148)

(In order to move discussion from the old kztmem patchset to
the new zcache patchset, I am replying here to Matt's email
sent at: https://lkml.org/lkml/2011/2/4/199 )

> From: Matt [mailto:[email protected]]

Hi Matt --

Thanks for all the thoughtful work and questions! Sorry it
took me a few days to reply...

> This finally makes Cleancache's functionality usable for desktop and
> other small device (non-enterprise) users (especially regarding
> frontswap) :)

> 2) feedback
>
> WARNING: at kernel/softirq.c:159 local_bh_enable+0xba/0x110()

These should be gone in V2.

> I also observed that it takes some time until volumes (which use
> kztmem's ephemeral nodes) are unmounted - probably due to emptying
> slub/slab taking longer - so this should be normal.

If "some time" becomes a problem, I have a design in my
head how to fix this. But I'll consider it lower priority
for now.

> 2.2) a user (32bit box) who's running a pretty similar kernel to mine
> (details later) has had some assert_spinlocks thrown while

The specific sequence of asserts indicates a race, but I think
a harmless one. I haven't been able to reproduce it and stared
at various race possibilities for a couple of hours without
luck. (Aha! There it is! Oops, no that's not it. Repeat.)
Hopefully getting broader exposure to more experienced
kernel developers will help find/fix this one.

> 2.3) rsync-operations seemed to speed up quite noticably to say the
> least (significantly)
> :
> so job (2) could be cut by 1-2 minutes. Unmounting the drive/partition
> :
> So kztmem also seems to help where low latency needs to be met, e.g.
> pro-audio.
> :
> So productivity is improved quite a lot.

Thanks for running some performance tests on a broader set of
test cases! The numbers look very nice!

> Questions:
> • What exactly is kztmem?
> ∘ is it a tmem similar functionality like provided in the project
> "Xen's Transcent Memory"
> ∘ and zmem is simply a "plugin" for memory compression support to tmem
> ? (is that what zcache does ?)
> • so simplified (superficially without taking into account advantages
> or certain unique characteristics) some equivalents:
> ∘ frontswap == ramzswap
> ∘ kztmem == zcache
> ∘ cleancache == is the "core", "mastermind" or "hypervisor" behind all
> this, making frontswap and kztmem kind of "plugins" for it ?

This is best described in the "Academic Overview" section
of PATCH V2 0/3: https://lkml.org/lkml/2011/2/6/346
Cleancache and frontswap are "data sources" for page-oriented
data that can easily be stored in "transcendent memory"
(aka "tmem"). Once pages of data are accessible only via tmem,
lots of things can be done to the data, including compression,
deduplication, being sent to the hypervisor, etc.

> So kztmem (or more accurately: cleancache) is open for adding more
> functionality in the future ?

Very definitely... I'm working on another interesting use
model right now!

> • What are advantages of kztmem compared to ramzswap ("compcache") &
> zcache ? From what I understood - it's more dynamic in it's nature
> than compcache & zcache: they need to preallocate predetermined amount
> of memory, several "ram-drives" would be needed for SMP-scalability
> ∘ whereas this (pre-allocated RAM and multiple "ram-drives" aren't
> needed for kztmem, cleancache and frontswap since cleancache,
> frontswap & kztmem are concurrency-safe and dynamic (according to
> documentation) ?

Yes, that's a good overview of the differences.

> • Coming back to usage of compcache - how about the problem of 60%
> memory fragmentation (according to compcache/zcache wiki,
> http://code.google.com/p/compcache/wiki/Fragmentation) ?
> Could the situation be improved with in-kernel "memory compaction" ?
> I'm not a developer so I don't know exactly how lumpy reclaim/memory
> compaction and xvmalloc would interact with each other

Nitin is the expert on compcache and xvmalloc, so I will leave
this question unanswered for now.

> • According to the Documentation you posted "e.g. a ram-based FS such
> as tmpfs should not enable cleancache" - so it's not using block i/o
> layer ? what are the performance or other advantages of that approach
> ?

Correct, no block i/o layer involved. The block i/o layer is
optimized for disks (though it is slowly becoming adapted to
faster devices). The real "advantage" is that EVERY put/get
has immediate feedback and this is very important to making
things as dynamic as possible.

> • Is there support for XFS or reiserfs - how difficult would it be to
> add that ?

I'm not familiar with either, but most filesystems are easy to
add... I'm just not able to do the testing. If zcache moves
into upstream, other filesystem experts should be able to try
zcache easily on other filesystems.

> • Very interesting would be: support for FUSE (taking into account zfs
> and ntfs3g, etc.) - would that be possible ?

I don't know enough about those to feel comfortable answering,
but would be happy to consult if someone else wants to try it.

> • Was there testing done on 32bit boxes ? How about alternative
> architectures such as ARM, PPC, etc. ?
> ∘ I'm especially interested in ARM since surely a lot on the

Sadly, I haven't done any testing on 32-bit boxes. All the code
is designed to be entirely architecture-independent though I'm
sure a bug or three will be found on other architectures.

> be / Is there a port of cleancache, kztmem and frontswap available for
> 2.6.32* kernels ? (most android devices are currently running those)

I've found porting cleancache and frontswap to other recent
Linux versions to be straightforward. And zcache is just a
staging driver so should also port easily.

> • Considerung UP boxes - is the usage even beneficial on those ?
> ∘ If not - why not (written in the documentation) - due to missing raw
> CPU power ?

Should work fine on a UP box. The majority of the performance
advantage is "converting" disk seek wait time into CPU compress/
decompress time.

> • How is the scaling ? In case of Multiprocessors - are the
> operations/parallelism or concurrency, how it's called, realized
> through "work queues" - (there have been lots of changes recently in
> the kernel [2.6.37, 2.6.38]). ?

Good questions. The concurrency should be pretty good, but in
the current version, interrupts are disabled during compression,
which could lead to some problems in a more real-time load.
This design is fixable but will take some work.

> • Are there higher latencies during high memory pressure or high CPU
> load situations, e.g. where the latencies would even go down below
> without usage of kztmem ?

Theoretically, if there is no disk wait time (e.g. CPUs are always
loaded even during disk reads) AND there is high disk demand,
zcache could cause a reduction in performance.

> • The compression algorithm in use seems to be lzo. Are any additional
> selectable compressions planned such as lzf, gzip - maybe even bzip2 ?
> - Would they be selectable via Kconfig ?
> ∘ are these threaded / scaling with multiple processors - e.g. like pcrypt ?

Good ideas for future enhancements!

> • "Exactly how much memory it provides is entirely dynamic and
> random." - can maximum limits be set ? ("watermarks" ? - if that is
> the correct term)
> How efficient is the algorithm ? What is it based on ?

For cleancache pages, all can be reclaimed so no maximum needs
to be set as long as the kernel reclaim mechanism is working properly.
For frontswap pages, there is a maximum currently hardcoded,
but this could be changed to be handled through a /sys fs file.

> • Can the operations be sped up even more using spice() system call or
> something similar (if existant) - if even applicable ?

Sorry, I don't know the answer to this.

> • Are userland hooks planned ? e.g. for other virtualization solutions
> such as KVM, qemu, etc.

We've thought of userland hooks, but haven't tried them yet.

KVM should be able to take advantage of zcache with a little effort.

> • How about deduplication support for the ephemeral (filesystem) pools?
> ∘ in my (humble) opinion this might be really useful - since in the
> future there will be more and more CPU power but due to available RAM
> not growing as linear (or fast) as CPU's power this could be a kind of
> compensation to gain more memory
> ∘ would that work with "Kernel Samepage Merging"?
> ∘ is KSM even similar to tmem's deduplication functionality (tmem -
> which is used or planned for Xen)
> Referring to http://marc.info/?l=linux-kernel&m=129683713531791&w=2
> slides 20 to 21 on the presentation deduplication would seem much more
> efficient than KSM.

Deduplication support could be added.

> Kztmem seems to be quite useful on memory constrained devices:

You have suggested several interesting possibilities!

If I've missed anything important, please let me know!

Thanks again!
Dan

2011-02-09 02:31:45

by Nitin Gupta

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On 02/08/2011 08:03 PM, Dan Magenheimer wrote:
>> (Historical note: This "new" zcache patchset supercedes both the
>> kztmem patchset and the "old" zcache patchset as described in:
>> http://lkml.org/lkml/2011/2/5/148)
>
> (In order to move discussion from the old kztmem patchset to
> the new zcache patchset, I am replying here to Matt's email
> sent at: https://lkml.org/lkml/2011/2/4/199 )
>
>> From: Matt [mailto:[email protected]]
>
<snip>

>
>> • Coming back to usage of compcache - how about the problem of 60%
>> memory fragmentation (according to compcache/zcache wiki,
>> http://code.google.com/p/compcache/wiki/Fragmentation) ?
>> Could the situation be improved with in-kernel "memory compaction" ?
>> I'm not a developer so I don't know exactly how lumpy reclaim/memory
>> compaction and xvmalloc would interact with each other
>
> Nitin is the expert on compcache and xvmalloc, so I will leave
> this question unanswered for now.
>


I'm currently in the process of designing a new allocator that gives
predictable memory fragmentation guarantees (at the expense of extra CPU
cycles). I've not yet posted details anywhere but many of the ideas are
from the "Compact Fit" allocator:
http://www.usenix.org/event/usenix08/tech/full_papers/craciunas/craciunas_html/

I'm not sure how much time it will take since I'm not yet done with some
of the design details, and then userspace implementation, testing,
profiling and finally kernel port. Add to that extra concurrency issues
when integrating with zcache!

Thanks,
Nitin

2011-02-14 00:09:04

by Matt

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
<[email protected]> wrote:
[snip]
>
> If I've missed anything important, please let me know!
>
> Thanks again!
> Dan
>

Hi Dan,

thank you so much for answering my email in such detail !

I shall pick up on that mail in my next email sending to the mailing list :)


currently I've got a problem with btrfs which seems to get triggered
by cleancache get-operations:


Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
354120c992a00761-5fa07d400126a895 devid 1 transid 7
/dev/mapper/portage
Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk space caching
Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo compression
Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created ephemeral
tmem pool, id=3
Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
kernel paging request at 0000000001400050
Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP: [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1] PREEMPT SMP
Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
/sys/devices/platform/coretemp.3/temp1_input
Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in: radeon
ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
ehci_hcd [last unloaded: tg3]
Feb 14 00:39:20 lupus kernel: [ 2951.853682]
Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
G3710
Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
0010:[<ffffffff8133ef1b>] [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
0018:ffff880129a11b00 EFLAGS: 00010246
Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
RBX: ffff88014a1ce628 RCX: 0000000000000000
Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
RSI: ffff880129a11b70 RDI: 0000000000000006
Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
R08: ffffffff8133eef0 R09: ffff880129a11c68
Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
R11: 0000000000000001 R12: ffff88014a1ce780
Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
R14: ffff88021fef9000 R15: 0000000000000000
Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
0000000000000000(0000) GS:ffff8800bf500000(0000)
knlGS:0000000000000000
Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: 0010 DS: 0000 ES:
0000 CR0: 000000008005003b
Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
CR3: 0000000001c27000 CR4: 00000000000006e0
Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-transacti
(pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
Feb 14 00:39:20 lupus kernel: [ 2951.854006] ffff880129a11b50
ffff880000000003 ffff88003c60a098 0000000000000003
Feb 14 00:39:20 lupus kernel: [ 2951.854035] ffffffffffffffff
ffffffff810e6aaa 0000000000000000 0000000602e4ac40
Feb 14 00:39:20 lupus kernel: [ 2951.854063] ffffffff8133e3f0
ffffffff810e6cee 0000000000001000 0000000000000000
Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
Feb 14 00:39:20 lupus kernel: [ 2951.854103] [<ffffffff810e6aaa>] ?
cleancache_get_key+0x4a/0x60
Feb 14 00:39:20 lupus kernel: [ 2951.854122] [<ffffffff8133e3f0>] ?
btrfs_wake_function+0x0/0x20
Feb 14 00:39:20 lupus kernel: [ 2951.854140] [<ffffffff810e6cee>] ?
__cleancache_flush_inode+0x3e/0x70
Feb 14 00:39:20 lupus kernel: [ 2951.854161] [<ffffffff810b34d2>] ?
truncate_inode_pages_range+0x42/0x440
Feb 14 00:39:20 lupus kernel: [ 2951.854182] [<ffffffff812f115e>] ?
btrfs_search_slot+0x89e/0xa00
Feb 14 00:39:20 lupus kernel: [ 2951.854201] [<ffffffff810c3a45>] ?
unmap_mapping_range+0xc5/0x2a0
Feb 14 00:39:20 lupus kernel: [ 2951.854220] [<ffffffff810b3930>] ?
truncate_pagecache+0x40/0x70
Feb 14 00:39:20 lupus kernel: [ 2951.854240] [<ffffffff813458b1>] ?
btrfs_truncate_free_space_cache+0x81/0xe0
Feb 14 00:39:20 lupus kernel: [ 2951.854261] [<ffffffff812fce15>] ?
btrfs_write_dirty_block_groups+0x245/0x500
Feb 14 00:39:20 lupus kernel: [ 2951.854283] [<ffffffff812fcb6a>] ?
btrfs_run_delayed_refs+0x1ba/0x220
Feb 14 00:39:20 lupus kernel: [ 2951.854304] [<ffffffff8130afff>] ?
commit_cowonly_roots+0xff/0x1d0
Feb 14 00:39:20 lupus kernel: [ 2951.854323] [<ffffffff8130c583>] ?
btrfs_commit_transaction+0x363/0x760
Feb 14 00:39:20 lupus kernel: [ 2951.854344] [<ffffffff81067ea0>] ?
autoremove_wake_function+0x0/0x30
Feb 14 00:39:20 lupus kernel: [ 2951.854364] [<ffffffff81305bc3>] ?
transaction_kthread+0x283/0x2a0
Feb 14 00:39:20 lupus kernel: [ 2951.854383] [<ffffffff81305940>] ?
transaction_kthread+0x0/0x2a0
Feb 14 00:39:20 lupus kernel: [ 2951.854401] [<ffffffff81305940>] ?
transaction_kthread+0x0/0x2a0
Feb 14 00:39:20 lupus kernel: [ 2951.854420] [<ffffffff81067a16>] ?
kthread+0x96/0xa0
Feb 14 00:39:20 lupus kernel: [ 2951.854437] [<ffffffff81003514>] ?
kernel_thread_helper+0x4/0x10
Feb 14 00:39:20 lupus kernel: [ 2951.854455] [<ffffffff81067980>] ?
kthread+0x0/0xa0
Feb 14 00:39:20 lupus kernel: [ 2951.854471] [<ffffffff81003510>] ?
kernel_thread_helper+0x0/0x10
Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00 00
53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
48 89 06 84 c9 48 8b 85 68 fe ff ff
Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 00:39:20 lupus kernel: [ 2951.854762] RSP <ffff880129a11b00>
Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
f831c5ceeaa49287 ]---

in my case I had compress-force with lzo and disk_cache enabled


another user of the kernel I'm currently running has had the same
problem with zcache
(http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)

(looks like in his case compression and any other fancy additional
features weren't enabled)


changes made by this kernel or patchset to btrfs are from
* io-less dirty throttling patchset (44 patches)
* zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
applied in both cases)
* PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
* btrfs-unstable changes to state of
3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals btrfs
from 2.6.38-rc4+)

I haven't tried downgrading to vanilla 2.6.37 with zcache only, yet,

but kind of upgraded btrfs to the latest state of the btrfs-unstable
repository (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=summary)
namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6

this also didn't help and seemed to produce the same error-message

so to summarize:

1) error message appearing with all 4 patchsets applied changing
btrfs-code and compress-force=lzo and disk_cache enabled

2) error message appearing with default mount-options and btrfs from
2.6.37 and changes for zcache & io-less dirty throttling patchset
applied (first 2 patch(sets)) from list)


in my case I tried to extract / play back a 1.7 GiB tarball of my
portage-directory (lots of small files and some tar.bzip2 archives)
via pbzip2 or 7z when the error happened and the message was shown

Due to KMS sound (webradio streaming) was still running but I couldn't
continue work (X switching to kernel output) so I did the magic sysrq
combo (reisub)


Does that BUG message ring a bell for anyone ?

(if I should leave out anyone from the CC in the next emails or
future, please holler - I don't want to spam your inboxes)

Thanks

Matt

2011-02-14 01:24:25

by Matt

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]> wrote:
> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
> <[email protected]> wrote:
> [snip]
>>
>> If I've missed anything important, please let me know!
>>
>> Thanks again!
>> Dan
>>
>
> Hi Dan,
>
> thank you so much for answering my email in such detail !
>
> I shall pick up on that mail in my next email sending to the mailing list :)
>
>
> currently I've got a problem with btrfs which seems to get triggered
> by cleancache get-operations:
>
>
> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
> /dev/mapper/portage
> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk space caching
> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo compression
> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created ephemeral
> tmem pool, id=3
> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
> kernel paging request at 0000000001400050
> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP: [<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1] PREEMPT SMP
> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
> /sys/devices/platform/coretemp.3/temp1_input
> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in: radeon
> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
> ehci_hcd [last unloaded: tg3]
> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
> G3710
> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
> 0018:ffff880129a11b00 ?EFLAGS: 00010246
> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
> RBX: ffff88014a1ce628 RCX: 0000000000000000
> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
> RSI: ffff880129a11b70 RDI: 0000000000000006
> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
> R08: ffffffff8133eef0 R09: ffff880129a11c68
> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
> R11: 0000000000000001 R12: ffff88014a1ce780
> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
> R14: ffff88021fef9000 R15: 0000000000000000
> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
> 0000000000000000(0000) GS:ffff8800bf500000(0000)
> knlGS:0000000000000000
> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: ?0010 DS: 0000 ES:
> 0000 CR0: 000000008005003b
> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
> CR3: 0000000001c27000 CR4: 00000000000006e0
> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
> DR1: 0000000000000000 DR2: 0000000000000000
> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-transacti
> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
> Feb 14 00:39:20 lupus kernel: [ 2951.854006] ?ffff880129a11b50
> ffff880000000003 ffff88003c60a098 0000000000000003
> Feb 14 00:39:20 lupus kernel: [ 2951.854035] ?ffffffffffffffff
> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
> Feb 14 00:39:20 lupus kernel: [ 2951.854063] ?ffffffff8133e3f0
> ffffffff810e6cee 0000000000001000 0000000000000000
> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
> Feb 14 00:39:20 lupus kernel: [ 2951.854103] ?[<ffffffff810e6aaa>] ?
> cleancache_get_key+0x4a/0x60
> Feb 14 00:39:20 lupus kernel: [ 2951.854122] ?[<ffffffff8133e3f0>] ?
> btrfs_wake_function+0x0/0x20
> Feb 14 00:39:20 lupus kernel: [ 2951.854140] ?[<ffffffff810e6cee>] ?
> __cleancache_flush_inode+0x3e/0x70
> Feb 14 00:39:20 lupus kernel: [ 2951.854161] ?[<ffffffff810b34d2>] ?
> truncate_inode_pages_range+0x42/0x440
> Feb 14 00:39:20 lupus kernel: [ 2951.854182] ?[<ffffffff812f115e>] ?
> btrfs_search_slot+0x89e/0xa00
> Feb 14 00:39:20 lupus kernel: [ 2951.854201] ?[<ffffffff810c3a45>] ?
> unmap_mapping_range+0xc5/0x2a0
> Feb 14 00:39:20 lupus kernel: [ 2951.854220] ?[<ffffffff810b3930>] ?
> truncate_pagecache+0x40/0x70
> Feb 14 00:39:20 lupus kernel: [ 2951.854240] ?[<ffffffff813458b1>] ?
> btrfs_truncate_free_space_cache+0x81/0xe0
> Feb 14 00:39:20 lupus kernel: [ 2951.854261] ?[<ffffffff812fce15>] ?
> btrfs_write_dirty_block_groups+0x245/0x500
> Feb 14 00:39:20 lupus kernel: [ 2951.854283] ?[<ffffffff812fcb6a>] ?
> btrfs_run_delayed_refs+0x1ba/0x220
> Feb 14 00:39:20 lupus kernel: [ 2951.854304] ?[<ffffffff8130afff>] ?
> commit_cowonly_roots+0xff/0x1d0
> Feb 14 00:39:20 lupus kernel: [ 2951.854323] ?[<ffffffff8130c583>] ?
> btrfs_commit_transaction+0x363/0x760
> Feb 14 00:39:20 lupus kernel: [ 2951.854344] ?[<ffffffff81067ea0>] ?
> autoremove_wake_function+0x0/0x30
> Feb 14 00:39:20 lupus kernel: [ 2951.854364] ?[<ffffffff81305bc3>] ?
> transaction_kthread+0x283/0x2a0
> Feb 14 00:39:20 lupus kernel: [ 2951.854383] ?[<ffffffff81305940>] ?
> transaction_kthread+0x0/0x2a0
> Feb 14 00:39:20 lupus kernel: [ 2951.854401] ?[<ffffffff81305940>] ?
> transaction_kthread+0x0/0x2a0
> Feb 14 00:39:20 lupus kernel: [ 2951.854420] ?[<ffffffff81067a16>] ?
> kthread+0x96/0xa0
> Feb 14 00:39:20 lupus kernel: [ 2951.854437] ?[<ffffffff81003514>] ?
> kernel_thread_helper+0x4/0x10
> Feb 14 00:39:20 lupus kernel: [ 2951.854455] ?[<ffffffff81067980>] ?
> kthread+0x0/0xa0
> Feb 14 00:39:20 lupus kernel: [ 2951.854471] ?[<ffffffff81003510>] ?
> kernel_thread_helper+0x0/0x10
> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00 00
> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP ?[<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 00:39:20 lupus kernel: [ 2951.854762] ?RSP <ffff880129a11b00>
> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
> f831c5ceeaa49287 ]---
>
> in my case I had compress-force with lzo and disk_cache enabled
>
>
> another user of the kernel I'm currently running has had the same
> problem with zcache
> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
>
> (looks like in his case compression and any other fancy additional
> features weren't enabled)
>
>
> changes made by this kernel or patchset to btrfs are from
> * io-less dirty throttling patchset (44 patches)
> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
> applied in both cases)
> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
> * btrfs-unstable changes to state of
> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals btrfs
> from 2.6.38-rc4+)
>
> I haven't tried downgrading to vanilla 2.6.37 with zcache only, yet,
>
> but kind of upgraded btrfs to the latest state of the btrfs-unstable
> repository (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=summary)
> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
>
> this also didn't help and seemed to produce the same error-message
>
> so to summarize:
>
> 1) error message appearing with all 4 patchsets applied changing
> btrfs-code and compress-force=lzo and disk_cache enabled
>
> 2) error message appearing with default mount-options and btrfs from
> 2.6.37 and changes for zcache & io-less dirty throttling patchset
> applied (first 2 patch(sets)) from list)
>
>
> in my case I tried to extract / play back a 1.7 GiB tarball of my
> portage-directory (lots of small files and some tar.bzip2 archives)
> via pbzip2 or 7z when the error happened and the message was shown
>
> Due to KMS sound (webradio streaming) was still running but I couldn't
> continue work (X switching to kernel output) so I did the magic sysrq
> combo (reisub)
>
>
> Does that BUG message ring a bell for anyone ?
>
> (if I should leave out anyone from the CC in the next emails or
> future, please holler - I don't want to spam your inboxes)
>
> Thanks
>
> Matt
>


OK,

here's the output of a kernel -

staying as close to vanilla (2.6.37) as the current situation allows
(only including some corruption or leak fixes for zram & zcache and
"zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
fixes)):

so in total the following patches are included in this new kernel
(2.6.37-zcache):

zram changes:
1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
2 [PATCH] zram: fix data corruption issue
3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations

zcache:
1 zcache-linux-2.6.37-110205
2 [PATCH] staging: zcache: fix memory leak
3 [PATCH] zcache: Fix build error when sysfs is not defined

reiserfs:
1 [PATCH] reiserfs: Make sure va_end() is always called after
2 [patch] reiserfs: potential ERR_PTR dereference


the same procedure:

trying to extract the mentioned portage-tarball:

time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 | tar
-xp -C /usr/gentoo/)


this hopefully should make it easier to track down the problem:


Feb 14 01:59:59 lupus kernel: [ 364.777143] device fsid
684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
/dev/mapper/portage
Feb 14 01:59:59 lupus kernel: [ 364.844994] zcache: created ephemeral
tmem pool, id=2
Feb 14 02:02:49 lupus kernel: [ 534.577573] BUG: unable to handle
kernel paging request at 0000000037610050
Feb 14 02:02:49 lupus kernel: [ 534.577605] IP: [<ffffffff81338cbb>]
btrfs_encode_fh+0x2b/0x110
Feb 14 02:02:49 lupus kernel: [ 534.577630] PGD 0
Feb 14 02:02:49 lupus kernel: [ 534.577640] Oops: 0000 [#1] PREEMPT SMP
Feb 14 02:02:49 lupus kernel: [ 534.577665] last sysfs file:
/sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
Feb 14 02:02:49 lupus kernel: [ 534.577693] CPU 5
Feb 14 02:02:49 lupus kernel: [ 534.577701] Modules linked in: radeon
ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
ehci_hcd [last unloaded: tg3]
Feb 14 02:02:49 lupus kernel: [ 534.578114]
Feb 14 02:02:49 lupus kernel: [ 534.578124] Pid: 8285, comm: tar Not
tainted 2.6.37-zcache #2 FMP55/ipower G3710
Feb 14 02:02:49 lupus kernel: [ 534.578146] RIP:
0010:[<ffffffff81338cbb>] [<ffffffff81338cbb>]
btrfs_encode_fh+0x2b/0x110
Feb 14 02:02:49 lupus kernel: [ 534.578172] RSP:
0018:ffff88023ea9dcc8 EFLAGS: 00010246
Feb 14 02:02:49 lupus kernel: [ 534.578189] RAX: 00000000000000ff
RBX: ffff8800b8643228 RCX: 0000000000000000
Feb 14 02:02:49 lupus kernel: [ 534.578210] RDX: ffff88023ea9dd04
RSI: ffff88023ea9dd38 RDI: 0000000000000006
Feb 14 02:02:49 lupus kernel: [ 534.578230] RBP: 0000000037610000
R08: ffffffff81338c90 R09: 0000000000000000
Feb 14 02:02:49 lupus kernel: [ 534.578251] R10: 0000000000000019
R11: 0000000000000001 R12: ffff8800b8643380
Feb 14 02:02:49 lupus kernel: [ 534.578272] R13: ffff8800b8643258
R14: 00007fff806f1f00 R15: 0000000000000000
Feb 14 02:02:49 lupus kernel: [ 534.578293] FS:
00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
knlGS:0000000000000000
Feb 14 02:02:49 lupus kernel: [ 534.578317] CS: 0010 DS: 0000 ES:
0000 CR0: 0000000080050033
Feb 14 02:02:49 lupus kernel: [ 534.578334] CR2: 0000000037610050
CR3: 000000023dcef000 CR4: 00000000000006e0
Feb 14 02:02:49 lupus kernel: [ 534.578356] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
Feb 14 02:02:49 lupus kernel: [ 534.578377] DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 14 02:02:49 lupus kernel: [ 534.578398] Process tar (pid: 8285,
threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
Feb 14 02:02:49 lupus kernel: [ 534.578421] Stack:
Feb 14 02:02:49 lupus kernel: [ 534.578428] 000000013d096000
ffff88023ed84800 ffff88023ea9c000 0000000000000002
Feb 14 02:02:49 lupus kernel: [ 534.578458] ffffffffffffffff
ffffffff810e3b1a 0000000000000001 000000061e1d5240
Feb 14 02:02:49 lupus kernel: [ 534.578486] fffffffffffffffb
ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
Feb 14 02:02:49 lupus kernel: [ 534.578514] Call Trace:
Feb 14 02:02:49 lupus kernel: [ 534.578525] [<ffffffff810e3b1a>] ?
cleancache_get_key+0x4a/0x60
Feb 14 02:02:49 lupus kernel: [ 534.578544] [<ffffffff810e3d5e>] ?
__cleancache_flush_inode+0x3e/0x70
Feb 14 02:02:49 lupus kernel: [ 534.578565] [<ffffffff810b0ed2>] ?
truncate_inode_pages_range+0x42/0x440
Feb 14 02:02:49 lupus kernel: [ 534.578586] [<ffffffff81338451>] ?
btrfs_tree_unlock+0x41/0x50
Feb 14 02:02:49 lupus kernel: [ 534.578605] [<ffffffff812e4ed5>] ?
btrfs_release_path+0x15/0x70
Feb 14 02:02:49 lupus kernel: [ 534.578624] [<ffffffff8130bf29>] ?
btrfs_run_delayed_iputs+0x49/0x120
Feb 14 02:02:49 lupus kernel: [ 534.578644] [<ffffffff813107e7>] ?
btrfs_evict_inode+0x27/0x1e0
Feb 14 02:02:49 lupus kernel: [ 534.578663] [<ffffffff810fc3aa>] ?
evict+0x1a/0xa0
Feb 14 02:02:49 lupus kernel: [ 534.578678] [<ffffffff810fc6bd>] ?
iput+0x1cd/0x2b0
Feb 14 02:02:49 lupus kernel: [ 534.578694] [<ffffffff810f266f>] ?
do_unlinkat+0x12f/0x1d0
Feb 14 02:02:49 lupus kernel: [ 534.578712] [<ffffffff810027bb>] ?
system_call_fastpath+0x16/0x1b
Feb 14 02:02:49 lupus kernel: [ 534.578730] Code: 55 b8 ff 00 00 00
53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
48 89 06 84 c9 48 8b 85 68 fe ff ff
Feb 14 02:02:49 lupus kernel: [ 534.578986] RIP [<ffffffff81338cbb>]
btrfs_encode_fh+0x2b/0x110
Feb 14 02:02:49 lupus kernel: [ 534.579081] RSP <ffff88023ea9dcc8>
Feb 14 02:02:49 lupus kernel: [ 534.579093] CR2: 0000000037610050
Feb 14 02:02:49 lupus kernel: [ 534.587513] ---[ end trace
c596b12e66c0b360 ]---


for reference I've pasted it to pastebin.com:

"2.6.37_zcache_V2.patch"
http://pastebin.com/cVSkwQ6M





after the reboot I had forgotten to not mount the btrfs volume and it
threw a similar error-message again and remounted several partitions
read-only (including the system partition)
the partition with btrfs (/usr/gentoo) couldn't be unmounted since the
umount process kind of hang

so here's the error message after a reboot (might not be accurate or
kind of "skewed" since other patches are included (io-less dirty
throttling, PATCH] fix (latent?) memory corruption in
btrfs_encode_fh() and latest changes for btrfs)) but might help to get
some more evidence:


Feb 14 02:05:46 lupus kernel: [ 63.922648] device fsid
684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
/dev/mapper/portage
Feb 14 02:05:46 lupus kernel: [ 64.047118] btrfs: unlinked 1 orphans
Feb 14 02:05:46 lupus kernel: [ 64.051956] zcache: created ephemeral
tmem pool, id=3
Feb 14 02:05:48 lupus kernel: [ 65.801364] hub 2-1:1.0: hub_suspend
Feb 14 02:05:48 lupus kernel: [ 65.801376] usb 2-1: unlink
qh256-0001/ffff88023fefd180 start 1 [1/0 us]
Feb 14 02:05:48 lupus kernel: [ 65.801559] usb 2-1: usb auto-suspend
Feb 14 02:05:50 lupus kernel: [ 67.797929] hub 2-0:1.0: hub_suspend
Feb 14 02:05:50 lupus kernel: [ 67.797939] usb usb2: bus auto-suspend
Feb 14 02:05:50 lupus kernel: [ 67.797942] ehci_hcd 0000:00:1d.0:
suspend root hub
Feb 14 02:05:52 lupus kernel: [ 70.050493] BUG: unable to handle
kernel paging request at 0000030341ed0050
Feb 14 02:05:52 lupus kernel: [ 70.050670] IP: [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 02:05:52 lupus kernel: [ 70.050807] PGD 0
Feb 14 02:05:52 lupus kernel: [ 70.050929] Oops: 0000 [#1] PREEMPT SMP
Feb 14 02:05:52 lupus kernel: [ 70.051223] last sysfs file:
/sys/module/pcie_aspm/parameters/policy
Feb 14 02:05:52 lupus kernel: [ 70.051365] CPU 6
Feb 14 02:05:52 lupus kernel: [ 70.051411] Modules linked in:
ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
usb_storage ehci_hcd [last unloaded: tg3]
Feb 14 02:05:52 lupus kernel: [ 70.054694]
Feb 14 02:05:52 lupus kernel: [ 70.054776] Pid: 7962, comm: umount
Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
Feb 14 02:05:52 lupus kernel: [ 70.054912] RIP:
0010:[<ffffffff8133ef1b>] [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 02:05:52 lupus kernel: [ 70.055084] RSP:
0018:ffff88023c77d6f8 EFLAGS: 00010246
Feb 14 02:05:52 lupus kernel: [ 70.055173] RAX: 00000000000000ff
RBX: ffff88023cde0168 RCX: 0000000000000000
Feb 14 02:05:52 lupus kernel: [ 70.055265] RDX: ffff88023c77d734
RSI: ffff88023c77d768 RDI: 0000000000000006
Feb 14 02:05:52 lupus kernel: [ 70.055357] RBP: 0000030341ed0000
R08: ffffffff8133eef0 R09: ffff88023c77d8d8
Feb 14 02:05:52 lupus kernel: [ 70.055448] R10: 0000000000000003
R11: 0000000000000001 R12: 00000000ffffffff
Feb 14 02:05:52 lupus kernel: [ 70.055540] R13: ffff88023cde0030
R14: ffffea0007dd39f0 R15: 0000000000000001
Feb 14 02:05:52 lupus kernel: [ 70.055633] FS:
00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
knlGS:0000000000000000
Feb 14 02:05:52 lupus kernel: [ 70.055762] CS: 0010 DS: 0000 ES:
0000 CR0: 000000008005003b
Feb 14 02:05:52 lupus kernel: [ 70.055851] CR2: 0000030341ed0050
CR3: 000000023c7d5000 CR4: 00000000000006e0
Feb 14 02:05:52 lupus kernel: [ 70.055943] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
Feb 14 02:05:52 lupus kernel: [ 70.056035] DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 14 02:05:52 lupus kernel: [ 70.056128] Process umount (pid:
7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
Feb 14 02:05:52 lupus kernel: [ 70.056257] Stack:
Feb 14 02:05:52 lupus kernel: [ 70.056338] 0000000000000000
0000000000000002 ffff880200000000 0000000000000003
Feb 14 02:05:52 lupus kernel: [ 70.056630] ffffea0007dd39f0
ffffffff810e6aaa ffff880200000041 0000000600000246
Feb 14 02:05:52 lupus kernel: [ 70.056922] ffff88023cdcd300
ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
Feb 14 02:05:52 lupus kernel: [ 70.057213] Call Trace:
Feb 14 02:05:52 lupus kernel: [ 70.057301] [<ffffffff810e6aaa>] ?
cleancache_get_key+0x4a/0x60
Feb 14 02:05:52 lupus kernel: [ 70.057393] [<ffffffff810e6b3a>] ?
__cleancache_get_page+0x7a/0xd0
Feb 14 02:05:52 lupus kernel: [ 70.057487] [<ffffffff8132bb7c>] ?
merge_state+0x7c/0x150
Feb 14 02:05:52 lupus kernel: [ 70.057579] [<ffffffff8132e4de>] ?
__extent_read_full_page+0x52e/0x710
Feb 14 02:05:52 lupus kernel: [ 70.057673] [<ffffffff813bdea4>] ?
rb_insert_color+0xa4/0x140
Feb 14 02:05:52 lupus kernel: [ 70.057766] [<ffffffff8134b0b6>] ?
tree_insert+0x86/0x1e0
Feb 14 02:05:52 lupus kernel: [ 70.057859] [<ffffffff81058c73>] ?
lock_timer_base.clone.22+0x33/0x70
Feb 14 02:05:52 lupus kernel: [ 70.058004] [<ffffffff81305060>] ?
btree_get_extent+0x0/0x1c0
Feb 14 02:05:52 lupus kernel: [ 70.058097] [<ffffffff81330b21>] ?
read_extent_buffer_pages+0x2d1/0x470
Feb 14 02:05:52 lupus kernel: [ 70.058191] [<ffffffff81305060>] ?
btree_get_extent+0x0/0x1c0
Feb 14 02:05:52 lupus kernel: [ 70.058283] [<ffffffff8130674d>] ?
btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
Feb 14 02:05:52 lupus kernel: [ 70.058415] [<ffffffff813076f9>] ?
read_tree_block+0x39/0x60
Feb 14 02:05:52 lupus kernel: [ 70.058508] [<ffffffff812ed5e6>] ?
read_block_for_search.clone.40+0x116/0x410
Feb 14 02:05:52 lupus kernel: [ 70.058638] [<ffffffff812eb228>] ?
btrfs_cow_block+0x118/0x2b0
Feb 14 02:05:52 lupus kernel: [ 70.058731] [<ffffffff812f0bc7>] ?
btrfs_search_slot+0x307/0xa00
Feb 14 02:05:52 lupus kernel: [ 70.058823] [<ffffffff812f6b18>] ?
lookup_inline_extent_backref+0x98/0x4a0
Feb 14 02:05:52 lupus kernel: [ 70.058919] [<ffffffff810e33d7>] ?
kmem_cache_alloc+0x87/0xa0
Feb 14 02:05:52 lupus kernel: [ 70.059032] [<ffffffff812f891c>] ?
__btrfs_free_extent+0xcc/0x6f0
Feb 14 02:05:52 lupus kernel: [ 70.059125] [<ffffffff812fc4cf>] ?
run_clustered_refs+0x39f/0x880
Feb 14 02:05:52 lupus kernel: [ 70.059220] [<ffffffff810b1f98>] ?
pagevec_lookup_tag+0x18/0x20
Feb 14 02:05:52 lupus kernel: [ 70.059312] [<ffffffff810a7c81>] ?
filemap_fdatawait_range+0x91/0x180
Feb 14 02:05:52 lupus kernel: [ 70.059405] [<ffffffff812fca77>] ?
btrfs_run_delayed_refs+0xc7/0x220
Feb 14 02:05:52 lupus kernel: [ 70.059498] [<ffffffff8130c29c>] ?
btrfs_commit_transaction+0x7c/0x760
Feb 14 02:05:52 lupus kernel: [ 70.059591] [<ffffffff81067ea0>] ?
autoremove_wake_function+0x0/0x30
Feb 14 02:05:52 lupus kernel: [ 70.059683] [<ffffffff8130cdef>] ?
start_transaction+0x1bf/0x270
Feb 14 02:05:52 lupus kernel: [ 70.059775] [<ffffffff8110e96a>] ?
__sync_filesystem+0x5a/0x90
Feb 14 02:05:52 lupus kernel: [ 70.059867] [<ffffffff810eae8d>] ?
generic_shutdown_super+0x2d/0x100
Feb 14 02:05:52 lupus kernel: [ 70.059960] [<ffffffff810eafb9>] ?
kill_anon_super+0x9/0x50
Feb 14 02:05:52 lupus kernel: [ 70.060051] [<ffffffff810eb266>] ?
deactivate_locked_super+0x26/0x80
Feb 14 02:05:52 lupus kernel: [ 70.060144] [<ffffffff811043ea>] ?
sys_umount+0x7a/0x390
Feb 14 02:05:52 lupus kernel: [ 70.060235] [<ffffffff810027bb>] ?
system_call_fastpath+0x16/0x1b
Feb 14 02:05:52 lupus kernel: [ 70.060325] Code: 55 b8 ff 00 00 00
53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
48 89 06 84 c9 48 8b 85 68 fe ff ff
Feb 14 02:05:52 lupus kernel: [ 70.063170] RIP [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 02:05:52 lupus kernel: [ 70.063302] RSP <ffff88023c77d6f8>
Feb 14 02:05:52 lupus kernel: [ 70.063386] CR2: 0000030341ed0050
Feb 14 02:05:52 lupus kernel: [ 70.063528] ---[ end trace
3313552d105b1535 ]---
Feb 14 02:06:16 lupus kernel: [ 93.961960] BUG: unable to handle
kernel paging request at 0000030341ed0050
Feb 14 02:06:16 lupus kernel: [ 93.962171] IP: [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 02:06:16 lupus kernel: [ 93.962307] PGD 0
Feb 14 02:06:16 lupus kernel: [ 93.962430] Oops: 0000 [#2] PREEMPT SMP
Feb 14 02:06:16 lupus kernel: [ 93.962637] last sysfs file:
/sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
Feb 14 02:06:16 lupus kernel: [ 93.962766] CPU 5
Feb 14 02:06:16 lupus kernel: [ 93.962812] Modules linked in:
ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
usb_storage ehci_hcd [last unloaded: tg3]
Feb 14 02:06:16 lupus kernel: [ 93.966044]
Feb 14 02:06:16 lupus kernel: [ 93.966127] Pid: 7915, comm:
btrfs-transacti Tainted: G D 2.6.37-plus_v16_zcache #4
FMP55/ipower G3710
Feb 14 02:06:16 lupus kernel: [ 93.966266] RIP:
0010:[<ffffffff8133ef1b>] [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 02:06:16 lupus kernel: [ 93.966440] RSP:
0018:ffff88023c63b6e0 EFLAGS: 00010246
Feb 14 02:06:16 lupus kernel: [ 93.966528] RAX: 00000000000000ff
RBX: ffff88023cde0168 RCX: 0000000000000000
Feb 14 02:06:16 lupus kernel: [ 93.966620] RDX: ffff88023c63b71c
RSI: ffff88023c63b750 RDI: 0000000000000006
Feb 14 02:06:16 lupus kernel: [ 93.966713] RBP: 0000030341ed0000
R08: ffffffff8133eef0 R09: ffff88023c63b8c0
Feb 14 02:06:16 lupus kernel: [ 93.966805] R10: 0000000000000003
R11: 0000000000000001 R12: 00000000ffffffff
Feb 14 02:06:16 lupus kernel: [ 93.966897] R13: ffff88023cde0030
R14: ffffea0007d59bc8 R15: 0000000000000001
Feb 14 02:06:16 lupus kernel: [ 93.966990] FS:
0000000000000000(0000) GS:ffff8800bf540000(0000)
knlGS:0000000000000000
Feb 14 02:06:16 lupus kernel: [ 93.967120] CS: 0010 DS: 0000 ES:
0000 CR0: 000000008005003b
Feb 14 02:06:16 lupus kernel: [ 93.967209] CR2: 0000030341ed0050
CR3: 0000000001c27000 CR4: 00000000000006e0
Feb 14 02:06:16 lupus kernel: [ 93.967302] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
Feb 14 02:06:16 lupus kernel: [ 93.967394] DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 14 02:06:16 lupus kernel: [ 93.967500] Process btrfs-transacti
(pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
Feb 14 02:06:16 lupus kernel: [ 93.967630] Stack:
Feb 14 02:06:16 lupus kernel: [ 93.967711] 0000000000000000
0000000000000002 0000000000000000 0000000000000003
Feb 14 02:06:16 lupus kernel: [ 93.968057] ffffea0007d59bc8
ffffffff810e6aaa 0000000000000041 0000000600000002
Feb 14 02:06:16 lupus kernel: [ 93.968348] 0000000000000000
ffffffff810e6b3a 0000000000000001 ffffffff00000001
Feb 14 02:06:16 lupus kernel: [ 93.968639] Call Trace:
Feb 14 02:06:16 lupus kernel: [ 93.968728] [<ffffffff810e6aaa>] ?
cleancache_get_key+0x4a/0x60
Feb 14 02:06:16 lupus kernel: [ 93.968820] [<ffffffff810e6b3a>] ?
__cleancache_get_page+0x7a/0xd0
Feb 14 02:06:16 lupus kernel: [ 93.968914] [<ffffffff8132e4de>] ?
__extent_read_full_page+0x52e/0x710
Feb 14 02:06:16 lupus kernel: [ 93.969008] [<ffffffff812f3f93>] ?
update_reserved_bytes+0xb3/0x140
Feb 14 02:06:16 lupus kernel: [ 93.969102] [<ffffffff81305060>] ?
btree_get_extent+0x0/0x1c0
Feb 14 02:06:16 lupus kernel: [ 93.969193] [<ffffffff8132bb7c>] ?
merge_state+0x7c/0x150
Feb 14 02:06:16 lupus kernel: [ 93.969285] [<ffffffff81330b21>] ?
read_extent_buffer_pages+0x2d1/0x470
Feb 14 02:06:16 lupus kernel: [ 93.969378] [<ffffffff81305060>] ?
btree_get_extent+0x0/0x1c0
Feb 14 02:06:16 lupus kernel: [ 93.969470] [<ffffffff8130674d>] ?
btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
Feb 14 02:06:16 lupus kernel: [ 93.969602] [<ffffffff813076f9>] ?
read_tree_block+0x39/0x60
Feb 14 02:06:16 lupus kernel: [ 93.969694] [<ffffffff812ed5e6>] ?
read_block_for_search.clone.40+0x116/0x410
Feb 14 02:06:16 lupus kernel: [ 93.969878] [<ffffffff812f0bc7>] ?
btrfs_search_slot+0x307/0xa00
Feb 14 02:06:16 lupus kernel: [ 93.969970] [<ffffffff812f6b18>] ?
lookup_inline_extent_backref+0x98/0x4a0
Feb 14 02:06:16 lupus kernel: [ 93.970065] [<ffffffff810e33d7>] ?
kmem_cache_alloc+0x87/0xa0
Feb 14 02:06:16 lupus kernel: [ 93.970157] [<ffffffff812f891c>] ?
__btrfs_free_extent+0xcc/0x6f0
Feb 14 02:06:16 lupus kernel: [ 93.970249] [<ffffffff812f8434>] ?
update_block_group.clone.62+0xc4/0x280
Feb 14 02:06:16 lupus kernel: [ 93.970343] [<ffffffff812fc4cf>] ?
run_clustered_refs+0x39f/0x880
Feb 14 02:06:16 lupus kernel: [ 93.970436] [<ffffffff812fca77>] ?
btrfs_run_delayed_refs+0xc7/0x220
Feb 14 02:06:16 lupus kernel: [ 93.970529] [<ffffffff810e15f9>] ?
new_slab+0x169/0x1f0
Feb 14 02:06:16 lupus kernel: [ 93.970619] [<ffffffff8130c29c>] ?
btrfs_commit_transaction+0x7c/0x760
Feb 14 02:06:16 lupus kernel: [ 93.970713] [<ffffffff81067ea0>] ?
autoremove_wake_function+0x0/0x30
Feb 14 02:06:16 lupus kernel: [ 93.970806] [<ffffffff81305bc3>] ?
transaction_kthread+0x283/0x2a0
Feb 14 02:06:16 lupus kernel: [ 93.970898] [<ffffffff81305940>] ?
transaction_kthread+0x0/0x2a0
Feb 14 02:06:16 lupus kernel: [ 93.970990] [<ffffffff81305940>] ?
transaction_kthread+0x0/0x2a0
Feb 14 02:06:16 lupus kernel: [ 93.971083] [<ffffffff81067a16>] ?
kthread+0x96/0xa0
Feb 14 02:06:16 lupus kernel: [ 93.971174] [<ffffffff81003514>] ?
kernel_thread_helper+0x4/0x10
Feb 14 02:06:16 lupus kernel: [ 93.971266] [<ffffffff81067980>] ?
kthread+0x0/0xa0
Feb 14 02:06:16 lupus kernel: [ 93.971355] [<ffffffff81003510>] ?
kernel_thread_helper+0x0/0x10
Feb 14 02:06:16 lupus kernel: [ 93.971444] Code: 55 b8 ff 00 00 00
53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
48 89 06 84 c9 48 8b 85 68 fe ff ff
Feb 14 02:06:16 lupus kernel: [ 93.974280] RIP [<ffffffff8133ef1b>]
btrfs_encode_fh+0x2b/0x120
Feb 14 02:06:16 lupus kernel: [ 93.974412] RSP <ffff88023c63b6e0>
Feb 14 02:06:16 lupus kernel: [ 93.974497] CR2: 0000030341ed0050
Feb 14 02:06:16 lupus kernel: [ 93.974599] ---[ end trace
3313552d105b1536 ]---
Feb 14 02:07:04 lupus kernel: [ 141.906124] zcache: destroyed pool id=2
Feb 14 02:07:17 lupus kernel: [ 154.783358] SysRq : Keyboard mode set
to system default
Feb 14 02:07:18 lupus kernel: [ 155.486147] SysRq : Terminate All Tasks


That's all for now

Thanks & Regards

Matt

2011-02-14 01:30:07

by Matt

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Mon, Feb 14, 2011 at 1:24 AM, Matt <[email protected]> wrote:
> On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]> wrote:
>> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
>> <[email protected]> wrote:
>> [snip]
>>>
>>> If I've missed anything important, please let me know!
>>>
>>> Thanks again!
>>> Dan
>>>
>>
>> Hi Dan,
>>
>> thank you so much for answering my email in such detail !
>>
>> I shall pick up on that mail in my next email sending to the mailing list :)
>>
>>
>> currently I've got a problem with btrfs which seems to get triggered
>> by cleancache get-operations:
>>
>>
>> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
>> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
>> /dev/mapper/portage
>> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk space caching
>> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo compression
>> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created ephemeral
>> tmem pool, id=3
>> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
>> kernel paging request at 0000000001400050
>> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP: [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
>> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1] PREEMPT SMP
>> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
>> /sys/devices/platform/coretemp.3/temp1_input
>> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
>> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in: radeon
>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>> ehci_hcd [last unloaded: tg3]
>> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
>> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
>> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
>> G3710
>> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
>> 0018:ffff880129a11b00 ?EFLAGS: 00010246
>> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
>> RBX: ffff88014a1ce628 RCX: 0000000000000000
>> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
>> RSI: ffff880129a11b70 RDI: 0000000000000006
>> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
>> R08: ffffffff8133eef0 R09: ffff880129a11c68
>> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
>> R11: 0000000000000001 R12: ffff88014a1ce780
>> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
>> R14: ffff88021fef9000 R15: 0000000000000000
>> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
>> 0000000000000000(0000) GS:ffff8800bf500000(0000)
>> knlGS:0000000000000000
>> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: ?0010 DS: 0000 ES:
>> 0000 CR0: 000000008005003b
>> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
>> CR3: 0000000001c27000 CR4: 00000000000006e0
>> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-transacti
>> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
>> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
>> Feb 14 00:39:20 lupus kernel: [ 2951.854006] ?ffff880129a11b50
>> ffff880000000003 ffff88003c60a098 0000000000000003
>> Feb 14 00:39:20 lupus kernel: [ 2951.854035] ?ffffffffffffffff
>> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
>> Feb 14 00:39:20 lupus kernel: [ 2951.854063] ?ffffffff8133e3f0
>> ffffffff810e6cee 0000000000001000 0000000000000000
>> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
>> Feb 14 00:39:20 lupus kernel: [ 2951.854103] ?[<ffffffff810e6aaa>] ?
>> cleancache_get_key+0x4a/0x60
>> Feb 14 00:39:20 lupus kernel: [ 2951.854122] ?[<ffffffff8133e3f0>] ?
>> btrfs_wake_function+0x0/0x20
>> Feb 14 00:39:20 lupus kernel: [ 2951.854140] ?[<ffffffff810e6cee>] ?
>> __cleancache_flush_inode+0x3e/0x70
>> Feb 14 00:39:20 lupus kernel: [ 2951.854161] ?[<ffffffff810b34d2>] ?
>> truncate_inode_pages_range+0x42/0x440
>> Feb 14 00:39:20 lupus kernel: [ 2951.854182] ?[<ffffffff812f115e>] ?
>> btrfs_search_slot+0x89e/0xa00
>> Feb 14 00:39:20 lupus kernel: [ 2951.854201] ?[<ffffffff810c3a45>] ?
>> unmap_mapping_range+0xc5/0x2a0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854220] ?[<ffffffff810b3930>] ?
>> truncate_pagecache+0x40/0x70
>> Feb 14 00:39:20 lupus kernel: [ 2951.854240] ?[<ffffffff813458b1>] ?
>> btrfs_truncate_free_space_cache+0x81/0xe0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854261] ?[<ffffffff812fce15>] ?
>> btrfs_write_dirty_block_groups+0x245/0x500
>> Feb 14 00:39:20 lupus kernel: [ 2951.854283] ?[<ffffffff812fcb6a>] ?
>> btrfs_run_delayed_refs+0x1ba/0x220
>> Feb 14 00:39:20 lupus kernel: [ 2951.854304] ?[<ffffffff8130afff>] ?
>> commit_cowonly_roots+0xff/0x1d0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854323] ?[<ffffffff8130c583>] ?
>> btrfs_commit_transaction+0x363/0x760
>> Feb 14 00:39:20 lupus kernel: [ 2951.854344] ?[<ffffffff81067ea0>] ?
>> autoremove_wake_function+0x0/0x30
>> Feb 14 00:39:20 lupus kernel: [ 2951.854364] ?[<ffffffff81305bc3>] ?
>> transaction_kthread+0x283/0x2a0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854383] ?[<ffffffff81305940>] ?
>> transaction_kthread+0x0/0x2a0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854401] ?[<ffffffff81305940>] ?
>> transaction_kthread+0x0/0x2a0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854420] ?[<ffffffff81067a16>] ?
>> kthread+0x96/0xa0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854437] ?[<ffffffff81003514>] ?
>> kernel_thread_helper+0x4/0x10
>> Feb 14 00:39:20 lupus kernel: [ 2951.854455] ?[<ffffffff81067980>] ?
>> kthread+0x0/0xa0
>> Feb 14 00:39:20 lupus kernel: [ 2951.854471] ?[<ffffffff81003510>] ?
>> kernel_thread_helper+0x0/0x10
>> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00 00
>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP ?[<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 00:39:20 lupus kernel: [ 2951.854762] ?RSP <ffff880129a11b00>
>> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
>> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
>> f831c5ceeaa49287 ]---
>>
>> in my case I had compress-force with lzo and disk_cache enabled
>>
>>
>> another user of the kernel I'm currently running has had the same
>> problem with zcache
>> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
>>
>> (looks like in his case compression and any other fancy additional
>> features weren't enabled)
>>
>>
>> changes made by this kernel or patchset to btrfs are from
>> * io-less dirty throttling patchset (44 patches)
>> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
>> applied in both cases)
>> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
>> * btrfs-unstable changes to state of
>> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals btrfs
>> from 2.6.38-rc4+)
>>
>> I haven't tried downgrading to vanilla 2.6.37 with zcache only, yet,
>>
>> but kind of upgraded btrfs to the latest state of the btrfs-unstable
>> repository (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=summary)
>> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
>>
>> this also didn't help and seemed to produce the same error-message
>>
>> so to summarize:
>>
>> 1) error message appearing with all 4 patchsets applied changing
>> btrfs-code and compress-force=lzo and disk_cache enabled
>>
>> 2) error message appearing with default mount-options and btrfs from
>> 2.6.37 and changes for zcache & io-less dirty throttling patchset
>> applied (first 2 patch(sets)) from list)
>>
>>
>> in my case I tried to extract / play back a 1.7 GiB tarball of my
>> portage-directory (lots of small files and some tar.bzip2 archives)
>> via pbzip2 or 7z when the error happened and the message was shown
>>
>> Due to KMS sound (webradio streaming) was still running but I couldn't
>> continue work (X switching to kernel output) so I did the magic sysrq
>> combo (reisub)
>>
>>
>> Does that BUG message ring a bell for anyone ?
>>
>> (if I should leave out anyone from the CC in the next emails or
>> future, please holler - I don't want to spam your inboxes)
>>
>> Thanks
>>
>> Matt
>>
>
>
> OK,
>
> here's the output of a kernel -
>
> staying as close to vanilla (2.6.37) as the current situation allows
> (only including some corruption or leak fixes for zram & zcache and
> "zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
> fixes)):
>
> so in total the following patches are included in this new kernel
> (2.6.37-zcache):
>
> zram changes:
> 1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
> 2 [PATCH] zram: fix data corruption issue
> 3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations
>
> zcache:
> 1 zcache-linux-2.6.37-110205
> 2 [PATCH] staging: zcache: fix memory leak
> 3 [PATCH] zcache: Fix build error when sysfs is not defined
>
> reiserfs:
> 1 [PATCH] reiserfs: Make sure va_end() is always called after
> 2 [patch] reiserfs: potential ERR_PTR dereference
>
>
> the same procedure:
>
> trying to extract the mentioned portage-tarball:
>
> time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 | tar
> -xp -C /usr/gentoo/)
>
>
> this hopefully should make it easier to track down the problem:
>
>
> Feb 14 01:59:59 lupus kernel: [ ?364.777143] device fsid
> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
> /dev/mapper/portage
> Feb 14 01:59:59 lupus kernel: [ ?364.844994] zcache: created ephemeral
> tmem pool, id=2
> Feb 14 02:02:49 lupus kernel: [ ?534.577573] BUG: unable to handle
> kernel paging request at 0000000037610050
> Feb 14 02:02:49 lupus kernel: [ ?534.577605] IP: [<ffffffff81338cbb>]
> btrfs_encode_fh+0x2b/0x110
> Feb 14 02:02:49 lupus kernel: [ ?534.577630] PGD 0
> Feb 14 02:02:49 lupus kernel: [ ?534.577640] Oops: 0000 [#1] PREEMPT SMP
> Feb 14 02:02:49 lupus kernel: [ ?534.577665] last sysfs file:
> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> Feb 14 02:02:49 lupus kernel: [ ?534.577693] CPU 5
> Feb 14 02:02:49 lupus kernel: [ ?534.577701] Modules linked in: radeon
> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
> snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
> ehci_hcd [last unloaded: tg3]
> Feb 14 02:02:49 lupus kernel: [ ?534.578114]
> Feb 14 02:02:49 lupus kernel: [ ?534.578124] Pid: 8285, comm: tar Not
> tainted 2.6.37-zcache #2 FMP55/ipower G3710
> Feb 14 02:02:49 lupus kernel: [ ?534.578146] RIP:
> 0010:[<ffffffff81338cbb>] ?[<ffffffff81338cbb>]
> btrfs_encode_fh+0x2b/0x110
> Feb 14 02:02:49 lupus kernel: [ ?534.578172] RSP:
> 0018:ffff88023ea9dcc8 ?EFLAGS: 00010246
> Feb 14 02:02:49 lupus kernel: [ ?534.578189] RAX: 00000000000000ff
> RBX: ffff8800b8643228 RCX: 0000000000000000
> Feb 14 02:02:49 lupus kernel: [ ?534.578210] RDX: ffff88023ea9dd04
> RSI: ffff88023ea9dd38 RDI: 0000000000000006
> Feb 14 02:02:49 lupus kernel: [ ?534.578230] RBP: 0000000037610000
> R08: ffffffff81338c90 R09: 0000000000000000
> Feb 14 02:02:49 lupus kernel: [ ?534.578251] R10: 0000000000000019
> R11: 0000000000000001 R12: ffff8800b8643380
> Feb 14 02:02:49 lupus kernel: [ ?534.578272] R13: ffff8800b8643258
> R14: 00007fff806f1f00 R15: 0000000000000000
> Feb 14 02:02:49 lupus kernel: [ ?534.578293] FS:
> 00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
> knlGS:0000000000000000
> Feb 14 02:02:49 lupus kernel: [ ?534.578317] CS: ?0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Feb 14 02:02:49 lupus kernel: [ ?534.578334] CR2: 0000000037610050
> CR3: 000000023dcef000 CR4: 00000000000006e0
> Feb 14 02:02:49 lupus kernel: [ ?534.578356] DR0: 0000000000000000
> DR1: 0000000000000000 DR2: 0000000000000000
> Feb 14 02:02:49 lupus kernel: [ ?534.578377] DR3: 0000000000000000
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Feb 14 02:02:49 lupus kernel: [ ?534.578398] Process tar (pid: 8285,
> threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
> Feb 14 02:02:49 lupus kernel: [ ?534.578421] Stack:
> Feb 14 02:02:49 lupus kernel: [ ?534.578428] ?000000013d096000
> ffff88023ed84800 ffff88023ea9c000 0000000000000002
> Feb 14 02:02:49 lupus kernel: [ ?534.578458] ?ffffffffffffffff
> ffffffff810e3b1a 0000000000000001 000000061e1d5240
> Feb 14 02:02:49 lupus kernel: [ ?534.578486] ?fffffffffffffffb
> ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
> Feb 14 02:02:49 lupus kernel: [ ?534.578514] Call Trace:
> Feb 14 02:02:49 lupus kernel: [ ?534.578525] ?[<ffffffff810e3b1a>] ?
> cleancache_get_key+0x4a/0x60
> Feb 14 02:02:49 lupus kernel: [ ?534.578544] ?[<ffffffff810e3d5e>] ?
> __cleancache_flush_inode+0x3e/0x70
> Feb 14 02:02:49 lupus kernel: [ ?534.578565] ?[<ffffffff810b0ed2>] ?
> truncate_inode_pages_range+0x42/0x440
> Feb 14 02:02:49 lupus kernel: [ ?534.578586] ?[<ffffffff81338451>] ?
> btrfs_tree_unlock+0x41/0x50
> Feb 14 02:02:49 lupus kernel: [ ?534.578605] ?[<ffffffff812e4ed5>] ?
> btrfs_release_path+0x15/0x70
> Feb 14 02:02:49 lupus kernel: [ ?534.578624] ?[<ffffffff8130bf29>] ?
> btrfs_run_delayed_iputs+0x49/0x120
> Feb 14 02:02:49 lupus kernel: [ ?534.578644] ?[<ffffffff813107e7>] ?
> btrfs_evict_inode+0x27/0x1e0
> Feb 14 02:02:49 lupus kernel: [ ?534.578663] ?[<ffffffff810fc3aa>] ?
> evict+0x1a/0xa0
> Feb 14 02:02:49 lupus kernel: [ ?534.578678] ?[<ffffffff810fc6bd>] ?
> iput+0x1cd/0x2b0
> Feb 14 02:02:49 lupus kernel: [ ?534.578694] ?[<ffffffff810f266f>] ?
> do_unlinkat+0x12f/0x1d0
> Feb 14 02:02:49 lupus kernel: [ ?534.578712] ?[<ffffffff810027bb>] ?
> system_call_fastpath+0x16/0x1b
> Feb 14 02:02:49 lupus kernel: [ ?534.578730] Code: 55 b8 ff 00 00 00
> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> Feb 14 02:02:49 lupus kernel: [ ?534.578986] RIP ?[<ffffffff81338cbb>]
> btrfs_encode_fh+0x2b/0x110
> Feb 14 02:02:49 lupus kernel: [ ?534.579081] ?RSP <ffff88023ea9dcc8>
> Feb 14 02:02:49 lupus kernel: [ ?534.579093] CR2: 0000000037610050
> Feb 14 02:02:49 lupus kernel: [ ?534.587513] ---[ end trace
> c596b12e66c0b360 ]---
>
>
> for reference I've pasted it to pastebin.com:
>
> "2.6.37_zcache_V2.patch"
> http://pastebin.com/cVSkwQ6M
>
>
>
>
>
> after the reboot I had forgotten to not mount the btrfs volume and it
> threw a similar error-message again and remounted several partitions
> read-only (including the system partition)
> the partition with btrfs (/usr/gentoo) couldn't be unmounted since the
> umount process kind of hang
>
> so here's the error message after a reboot (might not be accurate or
> kind of "skewed" since other patches are included (io-less dirty
> throttling, PATCH] fix (latent?) memory corruption in
> btrfs_encode_fh() and latest changes for btrfs)) but might help to get
> some more evidence:
>
>
> Feb 14 02:05:46 lupus kernel: [ ? 63.922648] device fsid
> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
> /dev/mapper/portage
> Feb 14 02:05:46 lupus kernel: [ ? 64.047118] btrfs: unlinked 1 orphans
> Feb 14 02:05:46 lupus kernel: [ ? 64.051956] zcache: created ephemeral
> tmem pool, id=3
> Feb 14 02:05:48 lupus kernel: [ ? 65.801364] hub 2-1:1.0: hub_suspend
> Feb 14 02:05:48 lupus kernel: [ ? 65.801376] usb 2-1: unlink
> qh256-0001/ffff88023fefd180 start 1 [1/0 us]
> Feb 14 02:05:48 lupus kernel: [ ? 65.801559] usb 2-1: usb auto-suspend
> Feb 14 02:05:50 lupus kernel: [ ? 67.797929] hub 2-0:1.0: hub_suspend
> Feb 14 02:05:50 lupus kernel: [ ? 67.797939] usb usb2: bus auto-suspend
> Feb 14 02:05:50 lupus kernel: [ ? 67.797942] ehci_hcd 0000:00:1d.0:
> suspend root hub
> Feb 14 02:05:52 lupus kernel: [ ? 70.050493] BUG: unable to handle
> kernel paging request at 0000030341ed0050
> Feb 14 02:05:52 lupus kernel: [ ? 70.050670] IP: [<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 02:05:52 lupus kernel: [ ? 70.050807] PGD 0
> Feb 14 02:05:52 lupus kernel: [ ? 70.050929] Oops: 0000 [#1] PREEMPT SMP
> Feb 14 02:05:52 lupus kernel: [ ? 70.051223] last sysfs file:
> /sys/module/pcie_aspm/parameters/policy
> Feb 14 02:05:52 lupus kernel: [ ? 70.051365] CPU 6
> Feb 14 02:05:52 lupus kernel: [ ? 70.051411] Modules linked in:
> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
> usb_storage ehci_hcd [last unloaded: tg3]
> Feb 14 02:05:52 lupus kernel: [ ? 70.054694]
> Feb 14 02:05:52 lupus kernel: [ ? 70.054776] Pid: 7962, comm: umount
> Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
> Feb 14 02:05:52 lupus kernel: [ ? 70.054912] RIP:
> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 02:05:52 lupus kernel: [ ? 70.055084] RSP:
> 0018:ffff88023c77d6f8 ?EFLAGS: 00010246
> Feb 14 02:05:52 lupus kernel: [ ? 70.055173] RAX: 00000000000000ff
> RBX: ffff88023cde0168 RCX: 0000000000000000
> Feb 14 02:05:52 lupus kernel: [ ? 70.055265] RDX: ffff88023c77d734
> RSI: ffff88023c77d768 RDI: 0000000000000006
> Feb 14 02:05:52 lupus kernel: [ ? 70.055357] RBP: 0000030341ed0000
> R08: ffffffff8133eef0 R09: ffff88023c77d8d8
> Feb 14 02:05:52 lupus kernel: [ ? 70.055448] R10: 0000000000000003
> R11: 0000000000000001 R12: 00000000ffffffff
> Feb 14 02:05:52 lupus kernel: [ ? 70.055540] R13: ffff88023cde0030
> R14: ffffea0007dd39f0 R15: 0000000000000001
> Feb 14 02:05:52 lupus kernel: [ ? 70.055633] FS:
> 00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
> knlGS:0000000000000000
> Feb 14 02:05:52 lupus kernel: [ ? 70.055762] CS: ?0010 DS: 0000 ES:
> 0000 CR0: 000000008005003b
> Feb 14 02:05:52 lupus kernel: [ ? 70.055851] CR2: 0000030341ed0050
> CR3: 000000023c7d5000 CR4: 00000000000006e0
> Feb 14 02:05:52 lupus kernel: [ ? 70.055943] DR0: 0000000000000000
> DR1: 0000000000000000 DR2: 0000000000000000
> Feb 14 02:05:52 lupus kernel: [ ? 70.056035] DR3: 0000000000000000
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Feb 14 02:05:52 lupus kernel: [ ? 70.056128] Process umount (pid:
> 7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
> Feb 14 02:05:52 lupus kernel: [ ? 70.056257] Stack:
> Feb 14 02:05:52 lupus kernel: [ ? 70.056338] ?0000000000000000
> 0000000000000002 ffff880200000000 0000000000000003
> Feb 14 02:05:52 lupus kernel: [ ? 70.056630] ?ffffea0007dd39f0
> ffffffff810e6aaa ffff880200000041 0000000600000246
> Feb 14 02:05:52 lupus kernel: [ ? 70.056922] ?ffff88023cdcd300
> ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
> Feb 14 02:05:52 lupus kernel: [ ? 70.057213] Call Trace:
> Feb 14 02:05:52 lupus kernel: [ ? 70.057301] ?[<ffffffff810e6aaa>] ?
> cleancache_get_key+0x4a/0x60
> Feb 14 02:05:52 lupus kernel: [ ? 70.057393] ?[<ffffffff810e6b3a>] ?
> __cleancache_get_page+0x7a/0xd0
> Feb 14 02:05:52 lupus kernel: [ ? 70.057487] ?[<ffffffff8132bb7c>] ?
> merge_state+0x7c/0x150
> Feb 14 02:05:52 lupus kernel: [ ? 70.057579] ?[<ffffffff8132e4de>] ?
> __extent_read_full_page+0x52e/0x710
> Feb 14 02:05:52 lupus kernel: [ ? 70.057673] ?[<ffffffff813bdea4>] ?
> rb_insert_color+0xa4/0x140
> Feb 14 02:05:52 lupus kernel: [ ? 70.057766] ?[<ffffffff8134b0b6>] ?
> tree_insert+0x86/0x1e0
> Feb 14 02:05:52 lupus kernel: [ ? 70.057859] ?[<ffffffff81058c73>] ?
> lock_timer_base.clone.22+0x33/0x70
> Feb 14 02:05:52 lupus kernel: [ ? 70.058004] ?[<ffffffff81305060>] ?
> btree_get_extent+0x0/0x1c0
> Feb 14 02:05:52 lupus kernel: [ ? 70.058097] ?[<ffffffff81330b21>] ?
> read_extent_buffer_pages+0x2d1/0x470
> Feb 14 02:05:52 lupus kernel: [ ? 70.058191] ?[<ffffffff81305060>] ?
> btree_get_extent+0x0/0x1c0
> Feb 14 02:05:52 lupus kernel: [ ? 70.058283] ?[<ffffffff8130674d>] ?
> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
> Feb 14 02:05:52 lupus kernel: [ ? 70.058415] ?[<ffffffff813076f9>] ?
> read_tree_block+0x39/0x60
> Feb 14 02:05:52 lupus kernel: [ ? 70.058508] ?[<ffffffff812ed5e6>] ?
> read_block_for_search.clone.40+0x116/0x410
> Feb 14 02:05:52 lupus kernel: [ ? 70.058638] ?[<ffffffff812eb228>] ?
> btrfs_cow_block+0x118/0x2b0
> Feb 14 02:05:52 lupus kernel: [ ? 70.058731] ?[<ffffffff812f0bc7>] ?
> btrfs_search_slot+0x307/0xa00
> Feb 14 02:05:52 lupus kernel: [ ? 70.058823] ?[<ffffffff812f6b18>] ?
> lookup_inline_extent_backref+0x98/0x4a0
> Feb 14 02:05:52 lupus kernel: [ ? 70.058919] ?[<ffffffff810e33d7>] ?
> kmem_cache_alloc+0x87/0xa0
> Feb 14 02:05:52 lupus kernel: [ ? 70.059032] ?[<ffffffff812f891c>] ?
> __btrfs_free_extent+0xcc/0x6f0
> Feb 14 02:05:52 lupus kernel: [ ? 70.059125] ?[<ffffffff812fc4cf>] ?
> run_clustered_refs+0x39f/0x880
> Feb 14 02:05:52 lupus kernel: [ ? 70.059220] ?[<ffffffff810b1f98>] ?
> pagevec_lookup_tag+0x18/0x20
> Feb 14 02:05:52 lupus kernel: [ ? 70.059312] ?[<ffffffff810a7c81>] ?
> filemap_fdatawait_range+0x91/0x180
> Feb 14 02:05:52 lupus kernel: [ ? 70.059405] ?[<ffffffff812fca77>] ?
> btrfs_run_delayed_refs+0xc7/0x220
> Feb 14 02:05:52 lupus kernel: [ ? 70.059498] ?[<ffffffff8130c29c>] ?
> btrfs_commit_transaction+0x7c/0x760
> Feb 14 02:05:52 lupus kernel: [ ? 70.059591] ?[<ffffffff81067ea0>] ?
> autoremove_wake_function+0x0/0x30
> Feb 14 02:05:52 lupus kernel: [ ? 70.059683] ?[<ffffffff8130cdef>] ?
> start_transaction+0x1bf/0x270
> Feb 14 02:05:52 lupus kernel: [ ? 70.059775] ?[<ffffffff8110e96a>] ?
> __sync_filesystem+0x5a/0x90
> Feb 14 02:05:52 lupus kernel: [ ? 70.059867] ?[<ffffffff810eae8d>] ?
> generic_shutdown_super+0x2d/0x100
> Feb 14 02:05:52 lupus kernel: [ ? 70.059960] ?[<ffffffff810eafb9>] ?
> kill_anon_super+0x9/0x50
> Feb 14 02:05:52 lupus kernel: [ ? 70.060051] ?[<ffffffff810eb266>] ?
> deactivate_locked_super+0x26/0x80
> Feb 14 02:05:52 lupus kernel: [ ? 70.060144] ?[<ffffffff811043ea>] ?
> sys_umount+0x7a/0x390
> Feb 14 02:05:52 lupus kernel: [ ? 70.060235] ?[<ffffffff810027bb>] ?
> system_call_fastpath+0x16/0x1b
> Feb 14 02:05:52 lupus kernel: [ ? 70.060325] Code: 55 b8 ff 00 00 00
> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> Feb 14 02:05:52 lupus kernel: [ ? 70.063170] RIP ?[<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 02:05:52 lupus kernel: [ ? 70.063302] ?RSP <ffff88023c77d6f8>
> Feb 14 02:05:52 lupus kernel: [ ? 70.063386] CR2: 0000030341ed0050
> Feb 14 02:05:52 lupus kernel: [ ? 70.063528] ---[ end trace
> 3313552d105b1535 ]---
> Feb 14 02:06:16 lupus kernel: [ ? 93.961960] BUG: unable to handle
> kernel paging request at 0000030341ed0050
> Feb 14 02:06:16 lupus kernel: [ ? 93.962171] IP: [<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 02:06:16 lupus kernel: [ ? 93.962307] PGD 0
> Feb 14 02:06:16 lupus kernel: [ ? 93.962430] Oops: 0000 [#2] PREEMPT SMP
> Feb 14 02:06:16 lupus kernel: [ ? 93.962637] last sysfs file:
> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> Feb 14 02:06:16 lupus kernel: [ ? 93.962766] CPU 5
> Feb 14 02:06:16 lupus kernel: [ ? 93.962812] Modules linked in:
> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
> usb_storage ehci_hcd [last unloaded: tg3]
> Feb 14 02:06:16 lupus kernel: [ ? 93.966044]
> Feb 14 02:06:16 lupus kernel: [ ? 93.966127] Pid: 7915, comm:
> btrfs-transacti Tainted: G ? ? ?D ? ? 2.6.37-plus_v16_zcache #4
> FMP55/ipower G3710
> Feb 14 02:06:16 lupus kernel: [ ? 93.966266] RIP:
> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 02:06:16 lupus kernel: [ ? 93.966440] RSP:
> 0018:ffff88023c63b6e0 ?EFLAGS: 00010246
> Feb 14 02:06:16 lupus kernel: [ ? 93.966528] RAX: 00000000000000ff
> RBX: ffff88023cde0168 RCX: 0000000000000000
> Feb 14 02:06:16 lupus kernel: [ ? 93.966620] RDX: ffff88023c63b71c
> RSI: ffff88023c63b750 RDI: 0000000000000006
> Feb 14 02:06:16 lupus kernel: [ ? 93.966713] RBP: 0000030341ed0000
> R08: ffffffff8133eef0 R09: ffff88023c63b8c0
> Feb 14 02:06:16 lupus kernel: [ ? 93.966805] R10: 0000000000000003
> R11: 0000000000000001 R12: 00000000ffffffff
> Feb 14 02:06:16 lupus kernel: [ ? 93.966897] R13: ffff88023cde0030
> R14: ffffea0007d59bc8 R15: 0000000000000001
> Feb 14 02:06:16 lupus kernel: [ ? 93.966990] FS:
> 0000000000000000(0000) GS:ffff8800bf540000(0000)
> knlGS:0000000000000000
> Feb 14 02:06:16 lupus kernel: [ ? 93.967120] CS: ?0010 DS: 0000 ES:
> 0000 CR0: 000000008005003b
> Feb 14 02:06:16 lupus kernel: [ ? 93.967209] CR2: 0000030341ed0050
> CR3: 0000000001c27000 CR4: 00000000000006e0
> Feb 14 02:06:16 lupus kernel: [ ? 93.967302] DR0: 0000000000000000
> DR1: 0000000000000000 DR2: 0000000000000000
> Feb 14 02:06:16 lupus kernel: [ ? 93.967394] DR3: 0000000000000000
> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Feb 14 02:06:16 lupus kernel: [ ? 93.967500] Process btrfs-transacti
> (pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
> Feb 14 02:06:16 lupus kernel: [ ? 93.967630] Stack:
> Feb 14 02:06:16 lupus kernel: [ ? 93.967711] ?0000000000000000
> 0000000000000002 0000000000000000 0000000000000003
> Feb 14 02:06:16 lupus kernel: [ ? 93.968057] ?ffffea0007d59bc8
> ffffffff810e6aaa 0000000000000041 0000000600000002
> Feb 14 02:06:16 lupus kernel: [ ? 93.968348] ?0000000000000000
> ffffffff810e6b3a 0000000000000001 ffffffff00000001
> Feb 14 02:06:16 lupus kernel: [ ? 93.968639] Call Trace:
> Feb 14 02:06:16 lupus kernel: [ ? 93.968728] ?[<ffffffff810e6aaa>] ?
> cleancache_get_key+0x4a/0x60
> Feb 14 02:06:16 lupus kernel: [ ? 93.968820] ?[<ffffffff810e6b3a>] ?
> __cleancache_get_page+0x7a/0xd0
> Feb 14 02:06:16 lupus kernel: [ ? 93.968914] ?[<ffffffff8132e4de>] ?
> __extent_read_full_page+0x52e/0x710
> Feb 14 02:06:16 lupus kernel: [ ? 93.969008] ?[<ffffffff812f3f93>] ?
> update_reserved_bytes+0xb3/0x140
> Feb 14 02:06:16 lupus kernel: [ ? 93.969102] ?[<ffffffff81305060>] ?
> btree_get_extent+0x0/0x1c0
> Feb 14 02:06:16 lupus kernel: [ ? 93.969193] ?[<ffffffff8132bb7c>] ?
> merge_state+0x7c/0x150
> Feb 14 02:06:16 lupus kernel: [ ? 93.969285] ?[<ffffffff81330b21>] ?
> read_extent_buffer_pages+0x2d1/0x470
> Feb 14 02:06:16 lupus kernel: [ ? 93.969378] ?[<ffffffff81305060>] ?
> btree_get_extent+0x0/0x1c0
> Feb 14 02:06:16 lupus kernel: [ ? 93.969470] ?[<ffffffff8130674d>] ?
> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
> Feb 14 02:06:16 lupus kernel: [ ? 93.969602] ?[<ffffffff813076f9>] ?
> read_tree_block+0x39/0x60
> Feb 14 02:06:16 lupus kernel: [ ? 93.969694] ?[<ffffffff812ed5e6>] ?
> read_block_for_search.clone.40+0x116/0x410
> Feb 14 02:06:16 lupus kernel: [ ? 93.969878] ?[<ffffffff812f0bc7>] ?
> btrfs_search_slot+0x307/0xa00
> Feb 14 02:06:16 lupus kernel: [ ? 93.969970] ?[<ffffffff812f6b18>] ?
> lookup_inline_extent_backref+0x98/0x4a0
> Feb 14 02:06:16 lupus kernel: [ ? 93.970065] ?[<ffffffff810e33d7>] ?
> kmem_cache_alloc+0x87/0xa0
> Feb 14 02:06:16 lupus kernel: [ ? 93.970157] ?[<ffffffff812f891c>] ?
> __btrfs_free_extent+0xcc/0x6f0
> Feb 14 02:06:16 lupus kernel: [ ? 93.970249] ?[<ffffffff812f8434>] ?
> update_block_group.clone.62+0xc4/0x280
> Feb 14 02:06:16 lupus kernel: [ ? 93.970343] ?[<ffffffff812fc4cf>] ?
> run_clustered_refs+0x39f/0x880
> Feb 14 02:06:16 lupus kernel: [ ? 93.970436] ?[<ffffffff812fca77>] ?
> btrfs_run_delayed_refs+0xc7/0x220
> Feb 14 02:06:16 lupus kernel: [ ? 93.970529] ?[<ffffffff810e15f9>] ?
> new_slab+0x169/0x1f0
> Feb 14 02:06:16 lupus kernel: [ ? 93.970619] ?[<ffffffff8130c29c>] ?
> btrfs_commit_transaction+0x7c/0x760
> Feb 14 02:06:16 lupus kernel: [ ? 93.970713] ?[<ffffffff81067ea0>] ?
> autoremove_wake_function+0x0/0x30
> Feb 14 02:06:16 lupus kernel: [ ? 93.970806] ?[<ffffffff81305bc3>] ?
> transaction_kthread+0x283/0x2a0
> Feb 14 02:06:16 lupus kernel: [ ? 93.970898] ?[<ffffffff81305940>] ?
> transaction_kthread+0x0/0x2a0
> Feb 14 02:06:16 lupus kernel: [ ? 93.970990] ?[<ffffffff81305940>] ?
> transaction_kthread+0x0/0x2a0
> Feb 14 02:06:16 lupus kernel: [ ? 93.971083] ?[<ffffffff81067a16>] ?
> kthread+0x96/0xa0
> Feb 14 02:06:16 lupus kernel: [ ? 93.971174] ?[<ffffffff81003514>] ?
> kernel_thread_helper+0x4/0x10
> Feb 14 02:06:16 lupus kernel: [ ? 93.971266] ?[<ffffffff81067980>] ?
> kthread+0x0/0xa0
> Feb 14 02:06:16 lupus kernel: [ ? 93.971355] ?[<ffffffff81003510>] ?
> kernel_thread_helper+0x0/0x10
> Feb 14 02:06:16 lupus kernel: [ ? 93.971444] Code: 55 b8 ff 00 00 00
> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> Feb 14 02:06:16 lupus kernel: [ ? 93.974280] RIP ?[<ffffffff8133ef1b>]
> btrfs_encode_fh+0x2b/0x120
> Feb 14 02:06:16 lupus kernel: [ ? 93.974412] ?RSP <ffff88023c63b6e0>
> Feb 14 02:06:16 lupus kernel: [ ? 93.974497] CR2: 0000030341ed0050
> Feb 14 02:06:16 lupus kernel: [ ? 93.974599] ---[ end trace
> 3313552d105b1536 ]---
> Feb 14 02:07:04 lupus kernel: [ ?141.906124] zcache: destroyed pool id=2
> Feb 14 02:07:17 lupus kernel: [ ?154.783358] SysRq : Keyboard mode set
> to system default
> Feb 14 02:07:18 lupus kernel: [ ?155.486147] SysRq : Terminate All Tasks
>
>
> That's all for now
>
> Thanks & Regards
>
> Matt
>

(leaving out several folks from the CC to avoid spamming - if I left
out someone wrongfully please re-add)

running an addr2line reveals:


addr2line -e /usr/src/linux-2.6.37_vanilla/vmlinux -i ffffffff81338cbb
export.c:0


hope that helps


Regards

Matt

2011-02-14 04:35:15

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Mon, Feb 14, 2011 at 10:29 AM, Matt <[email protected]> wrote:
> On Mon, Feb 14, 2011 at 1:24 AM, Matt <[email protected]> wrote:
>> On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]> wrote:
>>> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
>>> <[email protected]> wrote:
>>> [snip]
>>>>
>>>> If I've missed anything important, please let me know!
>>>>
>>>> Thanks again!
>>>> Dan
>>>>
>>>
>>> Hi Dan,
>>>
>>> thank you so much for answering my email in such detail !
>>>
>>> I shall pick up on that mail in my next email sending to the mailing list :)
>>>
>>>
>>> currently I've got a problem with btrfs which seems to get triggered
>>> by cleancache get-operations:
>>>
>>>
>>> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
>>> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
>>> /dev/mapper/portage
>>> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk space caching
>>> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo compression
>>> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created ephemeral
>>> tmem pool, id=3
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
>>> kernel paging request at 0000000001400050
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP: [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1] PREEMPT SMP
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
>>> /sys/devices/platform/coretemp.3/temp1_input
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in: radeon
>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>>> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>>> ehci_hcd [last unloaded: tg3]
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
>>> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
>>> G3710
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
>>> 0010:[<ffffffff8133ef1b>]  [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
>>> 0018:ffff880129a11b00  EFLAGS: 00010246
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
>>> RBX: ffff88014a1ce628 RCX: 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
>>> RSI: ffff880129a11b70 RDI: 0000000000000006
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
>>> R08: ffffffff8133eef0 R09: ffff880129a11c68
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
>>> R11: 0000000000000001 R12: ffff88014a1ce780
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
>>> R14: ffff88021fef9000 R15: 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
>>> 0000000000000000(0000) GS:ffff8800bf500000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS:  0010 DS: 0000 ES:
>>> 0000 CR0: 000000008005003b
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
>>> CR3: 0000000001c27000 CR4: 00000000000006e0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-transacti
>>> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854006]  ffff880129a11b50
>>> ffff880000000003 ffff88003c60a098 0000000000000003
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854035]  ffffffffffffffff
>>> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854063]  ffffffff8133e3f0
>>> ffffffff810e6cee 0000000000001000 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854103]  [<ffffffff810e6aaa>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854122]  [<ffffffff8133e3f0>] ?
>>> btrfs_wake_function+0x0/0x20
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854140]  [<ffffffff810e6cee>] ?
>>> __cleancache_flush_inode+0x3e/0x70
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854161]  [<ffffffff810b34d2>] ?
>>> truncate_inode_pages_range+0x42/0x440
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854182]  [<ffffffff812f115e>] ?
>>> btrfs_search_slot+0x89e/0xa00
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854201]  [<ffffffff810c3a45>] ?
>>> unmap_mapping_range+0xc5/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854220]  [<ffffffff810b3930>] ?
>>> truncate_pagecache+0x40/0x70
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854240]  [<ffffffff813458b1>] ?
>>> btrfs_truncate_free_space_cache+0x81/0xe0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854261]  [<ffffffff812fce15>] ?
>>> btrfs_write_dirty_block_groups+0x245/0x500
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854283]  [<ffffffff812fcb6a>] ?
>>> btrfs_run_delayed_refs+0x1ba/0x220
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854304]  [<ffffffff8130afff>] ?
>>> commit_cowonly_roots+0xff/0x1d0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854323]  [<ffffffff8130c583>] ?
>>> btrfs_commit_transaction+0x363/0x760
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854344]  [<ffffffff81067ea0>] ?
>>> autoremove_wake_function+0x0/0x30
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854364]  [<ffffffff81305bc3>] ?
>>> transaction_kthread+0x283/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854383]  [<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854401]  [<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854420]  [<ffffffff81067a16>] ?
>>> kthread+0x96/0xa0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854437]  [<ffffffff81003514>] ?
>>> kernel_thread_helper+0x4/0x10
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854455]  [<ffffffff81067980>] ?
>>> kthread+0x0/0xa0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854471]  [<ffffffff81003510>] ?
>>> kernel_thread_helper+0x0/0x10
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP  [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854762]  RSP <ffff880129a11b00>
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
>>> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
>>> f831c5ceeaa49287 ]---
>>>
>>> in my case I had compress-force with lzo and disk_cache enabled
>>>
>>>
>>> another user of the kernel I'm currently running has had the same
>>> problem with zcache
>>> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
>>>
>>> (looks like in his case compression and any other fancy additional
>>> features weren't enabled)
>>>
>>>
>>> changes made by this kernel or patchset to btrfs are from
>>> * io-less dirty throttling patchset (44 patches)
>>> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
>>> applied in both cases)
>>> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
>>> * btrfs-unstable changes to state of
>>> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals btrfs
>>> from 2.6.38-rc4+)
>>>
>>> I haven't tried downgrading to vanilla 2.6.37 with zcache only, yet,
>>>
>>> but kind of upgraded btrfs to the latest state of the btrfs-unstable
>>> repository (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=summary)
>>> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
>>>
>>> this also didn't help and seemed to produce the same error-message
>>>
>>> so to summarize:
>>>
>>> 1) error message appearing with all 4 patchsets applied changing
>>> btrfs-code and compress-force=lzo and disk_cache enabled
>>>
>>> 2) error message appearing with default mount-options and btrfs from
>>> 2.6.37 and changes for zcache & io-less dirty throttling patchset
>>> applied (first 2 patch(sets)) from list)
>>>
>>>
>>> in my case I tried to extract / play back a 1.7 GiB tarball of my
>>> portage-directory (lots of small files and some tar.bzip2 archives)
>>> via pbzip2 or 7z when the error happened and the message was shown
>>>
>>> Due to KMS sound (webradio streaming) was still running but I couldn't
>>> continue work (X switching to kernel output) so I did the magic sysrq
>>> combo (reisub)
>>>
>>>
>>> Does that BUG message ring a bell for anyone ?
>>>
>>> (if I should leave out anyone from the CC in the next emails or
>>> future, please holler - I don't want to spam your inboxes)
>>>
>>> Thanks
>>>
>>> Matt
>>>
>>
>>
>> OK,
>>
>> here's the output of a kernel -
>>
>> staying as close to vanilla (2.6.37) as the current situation allows
>> (only including some corruption or leak fixes for zram & zcache and
>> "zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
>> fixes)):
>>
>> so in total the following patches are included in this new kernel
>> (2.6.37-zcache):
>>
>> zram changes:
>> 1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
>> 2 [PATCH] zram: fix data corruption issue
>> 3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations
>>
>> zcache:
>> 1 zcache-linux-2.6.37-110205
>> 2 [PATCH] staging: zcache: fix memory leak
>> 3 [PATCH] zcache: Fix build error when sysfs is not defined
>>
>> reiserfs:
>> 1 [PATCH] reiserfs: Make sure va_end() is always called after
>> 2 [patch] reiserfs: potential ERR_PTR dereference
>>
>>
>> the same procedure:
>>
>> trying to extract the mentioned portage-tarball:
>>
>> time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 | tar
>> -xp -C /usr/gentoo/)
>>
>>
>> this hopefully should make it easier to track down the problem:
>>
>>
>> Feb 14 01:59:59 lupus kernel: [  364.777143] device fsid
>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
>> /dev/mapper/portage
>> Feb 14 01:59:59 lupus kernel: [  364.844994] zcache: created ephemeral
>> tmem pool, id=2
>> Feb 14 02:02:49 lupus kernel: [  534.577573] BUG: unable to handle
>> kernel paging request at 0000000037610050
>> Feb 14 02:02:49 lupus kernel: [  534.577605] IP: [<ffffffff81338cbb>]
>> btrfs_encode_fh+0x2b/0x110
>> Feb 14 02:02:49 lupus kernel: [  534.577630] PGD 0
>> Feb 14 02:02:49 lupus kernel: [  534.577640] Oops: 0000 [#1] PREEMPT SMP
>> Feb 14 02:02:49 lupus kernel: [  534.577665] last sysfs file:
>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>> Feb 14 02:02:49 lupus kernel: [  534.577693] CPU 5
>> Feb 14 02:02:49 lupus kernel: [  534.577701] Modules linked in: radeon
>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>> snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>> ehci_hcd [last unloaded: tg3]
>> Feb 14 02:02:49 lupus kernel: [  534.578114]
>> Feb 14 02:02:49 lupus kernel: [  534.578124] Pid: 8285, comm: tar Not
>> tainted 2.6.37-zcache #2 FMP55/ipower G3710
>> Feb 14 02:02:49 lupus kernel: [  534.578146] RIP:
>> 0010:[<ffffffff81338cbb>]  [<ffffffff81338cbb>]
>> btrfs_encode_fh+0x2b/0x110
>> Feb 14 02:02:49 lupus kernel: [  534.578172] RSP:
>> 0018:ffff88023ea9dcc8  EFLAGS: 00010246
>> Feb 14 02:02:49 lupus kernel: [  534.578189] RAX: 00000000000000ff
>> RBX: ffff8800b8643228 RCX: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [  534.578210] RDX: ffff88023ea9dd04
>> RSI: ffff88023ea9dd38 RDI: 0000000000000006
>> Feb 14 02:02:49 lupus kernel: [  534.578230] RBP: 0000000037610000
>> R08: ffffffff81338c90 R09: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [  534.578251] R10: 0000000000000019
>> R11: 0000000000000001 R12: ffff8800b8643380
>> Feb 14 02:02:49 lupus kernel: [  534.578272] R13: ffff8800b8643258
>> R14: 00007fff806f1f00 R15: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [  534.578293] FS:
>> 00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
>> knlGS:0000000000000000
>> Feb 14 02:02:49 lupus kernel: [  534.578317] CS:  0010 DS: 0000 ES:
>> 0000 CR0: 0000000080050033
>> Feb 14 02:02:49 lupus kernel: [  534.578334] CR2: 0000000037610050
>> CR3: 000000023dcef000 CR4: 00000000000006e0
>> Feb 14 02:02:49 lupus kernel: [  534.578356] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [  534.578377] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Feb 14 02:02:49 lupus kernel: [  534.578398] Process tar (pid: 8285,
>> threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
>> Feb 14 02:02:49 lupus kernel: [  534.578421] Stack:
>> Feb 14 02:02:49 lupus kernel: [  534.578428]  000000013d096000
>> ffff88023ed84800 ffff88023ea9c000 0000000000000002
>> Feb 14 02:02:49 lupus kernel: [  534.578458]  ffffffffffffffff
>> ffffffff810e3b1a 0000000000000001 000000061e1d5240
>> Feb 14 02:02:49 lupus kernel: [  534.578486]  fffffffffffffffb
>> ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
>> Feb 14 02:02:49 lupus kernel: [  534.578514] Call Trace:
>> Feb 14 02:02:49 lupus kernel: [  534.578525]  [<ffffffff810e3b1a>] ?
>> cleancache_get_key+0x4a/0x60
>> Feb 14 02:02:49 lupus kernel: [  534.578544]  [<ffffffff810e3d5e>] ?
>> __cleancache_flush_inode+0x3e/0x70
>> Feb 14 02:02:49 lupus kernel: [  534.578565]  [<ffffffff810b0ed2>] ?
>> truncate_inode_pages_range+0x42/0x440
>> Feb 14 02:02:49 lupus kernel: [  534.578586]  [<ffffffff81338451>] ?
>> btrfs_tree_unlock+0x41/0x50
>> Feb 14 02:02:49 lupus kernel: [  534.578605]  [<ffffffff812e4ed5>] ?
>> btrfs_release_path+0x15/0x70
>> Feb 14 02:02:49 lupus kernel: [  534.578624]  [<ffffffff8130bf29>] ?
>> btrfs_run_delayed_iputs+0x49/0x120
>> Feb 14 02:02:49 lupus kernel: [  534.578644]  [<ffffffff813107e7>] ?
>> btrfs_evict_inode+0x27/0x1e0
>> Feb 14 02:02:49 lupus kernel: [  534.578663]  [<ffffffff810fc3aa>] ?
>> evict+0x1a/0xa0
>> Feb 14 02:02:49 lupus kernel: [  534.578678]  [<ffffffff810fc6bd>] ?
>> iput+0x1cd/0x2b0
>> Feb 14 02:02:49 lupus kernel: [  534.578694]  [<ffffffff810f266f>] ?
>> do_unlinkat+0x12f/0x1d0
>> Feb 14 02:02:49 lupus kernel: [  534.578712]  [<ffffffff810027bb>] ?
>> system_call_fastpath+0x16/0x1b
>> Feb 14 02:02:49 lupus kernel: [  534.578730] Code: 55 b8 ff 00 00 00
>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> Feb 14 02:02:49 lupus kernel: [  534.578986] RIP  [<ffffffff81338cbb>]
>> btrfs_encode_fh+0x2b/0x110
>> Feb 14 02:02:49 lupus kernel: [  534.579081]  RSP <ffff88023ea9dcc8>
>> Feb 14 02:02:49 lupus kernel: [  534.579093] CR2: 0000000037610050
>> Feb 14 02:02:49 lupus kernel: [  534.587513] ---[ end trace
>> c596b12e66c0b360 ]---
>>
>>
>> for reference I've pasted it to pastebin.com:
>>
>> "2.6.37_zcache_V2.patch"
>> http://pastebin.com/cVSkwQ6M
>>
>>
>>
>>
>>
>> after the reboot I had forgotten to not mount the btrfs volume and it
>> threw a similar error-message again and remounted several partitions
>> read-only (including the system partition)
>> the partition with btrfs (/usr/gentoo) couldn't be unmounted since the
>> umount process kind of hang
>>
>> so here's the error message after a reboot (might not be accurate or
>> kind of "skewed" since other patches are included (io-less dirty
>> throttling, PATCH] fix (latent?) memory corruption in
>> btrfs_encode_fh() and latest changes for btrfs)) but might help to get
>> some more evidence:
>>
>>
>> Feb 14 02:05:46 lupus kernel: [   63.922648] device fsid
>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
>> /dev/mapper/portage
>> Feb 14 02:05:46 lupus kernel: [   64.047118] btrfs: unlinked 1 orphans
>> Feb 14 02:05:46 lupus kernel: [   64.051956] zcache: created ephemeral
>> tmem pool, id=3
>> Feb 14 02:05:48 lupus kernel: [   65.801364] hub 2-1:1.0: hub_suspend
>> Feb 14 02:05:48 lupus kernel: [   65.801376] usb 2-1: unlink
>> qh256-0001/ffff88023fefd180 start 1 [1/0 us]
>> Feb 14 02:05:48 lupus kernel: [   65.801559] usb 2-1: usb auto-suspend
>> Feb 14 02:05:50 lupus kernel: [   67.797929] hub 2-0:1.0: hub_suspend
>> Feb 14 02:05:50 lupus kernel: [   67.797939] usb usb2: bus auto-suspend
>> Feb 14 02:05:50 lupus kernel: [   67.797942] ehci_hcd 0000:00:1d.0:
>> suspend root hub
>> Feb 14 02:05:52 lupus kernel: [   70.050493] BUG: unable to handle
>> kernel paging request at 0000030341ed0050
>> Feb 14 02:05:52 lupus kernel: [   70.050670] IP: [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:05:52 lupus kernel: [   70.050807] PGD 0
>> Feb 14 02:05:52 lupus kernel: [   70.050929] Oops: 0000 [#1] PREEMPT SMP
>> Feb 14 02:05:52 lupus kernel: [   70.051223] last sysfs file:
>> /sys/module/pcie_aspm/parameters/policy
>> Feb 14 02:05:52 lupus kernel: [   70.051365] CPU 6
>> Feb 14 02:05:52 lupus kernel: [   70.051411] Modules linked in:
>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>> usb_storage ehci_hcd [last unloaded: tg3]
>> Feb 14 02:05:52 lupus kernel: [   70.054694]
>> Feb 14 02:05:52 lupus kernel: [   70.054776] Pid: 7962, comm: umount
>> Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
>> Feb 14 02:05:52 lupus kernel: [   70.054912] RIP:
>> 0010:[<ffffffff8133ef1b>]  [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:05:52 lupus kernel: [   70.055084] RSP:
>> 0018:ffff88023c77d6f8  EFLAGS: 00010246
>> Feb 14 02:05:52 lupus kernel: [   70.055173] RAX: 00000000000000ff
>> RBX: ffff88023cde0168 RCX: 0000000000000000
>> Feb 14 02:05:52 lupus kernel: [   70.055265] RDX: ffff88023c77d734
>> RSI: ffff88023c77d768 RDI: 0000000000000006
>> Feb 14 02:05:52 lupus kernel: [   70.055357] RBP: 0000030341ed0000
>> R08: ffffffff8133eef0 R09: ffff88023c77d8d8
>> Feb 14 02:05:52 lupus kernel: [   70.055448] R10: 0000000000000003
>> R11: 0000000000000001 R12: 00000000ffffffff
>> Feb 14 02:05:52 lupus kernel: [   70.055540] R13: ffff88023cde0030
>> R14: ffffea0007dd39f0 R15: 0000000000000001
>> Feb 14 02:05:52 lupus kernel: [   70.055633] FS:
>> 00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
>> knlGS:0000000000000000
>> Feb 14 02:05:52 lupus kernel: [   70.055762] CS:  0010 DS: 0000 ES:
>> 0000 CR0: 000000008005003b
>> Feb 14 02:05:52 lupus kernel: [   70.055851] CR2: 0000030341ed0050
>> CR3: 000000023c7d5000 CR4: 00000000000006e0
>> Feb 14 02:05:52 lupus kernel: [   70.055943] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Feb 14 02:05:52 lupus kernel: [   70.056035] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Feb 14 02:05:52 lupus kernel: [   70.056128] Process umount (pid:
>> 7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
>> Feb 14 02:05:52 lupus kernel: [   70.056257] Stack:
>> Feb 14 02:05:52 lupus kernel: [   70.056338]  0000000000000000
>> 0000000000000002 ffff880200000000 0000000000000003
>> Feb 14 02:05:52 lupus kernel: [   70.056630]  ffffea0007dd39f0
>> ffffffff810e6aaa ffff880200000041 0000000600000246
>> Feb 14 02:05:52 lupus kernel: [   70.056922]  ffff88023cdcd300
>> ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
>> Feb 14 02:05:52 lupus kernel: [   70.057213] Call Trace:
>> Feb 14 02:05:52 lupus kernel: [   70.057301]  [<ffffffff810e6aaa>] ?
>> cleancache_get_key+0x4a/0x60
>> Feb 14 02:05:52 lupus kernel: [   70.057393]  [<ffffffff810e6b3a>] ?
>> __cleancache_get_page+0x7a/0xd0
>> Feb 14 02:05:52 lupus kernel: [   70.057487]  [<ffffffff8132bb7c>] ?
>> merge_state+0x7c/0x150
>> Feb 14 02:05:52 lupus kernel: [   70.057579]  [<ffffffff8132e4de>] ?
>> __extent_read_full_page+0x52e/0x710
>> Feb 14 02:05:52 lupus kernel: [   70.057673]  [<ffffffff813bdea4>] ?
>> rb_insert_color+0xa4/0x140
>> Feb 14 02:05:52 lupus kernel: [   70.057766]  [<ffffffff8134b0b6>] ?
>> tree_insert+0x86/0x1e0
>> Feb 14 02:05:52 lupus kernel: [   70.057859]  [<ffffffff81058c73>] ?
>> lock_timer_base.clone.22+0x33/0x70
>> Feb 14 02:05:52 lupus kernel: [   70.058004]  [<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:05:52 lupus kernel: [   70.058097]  [<ffffffff81330b21>] ?
>> read_extent_buffer_pages+0x2d1/0x470
>> Feb 14 02:05:52 lupus kernel: [   70.058191]  [<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:05:52 lupus kernel: [   70.058283]  [<ffffffff8130674d>] ?
>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>> Feb 14 02:05:52 lupus kernel: [   70.058415]  [<ffffffff813076f9>] ?
>> read_tree_block+0x39/0x60
>> Feb 14 02:05:52 lupus kernel: [   70.058508]  [<ffffffff812ed5e6>] ?
>> read_block_for_search.clone.40+0x116/0x410
>> Feb 14 02:05:52 lupus kernel: [   70.058638]  [<ffffffff812eb228>] ?
>> btrfs_cow_block+0x118/0x2b0
>> Feb 14 02:05:52 lupus kernel: [   70.058731]  [<ffffffff812f0bc7>] ?
>> btrfs_search_slot+0x307/0xa00
>> Feb 14 02:05:52 lupus kernel: [   70.058823]  [<ffffffff812f6b18>] ?
>> lookup_inline_extent_backref+0x98/0x4a0
>> Feb 14 02:05:52 lupus kernel: [   70.058919]  [<ffffffff810e33d7>] ?
>> kmem_cache_alloc+0x87/0xa0
>> Feb 14 02:05:52 lupus kernel: [   70.059032]  [<ffffffff812f891c>] ?
>> __btrfs_free_extent+0xcc/0x6f0
>> Feb 14 02:05:52 lupus kernel: [   70.059125]  [<ffffffff812fc4cf>] ?
>> run_clustered_refs+0x39f/0x880
>> Feb 14 02:05:52 lupus kernel: [   70.059220]  [<ffffffff810b1f98>] ?
>> pagevec_lookup_tag+0x18/0x20
>> Feb 14 02:05:52 lupus kernel: [   70.059312]  [<ffffffff810a7c81>] ?
>> filemap_fdatawait_range+0x91/0x180
>> Feb 14 02:05:52 lupus kernel: [   70.059405]  [<ffffffff812fca77>] ?
>> btrfs_run_delayed_refs+0xc7/0x220
>> Feb 14 02:05:52 lupus kernel: [   70.059498]  [<ffffffff8130c29c>] ?
>> btrfs_commit_transaction+0x7c/0x760
>> Feb 14 02:05:52 lupus kernel: [   70.059591]  [<ffffffff81067ea0>] ?
>> autoremove_wake_function+0x0/0x30
>> Feb 14 02:05:52 lupus kernel: [   70.059683]  [<ffffffff8130cdef>] ?
>> start_transaction+0x1bf/0x270
>> Feb 14 02:05:52 lupus kernel: [   70.059775]  [<ffffffff8110e96a>] ?
>> __sync_filesystem+0x5a/0x90
>> Feb 14 02:05:52 lupus kernel: [   70.059867]  [<ffffffff810eae8d>] ?
>> generic_shutdown_super+0x2d/0x100
>> Feb 14 02:05:52 lupus kernel: [   70.059960]  [<ffffffff810eafb9>] ?
>> kill_anon_super+0x9/0x50
>> Feb 14 02:05:52 lupus kernel: [   70.060051]  [<ffffffff810eb266>] ?
>> deactivate_locked_super+0x26/0x80
>> Feb 14 02:05:52 lupus kernel: [   70.060144]  [<ffffffff811043ea>] ?
>> sys_umount+0x7a/0x390
>> Feb 14 02:05:52 lupus kernel: [   70.060235]  [<ffffffff810027bb>] ?
>> system_call_fastpath+0x16/0x1b
>> Feb 14 02:05:52 lupus kernel: [   70.060325] Code: 55 b8 ff 00 00 00
>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> Feb 14 02:05:52 lupus kernel: [   70.063170] RIP  [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:05:52 lupus kernel: [   70.063302]  RSP <ffff88023c77d6f8>
>> Feb 14 02:05:52 lupus kernel: [   70.063386] CR2: 0000030341ed0050
>> Feb 14 02:05:52 lupus kernel: [   70.063528] ---[ end trace
>> 3313552d105b1535 ]---
>> Feb 14 02:06:16 lupus kernel: [   93.961960] BUG: unable to handle
>> kernel paging request at 0000030341ed0050
>> Feb 14 02:06:16 lupus kernel: [   93.962171] IP: [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:06:16 lupus kernel: [   93.962307] PGD 0
>> Feb 14 02:06:16 lupus kernel: [   93.962430] Oops: 0000 [#2] PREEMPT SMP
>> Feb 14 02:06:16 lupus kernel: [   93.962637] last sysfs file:
>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>> Feb 14 02:06:16 lupus kernel: [   93.962766] CPU 5
>> Feb 14 02:06:16 lupus kernel: [   93.962812] Modules linked in:
>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>> usb_storage ehci_hcd [last unloaded: tg3]
>> Feb 14 02:06:16 lupus kernel: [   93.966044]
>> Feb 14 02:06:16 lupus kernel: [   93.966127] Pid: 7915, comm:
>> btrfs-transacti Tainted: G      D     2.6.37-plus_v16_zcache #4
>> FMP55/ipower G3710
>> Feb 14 02:06:16 lupus kernel: [   93.966266] RIP:
>> 0010:[<ffffffff8133ef1b>]  [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:06:16 lupus kernel: [   93.966440] RSP:
>> 0018:ffff88023c63b6e0  EFLAGS: 00010246
>> Feb 14 02:06:16 lupus kernel: [   93.966528] RAX: 00000000000000ff
>> RBX: ffff88023cde0168 RCX: 0000000000000000
>> Feb 14 02:06:16 lupus kernel: [   93.966620] RDX: ffff88023c63b71c
>> RSI: ffff88023c63b750 RDI: 0000000000000006
>> Feb 14 02:06:16 lupus kernel: [   93.966713] RBP: 0000030341ed0000
>> R08: ffffffff8133eef0 R09: ffff88023c63b8c0
>> Feb 14 02:06:16 lupus kernel: [   93.966805] R10: 0000000000000003
>> R11: 0000000000000001 R12: 00000000ffffffff
>> Feb 14 02:06:16 lupus kernel: [   93.966897] R13: ffff88023cde0030
>> R14: ffffea0007d59bc8 R15: 0000000000000001
>> Feb 14 02:06:16 lupus kernel: [   93.966990] FS:
>> 0000000000000000(0000) GS:ffff8800bf540000(0000)
>> knlGS:0000000000000000
>> Feb 14 02:06:16 lupus kernel: [   93.967120] CS:  0010 DS: 0000 ES:
>> 0000 CR0: 000000008005003b
>> Feb 14 02:06:16 lupus kernel: [   93.967209] CR2: 0000030341ed0050
>> CR3: 0000000001c27000 CR4: 00000000000006e0
>> Feb 14 02:06:16 lupus kernel: [   93.967302] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Feb 14 02:06:16 lupus kernel: [   93.967394] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Feb 14 02:06:16 lupus kernel: [   93.967500] Process btrfs-transacti
>> (pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
>> Feb 14 02:06:16 lupus kernel: [   93.967630] Stack:
>> Feb 14 02:06:16 lupus kernel: [   93.967711]  0000000000000000
>> 0000000000000002 0000000000000000 0000000000000003
>> Feb 14 02:06:16 lupus kernel: [   93.968057]  ffffea0007d59bc8
>> ffffffff810e6aaa 0000000000000041 0000000600000002
>> Feb 14 02:06:16 lupus kernel: [   93.968348]  0000000000000000
>> ffffffff810e6b3a 0000000000000001 ffffffff00000001
>> Feb 14 02:06:16 lupus kernel: [   93.968639] Call Trace:
>> Feb 14 02:06:16 lupus kernel: [   93.968728]  [<ffffffff810e6aaa>] ?
>> cleancache_get_key+0x4a/0x60
>> Feb 14 02:06:16 lupus kernel: [   93.968820]  [<ffffffff810e6b3a>] ?
>> __cleancache_get_page+0x7a/0xd0
>> Feb 14 02:06:16 lupus kernel: [   93.968914]  [<ffffffff8132e4de>] ?
>> __extent_read_full_page+0x52e/0x710
>> Feb 14 02:06:16 lupus kernel: [   93.969008]  [<ffffffff812f3f93>] ?
>> update_reserved_bytes+0xb3/0x140
>> Feb 14 02:06:16 lupus kernel: [   93.969102]  [<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:06:16 lupus kernel: [   93.969193]  [<ffffffff8132bb7c>] ?
>> merge_state+0x7c/0x150
>> Feb 14 02:06:16 lupus kernel: [   93.969285]  [<ffffffff81330b21>] ?
>> read_extent_buffer_pages+0x2d1/0x470
>> Feb 14 02:06:16 lupus kernel: [   93.969378]  [<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:06:16 lupus kernel: [   93.969470]  [<ffffffff8130674d>] ?
>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>> Feb 14 02:06:16 lupus kernel: [   93.969602]  [<ffffffff813076f9>] ?
>> read_tree_block+0x39/0x60
>> Feb 14 02:06:16 lupus kernel: [   93.969694]  [<ffffffff812ed5e6>] ?
>> read_block_for_search.clone.40+0x116/0x410
>> Feb 14 02:06:16 lupus kernel: [   93.969878]  [<ffffffff812f0bc7>] ?
>> btrfs_search_slot+0x307/0xa00
>> Feb 14 02:06:16 lupus kernel: [   93.969970]  [<ffffffff812f6b18>] ?
>> lookup_inline_extent_backref+0x98/0x4a0
>> Feb 14 02:06:16 lupus kernel: [   93.970065]  [<ffffffff810e33d7>] ?
>> kmem_cache_alloc+0x87/0xa0
>> Feb 14 02:06:16 lupus kernel: [   93.970157]  [<ffffffff812f891c>] ?
>> __btrfs_free_extent+0xcc/0x6f0
>> Feb 14 02:06:16 lupus kernel: [   93.970249]  [<ffffffff812f8434>] ?
>> update_block_group.clone.62+0xc4/0x280
>> Feb 14 02:06:16 lupus kernel: [   93.970343]  [<ffffffff812fc4cf>] ?
>> run_clustered_refs+0x39f/0x880
>> Feb 14 02:06:16 lupus kernel: [   93.970436]  [<ffffffff812fca77>] ?
>> btrfs_run_delayed_refs+0xc7/0x220
>> Feb 14 02:06:16 lupus kernel: [   93.970529]  [<ffffffff810e15f9>] ?
>> new_slab+0x169/0x1f0
>> Feb 14 02:06:16 lupus kernel: [   93.970619]  [<ffffffff8130c29c>] ?
>> btrfs_commit_transaction+0x7c/0x760
>> Feb 14 02:06:16 lupus kernel: [   93.970713]  [<ffffffff81067ea0>] ?
>> autoremove_wake_function+0x0/0x30
>> Feb 14 02:06:16 lupus kernel: [   93.970806]  [<ffffffff81305bc3>] ?
>> transaction_kthread+0x283/0x2a0
>> Feb 14 02:06:16 lupus kernel: [   93.970898]  [<ffffffff81305940>] ?
>> transaction_kthread+0x0/0x2a0
>> Feb 14 02:06:16 lupus kernel: [   93.970990]  [<ffffffff81305940>] ?
>> transaction_kthread+0x0/0x2a0
>> Feb 14 02:06:16 lupus kernel: [   93.971083]  [<ffffffff81067a16>] ?
>> kthread+0x96/0xa0
>> Feb 14 02:06:16 lupus kernel: [   93.971174]  [<ffffffff81003514>] ?
>> kernel_thread_helper+0x4/0x10
>> Feb 14 02:06:16 lupus kernel: [   93.971266]  [<ffffffff81067980>] ?
>> kthread+0x0/0xa0
>> Feb 14 02:06:16 lupus kernel: [   93.971355]  [<ffffffff81003510>] ?
>> kernel_thread_helper+0x0/0x10
>> Feb 14 02:06:16 lupus kernel: [   93.971444] Code: 55 b8 ff 00 00 00
>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> Feb 14 02:06:16 lupus kernel: [   93.974280] RIP  [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:06:16 lupus kernel: [   93.974412]  RSP <ffff88023c63b6e0>
>> Feb 14 02:06:16 lupus kernel: [   93.974497] CR2: 0000030341ed0050
>> Feb 14 02:06:16 lupus kernel: [   93.974599] ---[ end trace
>> 3313552d105b1536 ]---
>> Feb 14 02:07:04 lupus kernel: [  141.906124] zcache: destroyed pool id=2
>> Feb 14 02:07:17 lupus kernel: [  154.783358] SysRq : Keyboard mode set
>> to system default
>> Feb 14 02:07:18 lupus kernel: [  155.486147] SysRq : Terminate All Tasks
>>
>>
>> That's all for now
>>
>> Thanks & Regards
>>
>> Matt
>>
>
> (leaving out several folks from the CC to avoid spamming - if I left
> out someone wrongfully please re-add)
>
> running an addr2line reveals:
>
>
> addr2line -e /usr/src/linux-2.6.37_vanilla/vmlinux -i ffffffff81338cbb
> export.c:0
>
>
> hope that helps
>
>
> Regards
>
> Matt
>

Just my guessing. I might be wrong.

__cleancache_flush_inode calls cleancache_get_key with cleancache_filekey.
cleancache_file_key's size is just 6 * u32.
cleancache_get_key calls btrfs_encode_fh with the key.
but btrfs_encode_fh does typecasting the key to btrfs_fid which is
bigger size than cleancache_filekey's one so it should not access
fields beyond cleancache_get_key.

I think some file systems use extend fid so in there, this problem can
happen. I don't know why we can't find it earlier. Maybe Dan and
others test it for a long time.

Am I missing something?



--
Kind regards,
Minchan Kim

2011-02-14 20:59:19

by Matt

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Mon, Feb 14, 2011 at 1:29 AM, Matt <[email protected]> wrote:
> On Mon, Feb 14, 2011 at 1:24 AM, Matt <[email protected]> wrote:
>> On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]> wrote:
>>> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
>>> <[email protected]> wrote:
>>> [snip]
>>>>
>>>> If I've missed anything important, please let me know!
>>>>
>>>> Thanks again!
>>>> Dan
>>>>
>>>
>>> Hi Dan,
>>>
>>> thank you so much for answering my email in such detail !
>>>
>>> I shall pick up on that mail in my next email sending to the mailing list :)
>>>
>>>
>>> currently I've got a problem with btrfs which seems to get triggered
>>> by cleancache get-operations:
>>>
>>>
>>> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
>>> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
>>> /dev/mapper/portage
>>> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk space caching
>>> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo compression
>>> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created ephemeral
>>> tmem pool, id=3
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
>>> kernel paging request at 0000000001400050
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP: [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1] PREEMPT SMP
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
>>> /sys/devices/platform/coretemp.3/temp1_input
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in: radeon
>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>>> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>>> ehci_hcd [last unloaded: tg3]
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
>>> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
>>> G3710
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
>>> 0018:ffff880129a11b00 ?EFLAGS: 00010246
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
>>> RBX: ffff88014a1ce628 RCX: 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
>>> RSI: ffff880129a11b70 RDI: 0000000000000006
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
>>> R08: ffffffff8133eef0 R09: ffff880129a11c68
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
>>> R11: 0000000000000001 R12: ffff88014a1ce780
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
>>> R14: ffff88021fef9000 R15: 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
>>> 0000000000000000(0000) GS:ffff8800bf500000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: ?0010 DS: 0000 ES:
>>> 0000 CR0: 000000008005003b
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
>>> CR3: 0000000001c27000 CR4: 00000000000006e0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-transacti
>>> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
>>> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854006] ?ffff880129a11b50
>>> ffff880000000003 ffff88003c60a098 0000000000000003
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854035] ?ffffffffffffffff
>>> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854063] ?ffffffff8133e3f0
>>> ffffffff810e6cee 0000000000001000 0000000000000000
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854103] ?[<ffffffff810e6aaa>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854122] ?[<ffffffff8133e3f0>] ?
>>> btrfs_wake_function+0x0/0x20
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854140] ?[<ffffffff810e6cee>] ?
>>> __cleancache_flush_inode+0x3e/0x70
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854161] ?[<ffffffff810b34d2>] ?
>>> truncate_inode_pages_range+0x42/0x440
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854182] ?[<ffffffff812f115e>] ?
>>> btrfs_search_slot+0x89e/0xa00
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854201] ?[<ffffffff810c3a45>] ?
>>> unmap_mapping_range+0xc5/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854220] ?[<ffffffff810b3930>] ?
>>> truncate_pagecache+0x40/0x70
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854240] ?[<ffffffff813458b1>] ?
>>> btrfs_truncate_free_space_cache+0x81/0xe0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854261] ?[<ffffffff812fce15>] ?
>>> btrfs_write_dirty_block_groups+0x245/0x500
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854283] ?[<ffffffff812fcb6a>] ?
>>> btrfs_run_delayed_refs+0x1ba/0x220
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854304] ?[<ffffffff8130afff>] ?
>>> commit_cowonly_roots+0xff/0x1d0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854323] ?[<ffffffff8130c583>] ?
>>> btrfs_commit_transaction+0x363/0x760
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854344] ?[<ffffffff81067ea0>] ?
>>> autoremove_wake_function+0x0/0x30
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854364] ?[<ffffffff81305bc3>] ?
>>> transaction_kthread+0x283/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854383] ?[<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854401] ?[<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854420] ?[<ffffffff81067a16>] ?
>>> kthread+0x96/0xa0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854437] ?[<ffffffff81003514>] ?
>>> kernel_thread_helper+0x4/0x10
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854455] ?[<ffffffff81067980>] ?
>>> kthread+0x0/0xa0
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854471] ?[<ffffffff81003510>] ?
>>> kernel_thread_helper+0x0/0x10
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854762] ?RSP <ffff880129a11b00>
>>> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
>>> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
>>> f831c5ceeaa49287 ]---
>>>
>>> in my case I had compress-force with lzo and disk_cache enabled
>>>
>>>
>>> another user of the kernel I'm currently running has had the same
>>> problem with zcache
>>> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
>>>
>>> (looks like in his case compression and any other fancy additional
>>> features weren't enabled)
>>>
>>>
>>> changes made by this kernel or patchset to btrfs are from
>>> * io-less dirty throttling patchset (44 patches)
>>> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
>>> applied in both cases)
>>> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
>>> * btrfs-unstable changes to state of
>>> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals btrfs
>>> from 2.6.38-rc4+)
>>>
>>> I haven't tried downgrading to vanilla 2.6.37 with zcache only, yet,
>>>
>>> but kind of upgraded btrfs to the latest state of the btrfs-unstable
>>> repository (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=summary)
>>> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
>>>
>>> this also didn't help and seemed to produce the same error-message
>>>
>>> so to summarize:
>>>
>>> 1) error message appearing with all 4 patchsets applied changing
>>> btrfs-code and compress-force=lzo and disk_cache enabled
>>>
>>> 2) error message appearing with default mount-options and btrfs from
>>> 2.6.37 and changes for zcache & io-less dirty throttling patchset
>>> applied (first 2 patch(sets)) from list)
>>>
>>>
>>> in my case I tried to extract / play back a 1.7 GiB tarball of my
>>> portage-directory (lots of small files and some tar.bzip2 archives)
>>> via pbzip2 or 7z when the error happened and the message was shown
>>>
>>> Due to KMS sound (webradio streaming) was still running but I couldn't
>>> continue work (X switching to kernel output) so I did the magic sysrq
>>> combo (reisub)
>>>
>>>
>>> Does that BUG message ring a bell for anyone ?
>>>
>>> (if I should leave out anyone from the CC in the next emails or
>>> future, please holler - I don't want to spam your inboxes)
>>>
>>> Thanks
>>>
>>> Matt
>>>
>>
>>
>> OK,
>>
>> here's the output of a kernel -
>>
>> staying as close to vanilla (2.6.37) as the current situation allows
>> (only including some corruption or leak fixes for zram & zcache and
>> "zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
>> fixes)):
>>
>> so in total the following patches are included in this new kernel
>> (2.6.37-zcache):
>>
>> zram changes:
>> 1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
>> 2 [PATCH] zram: fix data corruption issue
>> 3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations
>>
>> zcache:
>> 1 zcache-linux-2.6.37-110205
>> 2 [PATCH] staging: zcache: fix memory leak
>> 3 [PATCH] zcache: Fix build error when sysfs is not defined
>>
>> reiserfs:
>> 1 [PATCH] reiserfs: Make sure va_end() is always called after
>> 2 [patch] reiserfs: potential ERR_PTR dereference
>>
>>
>> the same procedure:
>>
>> trying to extract the mentioned portage-tarball:
>>
>> time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 | tar
>> -xp -C /usr/gentoo/)
>>
>>
>> this hopefully should make it easier to track down the problem:
>>
>>
>> Feb 14 01:59:59 lupus kernel: [ ?364.777143] device fsid
>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
>> /dev/mapper/portage
>> Feb 14 01:59:59 lupus kernel: [ ?364.844994] zcache: created ephemeral
>> tmem pool, id=2
>> Feb 14 02:02:49 lupus kernel: [ ?534.577573] BUG: unable to handle
>> kernel paging request at 0000000037610050
>> Feb 14 02:02:49 lupus kernel: [ ?534.577605] IP: [<ffffffff81338cbb>]
>> btrfs_encode_fh+0x2b/0x110
>> Feb 14 02:02:49 lupus kernel: [ ?534.577630] PGD 0
>> Feb 14 02:02:49 lupus kernel: [ ?534.577640] Oops: 0000 [#1] PREEMPT SMP
>> Feb 14 02:02:49 lupus kernel: [ ?534.577665] last sysfs file:
>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>> Feb 14 02:02:49 lupus kernel: [ ?534.577693] CPU 5
>> Feb 14 02:02:49 lupus kernel: [ ?534.577701] Modules linked in: radeon
>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>> snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>> ehci_hcd [last unloaded: tg3]
>> Feb 14 02:02:49 lupus kernel: [ ?534.578114]
>> Feb 14 02:02:49 lupus kernel: [ ?534.578124] Pid: 8285, comm: tar Not
>> tainted 2.6.37-zcache #2 FMP55/ipower G3710
>> Feb 14 02:02:49 lupus kernel: [ ?534.578146] RIP:
>> 0010:[<ffffffff81338cbb>] ?[<ffffffff81338cbb>]
>> btrfs_encode_fh+0x2b/0x110
>> Feb 14 02:02:49 lupus kernel: [ ?534.578172] RSP:
>> 0018:ffff88023ea9dcc8 ?EFLAGS: 00010246
>> Feb 14 02:02:49 lupus kernel: [ ?534.578189] RAX: 00000000000000ff
>> RBX: ffff8800b8643228 RCX: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [ ?534.578210] RDX: ffff88023ea9dd04
>> RSI: ffff88023ea9dd38 RDI: 0000000000000006
>> Feb 14 02:02:49 lupus kernel: [ ?534.578230] RBP: 0000000037610000
>> R08: ffffffff81338c90 R09: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [ ?534.578251] R10: 0000000000000019
>> R11: 0000000000000001 R12: ffff8800b8643380
>> Feb 14 02:02:49 lupus kernel: [ ?534.578272] R13: ffff8800b8643258
>> R14: 00007fff806f1f00 R15: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [ ?534.578293] FS:
>> 00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
>> knlGS:0000000000000000
>> Feb 14 02:02:49 lupus kernel: [ ?534.578317] CS: ?0010 DS: 0000 ES:
>> 0000 CR0: 0000000080050033
>> Feb 14 02:02:49 lupus kernel: [ ?534.578334] CR2: 0000000037610050
>> CR3: 000000023dcef000 CR4: 00000000000006e0
>> Feb 14 02:02:49 lupus kernel: [ ?534.578356] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Feb 14 02:02:49 lupus kernel: [ ?534.578377] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Feb 14 02:02:49 lupus kernel: [ ?534.578398] Process tar (pid: 8285,
>> threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
>> Feb 14 02:02:49 lupus kernel: [ ?534.578421] Stack:
>> Feb 14 02:02:49 lupus kernel: [ ?534.578428] ?000000013d096000
>> ffff88023ed84800 ffff88023ea9c000 0000000000000002
>> Feb 14 02:02:49 lupus kernel: [ ?534.578458] ?ffffffffffffffff
>> ffffffff810e3b1a 0000000000000001 000000061e1d5240
>> Feb 14 02:02:49 lupus kernel: [ ?534.578486] ?fffffffffffffffb
>> ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
>> Feb 14 02:02:49 lupus kernel: [ ?534.578514] Call Trace:
>> Feb 14 02:02:49 lupus kernel: [ ?534.578525] ?[<ffffffff810e3b1a>] ?
>> cleancache_get_key+0x4a/0x60
>> Feb 14 02:02:49 lupus kernel: [ ?534.578544] ?[<ffffffff810e3d5e>] ?
>> __cleancache_flush_inode+0x3e/0x70
>> Feb 14 02:02:49 lupus kernel: [ ?534.578565] ?[<ffffffff810b0ed2>] ?
>> truncate_inode_pages_range+0x42/0x440
>> Feb 14 02:02:49 lupus kernel: [ ?534.578586] ?[<ffffffff81338451>] ?
>> btrfs_tree_unlock+0x41/0x50
>> Feb 14 02:02:49 lupus kernel: [ ?534.578605] ?[<ffffffff812e4ed5>] ?
>> btrfs_release_path+0x15/0x70
>> Feb 14 02:02:49 lupus kernel: [ ?534.578624] ?[<ffffffff8130bf29>] ?
>> btrfs_run_delayed_iputs+0x49/0x120
>> Feb 14 02:02:49 lupus kernel: [ ?534.578644] ?[<ffffffff813107e7>] ?
>> btrfs_evict_inode+0x27/0x1e0
>> Feb 14 02:02:49 lupus kernel: [ ?534.578663] ?[<ffffffff810fc3aa>] ?
>> evict+0x1a/0xa0
>> Feb 14 02:02:49 lupus kernel: [ ?534.578678] ?[<ffffffff810fc6bd>] ?
>> iput+0x1cd/0x2b0
>> Feb 14 02:02:49 lupus kernel: [ ?534.578694] ?[<ffffffff810f266f>] ?
>> do_unlinkat+0x12f/0x1d0
>> Feb 14 02:02:49 lupus kernel: [ ?534.578712] ?[<ffffffff810027bb>] ?
>> system_call_fastpath+0x16/0x1b
>> Feb 14 02:02:49 lupus kernel: [ ?534.578730] Code: 55 b8 ff 00 00 00
>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> Feb 14 02:02:49 lupus kernel: [ ?534.578986] RIP ?[<ffffffff81338cbb>]
>> btrfs_encode_fh+0x2b/0x110
>> Feb 14 02:02:49 lupus kernel: [ ?534.579081] ?RSP <ffff88023ea9dcc8>
>> Feb 14 02:02:49 lupus kernel: [ ?534.579093] CR2: 0000000037610050
>> Feb 14 02:02:49 lupus kernel: [ ?534.587513] ---[ end trace
>> c596b12e66c0b360 ]---
>>
>>
>> for reference I've pasted it to pastebin.com:
>>
>> "2.6.37_zcache_V2.patch"
>> http://pastebin.com/cVSkwQ6M
>>
>>
>>
>>
>>
>> after the reboot I had forgotten to not mount the btrfs volume and it
>> threw a similar error-message again and remounted several partitions
>> read-only (including the system partition)
>> the partition with btrfs (/usr/gentoo) couldn't be unmounted since the
>> umount process kind of hang
>>
>> so here's the error message after a reboot (might not be accurate or
>> kind of "skewed" since other patches are included (io-less dirty
>> throttling, PATCH] fix (latent?) memory corruption in
>> btrfs_encode_fh() and latest changes for btrfs)) but might help to get
>> some more evidence:
>>
>>
>> Feb 14 02:05:46 lupus kernel: [ ? 63.922648] device fsid
>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
>> /dev/mapper/portage
>> Feb 14 02:05:46 lupus kernel: [ ? 64.047118] btrfs: unlinked 1 orphans
>> Feb 14 02:05:46 lupus kernel: [ ? 64.051956] zcache: created ephemeral
>> tmem pool, id=3
>> Feb 14 02:05:48 lupus kernel: [ ? 65.801364] hub 2-1:1.0: hub_suspend
>> Feb 14 02:05:48 lupus kernel: [ ? 65.801376] usb 2-1: unlink
>> qh256-0001/ffff88023fefd180 start 1 [1/0 us]
>> Feb 14 02:05:48 lupus kernel: [ ? 65.801559] usb 2-1: usb auto-suspend
>> Feb 14 02:05:50 lupus kernel: [ ? 67.797929] hub 2-0:1.0: hub_suspend
>> Feb 14 02:05:50 lupus kernel: [ ? 67.797939] usb usb2: bus auto-suspend
>> Feb 14 02:05:50 lupus kernel: [ ? 67.797942] ehci_hcd 0000:00:1d.0:
>> suspend root hub
>> Feb 14 02:05:52 lupus kernel: [ ? 70.050493] BUG: unable to handle
>> kernel paging request at 0000030341ed0050
>> Feb 14 02:05:52 lupus kernel: [ ? 70.050670] IP: [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:05:52 lupus kernel: [ ? 70.050807] PGD 0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.050929] Oops: 0000 [#1] PREEMPT SMP
>> Feb 14 02:05:52 lupus kernel: [ ? 70.051223] last sysfs file:
>> /sys/module/pcie_aspm/parameters/policy
>> Feb 14 02:05:52 lupus kernel: [ ? 70.051365] CPU 6
>> Feb 14 02:05:52 lupus kernel: [ ? 70.051411] Modules linked in:
>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>> usb_storage ehci_hcd [last unloaded: tg3]
>> Feb 14 02:05:52 lupus kernel: [ ? 70.054694]
>> Feb 14 02:05:52 lupus kernel: [ ? 70.054776] Pid: 7962, comm: umount
>> Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
>> Feb 14 02:05:52 lupus kernel: [ ? 70.054912] RIP:
>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055084] RSP:
>> 0018:ffff88023c77d6f8 ?EFLAGS: 00010246
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055173] RAX: 00000000000000ff
>> RBX: ffff88023cde0168 RCX: 0000000000000000
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055265] RDX: ffff88023c77d734
>> RSI: ffff88023c77d768 RDI: 0000000000000006
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055357] RBP: 0000030341ed0000
>> R08: ffffffff8133eef0 R09: ffff88023c77d8d8
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055448] R10: 0000000000000003
>> R11: 0000000000000001 R12: 00000000ffffffff
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055540] R13: ffff88023cde0030
>> R14: ffffea0007dd39f0 R15: 0000000000000001
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055633] FS:
>> 00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
>> knlGS:0000000000000000
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055762] CS: ?0010 DS: 0000 ES:
>> 0000 CR0: 000000008005003b
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055851] CR2: 0000030341ed0050
>> CR3: 000000023c7d5000 CR4: 00000000000006e0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.055943] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Feb 14 02:05:52 lupus kernel: [ ? 70.056035] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Feb 14 02:05:52 lupus kernel: [ ? 70.056128] Process umount (pid:
>> 7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
>> Feb 14 02:05:52 lupus kernel: [ ? 70.056257] Stack:
>> Feb 14 02:05:52 lupus kernel: [ ? 70.056338] ?0000000000000000
>> 0000000000000002 ffff880200000000 0000000000000003
>> Feb 14 02:05:52 lupus kernel: [ ? 70.056630] ?ffffea0007dd39f0
>> ffffffff810e6aaa ffff880200000041 0000000600000246
>> Feb 14 02:05:52 lupus kernel: [ ? 70.056922] ?ffff88023cdcd300
>> ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057213] Call Trace:
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057301] ?[<ffffffff810e6aaa>] ?
>> cleancache_get_key+0x4a/0x60
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057393] ?[<ffffffff810e6b3a>] ?
>> __cleancache_get_page+0x7a/0xd0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057487] ?[<ffffffff8132bb7c>] ?
>> merge_state+0x7c/0x150
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057579] ?[<ffffffff8132e4de>] ?
>> __extent_read_full_page+0x52e/0x710
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057673] ?[<ffffffff813bdea4>] ?
>> rb_insert_color+0xa4/0x140
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057766] ?[<ffffffff8134b0b6>] ?
>> tree_insert+0x86/0x1e0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.057859] ?[<ffffffff81058c73>] ?
>> lock_timer_base.clone.22+0x33/0x70
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058004] ?[<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058097] ?[<ffffffff81330b21>] ?
>> read_extent_buffer_pages+0x2d1/0x470
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058191] ?[<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058283] ?[<ffffffff8130674d>] ?
>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058415] ?[<ffffffff813076f9>] ?
>> read_tree_block+0x39/0x60
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058508] ?[<ffffffff812ed5e6>] ?
>> read_block_for_search.clone.40+0x116/0x410
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058638] ?[<ffffffff812eb228>] ?
>> btrfs_cow_block+0x118/0x2b0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058731] ?[<ffffffff812f0bc7>] ?
>> btrfs_search_slot+0x307/0xa00
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058823] ?[<ffffffff812f6b18>] ?
>> lookup_inline_extent_backref+0x98/0x4a0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.058919] ?[<ffffffff810e33d7>] ?
>> kmem_cache_alloc+0x87/0xa0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059032] ?[<ffffffff812f891c>] ?
>> __btrfs_free_extent+0xcc/0x6f0
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059125] ?[<ffffffff812fc4cf>] ?
>> run_clustered_refs+0x39f/0x880
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059220] ?[<ffffffff810b1f98>] ?
>> pagevec_lookup_tag+0x18/0x20
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059312] ?[<ffffffff810a7c81>] ?
>> filemap_fdatawait_range+0x91/0x180
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059405] ?[<ffffffff812fca77>] ?
>> btrfs_run_delayed_refs+0xc7/0x220
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059498] ?[<ffffffff8130c29c>] ?
>> btrfs_commit_transaction+0x7c/0x760
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059591] ?[<ffffffff81067ea0>] ?
>> autoremove_wake_function+0x0/0x30
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059683] ?[<ffffffff8130cdef>] ?
>> start_transaction+0x1bf/0x270
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059775] ?[<ffffffff8110e96a>] ?
>> __sync_filesystem+0x5a/0x90
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059867] ?[<ffffffff810eae8d>] ?
>> generic_shutdown_super+0x2d/0x100
>> Feb 14 02:05:52 lupus kernel: [ ? 70.059960] ?[<ffffffff810eafb9>] ?
>> kill_anon_super+0x9/0x50
>> Feb 14 02:05:52 lupus kernel: [ ? 70.060051] ?[<ffffffff810eb266>] ?
>> deactivate_locked_super+0x26/0x80
>> Feb 14 02:05:52 lupus kernel: [ ? 70.060144] ?[<ffffffff811043ea>] ?
>> sys_umount+0x7a/0x390
>> Feb 14 02:05:52 lupus kernel: [ ? 70.060235] ?[<ffffffff810027bb>] ?
>> system_call_fastpath+0x16/0x1b
>> Feb 14 02:05:52 lupus kernel: [ ? 70.060325] Code: 55 b8 ff 00 00 00
>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> Feb 14 02:05:52 lupus kernel: [ ? 70.063170] RIP ?[<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:05:52 lupus kernel: [ ? 70.063302] ?RSP <ffff88023c77d6f8>
>> Feb 14 02:05:52 lupus kernel: [ ? 70.063386] CR2: 0000030341ed0050
>> Feb 14 02:05:52 lupus kernel: [ ? 70.063528] ---[ end trace
>> 3313552d105b1535 ]---
>> Feb 14 02:06:16 lupus kernel: [ ? 93.961960] BUG: unable to handle
>> kernel paging request at 0000030341ed0050
>> Feb 14 02:06:16 lupus kernel: [ ? 93.962171] IP: [<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:06:16 lupus kernel: [ ? 93.962307] PGD 0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.962430] Oops: 0000 [#2] PREEMPT SMP
>> Feb 14 02:06:16 lupus kernel: [ ? 93.962637] last sysfs file:
>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>> Feb 14 02:06:16 lupus kernel: [ ? 93.962766] CPU 5
>> Feb 14 02:06:16 lupus kernel: [ ? 93.962812] Modules linked in:
>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>> usb_storage ehci_hcd [last unloaded: tg3]
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966044]
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966127] Pid: 7915, comm:
>> btrfs-transacti Tainted: G ? ? ?D ? ? 2.6.37-plus_v16_zcache #4
>> FMP55/ipower G3710
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966266] RIP:
>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966440] RSP:
>> 0018:ffff88023c63b6e0 ?EFLAGS: 00010246
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966528] RAX: 00000000000000ff
>> RBX: ffff88023cde0168 RCX: 0000000000000000
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966620] RDX: ffff88023c63b71c
>> RSI: ffff88023c63b750 RDI: 0000000000000006
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966713] RBP: 0000030341ed0000
>> R08: ffffffff8133eef0 R09: ffff88023c63b8c0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966805] R10: 0000000000000003
>> R11: 0000000000000001 R12: 00000000ffffffff
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966897] R13: ffff88023cde0030
>> R14: ffffea0007d59bc8 R15: 0000000000000001
>> Feb 14 02:06:16 lupus kernel: [ ? 93.966990] FS:
>> 0000000000000000(0000) GS:ffff8800bf540000(0000)
>> knlGS:0000000000000000
>> Feb 14 02:06:16 lupus kernel: [ ? 93.967120] CS: ?0010 DS: 0000 ES:
>> 0000 CR0: 000000008005003b
>> Feb 14 02:06:16 lupus kernel: [ ? 93.967209] CR2: 0000030341ed0050
>> CR3: 0000000001c27000 CR4: 00000000000006e0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.967302] DR0: 0000000000000000
>> DR1: 0000000000000000 DR2: 0000000000000000
>> Feb 14 02:06:16 lupus kernel: [ ? 93.967394] DR3: 0000000000000000
>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Feb 14 02:06:16 lupus kernel: [ ? 93.967500] Process btrfs-transacti
>> (pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
>> Feb 14 02:06:16 lupus kernel: [ ? 93.967630] Stack:
>> Feb 14 02:06:16 lupus kernel: [ ? 93.967711] ?0000000000000000
>> 0000000000000002 0000000000000000 0000000000000003
>> Feb 14 02:06:16 lupus kernel: [ ? 93.968057] ?ffffea0007d59bc8
>> ffffffff810e6aaa 0000000000000041 0000000600000002
>> Feb 14 02:06:16 lupus kernel: [ ? 93.968348] ?0000000000000000
>> ffffffff810e6b3a 0000000000000001 ffffffff00000001
>> Feb 14 02:06:16 lupus kernel: [ ? 93.968639] Call Trace:
>> Feb 14 02:06:16 lupus kernel: [ ? 93.968728] ?[<ffffffff810e6aaa>] ?
>> cleancache_get_key+0x4a/0x60
>> Feb 14 02:06:16 lupus kernel: [ ? 93.968820] ?[<ffffffff810e6b3a>] ?
>> __cleancache_get_page+0x7a/0xd0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.968914] ?[<ffffffff8132e4de>] ?
>> __extent_read_full_page+0x52e/0x710
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969008] ?[<ffffffff812f3f93>] ?
>> update_reserved_bytes+0xb3/0x140
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969102] ?[<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969193] ?[<ffffffff8132bb7c>] ?
>> merge_state+0x7c/0x150
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969285] ?[<ffffffff81330b21>] ?
>> read_extent_buffer_pages+0x2d1/0x470
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969378] ?[<ffffffff81305060>] ?
>> btree_get_extent+0x0/0x1c0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969470] ?[<ffffffff8130674d>] ?
>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969602] ?[<ffffffff813076f9>] ?
>> read_tree_block+0x39/0x60
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969694] ?[<ffffffff812ed5e6>] ?
>> read_block_for_search.clone.40+0x116/0x410
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969878] ?[<ffffffff812f0bc7>] ?
>> btrfs_search_slot+0x307/0xa00
>> Feb 14 02:06:16 lupus kernel: [ ? 93.969970] ?[<ffffffff812f6b18>] ?
>> lookup_inline_extent_backref+0x98/0x4a0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970065] ?[<ffffffff810e33d7>] ?
>> kmem_cache_alloc+0x87/0xa0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970157] ?[<ffffffff812f891c>] ?
>> __btrfs_free_extent+0xcc/0x6f0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970249] ?[<ffffffff812f8434>] ?
>> update_block_group.clone.62+0xc4/0x280
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970343] ?[<ffffffff812fc4cf>] ?
>> run_clustered_refs+0x39f/0x880
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970436] ?[<ffffffff812fca77>] ?
>> btrfs_run_delayed_refs+0xc7/0x220
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970529] ?[<ffffffff810e15f9>] ?
>> new_slab+0x169/0x1f0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970619] ?[<ffffffff8130c29c>] ?
>> btrfs_commit_transaction+0x7c/0x760
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970713] ?[<ffffffff81067ea0>] ?
>> autoremove_wake_function+0x0/0x30
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970806] ?[<ffffffff81305bc3>] ?
>> transaction_kthread+0x283/0x2a0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970898] ?[<ffffffff81305940>] ?
>> transaction_kthread+0x0/0x2a0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.970990] ?[<ffffffff81305940>] ?
>> transaction_kthread+0x0/0x2a0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.971083] ?[<ffffffff81067a16>] ?
>> kthread+0x96/0xa0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.971174] ?[<ffffffff81003514>] ?
>> kernel_thread_helper+0x4/0x10
>> Feb 14 02:06:16 lupus kernel: [ ? 93.971266] ?[<ffffffff81067980>] ?
>> kthread+0x0/0xa0
>> Feb 14 02:06:16 lupus kernel: [ ? 93.971355] ?[<ffffffff81003510>] ?
>> kernel_thread_helper+0x0/0x10
>> Feb 14 02:06:16 lupus kernel: [ ? 93.971444] Code: 55 b8 ff 00 00 00
>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> Feb 14 02:06:16 lupus kernel: [ ? 93.974280] RIP ?[<ffffffff8133ef1b>]
>> btrfs_encode_fh+0x2b/0x120
>> Feb 14 02:06:16 lupus kernel: [ ? 93.974412] ?RSP <ffff88023c63b6e0>
>> Feb 14 02:06:16 lupus kernel: [ ? 93.974497] CR2: 0000030341ed0050
>> Feb 14 02:06:16 lupus kernel: [ ? 93.974599] ---[ end trace
>> 3313552d105b1536 ]---
>> Feb 14 02:07:04 lupus kernel: [ ?141.906124] zcache: destroyed pool id=2
>> Feb 14 02:07:17 lupus kernel: [ ?154.783358] SysRq : Keyboard mode set
>> to system default
>> Feb 14 02:07:18 lupus kernel: [ ?155.486147] SysRq : Terminate All Tasks
>>
>>
>> That's all for now
>>
>> Thanks & Regards
>>
>> Matt
>>
>
> (leaving out several folks from the CC to avoid spamming - if I left
> out someone wrongfully please re-add)
>
> running an addr2line reveals:
>
>
> addr2line -e /usr/src/linux-2.6.37_vanilla/vmlinux -i ffffffff81338cbb
> export.c:0
>
>
> hope that helps
>
>
> Regards
>
> Matt
>


ok, maybe it's useful to have some more details how to reproduce it as
easily as possible and about my configuration:

preparation steps:

1) 2.6.37 vanilla kernel with the mentioned changes to zram (and
xvmalloc), zcache (+ fixes)

configuration specifics:

CONFIG_CRYPTO_PCRYPT=y

2) on a non-btrfs partition or with zcache disable: get a
portage-tarball comparable to mine download from one of the
gentoo-mirrors (http://www.gentoo.org/main/en/mirrors2.xml), e.g. the
University of California:
ftp://ftp.ucsb.edu/pub/mirrors/linux/gentoo/snapshots/

(weighing only around 40 MiB) then

get that tarball to a decent size by adding the latest changes from a
rsync-mirror (http://www.gentoo.org/main/en/mirrors-rsync.xml)

I can also upload my specific tarball weighing at 1.7 GiB at request -
just point me a place to drop it

3) creating a tar.bzip2 ball (preferably via 7z or pbzip2 so that it
can be extracted later in parallel to create some pressure)


Hardware:
core i7 860 (4 cores - ht -> 8 threads), 8 GiB of RAM,
underlying harddrive is a Samsung HD203WI, NCQ is disabled
(queue_depth set to "1"), using CFQ as i/o scheduler

echo "13" > /proc/sys/vm/page-cluster
echo "60" > /proc/sys/vm/swappiness
echo "3000" > /proc/sys/vm/dirty_expire_centisecs
echo "1500" > /proc/sys/vm/dirty_writeback_centisecs
echo "15" > /proc/sys/vm/dirty_background_ratio
echo "50" > /proc/sys/vm/dirty_ratio
echo "50" > /proc/sys/vm/vfs_cache_pressure
echo "32768" > /proc/sys/vm/min_free_kbytes

for i in /sys/block/sd*; do
/bin/echo "4096" > $i/queue/read_ahead_kb
/bin/echo "64" > $i/queue/max_sectors_kb
/bin/echo "1" > $i/queue/rq_affinity
done

nr_requests is set to "1024"

slice_sync to "150"
fifo_expire_sync to "50"

echo 4096 > /sys/class/bdi/default/read_ahead_kb


steps to get the "result":

1) cryptsetup-partition with aes or twofish-encryption (512 bits)
[using cryptsetup 1.1.3* or 1.2*]

2) on top of that btrfs

3) extracting the tarball (usually I'm adding a "time" in front to see
how long it took): time (7z e -so -tbzip2 -mmt=5
/system/portage_backup_022011.tbz2 | tar
-xp -C /usr/gentoo/)


"result":

1) it seems to take several seconds or even minutes until that BUG
message gets shown so maybe some memory (or other kind of subsystem)
pressure is needed to trigger it


I hope that's useful in reproducing the BUG


Thanks & Regards

Matt

2011-02-15 23:48:35

by Matt

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Mon, Feb 14, 2011 at 8:59 PM, Matt <[email protected]> wrote:
> On Mon, Feb 14, 2011 at 1:29 AM, Matt <[email protected]> wrote:
>> On Mon, Feb 14, 2011 at 1:24 AM, Matt <[email protected]> wrote:
>>> On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]> wrote:
>>>> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
>>>> <[email protected]> wrote:
>>>> [snip]
>>>>>
>>>>> If I've missed anything important, please let me know!
>>>>>
>>>>> Thanks again!
>>>>> Dan
>>>>>
>>>>
>>>> Hi Dan,
>>>>
>>>> thank you so much for answering my email in such detail !
>>>>
>>>> I shall pick up on that mail in my next email sending to the mailing list :)
>>>>
>>>>
>>>> currently I've got a problem with btrfs which seems to get triggered
>>>> by cleancache get-operations:
>>>>
>>>>
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
>>>> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
>>>> /dev/mapper/portage
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk space caching
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo compression
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created ephemeral
>>>> tmem pool, id=3
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
>>>> kernel paging request at 0000000001400050
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP: [<ffffffff8133ef1b>]
>>>> btrfs_encode_fh+0x2b/0x120
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1] PREEMPT SMP
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
>>>> /sys/devices/platform/coretemp.3/temp1_input
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in: radeon
>>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>>>> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
>>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>>>> ehci_hcd [last unloaded: tg3]
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
>>>> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
>>>> G3710
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
>>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>>>> btrfs_encode_fh+0x2b/0x120
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
>>>> 0018:ffff880129a11b00 ?EFLAGS: 00010246
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
>>>> RBX: ffff88014a1ce628 RCX: 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
>>>> RSI: ffff880129a11b70 RDI: 0000000000000006
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
>>>> R08: ffffffff8133eef0 R09: ffff880129a11c68
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
>>>> R11: 0000000000000001 R12: ffff88014a1ce780
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
>>>> R14: ffff88021fef9000 R15: 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
>>>> 0000000000000000(0000) GS:ffff8800bf500000(0000)
>>>> knlGS:0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: ?0010 DS: 0000 ES:
>>>> 0000 CR0: 000000008005003b
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
>>>> CR3: 0000000001c27000 CR4: 00000000000006e0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
>>>> DR1: 0000000000000000 DR2: 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
>>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-transacti
>>>> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854006] ?ffff880129a11b50
>>>> ffff880000000003 ffff88003c60a098 0000000000000003
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854035] ?ffffffffffffffff
>>>> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854063] ?ffffffff8133e3f0
>>>> ffffffff810e6cee 0000000000001000 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854103] ?[<ffffffff810e6aaa>] ?
>>>> cleancache_get_key+0x4a/0x60
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854122] ?[<ffffffff8133e3f0>] ?
>>>> btrfs_wake_function+0x0/0x20
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854140] ?[<ffffffff810e6cee>] ?
>>>> __cleancache_flush_inode+0x3e/0x70
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854161] ?[<ffffffff810b34d2>] ?
>>>> truncate_inode_pages_range+0x42/0x440
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854182] ?[<ffffffff812f115e>] ?
>>>> btrfs_search_slot+0x89e/0xa00
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854201] ?[<ffffffff810c3a45>] ?
>>>> unmap_mapping_range+0xc5/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854220] ?[<ffffffff810b3930>] ?
>>>> truncate_pagecache+0x40/0x70
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854240] ?[<ffffffff813458b1>] ?
>>>> btrfs_truncate_free_space_cache+0x81/0xe0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854261] ?[<ffffffff812fce15>] ?
>>>> btrfs_write_dirty_block_groups+0x245/0x500
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854283] ?[<ffffffff812fcb6a>] ?
>>>> btrfs_run_delayed_refs+0x1ba/0x220
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854304] ?[<ffffffff8130afff>] ?
>>>> commit_cowonly_roots+0xff/0x1d0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854323] ?[<ffffffff8130c583>] ?
>>>> btrfs_commit_transaction+0x363/0x760
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854344] ?[<ffffffff81067ea0>] ?
>>>> autoremove_wake_function+0x0/0x30
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854364] ?[<ffffffff81305bc3>] ?
>>>> transaction_kthread+0x283/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854383] ?[<ffffffff81305940>] ?
>>>> transaction_kthread+0x0/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854401] ?[<ffffffff81305940>] ?
>>>> transaction_kthread+0x0/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854420] ?[<ffffffff81067a16>] ?
>>>> kthread+0x96/0xa0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854437] ?[<ffffffff81003514>] ?
>>>> kernel_thread_helper+0x4/0x10
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854455] ?[<ffffffff81067980>] ?
>>>> kthread+0x0/0xa0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854471] ?[<ffffffff81003510>] ?
>>>> kernel_thread_helper+0x0/0x10
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00 00
>>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP ?[<ffffffff8133ef1b>]
>>>> btrfs_encode_fh+0x2b/0x120
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854762] ?RSP <ffff880129a11b00>
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
>>>> f831c5ceeaa49287 ]---
>>>>
>>>> in my case I had compress-force with lzo and disk_cache enabled
>>>>
>>>>
>>>> another user of the kernel I'm currently running has had the same
>>>> problem with zcache
>>>> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
>>>>
>>>> (looks like in his case compression and any other fancy additional
>>>> features weren't enabled)
>>>>
>>>>
>>>> changes made by this kernel or patchset to btrfs are from
>>>> * io-less dirty throttling patchset (44 patches)
>>>> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
>>>> applied in both cases)
>>>> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
>>>> * btrfs-unstable changes to state of
>>>> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals btrfs
>>>> from 2.6.38-rc4+)
>>>>
>>>> I haven't tried downgrading to vanilla 2.6.37 with zcache only, yet,
>>>>
>>>> but kind of upgraded btrfs to the latest state of the btrfs-unstable
>>>> repository (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=summary)
>>>> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
>>>>
>>>> this also didn't help and seemed to produce the same error-message
>>>>
>>>> so to summarize:
>>>>
>>>> 1) error message appearing with all 4 patchsets applied changing
>>>> btrfs-code and compress-force=lzo and disk_cache enabled
>>>>
>>>> 2) error message appearing with default mount-options and btrfs from
>>>> 2.6.37 and changes for zcache & io-less dirty throttling patchset
>>>> applied (first 2 patch(sets)) from list)
>>>>
>>>>
>>>> in my case I tried to extract / play back a 1.7 GiB tarball of my
>>>> portage-directory (lots of small files and some tar.bzip2 archives)
>>>> via pbzip2 or 7z when the error happened and the message was shown
>>>>
>>>> Due to KMS sound (webradio streaming) was still running but I couldn't
>>>> continue work (X switching to kernel output) so I did the magic sysrq
>>>> combo (reisub)
>>>>
>>>>
>>>> Does that BUG message ring a bell for anyone ?
>>>>
>>>> (if I should leave out anyone from the CC in the next emails or
>>>> future, please holler - I don't want to spam your inboxes)
>>>>
>>>> Thanks
>>>>
>>>> Matt
>>>>
>>>
>>>
>>> OK,
>>>
>>> here's the output of a kernel -
>>>
>>> staying as close to vanilla (2.6.37) as the current situation allows
>>> (only including some corruption or leak fixes for zram & zcache and
>>> "zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
>>> fixes)):
>>>
>>> so in total the following patches are included in this new kernel
>>> (2.6.37-zcache):
>>>
>>> zram changes:
>>> 1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
>>> 2 [PATCH] zram: fix data corruption issue
>>> 3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations
>>>
>>> zcache:
>>> 1 zcache-linux-2.6.37-110205
>>> 2 [PATCH] staging: zcache: fix memory leak
>>> 3 [PATCH] zcache: Fix build error when sysfs is not defined
>>>
>>> reiserfs:
>>> 1 [PATCH] reiserfs: Make sure va_end() is always called after
>>> 2 [patch] reiserfs: potential ERR_PTR dereference
>>>
>>>
>>> the same procedure:
>>>
>>> trying to extract the mentioned portage-tarball:
>>>
>>> time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 | tar
>>> -xp -C /usr/gentoo/)
>>>
>>>
>>> this hopefully should make it easier to track down the problem:
>>>
>>>
>>> Feb 14 01:59:59 lupus kernel: [ ?364.777143] device fsid
>>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
>>> /dev/mapper/portage
>>> Feb 14 01:59:59 lupus kernel: [ ?364.844994] zcache: created ephemeral
>>> tmem pool, id=2
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577573] BUG: unable to handle
>>> kernel paging request at 0000000037610050
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577605] IP: [<ffffffff81338cbb>]
>>> btrfs_encode_fh+0x2b/0x110
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577630] PGD 0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577640] Oops: 0000 [#1] PREEMPT SMP
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577665] last sysfs file:
>>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577693] CPU 5
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577701] Modules linked in: radeon
>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>>> snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>>> ehci_hcd [last unloaded: tg3]
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578114]
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578124] Pid: 8285, comm: tar Not
>>> tainted 2.6.37-zcache #2 FMP55/ipower G3710
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578146] RIP:
>>> 0010:[<ffffffff81338cbb>] ?[<ffffffff81338cbb>]
>>> btrfs_encode_fh+0x2b/0x110
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578172] RSP:
>>> 0018:ffff88023ea9dcc8 ?EFLAGS: 00010246
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578189] RAX: 00000000000000ff
>>> RBX: ffff8800b8643228 RCX: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578210] RDX: ffff88023ea9dd04
>>> RSI: ffff88023ea9dd38 RDI: 0000000000000006
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578230] RBP: 0000000037610000
>>> R08: ffffffff81338c90 R09: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578251] R10: 0000000000000019
>>> R11: 0000000000000001 R12: ffff8800b8643380
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578272] R13: ffff8800b8643258
>>> R14: 00007fff806f1f00 R15: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578293] FS:
>>> 00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578317] CS: ?0010 DS: 0000 ES:
>>> 0000 CR0: 0000000080050033
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578334] CR2: 0000000037610050
>>> CR3: 000000023dcef000 CR4: 00000000000006e0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578356] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578377] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578398] Process tar (pid: 8285,
>>> threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578421] Stack:
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578428] ?000000013d096000
>>> ffff88023ed84800 ffff88023ea9c000 0000000000000002
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578458] ?ffffffffffffffff
>>> ffffffff810e3b1a 0000000000000001 000000061e1d5240
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578486] ?fffffffffffffffb
>>> ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578514] Call Trace:
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578525] ?[<ffffffff810e3b1a>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578544] ?[<ffffffff810e3d5e>] ?
>>> __cleancache_flush_inode+0x3e/0x70
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578565] ?[<ffffffff810b0ed2>] ?
>>> truncate_inode_pages_range+0x42/0x440
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578586] ?[<ffffffff81338451>] ?
>>> btrfs_tree_unlock+0x41/0x50
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578605] ?[<ffffffff812e4ed5>] ?
>>> btrfs_release_path+0x15/0x70
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578624] ?[<ffffffff8130bf29>] ?
>>> btrfs_run_delayed_iputs+0x49/0x120
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578644] ?[<ffffffff813107e7>] ?
>>> btrfs_evict_inode+0x27/0x1e0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578663] ?[<ffffffff810fc3aa>] ?
>>> evict+0x1a/0xa0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578678] ?[<ffffffff810fc6bd>] ?
>>> iput+0x1cd/0x2b0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578694] ?[<ffffffff810f266f>] ?
>>> do_unlinkat+0x12f/0x1d0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578712] ?[<ffffffff810027bb>] ?
>>> system_call_fastpath+0x16/0x1b
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578730] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578986] RIP ?[<ffffffff81338cbb>]
>>> btrfs_encode_fh+0x2b/0x110
>>> Feb 14 02:02:49 lupus kernel: [ ?534.579081] ?RSP <ffff88023ea9dcc8>
>>> Feb 14 02:02:49 lupus kernel: [ ?534.579093] CR2: 0000000037610050
>>> Feb 14 02:02:49 lupus kernel: [ ?534.587513] ---[ end trace
>>> c596b12e66c0b360 ]---
>>>
>>>
>>> for reference I've pasted it to pastebin.com:
>>>
>>> "2.6.37_zcache_V2.patch"
>>> http://pastebin.com/cVSkwQ6M
>>>
>>>
>>>
>>>
>>>
>>> after the reboot I had forgotten to not mount the btrfs volume and it
>>> threw a similar error-message again and remounted several partitions
>>> read-only (including the system partition)
>>> the partition with btrfs (/usr/gentoo) couldn't be unmounted since the
>>> umount process kind of hang
>>>
>>> so here's the error message after a reboot (might not be accurate or
>>> kind of "skewed" since other patches are included (io-less dirty
>>> throttling, PATCH] fix (latent?) memory corruption in
>>> btrfs_encode_fh() and latest changes for btrfs)) but might help to get
>>> some more evidence:
>>>
>>>
>>> Feb 14 02:05:46 lupus kernel: [ ? 63.922648] device fsid
>>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
>>> /dev/mapper/portage
>>> Feb 14 02:05:46 lupus kernel: [ ? 64.047118] btrfs: unlinked 1 orphans
>>> Feb 14 02:05:46 lupus kernel: [ ? 64.051956] zcache: created ephemeral
>>> tmem pool, id=3
>>> Feb 14 02:05:48 lupus kernel: [ ? 65.801364] hub 2-1:1.0: hub_suspend
>>> Feb 14 02:05:48 lupus kernel: [ ? 65.801376] usb 2-1: unlink
>>> qh256-0001/ffff88023fefd180 start 1 [1/0 us]
>>> Feb 14 02:05:48 lupus kernel: [ ? 65.801559] usb 2-1: usb auto-suspend
>>> Feb 14 02:05:50 lupus kernel: [ ? 67.797929] hub 2-0:1.0: hub_suspend
>>> Feb 14 02:05:50 lupus kernel: [ ? 67.797939] usb usb2: bus auto-suspend
>>> Feb 14 02:05:50 lupus kernel: [ ? 67.797942] ehci_hcd 0000:00:1d.0:
>>> suspend root hub
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050493] BUG: unable to handle
>>> kernel paging request at 0000030341ed0050
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050670] IP: [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050807] PGD 0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050929] Oops: 0000 [#1] PREEMPT SMP
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.051223] last sysfs file:
>>> /sys/module/pcie_aspm/parameters/policy
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.051365] CPU 6
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.051411] Modules linked in:
>>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>>> usb_storage ehci_hcd [last unloaded: tg3]
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.054694]
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.054776] Pid: 7962, comm: umount
>>> Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.054912] RIP:
>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055084] RSP:
>>> 0018:ffff88023c77d6f8 ?EFLAGS: 00010246
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055173] RAX: 00000000000000ff
>>> RBX: ffff88023cde0168 RCX: 0000000000000000
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055265] RDX: ffff88023c77d734
>>> RSI: ffff88023c77d768 RDI: 0000000000000006
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055357] RBP: 0000030341ed0000
>>> R08: ffffffff8133eef0 R09: ffff88023c77d8d8
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055448] R10: 0000000000000003
>>> R11: 0000000000000001 R12: 00000000ffffffff
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055540] R13: ffff88023cde0030
>>> R14: ffffea0007dd39f0 R15: 0000000000000001
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055633] FS:
>>> 00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055762] CS: ?0010 DS: 0000 ES:
>>> 0000 CR0: 000000008005003b
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055851] CR2: 0000030341ed0050
>>> CR3: 000000023c7d5000 CR4: 00000000000006e0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055943] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056035] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056128] Process umount (pid:
>>> 7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056257] Stack:
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056338] ?0000000000000000
>>> 0000000000000002 ffff880200000000 0000000000000003
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056630] ?ffffea0007dd39f0
>>> ffffffff810e6aaa ffff880200000041 0000000600000246
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056922] ?ffff88023cdcd300
>>> ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057213] Call Trace:
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057301] ?[<ffffffff810e6aaa>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057393] ?[<ffffffff810e6b3a>] ?
>>> __cleancache_get_page+0x7a/0xd0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057487] ?[<ffffffff8132bb7c>] ?
>>> merge_state+0x7c/0x150
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057579] ?[<ffffffff8132e4de>] ?
>>> __extent_read_full_page+0x52e/0x710
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057673] ?[<ffffffff813bdea4>] ?
>>> rb_insert_color+0xa4/0x140
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057766] ?[<ffffffff8134b0b6>] ?
>>> tree_insert+0x86/0x1e0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057859] ?[<ffffffff81058c73>] ?
>>> lock_timer_base.clone.22+0x33/0x70
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058004] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058097] ?[<ffffffff81330b21>] ?
>>> read_extent_buffer_pages+0x2d1/0x470
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058191] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058283] ?[<ffffffff8130674d>] ?
>>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058415] ?[<ffffffff813076f9>] ?
>>> read_tree_block+0x39/0x60
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058508] ?[<ffffffff812ed5e6>] ?
>>> read_block_for_search.clone.40+0x116/0x410
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058638] ?[<ffffffff812eb228>] ?
>>> btrfs_cow_block+0x118/0x2b0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058731] ?[<ffffffff812f0bc7>] ?
>>> btrfs_search_slot+0x307/0xa00
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058823] ?[<ffffffff812f6b18>] ?
>>> lookup_inline_extent_backref+0x98/0x4a0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058919] ?[<ffffffff810e33d7>] ?
>>> kmem_cache_alloc+0x87/0xa0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059032] ?[<ffffffff812f891c>] ?
>>> __btrfs_free_extent+0xcc/0x6f0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059125] ?[<ffffffff812fc4cf>] ?
>>> run_clustered_refs+0x39f/0x880
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059220] ?[<ffffffff810b1f98>] ?
>>> pagevec_lookup_tag+0x18/0x20
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059312] ?[<ffffffff810a7c81>] ?
>>> filemap_fdatawait_range+0x91/0x180
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059405] ?[<ffffffff812fca77>] ?
>>> btrfs_run_delayed_refs+0xc7/0x220
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059498] ?[<ffffffff8130c29c>] ?
>>> btrfs_commit_transaction+0x7c/0x760
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059591] ?[<ffffffff81067ea0>] ?
>>> autoremove_wake_function+0x0/0x30
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059683] ?[<ffffffff8130cdef>] ?
>>> start_transaction+0x1bf/0x270
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059775] ?[<ffffffff8110e96a>] ?
>>> __sync_filesystem+0x5a/0x90
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059867] ?[<ffffffff810eae8d>] ?
>>> generic_shutdown_super+0x2d/0x100
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059960] ?[<ffffffff810eafb9>] ?
>>> kill_anon_super+0x9/0x50
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060051] ?[<ffffffff810eb266>] ?
>>> deactivate_locked_super+0x26/0x80
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060144] ?[<ffffffff811043ea>] ?
>>> sys_umount+0x7a/0x390
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060235] ?[<ffffffff810027bb>] ?
>>> system_call_fastpath+0x16/0x1b
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060325] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063170] RIP ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063302] ?RSP <ffff88023c77d6f8>
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063386] CR2: 0000030341ed0050
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063528] ---[ end trace
>>> 3313552d105b1535 ]---
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.961960] BUG: unable to handle
>>> kernel paging request at 0000030341ed0050
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962171] IP: [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962307] PGD 0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962430] Oops: 0000 [#2] PREEMPT SMP
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962637] last sysfs file:
>>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962766] CPU 5
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962812] Modules linked in:
>>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>>> usb_storage ehci_hcd [last unloaded: tg3]
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966044]
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966127] Pid: 7915, comm:
>>> btrfs-transacti Tainted: G ? ? ?D ? ? 2.6.37-plus_v16_zcache #4
>>> FMP55/ipower G3710
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966266] RIP:
>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966440] RSP:
>>> 0018:ffff88023c63b6e0 ?EFLAGS: 00010246
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966528] RAX: 00000000000000ff
>>> RBX: ffff88023cde0168 RCX: 0000000000000000
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966620] RDX: ffff88023c63b71c
>>> RSI: ffff88023c63b750 RDI: 0000000000000006
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966713] RBP: 0000030341ed0000
>>> R08: ffffffff8133eef0 R09: ffff88023c63b8c0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966805] R10: 0000000000000003
>>> R11: 0000000000000001 R12: 00000000ffffffff
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966897] R13: ffff88023cde0030
>>> R14: ffffea0007d59bc8 R15: 0000000000000001
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966990] FS:
>>> 0000000000000000(0000) GS:ffff8800bf540000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967120] CS: ?0010 DS: 0000 ES:
>>> 0000 CR0: 000000008005003b
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967209] CR2: 0000030341ed0050
>>> CR3: 0000000001c27000 CR4: 00000000000006e0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967302] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967394] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967500] Process btrfs-transacti
>>> (pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967630] Stack:
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967711] ?0000000000000000
>>> 0000000000000002 0000000000000000 0000000000000003
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968057] ?ffffea0007d59bc8
>>> ffffffff810e6aaa 0000000000000041 0000000600000002
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968348] ?0000000000000000
>>> ffffffff810e6b3a 0000000000000001 ffffffff00000001
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968639] Call Trace:
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968728] ?[<ffffffff810e6aaa>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968820] ?[<ffffffff810e6b3a>] ?
>>> __cleancache_get_page+0x7a/0xd0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968914] ?[<ffffffff8132e4de>] ?
>>> __extent_read_full_page+0x52e/0x710
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969008] ?[<ffffffff812f3f93>] ?
>>> update_reserved_bytes+0xb3/0x140
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969102] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969193] ?[<ffffffff8132bb7c>] ?
>>> merge_state+0x7c/0x150
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969285] ?[<ffffffff81330b21>] ?
>>> read_extent_buffer_pages+0x2d1/0x470
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969378] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969470] ?[<ffffffff8130674d>] ?
>>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969602] ?[<ffffffff813076f9>] ?
>>> read_tree_block+0x39/0x60
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969694] ?[<ffffffff812ed5e6>] ?
>>> read_block_for_search.clone.40+0x116/0x410
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969878] ?[<ffffffff812f0bc7>] ?
>>> btrfs_search_slot+0x307/0xa00
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969970] ?[<ffffffff812f6b18>] ?
>>> lookup_inline_extent_backref+0x98/0x4a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970065] ?[<ffffffff810e33d7>] ?
>>> kmem_cache_alloc+0x87/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970157] ?[<ffffffff812f891c>] ?
>>> __btrfs_free_extent+0xcc/0x6f0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970249] ?[<ffffffff812f8434>] ?
>>> update_block_group.clone.62+0xc4/0x280
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970343] ?[<ffffffff812fc4cf>] ?
>>> run_clustered_refs+0x39f/0x880
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970436] ?[<ffffffff812fca77>] ?
>>> btrfs_run_delayed_refs+0xc7/0x220
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970529] ?[<ffffffff810e15f9>] ?
>>> new_slab+0x169/0x1f0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970619] ?[<ffffffff8130c29c>] ?
>>> btrfs_commit_transaction+0x7c/0x760
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970713] ?[<ffffffff81067ea0>] ?
>>> autoremove_wake_function+0x0/0x30
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970806] ?[<ffffffff81305bc3>] ?
>>> transaction_kthread+0x283/0x2a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970898] ?[<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970990] ?[<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971083] ?[<ffffffff81067a16>] ?
>>> kthread+0x96/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971174] ?[<ffffffff81003514>] ?
>>> kernel_thread_helper+0x4/0x10
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971266] ?[<ffffffff81067980>] ?
>>> kthread+0x0/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971355] ?[<ffffffff81003510>] ?
>>> kernel_thread_helper+0x0/0x10
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971444] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974280] RIP ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974412] ?RSP <ffff88023c63b6e0>
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974497] CR2: 0000030341ed0050
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974599] ---[ end trace
>>> 3313552d105b1536 ]---
>>> Feb 14 02:07:04 lupus kernel: [ ?141.906124] zcache: destroyed pool id=2
>>> Feb 14 02:07:17 lupus kernel: [ ?154.783358] SysRq : Keyboard mode set
>>> to system default
>>> Feb 14 02:07:18 lupus kernel: [ ?155.486147] SysRq : Terminate All Tasks
>>>
>>>
>>> That's all for now
>>>
>>> Thanks & Regards
>>>
>>> Matt
>>>
>>
>> (leaving out several folks from the CC to avoid spamming - if I left
>> out someone wrongfully please re-add)
>>
>> running an addr2line reveals:
>>
>>
>> addr2line -e /usr/src/linux-2.6.37_vanilla/vmlinux -i ffffffff81338cbb
>> export.c:0
>>
>>
>> hope that helps
>>
>>
>> Regards
>>
>> Matt
>>
>
>
> ok, maybe it's useful to have some more details how to reproduce it as
> easily as possible and about my configuration:
>
> preparation steps:
>
> 1) 2.6.37 vanilla kernel with the mentioned changes to zram (and
> xvmalloc), zcache (+ fixes)
>
> configuration specifics:
>
> CONFIG_CRYPTO_PCRYPT=y
>
> 2) on a non-btrfs partition or with zcache disable: get a
> portage-tarball comparable to mine download from one of the
> gentoo-mirrors (http://www.gentoo.org/main/en/mirrors2.xml), e.g. the
> University of California:
> ftp://ftp.ucsb.edu/pub/mirrors/linux/gentoo/snapshots/
>
> (weighing only around 40 MiB) then
>
> get that tarball to a decent size by adding the latest changes from a
> rsync-mirror (http://www.gentoo.org/main/en/mirrors-rsync.xml)
>
> I can also upload my specific tarball weighing at 1.7 GiB at request -
> just point me a place to drop it
>
> 3) creating a tar.bzip2 ball (preferably via 7z or pbzip2 so that it
> can be extracted later in parallel to create some pressure)
>
>
> Hardware:
> core i7 860 (4 cores - ht -> 8 threads), 8 GiB of RAM,
> underlying harddrive is a Samsung HD203WI, NCQ is disabled
> (queue_depth set to "1"), using CFQ as i/o scheduler
>
> echo "13" > /proc/sys/vm/page-cluster
> echo "60" > /proc/sys/vm/swappiness
> echo "3000" > /proc/sys/vm/dirty_expire_centisecs
> echo "1500" ?> /proc/sys/vm/dirty_writeback_centisecs
> echo "15" > /proc/sys/vm/dirty_background_ratio
> echo "50" ? > /proc/sys/vm/dirty_ratio
> echo "50" > /proc/sys/vm/vfs_cache_pressure
> echo "32768" > /proc/sys/vm/min_free_kbytes
>
> for i in /sys/block/sd*; do
> ? ? ? ? /bin/echo "4096" > ?$i/queue/read_ahead_kb
> ? ? ? ? /bin/echo "64" > ?$i/queue/max_sectors_kb
> ? ? ? ? /bin/echo "1" ? > ?$i/queue/rq_affinity
> done
>
> nr_requests is set to "1024"
>
> slice_sync to "150"
> fifo_expire_sync to "50"
>
> echo 4096 > /sys/class/bdi/default/read_ahead_kb
>
>
> steps to get the "result":
>
> 1) cryptsetup-partition with aes or twofish-encryption (512 bits)
> [using cryptsetup 1.1.3* or 1.2*]
>
> 2) on top of that btrfs
>
> 3) extracting the tarball (usually I'm adding a "time" in front to see
> how long it took): time (7z e -so -tbzip2 -mmt=5
> /system/portage_backup_022011.tbz2 | tar
> -xp -C /usr/gentoo/)
>
>
> "result":
>
> 1) it seems to take several seconds or even minutes until that BUG
> message gets shown so maybe some memory (or other kind of subsystem)
> pressure is needed to trigger it
>
>
> I hope that's useful in reproducing the BUG
>
>
> Thanks & Regards
>
> Matt
>

*bump*

adding: Li Zefan, Miao Xie, Yan Zheng, Dan Rosenberg, Josef Bacik and
the btrfs mailing list to CC

2011-02-16 00:12:31

by Matt

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Mon, Feb 14, 2011 at 4:35 AM, Minchan Kim <[email protected]> wrote:
> On Mon, Feb 14, 2011 at 10:29 AM, Matt <[email protected]> wrote:
>> On Mon, Feb 14, 2011 at 1:24 AM, Matt <[email protected]> wrote:
>>> On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]> wrote:
>>>> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
>>>> <[email protected]> wrote:
>>>> [snip]
>>>>>
>>>>> If I've missed anything important, please let me know!
>>>>>
>>>>> Thanks again!
>>>>> Dan
>>>>>
>>>>
>>>> Hi Dan,
>>>>
>>>> thank you so much for answering my email in such detail !
>>>>
>>>> I shall pick up on that mail in my next email sending to the mailing list :)
>>>>
>>>>
>>>> currently I've got a problem with btrfs which seems to get triggered
>>>> by cleancache get-operations:
>>>>
>>>>
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
>>>> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
>>>> /dev/mapper/portage
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk space caching
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo compression
>>>> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created ephemeral
>>>> tmem pool, id=3
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
>>>> kernel paging request at 0000000001400050
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP: [<ffffffff8133ef1b>]
>>>> btrfs_encode_fh+0x2b/0x120
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1] PREEMPT SMP
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
>>>> /sys/devices/platform/coretemp.3/temp1_input
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in: radeon
>>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>>>> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
>>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>>>> ehci_hcd [last unloaded: tg3]
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
>>>> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
>>>> G3710
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
>>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>>>> btrfs_encode_fh+0x2b/0x120
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
>>>> 0018:ffff880129a11b00 ?EFLAGS: 00010246
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
>>>> RBX: ffff88014a1ce628 RCX: 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
>>>> RSI: ffff880129a11b70 RDI: 0000000000000006
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
>>>> R08: ffffffff8133eef0 R09: ffff880129a11c68
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
>>>> R11: 0000000000000001 R12: ffff88014a1ce780
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
>>>> R14: ffff88021fef9000 R15: 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
>>>> 0000000000000000(0000) GS:ffff8800bf500000(0000)
>>>> knlGS:0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: ?0010 DS: 0000 ES:
>>>> 0000 CR0: 000000008005003b
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
>>>> CR3: 0000000001c27000 CR4: 00000000000006e0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
>>>> DR1: 0000000000000000 DR2: 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
>>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-transacti
>>>> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854006] ?ffff880129a11b50
>>>> ffff880000000003 ffff88003c60a098 0000000000000003
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854035] ?ffffffffffffffff
>>>> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854063] ?ffffffff8133e3f0
>>>> ffffffff810e6cee 0000000000001000 0000000000000000
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854103] ?[<ffffffff810e6aaa>] ?
>>>> cleancache_get_key+0x4a/0x60
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854122] ?[<ffffffff8133e3f0>] ?
>>>> btrfs_wake_function+0x0/0x20
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854140] ?[<ffffffff810e6cee>] ?
>>>> __cleancache_flush_inode+0x3e/0x70
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854161] ?[<ffffffff810b34d2>] ?
>>>> truncate_inode_pages_range+0x42/0x440
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854182] ?[<ffffffff812f115e>] ?
>>>> btrfs_search_slot+0x89e/0xa00
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854201] ?[<ffffffff810c3a45>] ?
>>>> unmap_mapping_range+0xc5/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854220] ?[<ffffffff810b3930>] ?
>>>> truncate_pagecache+0x40/0x70
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854240] ?[<ffffffff813458b1>] ?
>>>> btrfs_truncate_free_space_cache+0x81/0xe0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854261] ?[<ffffffff812fce15>] ?
>>>> btrfs_write_dirty_block_groups+0x245/0x500
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854283] ?[<ffffffff812fcb6a>] ?
>>>> btrfs_run_delayed_refs+0x1ba/0x220
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854304] ?[<ffffffff8130afff>] ?
>>>> commit_cowonly_roots+0xff/0x1d0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854323] ?[<ffffffff8130c583>] ?
>>>> btrfs_commit_transaction+0x363/0x760
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854344] ?[<ffffffff81067ea0>] ?
>>>> autoremove_wake_function+0x0/0x30
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854364] ?[<ffffffff81305bc3>] ?
>>>> transaction_kthread+0x283/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854383] ?[<ffffffff81305940>] ?
>>>> transaction_kthread+0x0/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854401] ?[<ffffffff81305940>] ?
>>>> transaction_kthread+0x0/0x2a0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854420] ?[<ffffffff81067a16>] ?
>>>> kthread+0x96/0xa0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854437] ?[<ffffffff81003514>] ?
>>>> kernel_thread_helper+0x4/0x10
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854455] ?[<ffffffff81067980>] ?
>>>> kthread+0x0/0xa0
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854471] ?[<ffffffff81003510>] ?
>>>> kernel_thread_helper+0x0/0x10
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00 00
>>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP ?[<ffffffff8133ef1b>]
>>>> btrfs_encode_fh+0x2b/0x120
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854762] ?RSP <ffff880129a11b00>
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
>>>> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
>>>> f831c5ceeaa49287 ]---
>>>>
>>>> in my case I had compress-force with lzo and disk_cache enabled
>>>>
>>>>
>>>> another user of the kernel I'm currently running has had the same
>>>> problem with zcache
>>>> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
>>>>
>>>> (looks like in his case compression and any other fancy additional
>>>> features weren't enabled)
>>>>
>>>>
>>>> changes made by this kernel or patchset to btrfs are from
>>>> * io-less dirty throttling patchset (44 patches)
>>>> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
>>>> applied in both cases)
>>>> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
>>>> * btrfs-unstable changes to state of
>>>> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals btrfs
>>>> from 2.6.38-rc4+)
>>>>
>>>> I haven't tried downgrading to vanilla 2.6.37 with zcache only, yet,
>>>>
>>>> but kind of upgraded btrfs to the latest state of the btrfs-unstable
>>>> repository (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=summary)
>>>> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
>>>>
>>>> this also didn't help and seemed to produce the same error-message
>>>>
>>>> so to summarize:
>>>>
>>>> 1) error message appearing with all 4 patchsets applied changing
>>>> btrfs-code and compress-force=lzo and disk_cache enabled
>>>>
>>>> 2) error message appearing with default mount-options and btrfs from
>>>> 2.6.37 and changes for zcache & io-less dirty throttling patchset
>>>> applied (first 2 patch(sets)) from list)
>>>>
>>>>
>>>> in my case I tried to extract / play back a 1.7 GiB tarball of my
>>>> portage-directory (lots of small files and some tar.bzip2 archives)
>>>> via pbzip2 or 7z when the error happened and the message was shown
>>>>
>>>> Due to KMS sound (webradio streaming) was still running but I couldn't
>>>> continue work (X switching to kernel output) so I did the magic sysrq
>>>> combo (reisub)
>>>>
>>>>
>>>> Does that BUG message ring a bell for anyone ?
>>>>
>>>> (if I should leave out anyone from the CC in the next emails or
>>>> future, please holler - I don't want to spam your inboxes)
>>>>
>>>> Thanks
>>>>
>>>> Matt
>>>>
>>>
>>>
>>> OK,
>>>
>>> here's the output of a kernel -
>>>
>>> staying as close to vanilla (2.6.37) as the current situation allows
>>> (only including some corruption or leak fixes for zram & zcache and
>>> "zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
>>> fixes)):
>>>
>>> so in total the following patches are included in this new kernel
>>> (2.6.37-zcache):
>>>
>>> zram changes:
>>> 1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
>>> 2 [PATCH] zram: fix data corruption issue
>>> 3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations
>>>
>>> zcache:
>>> 1 zcache-linux-2.6.37-110205
>>> 2 [PATCH] staging: zcache: fix memory leak
>>> 3 [PATCH] zcache: Fix build error when sysfs is not defined
>>>
>>> reiserfs:
>>> 1 [PATCH] reiserfs: Make sure va_end() is always called after
>>> 2 [patch] reiserfs: potential ERR_PTR dereference
>>>
>>>
>>> the same procedure:
>>>
>>> trying to extract the mentioned portage-tarball:
>>>
>>> time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 | tar
>>> -xp -C /usr/gentoo/)
>>>
>>>
>>> this hopefully should make it easier to track down the problem:
>>>
>>>
>>> Feb 14 01:59:59 lupus kernel: [ ?364.777143] device fsid
>>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
>>> /dev/mapper/portage
>>> Feb 14 01:59:59 lupus kernel: [ ?364.844994] zcache: created ephemeral
>>> tmem pool, id=2
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577573] BUG: unable to handle
>>> kernel paging request at 0000000037610050
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577605] IP: [<ffffffff81338cbb>]
>>> btrfs_encode_fh+0x2b/0x110
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577630] PGD 0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577640] Oops: 0000 [#1] PREEMPT SMP
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577665] last sysfs file:
>>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577693] CPU 5
>>> Feb 14 02:02:49 lupus kernel: [ ?534.577701] Modules linked in: radeon
>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack xt_mark
>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss snd_seq_midi_event
>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>>> snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>>> ehci_hcd [last unloaded: tg3]
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578114]
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578124] Pid: 8285, comm: tar Not
>>> tainted 2.6.37-zcache #2 FMP55/ipower G3710
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578146] RIP:
>>> 0010:[<ffffffff81338cbb>] ?[<ffffffff81338cbb>]
>>> btrfs_encode_fh+0x2b/0x110
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578172] RSP:
>>> 0018:ffff88023ea9dcc8 ?EFLAGS: 00010246
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578189] RAX: 00000000000000ff
>>> RBX: ffff8800b8643228 RCX: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578210] RDX: ffff88023ea9dd04
>>> RSI: ffff88023ea9dd38 RDI: 0000000000000006
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578230] RBP: 0000000037610000
>>> R08: ffffffff81338c90 R09: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578251] R10: 0000000000000019
>>> R11: 0000000000000001 R12: ffff8800b8643380
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578272] R13: ffff8800b8643258
>>> R14: 00007fff806f1f00 R15: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578293] FS:
>>> 00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578317] CS: ?0010 DS: 0000 ES:
>>> 0000 CR0: 0000000080050033
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578334] CR2: 0000000037610050
>>> CR3: 000000023dcef000 CR4: 00000000000006e0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578356] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578377] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578398] Process tar (pid: 8285,
>>> threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578421] Stack:
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578428] ?000000013d096000
>>> ffff88023ed84800 ffff88023ea9c000 0000000000000002
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578458] ?ffffffffffffffff
>>> ffffffff810e3b1a 0000000000000001 000000061e1d5240
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578486] ?fffffffffffffffb
>>> ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578514] Call Trace:
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578525] ?[<ffffffff810e3b1a>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578544] ?[<ffffffff810e3d5e>] ?
>>> __cleancache_flush_inode+0x3e/0x70
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578565] ?[<ffffffff810b0ed2>] ?
>>> truncate_inode_pages_range+0x42/0x440
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578586] ?[<ffffffff81338451>] ?
>>> btrfs_tree_unlock+0x41/0x50
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578605] ?[<ffffffff812e4ed5>] ?
>>> btrfs_release_path+0x15/0x70
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578624] ?[<ffffffff8130bf29>] ?
>>> btrfs_run_delayed_iputs+0x49/0x120
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578644] ?[<ffffffff813107e7>] ?
>>> btrfs_evict_inode+0x27/0x1e0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578663] ?[<ffffffff810fc3aa>] ?
>>> evict+0x1a/0xa0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578678] ?[<ffffffff810fc6bd>] ?
>>> iput+0x1cd/0x2b0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578694] ?[<ffffffff810f266f>] ?
>>> do_unlinkat+0x12f/0x1d0
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578712] ?[<ffffffff810027bb>] ?
>>> system_call_fastpath+0x16/0x1b
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578730] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 02:02:49 lupus kernel: [ ?534.578986] RIP ?[<ffffffff81338cbb>]
>>> btrfs_encode_fh+0x2b/0x110
>>> Feb 14 02:02:49 lupus kernel: [ ?534.579081] ?RSP <ffff88023ea9dcc8>
>>> Feb 14 02:02:49 lupus kernel: [ ?534.579093] CR2: 0000000037610050
>>> Feb 14 02:02:49 lupus kernel: [ ?534.587513] ---[ end trace
>>> c596b12e66c0b360 ]---
>>>
>>>
>>> for reference I've pasted it to pastebin.com:
>>>
>>> "2.6.37_zcache_V2.patch"
>>> http://pastebin.com/cVSkwQ6M
>>>
>>>
>>>
>>>
>>>
>>> after the reboot I had forgotten to not mount the btrfs volume and it
>>> threw a similar error-message again and remounted several partitions
>>> read-only (including the system partition)
>>> the partition with btrfs (/usr/gentoo) couldn't be unmounted since the
>>> umount process kind of hang
>>>
>>> so here's the error message after a reboot (might not be accurate or
>>> kind of "skewed" since other patches are included (io-less dirty
>>> throttling, PATCH] fix (latent?) memory corruption in
>>> btrfs_encode_fh() and latest changes for btrfs)) but might help to get
>>> some more evidence:
>>>
>>>
>>> Feb 14 02:05:46 lupus kernel: [ ? 63.922648] device fsid
>>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
>>> /dev/mapper/portage
>>> Feb 14 02:05:46 lupus kernel: [ ? 64.047118] btrfs: unlinked 1 orphans
>>> Feb 14 02:05:46 lupus kernel: [ ? 64.051956] zcache: created ephemeral
>>> tmem pool, id=3
>>> Feb 14 02:05:48 lupus kernel: [ ? 65.801364] hub 2-1:1.0: hub_suspend
>>> Feb 14 02:05:48 lupus kernel: [ ? 65.801376] usb 2-1: unlink
>>> qh256-0001/ffff88023fefd180 start 1 [1/0 us]
>>> Feb 14 02:05:48 lupus kernel: [ ? 65.801559] usb 2-1: usb auto-suspend
>>> Feb 14 02:05:50 lupus kernel: [ ? 67.797929] hub 2-0:1.0: hub_suspend
>>> Feb 14 02:05:50 lupus kernel: [ ? 67.797939] usb usb2: bus auto-suspend
>>> Feb 14 02:05:50 lupus kernel: [ ? 67.797942] ehci_hcd 0000:00:1d.0:
>>> suspend root hub
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050493] BUG: unable to handle
>>> kernel paging request at 0000030341ed0050
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050670] IP: [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050807] PGD 0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.050929] Oops: 0000 [#1] PREEMPT SMP
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.051223] last sysfs file:
>>> /sys/module/pcie_aspm/parameters/policy
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.051365] CPU 6
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.051411] Modules linked in:
>>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>>> usb_storage ehci_hcd [last unloaded: tg3]
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.054694]
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.054776] Pid: 7962, comm: umount
>>> Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.054912] RIP:
>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055084] RSP:
>>> 0018:ffff88023c77d6f8 ?EFLAGS: 00010246
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055173] RAX: 00000000000000ff
>>> RBX: ffff88023cde0168 RCX: 0000000000000000
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055265] RDX: ffff88023c77d734
>>> RSI: ffff88023c77d768 RDI: 0000000000000006
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055357] RBP: 0000030341ed0000
>>> R08: ffffffff8133eef0 R09: ffff88023c77d8d8
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055448] R10: 0000000000000003
>>> R11: 0000000000000001 R12: 00000000ffffffff
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055540] R13: ffff88023cde0030
>>> R14: ffffea0007dd39f0 R15: 0000000000000001
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055633] FS:
>>> 00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055762] CS: ?0010 DS: 0000 ES:
>>> 0000 CR0: 000000008005003b
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055851] CR2: 0000030341ed0050
>>> CR3: 000000023c7d5000 CR4: 00000000000006e0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.055943] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056035] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056128] Process umount (pid:
>>> 7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056257] Stack:
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056338] ?0000000000000000
>>> 0000000000000002 ffff880200000000 0000000000000003
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056630] ?ffffea0007dd39f0
>>> ffffffff810e6aaa ffff880200000041 0000000600000246
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.056922] ?ffff88023cdcd300
>>> ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057213] Call Trace:
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057301] ?[<ffffffff810e6aaa>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057393] ?[<ffffffff810e6b3a>] ?
>>> __cleancache_get_page+0x7a/0xd0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057487] ?[<ffffffff8132bb7c>] ?
>>> merge_state+0x7c/0x150
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057579] ?[<ffffffff8132e4de>] ?
>>> __extent_read_full_page+0x52e/0x710
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057673] ?[<ffffffff813bdea4>] ?
>>> rb_insert_color+0xa4/0x140
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057766] ?[<ffffffff8134b0b6>] ?
>>> tree_insert+0x86/0x1e0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.057859] ?[<ffffffff81058c73>] ?
>>> lock_timer_base.clone.22+0x33/0x70
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058004] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058097] ?[<ffffffff81330b21>] ?
>>> read_extent_buffer_pages+0x2d1/0x470
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058191] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058283] ?[<ffffffff8130674d>] ?
>>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058415] ?[<ffffffff813076f9>] ?
>>> read_tree_block+0x39/0x60
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058508] ?[<ffffffff812ed5e6>] ?
>>> read_block_for_search.clone.40+0x116/0x410
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058638] ?[<ffffffff812eb228>] ?
>>> btrfs_cow_block+0x118/0x2b0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058731] ?[<ffffffff812f0bc7>] ?
>>> btrfs_search_slot+0x307/0xa00
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058823] ?[<ffffffff812f6b18>] ?
>>> lookup_inline_extent_backref+0x98/0x4a0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.058919] ?[<ffffffff810e33d7>] ?
>>> kmem_cache_alloc+0x87/0xa0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059032] ?[<ffffffff812f891c>] ?
>>> __btrfs_free_extent+0xcc/0x6f0
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059125] ?[<ffffffff812fc4cf>] ?
>>> run_clustered_refs+0x39f/0x880
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059220] ?[<ffffffff810b1f98>] ?
>>> pagevec_lookup_tag+0x18/0x20
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059312] ?[<ffffffff810a7c81>] ?
>>> filemap_fdatawait_range+0x91/0x180
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059405] ?[<ffffffff812fca77>] ?
>>> btrfs_run_delayed_refs+0xc7/0x220
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059498] ?[<ffffffff8130c29c>] ?
>>> btrfs_commit_transaction+0x7c/0x760
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059591] ?[<ffffffff81067ea0>] ?
>>> autoremove_wake_function+0x0/0x30
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059683] ?[<ffffffff8130cdef>] ?
>>> start_transaction+0x1bf/0x270
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059775] ?[<ffffffff8110e96a>] ?
>>> __sync_filesystem+0x5a/0x90
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059867] ?[<ffffffff810eae8d>] ?
>>> generic_shutdown_super+0x2d/0x100
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.059960] ?[<ffffffff810eafb9>] ?
>>> kill_anon_super+0x9/0x50
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060051] ?[<ffffffff810eb266>] ?
>>> deactivate_locked_super+0x26/0x80
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060144] ?[<ffffffff811043ea>] ?
>>> sys_umount+0x7a/0x390
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060235] ?[<ffffffff810027bb>] ?
>>> system_call_fastpath+0x16/0x1b
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.060325] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063170] RIP ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063302] ?RSP <ffff88023c77d6f8>
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063386] CR2: 0000030341ed0050
>>> Feb 14 02:05:52 lupus kernel: [ ? 70.063528] ---[ end trace
>>> 3313552d105b1535 ]---
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.961960] BUG: unable to handle
>>> kernel paging request at 0000030341ed0050
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962171] IP: [<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962307] PGD 0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962430] Oops: 0000 [#2] PREEMPT SMP
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962637] last sysfs file:
>>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962766] CPU 5
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.962812] Modules linked in:
>>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit
>>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack xt_string
>>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
>>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp e1000e
>>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>>> usb_storage ehci_hcd [last unloaded: tg3]
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966044]
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966127] Pid: 7915, comm:
>>> btrfs-transacti Tainted: G ? ? ?D ? ? 2.6.37-plus_v16_zcache #4
>>> FMP55/ipower G3710
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966266] RIP:
>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966440] RSP:
>>> 0018:ffff88023c63b6e0 ?EFLAGS: 00010246
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966528] RAX: 00000000000000ff
>>> RBX: ffff88023cde0168 RCX: 0000000000000000
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966620] RDX: ffff88023c63b71c
>>> RSI: ffff88023c63b750 RDI: 0000000000000006
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966713] RBP: 0000030341ed0000
>>> R08: ffffffff8133eef0 R09: ffff88023c63b8c0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966805] R10: 0000000000000003
>>> R11: 0000000000000001 R12: 00000000ffffffff
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966897] R13: ffff88023cde0030
>>> R14: ffffea0007d59bc8 R15: 0000000000000001
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.966990] FS:
>>> 0000000000000000(0000) GS:ffff8800bf540000(0000)
>>> knlGS:0000000000000000
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967120] CS: ?0010 DS: 0000 ES:
>>> 0000 CR0: 000000008005003b
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967209] CR2: 0000030341ed0050
>>> CR3: 0000000001c27000 CR4: 00000000000006e0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967302] DR0: 0000000000000000
>>> DR1: 0000000000000000 DR2: 0000000000000000
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967394] DR3: 0000000000000000
>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967500] Process btrfs-transacti
>>> (pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967630] Stack:
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.967711] ?0000000000000000
>>> 0000000000000002 0000000000000000 0000000000000003
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968057] ?ffffea0007d59bc8
>>> ffffffff810e6aaa 0000000000000041 0000000600000002
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968348] ?0000000000000000
>>> ffffffff810e6b3a 0000000000000001 ffffffff00000001
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968639] Call Trace:
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968728] ?[<ffffffff810e6aaa>] ?
>>> cleancache_get_key+0x4a/0x60
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968820] ?[<ffffffff810e6b3a>] ?
>>> __cleancache_get_page+0x7a/0xd0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.968914] ?[<ffffffff8132e4de>] ?
>>> __extent_read_full_page+0x52e/0x710
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969008] ?[<ffffffff812f3f93>] ?
>>> update_reserved_bytes+0xb3/0x140
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969102] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969193] ?[<ffffffff8132bb7c>] ?
>>> merge_state+0x7c/0x150
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969285] ?[<ffffffff81330b21>] ?
>>> read_extent_buffer_pages+0x2d1/0x470
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969378] ?[<ffffffff81305060>] ?
>>> btree_get_extent+0x0/0x1c0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969470] ?[<ffffffff8130674d>] ?
>>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969602] ?[<ffffffff813076f9>] ?
>>> read_tree_block+0x39/0x60
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969694] ?[<ffffffff812ed5e6>] ?
>>> read_block_for_search.clone.40+0x116/0x410
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969878] ?[<ffffffff812f0bc7>] ?
>>> btrfs_search_slot+0x307/0xa00
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.969970] ?[<ffffffff812f6b18>] ?
>>> lookup_inline_extent_backref+0x98/0x4a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970065] ?[<ffffffff810e33d7>] ?
>>> kmem_cache_alloc+0x87/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970157] ?[<ffffffff812f891c>] ?
>>> __btrfs_free_extent+0xcc/0x6f0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970249] ?[<ffffffff812f8434>] ?
>>> update_block_group.clone.62+0xc4/0x280
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970343] ?[<ffffffff812fc4cf>] ?
>>> run_clustered_refs+0x39f/0x880
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970436] ?[<ffffffff812fca77>] ?
>>> btrfs_run_delayed_refs+0xc7/0x220
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970529] ?[<ffffffff810e15f9>] ?
>>> new_slab+0x169/0x1f0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970619] ?[<ffffffff8130c29c>] ?
>>> btrfs_commit_transaction+0x7c/0x760
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970713] ?[<ffffffff81067ea0>] ?
>>> autoremove_wake_function+0x0/0x30
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970806] ?[<ffffffff81305bc3>] ?
>>> transaction_kthread+0x283/0x2a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970898] ?[<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.970990] ?[<ffffffff81305940>] ?
>>> transaction_kthread+0x0/0x2a0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971083] ?[<ffffffff81067a16>] ?
>>> kthread+0x96/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971174] ?[<ffffffff81003514>] ?
>>> kernel_thread_helper+0x4/0x10
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971266] ?[<ffffffff81067980>] ?
>>> kthread+0x0/0xa0
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971355] ?[<ffffffff81003510>] ?
>>> kernel_thread_helper+0x0/0x10
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.971444] Code: 55 b8 ff 00 00 00
>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00 00
>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00 00
>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974280] RIP ?[<ffffffff8133ef1b>]
>>> btrfs_encode_fh+0x2b/0x120
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974412] ?RSP <ffff88023c63b6e0>
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974497] CR2: 0000030341ed0050
>>> Feb 14 02:06:16 lupus kernel: [ ? 93.974599] ---[ end trace
>>> 3313552d105b1536 ]---
>>> Feb 14 02:07:04 lupus kernel: [ ?141.906124] zcache: destroyed pool id=2
>>> Feb 14 02:07:17 lupus kernel: [ ?154.783358] SysRq : Keyboard mode set
>>> to system default
>>> Feb 14 02:07:18 lupus kernel: [ ?155.486147] SysRq : Terminate All Tasks
>>>
>>>
>>> That's all for now
>>>
>>> Thanks & Regards
>>>
>>> Matt
>>>
>>
>> (leaving out several folks from the CC to avoid spamming - if I left
>> out someone wrongfully please re-add)
>>
>> running an addr2line reveals:
>>
>>
>> addr2line -e /usr/src/linux-2.6.37_vanilla/vmlinux -i ffffffff81338cbb
>> export.c:0
>>
>>
>> hope that helps
>>
>>
>> Regards
>>
>> Matt
>>
>
> Just my guessing. I might be wrong.
>
> __cleancache_flush_inode calls cleancache_get_key with cleancache_filekey.
> cleancache_file_key's size is just 6 * u32.
> cleancache_get_key calls btrfs_encode_fh with the key.
> but btrfs_encode_fh does typecasting the key to btrfs_fid which is
> bigger size than cleancache_filekey's one so it should not access
> fields beyond cleancache_get_key.
>
> I think some file systems use extend fid so in there, this problem can
> happen. I don't know why we can't find it earlier. Maybe Dan and
> others test it for a long time.
>
> Am I missing something?
>
>
>
> --
> Kind regards,
> Minchan Kim
>

reposting Minchan's message for reference to the btrfs mailing list
while also adding

Li Zefan, Miao Xie, Yan Zheng, Dan Rosenberg and Josef Bacik to CC

Regards

Matt

2011-02-16 01:30:23

by Dan Magenheimer

[permalink] [raw]
Subject: RE: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

> -----Original Message-----
> From: Matt [mailto:[email protected]]
> Sent: Tuesday, February 15, 2011 5:12 PM
> To: Minchan Kim
> Cc: Dan Magenheimer; [email protected]; Chris Mason; linux-
> [email protected]; [email protected]; [email protected]; linux-
> [email protected]; Josef Bacik; Dan Rosenberg; Yan Zheng;
> [email protected]; Li Zefan
> Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page
> cache/swap compression
>
> On Mon, Feb 14, 2011 at 4:35 AM, Minchan Kim <[email protected]>
> wrote:
> > On Mon, Feb 14, 2011 at 10:29 AM, Matt <[email protected]> wrote:
> >> On Mon, Feb 14, 2011 at 1:24 AM, Matt <[email protected]> wrote:
> >>> On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]>
> wrote:
> >>>> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
> >>>> <[email protected]> wrote:
> >>>> [snip]
> >>>>>
> >>>>> If I've missed anything important, please let me know!
> >>>>>
> >>>>> Thanks again!
> >>>>> Dan
> >>>>>
> >>>>
> >>>> Hi Dan,
> >>>>
> >>>> thank you so much for answering my email in such detail !
> >>>>
> >>>> I shall pick up on that mail in my next email sending to the
> mailing list :)
> >>>>
> >>>>
> >>>> currently I've got a problem with btrfs which seems to get
> triggered
> >>>> by cleancache get-operations:
> >>>>
> >>>>
> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
> >>>> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
> >>>> /dev/mapper/portage
> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk
> space caching
> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo
> compression
> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created
> ephemeral
> >>>> tmem pool, id=3
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
> >>>> kernel paging request at 0000000001400050
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP:
> [<ffffffff8133ef1b>]
> >>>> btrfs_encode_fh+0x2b/0x120
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1]
> PREEMPT SMP
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
> >>>> /sys/devices/platform/coretemp.3/temp1_input
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in:
> radeon
> >>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
> >>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
> >>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> nf_conntrack_ftp
> >>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
> >>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack
> xt_mark
> >>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables
> x_tables
> >>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
> snd_seq_midi_event
> >>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> snd_hda_codec_hdmi
> >>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep
> snd_pcm
> >>>> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
> >>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
> >>>> ehci_hcd [last unloaded: tg3]
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
> >>>> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
> >>>> G3710
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
> >>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
> >>>> btrfs_encode_fh+0x2b/0x120
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
> >>>> 0018:ffff880129a11b00 ?EFLAGS: 00010246
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
> >>>> RBX: ffff88014a1ce628 RCX: 0000000000000000
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
> >>>> RSI: ffff880129a11b70 RDI: 0000000000000006
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
> >>>> R08: ffffffff8133eef0 R09: ffff880129a11c68
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
> >>>> R11: 0000000000000001 R12: ffff88014a1ce780
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
> >>>> R14: ffff88021fef9000 R15: 0000000000000000
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
> >>>> 0000000000000000(0000) GS:ffff8800bf500000(0000)
> >>>> knlGS:0000000000000000
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: ?0010 DS: 0000
> ES:
> >>>> 0000 CR0: 000000008005003b
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
> >>>> CR3: 0000000001c27000 CR4: 00000000000006e0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
> >>>> DR1: 0000000000000000 DR2: 0000000000000000
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
> >>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-
> transacti
> >>>> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854006] ?ffff880129a11b50
> >>>> ffff880000000003 ffff88003c60a098 0000000000000003
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854035] ?ffffffffffffffff
> >>>> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854063] ?ffffffff8133e3f0
> >>>> ffffffff810e6cee 0000000000001000 0000000000000000
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854103] ?[<ffffffff810e6aaa>]
> ?
> >>>> cleancache_get_key+0x4a/0x60
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854122] ?[<ffffffff8133e3f0>]
> ?
> >>>> btrfs_wake_function+0x0/0x20
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854140] ?[<ffffffff810e6cee>]
> ?
> >>>> __cleancache_flush_inode+0x3e/0x70
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854161] ?[<ffffffff810b34d2>]
> ?
> >>>> truncate_inode_pages_range+0x42/0x440
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854182] ?[<ffffffff812f115e>]
> ?
> >>>> btrfs_search_slot+0x89e/0xa00
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854201] ?[<ffffffff810c3a45>]
> ?
> >>>> unmap_mapping_range+0xc5/0x2a0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854220] ?[<ffffffff810b3930>]
> ?
> >>>> truncate_pagecache+0x40/0x70
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854240] ?[<ffffffff813458b1>]
> ?
> >>>> btrfs_truncate_free_space_cache+0x81/0xe0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854261] ?[<ffffffff812fce15>]
> ?
> >>>> btrfs_write_dirty_block_groups+0x245/0x500
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854283] ?[<ffffffff812fcb6a>]
> ?
> >>>> btrfs_run_delayed_refs+0x1ba/0x220
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854304] ?[<ffffffff8130afff>]
> ?
> >>>> commit_cowonly_roots+0xff/0x1d0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854323] ?[<ffffffff8130c583>]
> ?
> >>>> btrfs_commit_transaction+0x363/0x760
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854344] ?[<ffffffff81067ea0>]
> ?
> >>>> autoremove_wake_function+0x0/0x30
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854364] ?[<ffffffff81305bc3>]
> ?
> >>>> transaction_kthread+0x283/0x2a0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854383] ?[<ffffffff81305940>]
> ?
> >>>> transaction_kthread+0x0/0x2a0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854401] ?[<ffffffff81305940>]
> ?
> >>>> transaction_kthread+0x0/0x2a0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854420] ?[<ffffffff81067a16>]
> ?
> >>>> kthread+0x96/0xa0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854437] ?[<ffffffff81003514>]
> ?
> >>>> kernel_thread_helper+0x4/0x10
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854455] ?[<ffffffff81067980>]
> ?
> >>>> kthread+0x0/0xa0
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854471] ?[<ffffffff81003510>]
> ?
> >>>> kernel_thread_helper+0x0/0x10
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00
> 00
> >>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
> 00
> >>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00
> 00 00
> >>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP
> ?[<ffffffff8133ef1b>]
> >>>> btrfs_encode_fh+0x2b/0x120
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854762] ?RSP
> <ffff880129a11b00>
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
> >>>> f831c5ceeaa49287 ]---
> >>>>
> >>>> in my case I had compress-force with lzo and disk_cache enabled
> >>>>
> >>>>
> >>>> another user of the kernel I'm currently running has had the same
> >>>> problem with zcache
> >>>> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
> >>>>
> >>>> (looks like in his case compression and any other fancy additional
> >>>> features weren't enabled)
> >>>>
> >>>>
> >>>> changes made by this kernel or patchset to btrfs are from
> >>>> * io-less dirty throttling patchset (44 patches)
> >>>> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
> >>>> applied in both cases)
> >>>> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
> >>>> * btrfs-unstable changes to state of
> >>>> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals
> btrfs
> >>>> from 2.6.38-rc4+)
> >>>>
> >>>> I haven't tried downgrading to vanilla 2.6.37 with zcache only,
> yet,
> >>>>
> >>>> but kind of upgraded btrfs to the latest state of the btrfs-
> unstable
> >>>> repository
> (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-
> unstable.git;a=summary)
> >>>> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
> >>>>
> >>>> this also didn't help and seemed to produce the same error-message
> >>>>
> >>>> so to summarize:
> >>>>
> >>>> 1) error message appearing with all 4 patchsets applied changing
> >>>> btrfs-code and compress-force=lzo and disk_cache enabled
> >>>>
> >>>> 2) error message appearing with default mount-options and btrfs
> from
> >>>> 2.6.37 and changes for zcache & io-less dirty throttling patchset
> >>>> applied (first 2 patch(sets)) from list)
> >>>>
> >>>>
> >>>> in my case I tried to extract / play back a 1.7 GiB tarball of my
> >>>> portage-directory (lots of small files and some tar.bzip2
> archives)
> >>>> via pbzip2 or 7z when the error happened and the message was shown
> >>>>
> >>>> Due to KMS sound (webradio streaming) was still running but I
> couldn't
> >>>> continue work (X switching to kernel output) so I did the magic
> sysrq
> >>>> combo (reisub)
> >>>>
> >>>>
> >>>> Does that BUG message ring a bell for anyone ?
> >>>>
> >>>> (if I should leave out anyone from the CC in the next emails or
> >>>> future, please holler - I don't want to spam your inboxes)
> >>>>
> >>>> Thanks
> >>>>
> >>>> Matt
> >>>>
> >>>
> >>>
> >>> OK,
> >>>
> >>> here's the output of a kernel -
> >>>
> >>> staying as close to vanilla (2.6.37) as the current situation
> allows
> >>> (only including some corruption or leak fixes for zram & zcache and
> >>> "zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
> >>> fixes)):
> >>>
> >>> so in total the following patches are included in this new kernel
> >>> (2.6.37-zcache):
> >>>
> >>> zram changes:
> >>> 1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
> >>> 2 [PATCH] zram: fix data corruption issue
> >>> 3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations
> >>>
> >>> zcache:
> >>> 1 zcache-linux-2.6.37-110205
> >>> 2 [PATCH] staging: zcache: fix memory leak
> >>> 3 [PATCH] zcache: Fix build error when sysfs is not defined
> >>>
> >>> reiserfs:
> >>> 1 [PATCH] reiserfs: Make sure va_end() is always called after
> >>> 2 [patch] reiserfs: potential ERR_PTR dereference
> >>>
> >>>
> >>> the same procedure:
> >>>
> >>> trying to extract the mentioned portage-tarball:
> >>>
> >>> time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 |
> tar
> >>> -xp -C /usr/gentoo/)
> >>>
> >>>
> >>> this hopefully should make it easier to track down the problem:
> >>>
> >>>
> >>> Feb 14 01:59:59 lupus kernel: [ ?364.777143] device fsid
> >>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
> >>> /dev/mapper/portage
> >>> Feb 14 01:59:59 lupus kernel: [ ?364.844994] zcache: created
> ephemeral
> >>> tmem pool, id=2
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577573] BUG: unable to handle
> >>> kernel paging request at 0000000037610050
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577605] IP:
> [<ffffffff81338cbb>]
> >>> btrfs_encode_fh+0x2b/0x110
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577630] PGD 0
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577640] Oops: 0000 [#1]
> PREEMPT SMP
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577665] last sysfs file:
> >>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577693] CPU 5
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577701] Modules linked in:
> radeon
> >>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
> >>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
> >>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
> >>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
> >>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack
> xt_mark
> >>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
> >>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
> snd_seq_midi_event
> >>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
> >>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
> >>> snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
> >>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
> >>> ehci_hcd [last unloaded: tg3]
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578114]
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578124] Pid: 8285, comm: tar
> Not
> >>> tainted 2.6.37-zcache #2 FMP55/ipower G3710
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578146] RIP:
> >>> 0010:[<ffffffff81338cbb>] ?[<ffffffff81338cbb>]
> >>> btrfs_encode_fh+0x2b/0x110
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578172] RSP:
> >>> 0018:ffff88023ea9dcc8 ?EFLAGS: 00010246
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578189] RAX: 00000000000000ff
> >>> RBX: ffff8800b8643228 RCX: 0000000000000000
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578210] RDX: ffff88023ea9dd04
> >>> RSI: ffff88023ea9dd38 RDI: 0000000000000006
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578230] RBP: 0000000037610000
> >>> R08: ffffffff81338c90 R09: 0000000000000000
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578251] R10: 0000000000000019
> >>> R11: 0000000000000001 R12: ffff8800b8643380
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578272] R13: ffff8800b8643258
> >>> R14: 00007fff806f1f00 R15: 0000000000000000
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578293] FS:
> >>> 00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
> >>> knlGS:0000000000000000
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578317] CS: ?0010 DS: 0000 ES:
> >>> 0000 CR0: 0000000080050033
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578334] CR2: 0000000037610050
> >>> CR3: 000000023dcef000 CR4: 00000000000006e0
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578356] DR0: 0000000000000000
> >>> DR1: 0000000000000000 DR2: 0000000000000000
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578377] DR3: 0000000000000000
> >>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578398] Process tar (pid:
> 8285,
> >>> threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578421] Stack:
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578428] ?000000013d096000
> >>> ffff88023ed84800 ffff88023ea9c000 0000000000000002
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578458] ?ffffffffffffffff
> >>> ffffffff810e3b1a 0000000000000001 000000061e1d5240
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578486] ?fffffffffffffffb
> >>> ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578514] Call Trace:
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578525] ?[<ffffffff810e3b1a>]
> ?
> >>> cleancache_get_key+0x4a/0x60
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578544] ?[<ffffffff810e3d5e>]
> ?
> >>> __cleancache_flush_inode+0x3e/0x70
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578565] ?[<ffffffff810b0ed2>]
> ?
> >>> truncate_inode_pages_range+0x42/0x440
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578586] ?[<ffffffff81338451>]
> ?
> >>> btrfs_tree_unlock+0x41/0x50
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578605] ?[<ffffffff812e4ed5>]
> ?
> >>> btrfs_release_path+0x15/0x70
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578624] ?[<ffffffff8130bf29>]
> ?
> >>> btrfs_run_delayed_iputs+0x49/0x120
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578644] ?[<ffffffff813107e7>]
> ?
> >>> btrfs_evict_inode+0x27/0x1e0
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578663] ?[<ffffffff810fc3aa>]
> ?
> >>> evict+0x1a/0xa0
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578678] ?[<ffffffff810fc6bd>]
> ?
> >>> iput+0x1cd/0x2b0
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578694] ?[<ffffffff810f266f>]
> ?
> >>> do_unlinkat+0x12f/0x1d0
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578712] ?[<ffffffff810027bb>]
> ?
> >>> system_call_fastpath+0x16/0x1b
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578730] Code: 55 b8 ff 00 00
> 00
> >>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
> 00
> >>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00
> 00
> >>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578986] RIP
> ?[<ffffffff81338cbb>]
> >>> btrfs_encode_fh+0x2b/0x110
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.579081] ?RSP
> <ffff88023ea9dcc8>
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.579093] CR2: 0000000037610050
> >>> Feb 14 02:02:49 lupus kernel: [ ?534.587513] ---[ end trace
> >>> c596b12e66c0b360 ]---
> >>>
> >>>
> >>> for reference I've pasted it to pastebin.com:
> >>>
> >>> "2.6.37_zcache_V2.patch"
> >>> http://pastebin.com/cVSkwQ6M
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> after the reboot I had forgotten to not mount the btrfs volume and
> it
> >>> threw a similar error-message again and remounted several
> partitions
> >>> read-only (including the system partition)
> >>> the partition with btrfs (/usr/gentoo) couldn't be unmounted since
> the
> >>> umount process kind of hang
> >>>
> >>> so here's the error message after a reboot (might not be accurate
> or
> >>> kind of "skewed" since other patches are included (io-less dirty
> >>> throttling, PATCH] fix (latent?) memory corruption in
> >>> btrfs_encode_fh() and latest changes for btrfs)) but might help to
> get
> >>> some more evidence:
> >>>
> >>>
> >>> Feb 14 02:05:46 lupus kernel: [ ? 63.922648] device fsid
> >>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
> >>> /dev/mapper/portage
> >>> Feb 14 02:05:46 lupus kernel: [ ? 64.047118] btrfs: unlinked 1
> orphans
> >>> Feb 14 02:05:46 lupus kernel: [ ? 64.051956] zcache: created
> ephemeral
> >>> tmem pool, id=3
> >>> Feb 14 02:05:48 lupus kernel: [ ? 65.801364] hub 2-1:1.0:
> hub_suspend
> >>> Feb 14 02:05:48 lupus kernel: [ ? 65.801376] usb 2-1: unlink
> >>> qh256-0001/ffff88023fefd180 start 1 [1/0 us]
> >>> Feb 14 02:05:48 lupus kernel: [ ? 65.801559] usb 2-1: usb auto-
> suspend
> >>> Feb 14 02:05:50 lupus kernel: [ ? 67.797929] hub 2-0:1.0:
> hub_suspend
> >>> Feb 14 02:05:50 lupus kernel: [ ? 67.797939] usb usb2: bus auto-
> suspend
> >>> Feb 14 02:05:50 lupus kernel: [ ? 67.797942] ehci_hcd 0000:00:1d.0:
> >>> suspend root hub
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050493] BUG: unable to handle
> >>> kernel paging request at 0000030341ed0050
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050670] IP:
> [<ffffffff8133ef1b>]
> >>> btrfs_encode_fh+0x2b/0x120
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050807] PGD 0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050929] Oops: 0000 [#1]
> PREEMPT SMP
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.051223] last sysfs file:
> >>> /sys/module/pcie_aspm/parameters/policy
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.051365] CPU 6
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.051411] Modules linked in:
> >>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
> >>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> >>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
> >>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner
> xt_hashlimit
> >>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack
> xt_string
> >>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy
> snd_seq_oss
> >>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> >>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel
> snd_hda_codec
> >>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp
> e1000e
> >>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
> >>> usb_storage ehci_hcd [last unloaded: tg3]
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.054694]
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.054776] Pid: 7962, comm:
> umount
> >>> Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.054912] RIP:
> >>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
> >>> btrfs_encode_fh+0x2b/0x120
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055084] RSP:
> >>> 0018:ffff88023c77d6f8 ?EFLAGS: 00010246
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055173] RAX: 00000000000000ff
> >>> RBX: ffff88023cde0168 RCX: 0000000000000000
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055265] RDX: ffff88023c77d734
> >>> RSI: ffff88023c77d768 RDI: 0000000000000006
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055357] RBP: 0000030341ed0000
> >>> R08: ffffffff8133eef0 R09: ffff88023c77d8d8
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055448] R10: 0000000000000003
> >>> R11: 0000000000000001 R12: 00000000ffffffff
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055540] R13: ffff88023cde0030
> >>> R14: ffffea0007dd39f0 R15: 0000000000000001
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055633] FS:
> >>> 00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
> >>> knlGS:0000000000000000
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055762] CS: ?0010 DS: 0000 ES:
> >>> 0000 CR0: 000000008005003b
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055851] CR2: 0000030341ed0050
> >>> CR3: 000000023c7d5000 CR4: 00000000000006e0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055943] DR0: 0000000000000000
> >>> DR1: 0000000000000000 DR2: 0000000000000000
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056035] DR3: 0000000000000000
> >>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056128] Process umount (pid:
> >>> 7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056257] Stack:
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056338] ?0000000000000000
> >>> 0000000000000002 ffff880200000000 0000000000000003
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056630] ?ffffea0007dd39f0
> >>> ffffffff810e6aaa ffff880200000041 0000000600000246
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056922] ?ffff88023cdcd300
> >>> ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057213] Call Trace:
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057301] ?[<ffffffff810e6aaa>]
> ?
> >>> cleancache_get_key+0x4a/0x60
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057393] ?[<ffffffff810e6b3a>]
> ?
> >>> __cleancache_get_page+0x7a/0xd0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057487] ?[<ffffffff8132bb7c>]
> ?
> >>> merge_state+0x7c/0x150
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057579] ?[<ffffffff8132e4de>]
> ?
> >>> __extent_read_full_page+0x52e/0x710
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057673] ?[<ffffffff813bdea4>]
> ?
> >>> rb_insert_color+0xa4/0x140
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057766] ?[<ffffffff8134b0b6>]
> ?
> >>> tree_insert+0x86/0x1e0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057859] ?[<ffffffff81058c73>]
> ?
> >>> lock_timer_base.clone.22+0x33/0x70
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058004] ?[<ffffffff81305060>]
> ?
> >>> btree_get_extent+0x0/0x1c0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058097] ?[<ffffffff81330b21>]
> ?
> >>> read_extent_buffer_pages+0x2d1/0x470
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058191] ?[<ffffffff81305060>]
> ?
> >>> btree_get_extent+0x0/0x1c0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058283] ?[<ffffffff8130674d>]
> ?
> >>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058415] ?[<ffffffff813076f9>]
> ?
> >>> read_tree_block+0x39/0x60
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058508] ?[<ffffffff812ed5e6>]
> ?
> >>> read_block_for_search.clone.40+0x116/0x410
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058638] ?[<ffffffff812eb228>]
> ?
> >>> btrfs_cow_block+0x118/0x2b0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058731] ?[<ffffffff812f0bc7>]
> ?
> >>> btrfs_search_slot+0x307/0xa00
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058823] ?[<ffffffff812f6b18>]
> ?
> >>> lookup_inline_extent_backref+0x98/0x4a0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058919] ?[<ffffffff810e33d7>]
> ?
> >>> kmem_cache_alloc+0x87/0xa0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059032] ?[<ffffffff812f891c>]
> ?
> >>> __btrfs_free_extent+0xcc/0x6f0
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059125] ?[<ffffffff812fc4cf>]
> ?
> >>> run_clustered_refs+0x39f/0x880
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059220] ?[<ffffffff810b1f98>]
> ?
> >>> pagevec_lookup_tag+0x18/0x20
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059312] ?[<ffffffff810a7c81>]
> ?
> >>> filemap_fdatawait_range+0x91/0x180
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059405] ?[<ffffffff812fca77>]
> ?
> >>> btrfs_run_delayed_refs+0xc7/0x220
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059498] ?[<ffffffff8130c29c>]
> ?
> >>> btrfs_commit_transaction+0x7c/0x760
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059591] ?[<ffffffff81067ea0>]
> ?
> >>> autoremove_wake_function+0x0/0x30
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059683] ?[<ffffffff8130cdef>]
> ?
> >>> start_transaction+0x1bf/0x270
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059775] ?[<ffffffff8110e96a>]
> ?
> >>> __sync_filesystem+0x5a/0x90
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059867] ?[<ffffffff810eae8d>]
> ?
> >>> generic_shutdown_super+0x2d/0x100
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059960] ?[<ffffffff810eafb9>]
> ?
> >>> kill_anon_super+0x9/0x50
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060051] ?[<ffffffff810eb266>]
> ?
> >>> deactivate_locked_super+0x26/0x80
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060144] ?[<ffffffff811043ea>]
> ?
> >>> sys_umount+0x7a/0x390
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060235] ?[<ffffffff810027bb>]
> ?
> >>> system_call_fastpath+0x16/0x1b
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060325] Code: 55 b8 ff 00 00
> 00
> >>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
> 00
> >>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00
> 00
> >>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063170] RIP
> ?[<ffffffff8133ef1b>]
> >>> btrfs_encode_fh+0x2b/0x120
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063302] ?RSP
> <ffff88023c77d6f8>
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063386] CR2: 0000030341ed0050
> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063528] ---[ end trace
> >>> 3313552d105b1535 ]---
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.961960] BUG: unable to handle
> >>> kernel paging request at 0000030341ed0050
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962171] IP:
> [<ffffffff8133ef1b>]
> >>> btrfs_encode_fh+0x2b/0x120
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962307] PGD 0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962430] Oops: 0000 [#2]
> PREEMPT SMP
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962637] last sysfs file:
> >>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962766] CPU 5
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962812] Modules linked in:
> >>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
> >>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> >>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
> >>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner
> xt_hashlimit
> >>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack
> xt_string
> >>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy
> snd_seq_oss
> >>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
> >>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel
> snd_hda_codec
> >>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp
> e1000e
> >>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
> >>> usb_storage ehci_hcd [last unloaded: tg3]
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966044]
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966127] Pid: 7915, comm:
> >>> btrfs-transacti Tainted: G ? ? ?D ? ? 2.6.37-plus_v16_zcache #4
> >>> FMP55/ipower G3710
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966266] RIP:
> >>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
> >>> btrfs_encode_fh+0x2b/0x120
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966440] RSP:
> >>> 0018:ffff88023c63b6e0 ?EFLAGS: 00010246
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966528] RAX: 00000000000000ff
> >>> RBX: ffff88023cde0168 RCX: 0000000000000000
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966620] RDX: ffff88023c63b71c
> >>> RSI: ffff88023c63b750 RDI: 0000000000000006
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966713] RBP: 0000030341ed0000
> >>> R08: ffffffff8133eef0 R09: ffff88023c63b8c0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966805] R10: 0000000000000003
> >>> R11: 0000000000000001 R12: 00000000ffffffff
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966897] R13: ffff88023cde0030
> >>> R14: ffffea0007d59bc8 R15: 0000000000000001
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966990] FS:
> >>> 0000000000000000(0000) GS:ffff8800bf540000(0000)
> >>> knlGS:0000000000000000
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967120] CS: ?0010 DS: 0000 ES:
> >>> 0000 CR0: 000000008005003b
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967209] CR2: 0000030341ed0050
> >>> CR3: 0000000001c27000 CR4: 00000000000006e0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967302] DR0: 0000000000000000
> >>> DR1: 0000000000000000 DR2: 0000000000000000
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967394] DR3: 0000000000000000
> >>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967500] Process btrfs-
> transacti
> >>> (pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967630] Stack:
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967711] ?0000000000000000
> >>> 0000000000000002 0000000000000000 0000000000000003
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968057] ?ffffea0007d59bc8
> >>> ffffffff810e6aaa 0000000000000041 0000000600000002
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968348] ?0000000000000000
> >>> ffffffff810e6b3a 0000000000000001 ffffffff00000001
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968639] Call Trace:
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968728] ?[<ffffffff810e6aaa>]
> ?
> >>> cleancache_get_key+0x4a/0x60
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968820] ?[<ffffffff810e6b3a>]
> ?
> >>> __cleancache_get_page+0x7a/0xd0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968914] ?[<ffffffff8132e4de>]
> ?
> >>> __extent_read_full_page+0x52e/0x710
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969008] ?[<ffffffff812f3f93>]
> ?
> >>> update_reserved_bytes+0xb3/0x140
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969102] ?[<ffffffff81305060>]
> ?
> >>> btree_get_extent+0x0/0x1c0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969193] ?[<ffffffff8132bb7c>]
> ?
> >>> merge_state+0x7c/0x150
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969285] ?[<ffffffff81330b21>]
> ?
> >>> read_extent_buffer_pages+0x2d1/0x470
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969378] ?[<ffffffff81305060>]
> ?
> >>> btree_get_extent+0x0/0x1c0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969470] ?[<ffffffff8130674d>]
> ?
> >>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969602] ?[<ffffffff813076f9>]
> ?
> >>> read_tree_block+0x39/0x60
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969694] ?[<ffffffff812ed5e6>]
> ?
> >>> read_block_for_search.clone.40+0x116/0x410
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969878] ?[<ffffffff812f0bc7>]
> ?
> >>> btrfs_search_slot+0x307/0xa00
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969970] ?[<ffffffff812f6b18>]
> ?
> >>> lookup_inline_extent_backref+0x98/0x4a0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970065] ?[<ffffffff810e33d7>]
> ?
> >>> kmem_cache_alloc+0x87/0xa0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970157] ?[<ffffffff812f891c>]
> ?
> >>> __btrfs_free_extent+0xcc/0x6f0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970249] ?[<ffffffff812f8434>]
> ?
> >>> update_block_group.clone.62+0xc4/0x280
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970343] ?[<ffffffff812fc4cf>]
> ?
> >>> run_clustered_refs+0x39f/0x880
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970436] ?[<ffffffff812fca77>]
> ?
> >>> btrfs_run_delayed_refs+0xc7/0x220
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970529] ?[<ffffffff810e15f9>]
> ?
> >>> new_slab+0x169/0x1f0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970619] ?[<ffffffff8130c29c>]
> ?
> >>> btrfs_commit_transaction+0x7c/0x760
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970713] ?[<ffffffff81067ea0>]
> ?
> >>> autoremove_wake_function+0x0/0x30
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970806] ?[<ffffffff81305bc3>]
> ?
> >>> transaction_kthread+0x283/0x2a0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970898] ?[<ffffffff81305940>]
> ?
> >>> transaction_kthread+0x0/0x2a0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970990] ?[<ffffffff81305940>]
> ?
> >>> transaction_kthread+0x0/0x2a0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971083] ?[<ffffffff81067a16>]
> ?
> >>> kthread+0x96/0xa0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971174] ?[<ffffffff81003514>]
> ?
> >>> kernel_thread_helper+0x4/0x10
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971266] ?[<ffffffff81067980>]
> ?
> >>> kthread+0x0/0xa0
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971355] ?[<ffffffff81003510>]
> ?
> >>> kernel_thread_helper+0x0/0x10
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971444] Code: 55 b8 ff 00 00
> 00
> >>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
> 00
> >>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00
> 00
> >>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974280] RIP
> ?[<ffffffff8133ef1b>]
> >>> btrfs_encode_fh+0x2b/0x120
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974412] ?RSP
> <ffff88023c63b6e0>
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974497] CR2: 0000030341ed0050
> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974599] ---[ end trace
> >>> 3313552d105b1536 ]---
> >>> Feb 14 02:07:04 lupus kernel: [ ?141.906124] zcache: destroyed pool
> id=2
> >>> Feb 14 02:07:17 lupus kernel: [ ?154.783358] SysRq : Keyboard mode
> set
> >>> to system default
> >>> Feb 14 02:07:18 lupus kernel: [ ?155.486147] SysRq : Terminate All
> Tasks
> >>>
> >>>
> >>> That's all for now
> >>>
> >>> Thanks & Regards
> >>>
> >>> Matt
> >>>
> >>
> >> (leaving out several folks from the CC to avoid spamming - if I left
> >> out someone wrongfully please re-add)
> >>
> >> running an addr2line reveals:
> >>
> >>
> >> addr2line -e /usr/src/linux-2.6.37_vanilla/vmlinux -i
> ffffffff81338cbb
> >> export.c:0
> >>
> >>
> >> hope that helps
> >>
> >>
> >> Regards
> >>
> >> Matt
> >>
> >
> > Just my guessing. I might be wrong.
> >
> > __cleancache_flush_inode calls cleancache_get_key with
> cleancache_filekey.
> > cleancache_file_key's size is just 6 * u32.
> > cleancache_get_key calls btrfs_encode_fh with the key.
> > but btrfs_encode_fh does typecasting the key to btrfs_fid which is
> > bigger size than cleancache_filekey's one so it should not access
> > fields beyond cleancache_get_key.
> >
> > I think some file systems use extend fid so in there, this problem
> can
> > happen. I don't know why we can't find it earlier. Maybe Dan and
> > others test it for a long time.
> >
> > Am I missing something?
> >
> >
> >
> > --
> > Kind regards,
> > Minchan Kim
> >
>
> reposting Minchan's message for reference to the btrfs mailing list
> while also adding
>
> Li Zefan, Miao Xie, Yan Zheng, Dan Rosenberg and Josef Bacik to CC
>
> Regards
>
> Matt

Hi Matt and Minchan --

(BTRFS EXPERTS SEE *** BELOW)

I definitely see a bug in cleancache_get_key in the monolithic
zcache+cleancache+frontswap patch I posted on oss.oracle.com
that is corrected in linux-next but I don't see how it could
get provoked by btrfs.

The bug is that, in cleancache_get_key, the return value of fhfn should
be checked against 255. If the return value is 255, cleancache_get_key
should return -1. This should disable cleancache for any filesystem
where KEY_MAX is too large.

But cleancache_get_key always calls fhfn with connectable == 0 and
CLEANCACHE_KEY_MAX==6 should be greater than BTRFS_FID_SIZE_CONNECTABLE
(which I think should be 5?). And the elements written into the
typecast btrfs_fid should be only writing the first 5 32-bit words.

So if the problem is the one Minchan suggests, I am confused. Matt,
can you first confirm that you are using the cleancache patches from
my monolithic patch from oss.oracle.com (which I think you are)?

***

Looking over the stacktrace and the code, I have an alternate theory.
I wonder if it is ever possible the inode->dentry list is empty
(or corrupt)? list_first_entry() assumes the list is non-empty.
If this is possible and unusual, maybe my testing didn't see the
problem?

Thanks,
Dan

P.S. For those new to cleancache, the code is in linux-next here:
http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=blob;f=mm/cleancache.c;h=f545eb8f11180cfb3aaf3f4f85a5255be8f9f881;hb=a57cb3bc013d2e262a663df50af6a9e7cc88bdad

2011-02-16 01:58:20

by Matt

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Wed, Feb 16, 2011 at 1:27 AM, Dan Magenheimer
<[email protected]> wrote:
>> -----Original Message-----
>> From: Matt [mailto:[email protected]]
>> Sent: Tuesday, February 15, 2011 5:12 PM
>> To: Minchan Kim
>> Cc: Dan Magenheimer; [email protected]; Chris Mason; linux-
>> [email protected]; [email protected]; [email protected]; linux-
>> [email protected]; Josef Bacik; Dan Rosenberg; Yan Zheng;
>> [email protected]; Li Zefan
>> Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page
>> cache/swap compression
>>
>> On Mon, Feb 14, 2011 at 4:35 AM, Minchan Kim <[email protected]>
>> wrote:
>> > On Mon, Feb 14, 2011 at 10:29 AM, Matt <[email protected]> wrote:
>> >> On Mon, Feb 14, 2011 at 1:24 AM, Matt <[email protected]> wrote:
>> >>> On Mon, Feb 14, 2011 at 12:08 AM, Matt <[email protected]>
>> wrote:
>> >>>> On Wed, Feb 9, 2011 at 1:03 AM, Dan Magenheimer
>> >>>> <[email protected]> wrote:
>> >>>> [snip]
>> >>>>>
>> >>>>> If I've missed anything important, please let me know!
>> >>>>>
>> >>>>> Thanks again!
>> >>>>> Dan
>> >>>>>
>> >>>>
>> >>>> Hi Dan,
>> >>>>
>> >>>> thank you so much for answering my email in such detail !
>> >>>>
>> >>>> I shall pick up on that mail in my next email sending to the
>> mailing list :)
>> >>>>
>> >>>>
>> >>>> currently I've got a problem with btrfs which seems to get
>> triggered
>> >>>> by cleancache get-operations:
>> >>>>
>> >>>>
>> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.297377] device fsid
>> >>>> 354120c992a00761-5fa07d400126a895 devid 1 transid 7
>> >>>> /dev/mapper/portage
>> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.297698] btrfs: enabling disk
>> space caching
>> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.297700] btrfs: force lzo
>> compression
>> >>>> Feb 14 00:37:19 lupus kernel: [ 2831.315844] zcache: created
>> ephemeral
>> >>>> tmem pool, id=3
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853188] BUG: unable to handle
>> >>>> kernel paging request at 0000000001400050
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853219] IP:
>> [<ffffffff8133ef1b>]
>> >>>> btrfs_encode_fh+0x2b/0x120
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853242] PGD 0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853251] Oops: 0000 [#1]
>> PREEMPT SMP
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853275] last sysfs file:
>> >>>> /sys/devices/platform/coretemp.3/temp1_input
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853295] CPU 4
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853303] Modules linked in:
>> radeon
>> >>>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>> >>>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>> >>>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> nf_conntrack_ftp
>> >>>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>> >>>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack
>> xt_mark
>> >>>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables
>> x_tables
>> >>>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>> snd_seq_midi_event
>> >>>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>> snd_hda_codec_hdmi
>> >>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep
>> snd_pcm
>> >>>> snd_timer snd soundcore i2c_i801 wmi e1000e shpchp snd_page_alloc
>> >>>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>> >>>> ehci_hcd [last unloaded: tg3]
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853682]
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853690] Pid: 11394, comm:
>> >>>> btrfs-transacti Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower
>> >>>> G3710
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853725] RIP:
>> >>>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>> >>>> btrfs_encode_fh+0x2b/0x120
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853751] RSP:
>> >>>> 0018:ffff880129a11b00 ?EFLAGS: 00010246
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853767] RAX: 00000000000000ff
>> >>>> RBX: ffff88014a1ce628 RCX: 0000000000000000
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853788] RDX: ffff880129a11b3c
>> >>>> RSI: ffff880129a11b70 RDI: 0000000000000006
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853808] RBP: 0000000001400000
>> >>>> R08: ffffffff8133eef0 R09: ffff880129a11c68
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853829] R10: 0000000000000001
>> >>>> R11: 0000000000000001 R12: ffff88014a1ce780
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853849] R13: ffff88021fefc000
>> >>>> R14: ffff88021fef9000 R15: 0000000000000000
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853870] FS:
>> >>>> 0000000000000000(0000) GS:ffff8800bf500000(0000)
>> >>>> knlGS:0000000000000000
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853894] CS: ?0010 DS: 0000
>> ES:
>> >>>> 0000 CR0: 000000008005003b
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853911] CR2: 0000000001400050
>> >>>> CR3: 0000000001c27000 CR4: 00000000000006e0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853932] DR0: 0000000000000000
>> >>>> DR1: 0000000000000000 DR2: 0000000000000000
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853952] DR3: 0000000000000000
>> >>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853973] Process btrfs-
>> transacti
>> >>>> (pid: 11394, threadinfo ffff880129a10000, task ffff880202e4ac40)
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.853999] Stack:
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854006] ?ffff880129a11b50
>> >>>> ffff880000000003 ffff88003c60a098 0000000000000003
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854035] ?ffffffffffffffff
>> >>>> ffffffff810e6aaa 0000000000000000 0000000602e4ac40
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854063] ?ffffffff8133e3f0
>> >>>> ffffffff810e6cee 0000000000001000 0000000000000000
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854092] Call Trace:
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854103] ?[<ffffffff810e6aaa>]
>> ?
>> >>>> cleancache_get_key+0x4a/0x60
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854122] ?[<ffffffff8133e3f0>]
>> ?
>> >>>> btrfs_wake_function+0x0/0x20
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854140] ?[<ffffffff810e6cee>]
>> ?
>> >>>> __cleancache_flush_inode+0x3e/0x70
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854161] ?[<ffffffff810b34d2>]
>> ?
>> >>>> truncate_inode_pages_range+0x42/0x440
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854182] ?[<ffffffff812f115e>]
>> ?
>> >>>> btrfs_search_slot+0x89e/0xa00
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854201] ?[<ffffffff810c3a45>]
>> ?
>> >>>> unmap_mapping_range+0xc5/0x2a0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854220] ?[<ffffffff810b3930>]
>> ?
>> >>>> truncate_pagecache+0x40/0x70
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854240] ?[<ffffffff813458b1>]
>> ?
>> >>>> btrfs_truncate_free_space_cache+0x81/0xe0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854261] ?[<ffffffff812fce15>]
>> ?
>> >>>> btrfs_write_dirty_block_groups+0x245/0x500
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854283] ?[<ffffffff812fcb6a>]
>> ?
>> >>>> btrfs_run_delayed_refs+0x1ba/0x220
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854304] ?[<ffffffff8130afff>]
>> ?
>> >>>> commit_cowonly_roots+0xff/0x1d0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854323] ?[<ffffffff8130c583>]
>> ?
>> >>>> btrfs_commit_transaction+0x363/0x760
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854344] ?[<ffffffff81067ea0>]
>> ?
>> >>>> autoremove_wake_function+0x0/0x30
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854364] ?[<ffffffff81305bc3>]
>> ?
>> >>>> transaction_kthread+0x283/0x2a0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854383] ?[<ffffffff81305940>]
>> ?
>> >>>> transaction_kthread+0x0/0x2a0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854401] ?[<ffffffff81305940>]
>> ?
>> >>>> transaction_kthread+0x0/0x2a0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854420] ?[<ffffffff81067a16>]
>> ?
>> >>>> kthread+0x96/0xa0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854437] ?[<ffffffff81003514>]
>> ?
>> >>>> kernel_thread_helper+0x4/0x10
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854455] ?[<ffffffff81067980>]
>> ?
>> >>>> kthread+0x0/0xa0
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854471] ?[<ffffffff81003510>]
>> ?
>> >>>> kernel_thread_helper+0x0/0x10
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854488] Code: 55 b8 ff 00 00
>> 00
>> >>>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
>> 00
>> >>>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00
>> 00 00
>> >>>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854742] RIP
>> ?[<ffffffff8133ef1b>]
>> >>>> btrfs_encode_fh+0x2b/0x120
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854762] ?RSP
>> <ffff880129a11b00>
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.854773] CR2: 0000000001400050
>> >>>> Feb 14 00:39:20 lupus kernel: [ 2951.860906] ---[ end trace
>> >>>> f831c5ceeaa49287 ]---
>> >>>>
>> >>>> in my case I had compress-force with lzo and disk_cache enabled
>> >>>>
>> >>>>
>> >>>> another user of the kernel I'm currently running has had the same
>> >>>> problem with zcache
>> >>>> (http://forums.gentoo.org/viewtopic-p-6571799.html#6571799)
>> >>>>
>> >>>> (looks like in his case compression and any other fancy additional
>> >>>> features weren't enabled)
>> >>>>
>> >>>>
>> >>>> changes made by this kernel or patchset to btrfs are from
>> >>>> * io-less dirty throttling patchset (44 patches)
>> >>>> * zcache V2 ("[PATCH] staging: zcache: fix memory leak" should be
>> >>>> applied in both cases)
>> >>>> * PATCH] fix (latent?) memory corruption in btrfs_encode_fh()
>> >>>> * btrfs-unstable changes to state of
>> >>>> 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6 (so practically equals
>> btrfs
>> >>>> from 2.6.38-rc4+)
>> >>>>
>> >>>> I haven't tried downgrading to vanilla 2.6.37 with zcache only,
>> yet,
>> >>>>
>> >>>> but kind of upgraded btrfs to the latest state of the btrfs-
>> unstable
>> >>>> repository
>> (http://git.eu.kernel.org/?p=linux/kernel/git/mason/btrfs-
>> unstable.git;a=summary)
>> >>>> namely 3a90983dbdcb2f4f48c0d771d8e5b4d88f27fae6
>> >>>>
>> >>>> this also didn't help and seemed to produce the same error-message
>> >>>>
>> >>>> so to summarize:
>> >>>>
>> >>>> 1) error message appearing with all 4 patchsets applied changing
>> >>>> btrfs-code and compress-force=lzo and disk_cache enabled
>> >>>>
>> >>>> 2) error message appearing with default mount-options and btrfs
>> from
>> >>>> 2.6.37 and changes for zcache & io-less dirty throttling patchset
>> >>>> applied (first 2 patch(sets)) from list)
>> >>>>
>> >>>>
>> >>>> in my case I tried to extract / play back a 1.7 GiB tarball of my
>> >>>> portage-directory (lots of small files and some tar.bzip2
>> archives)
>> >>>> via pbzip2 or 7z when the error happened and the message was shown
>> >>>>
>> >>>> Due to KMS sound (webradio streaming) was still running but I
>> couldn't
>> >>>> continue work (X switching to kernel output) so I did the magic
>> sysrq
>> >>>> combo (reisub)
>> >>>>
>> >>>>
>> >>>> Does that BUG message ring a bell for anyone ?
>> >>>>
>> >>>> (if I should leave out anyone from the CC in the next emails or
>> >>>> future, please holler - I don't want to spam your inboxes)
>> >>>>
>> >>>> Thanks
>> >>>>
>> >>>> Matt
>> >>>>
>> >>>
>> >>>
>> >>> OK,
>> >>>
>> >>> here's the output of a kernel -
>> >>>
>> >>> staying as close to vanilla (2.6.37) as the current situation
>> allows
>> >>> (only including some corruption or leak fixes for zram & zcache and
>> >>> "zram_xvmalloc: 64K page fixes and optimizations" (and 2 reiserfs
>> >>> fixes)):
>> >>>
>> >>> so in total the following patches are included in this new kernel
>> >>> (2.6.37-zcache):
>> >>>
>> >>> zram changes:
>> >>> 1 zram: Fix sparse warning 'Using plain integer as NULL pointer'
>> >>> 2 [PATCH] zram: fix data corruption issue
>> >>> 3 [PATCH 0/7][v2] zram_xvmalloc: 64K page fixes and optimizations
>> >>>
>> >>> zcache:
>> >>> 1 zcache-linux-2.6.37-110205
>> >>> 2 [PATCH] staging: zcache: fix memory leak
>> >>> 3 [PATCH] zcache: Fix build error when sysfs is not defined
>> >>>
>> >>> reiserfs:
>> >>> 1 [PATCH] reiserfs: Make sure va_end() is always called after
>> >>> 2 [patch] reiserfs: potential ERR_PTR dereference
>> >>>
>> >>>
>> >>> the same procedure:
>> >>>
>> >>> trying to extract the mentioned portage-tarball:
>> >>>
>> >>> time (7z e -so -tbzip2 -mmt=5 /system/portage_backup_022011.tbz2 |
>> tar
>> >>> -xp -C /usr/gentoo/)
>> >>>
>> >>>
>> >>> this hopefully should make it easier to track down the problem:
>> >>>
>> >>>
>> >>> Feb 14 01:59:59 lupus kernel: [ ?364.777143] device fsid
>> >>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 7
>> >>> /dev/mapper/portage
>> >>> Feb 14 01:59:59 lupus kernel: [ ?364.844994] zcache: created
>> ephemeral
>> >>> tmem pool, id=2
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577573] BUG: unable to handle
>> >>> kernel paging request at 0000000037610050
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577605] IP:
>> [<ffffffff81338cbb>]
>> >>> btrfs_encode_fh+0x2b/0x110
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577630] PGD 0
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577640] Oops: 0000 [#1]
>> PREEMPT SMP
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577665] last sysfs file:
>> >>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577693] CPU 5
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.577701] Modules linked in:
>> radeon
>> >>> ttm drm_kms_helper cfbcopyarea cfbimgblt cfbfillrect ipt_REJECT
>> >>> ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc nf_conntrack_irc
>> >>> nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp
>> >>> iptable_filter ipt_addrtype xt_DSCP xt_dscp xt_iprange ip_tables
>> >>> ip6table_filter xt_NFQUEUE xt_owner xt_hashlimit xt_conntrack
>> xt_mark
>> >>> xt_multiport xt_connmark nf_conntrack xt_string ip6_tables x_tables
>> >>> it87 hwmon_vid coretemp snd_seq_dummy snd_seq_oss
>> snd_seq_midi_event
>> >>> snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_codec_hdmi
>> >>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm
>> >>> snd_timer snd e1000e soundcore i2c_i801 shpchp snd_page_alloc wmi
>> >>> libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb usb_storage
>> >>> ehci_hcd [last unloaded: tg3]
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578114]
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578124] Pid: 8285, comm: tar
>> Not
>> >>> tainted 2.6.37-zcache #2 FMP55/ipower G3710
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578146] RIP:
>> >>> 0010:[<ffffffff81338cbb>] ?[<ffffffff81338cbb>]
>> >>> btrfs_encode_fh+0x2b/0x110
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578172] RSP:
>> >>> 0018:ffff88023ea9dcc8 ?EFLAGS: 00010246
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578189] RAX: 00000000000000ff
>> >>> RBX: ffff8800b8643228 RCX: 0000000000000000
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578210] RDX: ffff88023ea9dd04
>> >>> RSI: ffff88023ea9dd38 RDI: 0000000000000006
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578230] RBP: 0000000037610000
>> >>> R08: ffffffff81338c90 R09: 0000000000000000
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578251] R10: 0000000000000019
>> >>> R11: 0000000000000001 R12: ffff8800b8643380
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578272] R13: ffff8800b8643258
>> >>> R14: 00007fff806f1f00 R15: 0000000000000000
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578293] FS:
>> >>> 00007f823d7ed700(0000) GS:ffff8800bf540000(0000)
>> >>> knlGS:0000000000000000
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578317] CS: ?0010 DS: 0000 ES:
>> >>> 0000 CR0: 0000000080050033
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578334] CR2: 0000000037610050
>> >>> CR3: 000000023dcef000 CR4: 00000000000006e0
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578356] DR0: 0000000000000000
>> >>> DR1: 0000000000000000 DR2: 0000000000000000
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578377] DR3: 0000000000000000
>> >>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578398] Process tar (pid:
>> 8285,
>> >>> threadinfo ffff88023ea9c000, task ffff88023e8b9d40)
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578421] Stack:
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578428] ?000000013d096000
>> >>> ffff88023ed84800 ffff88023ea9c000 0000000000000002
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578458] ?ffffffffffffffff
>> >>> ffffffff810e3b1a 0000000000000001 000000061e1d5240
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578486] ?fffffffffffffffb
>> >>> ffffffff810e3d5e ffff88010f383000 0000001ab86cb908
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578514] Call Trace:
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578525] ?[<ffffffff810e3b1a>]
>> ?
>> >>> cleancache_get_key+0x4a/0x60
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578544] ?[<ffffffff810e3d5e>]
>> ?
>> >>> __cleancache_flush_inode+0x3e/0x70
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578565] ?[<ffffffff810b0ed2>]
>> ?
>> >>> truncate_inode_pages_range+0x42/0x440
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578586] ?[<ffffffff81338451>]
>> ?
>> >>> btrfs_tree_unlock+0x41/0x50
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578605] ?[<ffffffff812e4ed5>]
>> ?
>> >>> btrfs_release_path+0x15/0x70
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578624] ?[<ffffffff8130bf29>]
>> ?
>> >>> btrfs_run_delayed_iputs+0x49/0x120
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578644] ?[<ffffffff813107e7>]
>> ?
>> >>> btrfs_evict_inode+0x27/0x1e0
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578663] ?[<ffffffff810fc3aa>]
>> ?
>> >>> evict+0x1a/0xa0
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578678] ?[<ffffffff810fc6bd>]
>> ?
>> >>> iput+0x1cd/0x2b0
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578694] ?[<ffffffff810f266f>]
>> ?
>> >>> do_unlinkat+0x12f/0x1d0
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578712] ?[<ffffffff810027bb>]
>> ?
>> >>> system_call_fastpath+0x16/0x1b
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578730] Code: 55 b8 ff 00 00
>> 00
>> >>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
>> 00
>> >>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00
>> 00
>> >>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.578986] RIP
>> ?[<ffffffff81338cbb>]
>> >>> btrfs_encode_fh+0x2b/0x110
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.579081] ?RSP
>> <ffff88023ea9dcc8>
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.579093] CR2: 0000000037610050
>> >>> Feb 14 02:02:49 lupus kernel: [ ?534.587513] ---[ end trace
>> >>> c596b12e66c0b360 ]---
>> >>>
>> >>>
>> >>> for reference I've pasted it to pastebin.com:
>> >>>
>> >>> "2.6.37_zcache_V2.patch"
>> >>> http://pastebin.com/cVSkwQ6M
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> after the reboot I had forgotten to not mount the btrfs volume and
>> it
>> >>> threw a similar error-message again and remounted several
>> partitions
>> >>> read-only (including the system partition)
>> >>> the partition with btrfs (/usr/gentoo) couldn't be unmounted since
>> the
>> >>> umount process kind of hang
>> >>>
>> >>> so here's the error message after a reboot (might not be accurate
>> or
>> >>> kind of "skewed" since other patches are included (io-less dirty
>> >>> throttling, PATCH] fix (latent?) memory corruption in
>> >>> btrfs_encode_fh() and latest changes for btrfs)) but might help to
>> get
>> >>> some more evidence:
>> >>>
>> >>>
>> >>> Feb 14 02:05:46 lupus kernel: [ ? 63.922648] device fsid
>> >>> 684a4213565dd3fe-ca991821badc2aac devid 1 transid 13
>> >>> /dev/mapper/portage
>> >>> Feb 14 02:05:46 lupus kernel: [ ? 64.047118] btrfs: unlinked 1
>> orphans
>> >>> Feb 14 02:05:46 lupus kernel: [ ? 64.051956] zcache: created
>> ephemeral
>> >>> tmem pool, id=3
>> >>> Feb 14 02:05:48 lupus kernel: [ ? 65.801364] hub 2-1:1.0:
>> hub_suspend
>> >>> Feb 14 02:05:48 lupus kernel: [ ? 65.801376] usb 2-1: unlink
>> >>> qh256-0001/ffff88023fefd180 start 1 [1/0 us]
>> >>> Feb 14 02:05:48 lupus kernel: [ ? 65.801559] usb 2-1: usb auto-
>> suspend
>> >>> Feb 14 02:05:50 lupus kernel: [ ? 67.797929] hub 2-0:1.0:
>> hub_suspend
>> >>> Feb 14 02:05:50 lupus kernel: [ ? 67.797939] usb usb2: bus auto-
>> suspend
>> >>> Feb 14 02:05:50 lupus kernel: [ ? 67.797942] ehci_hcd 0000:00:1d.0:
>> >>> suspend root hub
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050493] BUG: unable to handle
>> >>> kernel paging request at 0000030341ed0050
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050670] IP:
>> [<ffffffff8133ef1b>]
>> >>> btrfs_encode_fh+0x2b/0x120
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050807] PGD 0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.050929] Oops: 0000 [#1]
>> PREEMPT SMP
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.051223] last sysfs file:
>> >>> /sys/module/pcie_aspm/parameters/policy
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.051365] CPU 6
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.051411] Modules linked in:
>> >>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>> >>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> >>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>> >>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner
>> xt_hashlimit
>> >>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack
>> xt_string
>> >>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy
>> snd_seq_oss
>> >>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>> >>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel
>> snd_hda_codec
>> >>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp
>> e1000e
>> >>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>> >>> usb_storage ehci_hcd [last unloaded: tg3]
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.054694]
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.054776] Pid: 7962, comm:
>> umount
>> >>> Not tainted 2.6.37-plus_v16_zcache #4 FMP55/ipower G3710
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.054912] RIP:
>> >>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>> >>> btrfs_encode_fh+0x2b/0x120
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055084] RSP:
>> >>> 0018:ffff88023c77d6f8 ?EFLAGS: 00010246
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055173] RAX: 00000000000000ff
>> >>> RBX: ffff88023cde0168 RCX: 0000000000000000
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055265] RDX: ffff88023c77d734
>> >>> RSI: ffff88023c77d768 RDI: 0000000000000006
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055357] RBP: 0000030341ed0000
>> >>> R08: ffffffff8133eef0 R09: ffff88023c77d8d8
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055448] R10: 0000000000000003
>> >>> R11: 0000000000000001 R12: 00000000ffffffff
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055540] R13: ffff88023cde0030
>> >>> R14: ffffea0007dd39f0 R15: 0000000000000001
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055633] FS:
>> >>> 00007fb1cad04760(0000) GS:ffff8800bf580000(0000)
>> >>> knlGS:0000000000000000
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055762] CS: ?0010 DS: 0000 ES:
>> >>> 0000 CR0: 000000008005003b
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055851] CR2: 0000030341ed0050
>> >>> CR3: 000000023c7d5000 CR4: 00000000000006e0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.055943] DR0: 0000000000000000
>> >>> DR1: 0000000000000000 DR2: 0000000000000000
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056035] DR3: 0000000000000000
>> >>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056128] Process umount (pid:
>> >>> 7962, threadinfo ffff88023c77c000, task ffff88023c7a4260)
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056257] Stack:
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056338] ?0000000000000000
>> >>> 0000000000000002 ffff880200000000 0000000000000003
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056630] ?ffffea0007dd39f0
>> >>> ffffffff810e6aaa ffff880200000041 0000000600000246
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.056922] ?ffff88023cdcd300
>> >>> ffffffff810e6b3a 0000000000000001 ffffffff8132bb7c
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057213] Call Trace:
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057301] ?[<ffffffff810e6aaa>]
>> ?
>> >>> cleancache_get_key+0x4a/0x60
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057393] ?[<ffffffff810e6b3a>]
>> ?
>> >>> __cleancache_get_page+0x7a/0xd0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057487] ?[<ffffffff8132bb7c>]
>> ?
>> >>> merge_state+0x7c/0x150
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057579] ?[<ffffffff8132e4de>]
>> ?
>> >>> __extent_read_full_page+0x52e/0x710
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057673] ?[<ffffffff813bdea4>]
>> ?
>> >>> rb_insert_color+0xa4/0x140
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057766] ?[<ffffffff8134b0b6>]
>> ?
>> >>> tree_insert+0x86/0x1e0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.057859] ?[<ffffffff81058c73>]
>> ?
>> >>> lock_timer_base.clone.22+0x33/0x70
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058004] ?[<ffffffff81305060>]
>> ?
>> >>> btree_get_extent+0x0/0x1c0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058097] ?[<ffffffff81330b21>]
>> ?
>> >>> read_extent_buffer_pages+0x2d1/0x470
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058191] ?[<ffffffff81305060>]
>> ?
>> >>> btree_get_extent+0x0/0x1c0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058283] ?[<ffffffff8130674d>]
>> ?
>> >>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058415] ?[<ffffffff813076f9>]
>> ?
>> >>> read_tree_block+0x39/0x60
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058508] ?[<ffffffff812ed5e6>]
>> ?
>> >>> read_block_for_search.clone.40+0x116/0x410
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058638] ?[<ffffffff812eb228>]
>> ?
>> >>> btrfs_cow_block+0x118/0x2b0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058731] ?[<ffffffff812f0bc7>]
>> ?
>> >>> btrfs_search_slot+0x307/0xa00
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058823] ?[<ffffffff812f6b18>]
>> ?
>> >>> lookup_inline_extent_backref+0x98/0x4a0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.058919] ?[<ffffffff810e33d7>]
>> ?
>> >>> kmem_cache_alloc+0x87/0xa0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059032] ?[<ffffffff812f891c>]
>> ?
>> >>> __btrfs_free_extent+0xcc/0x6f0
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059125] ?[<ffffffff812fc4cf>]
>> ?
>> >>> run_clustered_refs+0x39f/0x880
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059220] ?[<ffffffff810b1f98>]
>> ?
>> >>> pagevec_lookup_tag+0x18/0x20
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059312] ?[<ffffffff810a7c81>]
>> ?
>> >>> filemap_fdatawait_range+0x91/0x180
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059405] ?[<ffffffff812fca77>]
>> ?
>> >>> btrfs_run_delayed_refs+0xc7/0x220
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059498] ?[<ffffffff8130c29c>]
>> ?
>> >>> btrfs_commit_transaction+0x7c/0x760
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059591] ?[<ffffffff81067ea0>]
>> ?
>> >>> autoremove_wake_function+0x0/0x30
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059683] ?[<ffffffff8130cdef>]
>> ?
>> >>> start_transaction+0x1bf/0x270
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059775] ?[<ffffffff8110e96a>]
>> ?
>> >>> __sync_filesystem+0x5a/0x90
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059867] ?[<ffffffff810eae8d>]
>> ?
>> >>> generic_shutdown_super+0x2d/0x100
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.059960] ?[<ffffffff810eafb9>]
>> ?
>> >>> kill_anon_super+0x9/0x50
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060051] ?[<ffffffff810eb266>]
>> ?
>> >>> deactivate_locked_super+0x26/0x80
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060144] ?[<ffffffff811043ea>]
>> ?
>> >>> sys_umount+0x7a/0x390
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060235] ?[<ffffffff810027bb>]
>> ?
>> >>> system_call_fastpath+0x16/0x1b
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.060325] Code: 55 b8 ff 00 00
>> 00
>> >>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
>> 00
>> >>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00
>> 00
>> >>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063170] RIP
>> ?[<ffffffff8133ef1b>]
>> >>> btrfs_encode_fh+0x2b/0x120
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063302] ?RSP
>> <ffff88023c77d6f8>
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063386] CR2: 0000030341ed0050
>> >>> Feb 14 02:05:52 lupus kernel: [ ? 70.063528] ---[ end trace
>> >>> 3313552d105b1535 ]---
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.961960] BUG: unable to handle
>> >>> kernel paging request at 0000030341ed0050
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962171] IP:
>> [<ffffffff8133ef1b>]
>> >>> btrfs_encode_fh+0x2b/0x120
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962307] PGD 0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962430] Oops: 0000 [#2]
>> PREEMPT SMP
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962637] last sysfs file:
>> >>> /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962766] CPU 5
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.962812] Modules linked in:
>> >>> ipt_REJECT ipt_LOG xt_limit xt_tcpudp xt_state nf_nat_irc
>> >>> nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>> >>> nf_conntrack_ftp iptable_filter ipt_addrtype xt_DSCP xt_dscp
>> >>> xt_iprange ip_tables ip6table_filter xt_NFQUEUE xt_owner
>> xt_hashlimit
>> >>> xt_conntrack xt_mark xt_multiport xt_connmark nf_conntrack
>> xt_string
>> >>> ip6_tables x_tables it87 hwmon_vid coretemp snd_seq_dummy
>> snd_seq_oss
>> >>> snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss
>> >>> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel
>> snd_hda_codec
>> >>> snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore wmi shpchp
>> e1000e
>> >>> snd_page_alloc libphy e1000 scsi_wait_scan sl811_hcd ohci_hcd ssb
>> >>> usb_storage ehci_hcd [last unloaded: tg3]
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966044]
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966127] Pid: 7915, comm:
>> >>> btrfs-transacti Tainted: G ? ? ?D ? ? 2.6.37-plus_v16_zcache #4
>> >>> FMP55/ipower G3710
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966266] RIP:
>> >>> 0010:[<ffffffff8133ef1b>] ?[<ffffffff8133ef1b>]
>> >>> btrfs_encode_fh+0x2b/0x120
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966440] RSP:
>> >>> 0018:ffff88023c63b6e0 ?EFLAGS: 00010246
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966528] RAX: 00000000000000ff
>> >>> RBX: ffff88023cde0168 RCX: 0000000000000000
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966620] RDX: ffff88023c63b71c
>> >>> RSI: ffff88023c63b750 RDI: 0000000000000006
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966713] RBP: 0000030341ed0000
>> >>> R08: ffffffff8133eef0 R09: ffff88023c63b8c0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966805] R10: 0000000000000003
>> >>> R11: 0000000000000001 R12: 00000000ffffffff
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966897] R13: ffff88023cde0030
>> >>> R14: ffffea0007d59bc8 R15: 0000000000000001
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.966990] FS:
>> >>> 0000000000000000(0000) GS:ffff8800bf540000(0000)
>> >>> knlGS:0000000000000000
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967120] CS: ?0010 DS: 0000 ES:
>> >>> 0000 CR0: 000000008005003b
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967209] CR2: 0000030341ed0050
>> >>> CR3: 0000000001c27000 CR4: 00000000000006e0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967302] DR0: 0000000000000000
>> >>> DR1: 0000000000000000 DR2: 0000000000000000
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967394] DR3: 0000000000000000
>> >>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967500] Process btrfs-
>> transacti
>> >>> (pid: 7915, threadinfo ffff88023c63a000, task ffff88023c7a1620)
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967630] Stack:
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.967711] ?0000000000000000
>> >>> 0000000000000002 0000000000000000 0000000000000003
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968057] ?ffffea0007d59bc8
>> >>> ffffffff810e6aaa 0000000000000041 0000000600000002
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968348] ?0000000000000000
>> >>> ffffffff810e6b3a 0000000000000001 ffffffff00000001
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968639] Call Trace:
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968728] ?[<ffffffff810e6aaa>]
>> ?
>> >>> cleancache_get_key+0x4a/0x60
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968820] ?[<ffffffff810e6b3a>]
>> ?
>> >>> __cleancache_get_page+0x7a/0xd0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.968914] ?[<ffffffff8132e4de>]
>> ?
>> >>> __extent_read_full_page+0x52e/0x710
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969008] ?[<ffffffff812f3f93>]
>> ?
>> >>> update_reserved_bytes+0xb3/0x140
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969102] ?[<ffffffff81305060>]
>> ?
>> >>> btree_get_extent+0x0/0x1c0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969193] ?[<ffffffff8132bb7c>]
>> ?
>> >>> merge_state+0x7c/0x150
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969285] ?[<ffffffff81330b21>]
>> ?
>> >>> read_extent_buffer_pages+0x2d1/0x470
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969378] ?[<ffffffff81305060>]
>> ?
>> >>> btree_get_extent+0x0/0x1c0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969470] ?[<ffffffff8130674d>]
>> ?
>> >>> btree_read_extent_buffer_pages.clone.65+0x4d/0xa0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969602] ?[<ffffffff813076f9>]
>> ?
>> >>> read_tree_block+0x39/0x60
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969694] ?[<ffffffff812ed5e6>]
>> ?
>> >>> read_block_for_search.clone.40+0x116/0x410
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969878] ?[<ffffffff812f0bc7>]
>> ?
>> >>> btrfs_search_slot+0x307/0xa00
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.969970] ?[<ffffffff812f6b18>]
>> ?
>> >>> lookup_inline_extent_backref+0x98/0x4a0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970065] ?[<ffffffff810e33d7>]
>> ?
>> >>> kmem_cache_alloc+0x87/0xa0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970157] ?[<ffffffff812f891c>]
>> ?
>> >>> __btrfs_free_extent+0xcc/0x6f0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970249] ?[<ffffffff812f8434>]
>> ?
>> >>> update_block_group.clone.62+0xc4/0x280
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970343] ?[<ffffffff812fc4cf>]
>> ?
>> >>> run_clustered_refs+0x39f/0x880
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970436] ?[<ffffffff812fca77>]
>> ?
>> >>> btrfs_run_delayed_refs+0xc7/0x220
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970529] ?[<ffffffff810e15f9>]
>> ?
>> >>> new_slab+0x169/0x1f0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970619] ?[<ffffffff8130c29c>]
>> ?
>> >>> btrfs_commit_transaction+0x7c/0x760
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970713] ?[<ffffffff81067ea0>]
>> ?
>> >>> autoremove_wake_function+0x0/0x30
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970806] ?[<ffffffff81305bc3>]
>> ?
>> >>> transaction_kthread+0x283/0x2a0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970898] ?[<ffffffff81305940>]
>> ?
>> >>> transaction_kthread+0x0/0x2a0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.970990] ?[<ffffffff81305940>]
>> ?
>> >>> transaction_kthread+0x0/0x2a0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971083] ?[<ffffffff81067a16>]
>> ?
>> >>> kthread+0x96/0xa0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971174] ?[<ffffffff81003514>]
>> ?
>> >>> kernel_thread_helper+0x4/0x10
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971266] ?[<ffffffff81067980>]
>> ?
>> >>> kthread+0x0/0xa0
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971355] ?[<ffffffff81003510>]
>> ?
>> >>> kernel_thread_helper+0x0/0x10
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.971444] Code: 55 b8 ff 00 00
>> 00
>> >>> 53 48 89 fb 48 83 ec 18 48 8b 6f 10 8b 3a 83 ff 04 0f 86 d5 00 00
>> 00
>> >>> 85 c9 0f 95 c1 83 ff 07 0f 86 d5 00 00 00 <48> 8b 45 50 bf 05 00 00
>> 00
>> >>> 48 89 06 84 c9 48 8b 85 68 fe ff ff
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974280] RIP
>> ?[<ffffffff8133ef1b>]
>> >>> btrfs_encode_fh+0x2b/0x120
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974412] ?RSP
>> <ffff88023c63b6e0>
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974497] CR2: 0000030341ed0050
>> >>> Feb 14 02:06:16 lupus kernel: [ ? 93.974599] ---[ end trace
>> >>> 3313552d105b1536 ]---
>> >>> Feb 14 02:07:04 lupus kernel: [ ?141.906124] zcache: destroyed pool
>> id=2
>> >>> Feb 14 02:07:17 lupus kernel: [ ?154.783358] SysRq : Keyboard mode
>> set
>> >>> to system default
>> >>> Feb 14 02:07:18 lupus kernel: [ ?155.486147] SysRq : Terminate All
>> Tasks
>> >>>
>> >>>
>> >>> That's all for now
>> >>>
>> >>> Thanks & Regards
>> >>>
>> >>> Matt
>> >>>
>> >>
>> >> (leaving out several folks from the CC to avoid spamming - if I left
>> >> out someone wrongfully please re-add)
>> >>
>> >> running an addr2line reveals:
>> >>
>> >>
>> >> addr2line -e /usr/src/linux-2.6.37_vanilla/vmlinux -i
>> ffffffff81338cbb
>> >> export.c:0
>> >>
>> >>
>> >> hope that helps
>> >>
>> >>
>> >> Regards
>> >>
>> >> Matt
>> >>
>> >
>> > Just my guessing. I might be wrong.
>> >
>> > __cleancache_flush_inode calls cleancache_get_key with
>> cleancache_filekey.
>> > cleancache_file_key's size is just 6 * u32.
>> > cleancache_get_key calls btrfs_encode_fh with the key.
>> > but btrfs_encode_fh does typecasting the key to btrfs_fid which is
>> > bigger size than cleancache_filekey's one so it should not access
>> > fields beyond cleancache_get_key.
>> >
>> > I think some file systems use extend fid so in there, this problem
>> can
>> > happen. I don't know why we can't find it earlier. Maybe Dan and
>> > others test it for a long time.
>> >
>> > Am I missing something?
>> >
>> >
>> >
>> > --
>> > Kind regards,
>> > Minchan Kim
>> >
>>
>> reposting Minchan's message for reference to the btrfs mailing list
>> while also adding
>>
>> Li Zefan, Miao Xie, Yan Zheng, Dan Rosenberg and Josef Bacik to CC
>>
>> Regards
>>
>> Matt
>
> Hi Matt and Minchan --
>
> (BTRFS EXPERTS SEE *** BELOW)
>
> I definitely see a bug in cleancache_get_key in the monolithic
> zcache+cleancache+frontswap patch I posted on oss.oracle.com
> that is corrected in linux-next but I don't see how it could
> get provoked by btrfs.
>
> The bug is that, in cleancache_get_key, the return value of fhfn should
> be checked against 255. ?If the return value is 255, cleancache_get_key
> should return -1. ?This should disable cleancache for any filesystem
> where KEY_MAX is too large.
>
> But cleancache_get_key always calls fhfn with connectable == 0 and
> CLEANCACHE_KEY_MAX==6 should be greater than BTRFS_FID_SIZE_CONNECTABLE
> (which I think should be 5?). ?And the elements written into the
> typecast btrfs_fid should be only writing the first 5 32-bit words.
>
> So if the problem is the one Minchan suggests, I am confused. ?Matt,
> can you first confirm that you are using the cleancache patches from
> my monolithic patch from oss.oracle.com (which I think you are)?
>
> ***
>
> Looking over the stacktrace and the code, I have an alternate theory.
> I wonder if it is ever possible the inode->dentry list is empty
> (or corrupt)? ?list_first_entry() assumes the list is non-empty.
> If this is possible and unusual, maybe my testing didn't see the
> problem?
>
> Thanks,
> Dan
>
> P.S. For those new to cleancache, the code is in linux-next here:
> http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=blob;f=mm/cleancache.c;h=f545eb8f11180cfb3aaf3f4f85a5255be8f9f881;hb=a57cb3bc013d2e262a663df50af6a9e7cc88bdad
>

Hi Dan,

yeah,

I downloaded the monolithic patch you mentioned in the original
message (http://marc.info/?l=linux-kernel&m=129705217700769&w=2) and
use that.

Just checked the file's md5sum on my harddrive against the one from
http://oss.oracle.com/projects/tmem/files/zcache/ and it's the same
file (119d91d81d99fdf3b95919e6012d5fa8).

If you could point out where to download the latest versions of each
of the (broken-out) patches for frontswap, cleancache and zcache - or
simply the updated ones needed - I'd give them another test run (read:
try to use them for everyday tasks).

Currently I've V5 of Cleancache, and V3 of Frontswap in addition to V2
of Zcache on my harddrive - I don't know if those are the most current
ones available.

Thanks

Matt

2011-02-16 04:36:21

by Minchan Kim

[permalink] [raw]
Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

On Wed, Feb 16, 2011 at 10:27 AM, Dan Magenheimer
<[email protected]> wrote:
>> -----Original Message-----
>> From: Matt [mailto:[email protected]]
>> Sent: Tuesday, February 15, 2011 5:12 PM
>> To: Minchan Kim
>> Cc: Dan Magenheimer; [email protected]; Chris Mason; linux-
>> [email protected]; [email protected]; [email protected]; linux-
>> [email protected]; Josef Bacik; Dan Rosenberg; Yan Zheng;
>> [email protected]; Li Zefan
>> Subject: Re: [PATCH V2 0/3] drivers/staging: zcache: dynamic page
>> cache/swap compression
>>
>> On Mon, Feb 14, 2011 at 4:35 AM, Minchan Kim <[email protected]>
>> > Just my guessing. I might be wrong.
>> >
>> > __cleancache_flush_inode calls cleancache_get_key with
>> cleancache_filekey.
>> > cleancache_file_key's size is just 6 * u32.
>> > cleancache_get_key calls btrfs_encode_fh with the key.
>> > but btrfs_encode_fh does typecasting the key to btrfs_fid which is
>> > bigger size than cleancache_filekey's one so it should not access
>> > fields beyond cleancache_get_key.
>> >
>> > I think some file systems use extend fid so in there, this problem
>> can
>> > happen. I don't know why we can't find it earlier. Maybe Dan and
>> > others test it for a long time.
>> >
>> > Am I missing something?
>> >
>> >
>> >
>> > --
>> > Kind regards,
>> > Minchan Kim
>> >
>>
>> reposting Minchan's message for reference to the btrfs mailing list
>> while also adding
>>
>> Li Zefan, Miao Xie, Yan Zheng, Dan Rosenberg and Josef Bacik to CC
>>
>> Regards
>>
>> Matt
>
> Hi Matt and Minchan --
>
> (BTRFS EXPERTS SEE *** BELOW)
>
> I definitely see a bug in cleancache_get_key in the monolithic
> zcache+cleancache+frontswap patch I posted on oss.oracle.com
> that is corrected in linux-next but I don't see how it could
> get provoked by btrfs.
>
> The bug is that, in cleancache_get_key, the return value of fhfn should
> be checked against 255.  If the return value is 255, cleancache_get_key
> should return -1.  This should disable cleancache for any filesystem
> where KEY_MAX is too large.
>
> But cleancache_get_key always calls fhfn with connectable == 0 and
> CLEANCACHE_KEY_MAX==6 should be greater than BTRFS_FID_SIZE_CONNECTABLE
> (which I think should be 5?).  And the elements written into the
> typecast btrfs_fid should be only writing the first 5 32-bit words.

BTRFS_FID_SIZE_NON_CONNECTALBE is 5, not BTRFS_FID_SIZE_CONNECTABLE.
Anyway, you passed connectable with 0 so it should be only writing the
first 5 32-bit words as you said.
That's one I missed. ;-)

Thanks.
--
Kind regards,
Minchan Kim

2011-03-03 17:29:56

by Dan Magenheimer

[permalink] [raw]
Subject: RE: [PATCH V2 0/3] drivers/staging: zcache: dynamic page cache/swap compression

> > I definitely see a bug in cleancache_get_key in the monolithic
> > zcache+cleancache+frontswap patch I posted on oss.oracle.com
> > that is corrected in linux-next but I don't see how it could
> > get provoked by btrfs.
> >
> > The bug is that, in cleancache_get_key, the return value of fhfn
> should
> > be checked against 255.  If the return value is 255,
> cleancache_get_key
> > should return -1.  This should disable cleancache for any filesystem
> > where KEY_MAX is too large.
> >
> > But cleancache_get_key always calls fhfn with connectable == 0 and
> > CLEANCACHE_KEY_MAX==6 should be greater than
> BTRFS_FID_SIZE_CONNECTABLE
> > (which I think should be 5?).  And the elements written into the
> > typecast btrfs_fid should be only writing the first 5 32-bit words.
>
> BTRFS_FID_SIZE_NON_CONNECTALBE is 5, not BTRFS_FID_SIZE_CONNECTABLE.
> Anyway, you passed connectable with 0 so it should be only writing the
> first 5 32-bit words as you said.
> That's one I missed. ;-)
>
> Thanks.
> --
> Kind regards,
> Minchan Kim

Sorry, I realized that I solved this with Matt offlist and never
posted the solution on-list, so for the archives:

This patch applies on top of the cleancache patch. It is really
a horrible hack but solving it correctly requires the interface
to encode_fh ops to change, which would require changes to many
filesystems, so best saved for a later time. If/when cleancache
gets merged, this patch will need to be applied on top of it
for btrfs to work properly when cleancache is enabled.

Basically, the problem is that, in all current filesystems,
obtaining the filehandle requires a dentry ONLY if connectable
is set. Otherwise, the dentry is only used to get the inode.
But cleancache_get_key only has an inode, and the alias list
of dentries associated with the inode may be empty. So
either the encode_fh interface would need to be changed
or, in this hack-y solution, a dentry is created temporarily
only for the purpose of dereferencing it.

Signed-off-by: Dan Magenheimer <[email protected]>

diff -Napur -X linux-2.6.37.1/Documentation/dontdiff linux-2.6.37.1/mm/cleancache.c linux-2.6.37.1-fix/mm/cleancache.c
--- linux-2.6.37.1/mm/cleancache.c 2011-02-25 11:38:47.000000000 -0800
+++ linux-2.6.37.1-fix/mm/cleancache.c 2011-02-25 08:53:46.000000000 -0800
@@ -78,15 +78,14 @@ static int cleancache_get_key(struct ino
int (*fhfn)(struct dentry *, __u32 *fh, int *, int);
int maxlen = CLEANCACHE_KEY_MAX;
struct super_block *sb = inode->i_sb;
- struct dentry *d;

key->u.ino = inode->i_ino;
if (sb->s_export_op != NULL) {
fhfn = sb->s_export_op->encode_fh;
if (fhfn) {
- d = list_first_entry(&inode->i_dentry,
- struct dentry, d_alias);
- (void)(*fhfn)(d, &key->u.fh[0], &maxlen, 0);
+ struct dentry d;
+ d.d_inode = inode;
+ (void)(*fhfn)(&d, &key->u.fh[0], &maxlen, 0);
if (maxlen > CLEANCACHE_KEY_MAX)
return -1;
}