2008-07-28 16:13:34

by Eric Sandeen

[permalink] [raw]
Subject: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

With SLUB debugging turned on in 2.6.26, I was getting memory corruption
when testing eCryptfs. The root cause turned out to be that eCryptfs
was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
as a nice page-aligned chunk of memory. But at least with SLUB debugging
on, this is not always true, and the page we get from virt_to_page does
not necessarily match the PAGE_CACHE_SIZE worth of memory we got from
kmalloc.

My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for
2 different multi-megabyte files. With this change I no longer see
the corruption.

Thanks,
-Eric

Signed-off-by: Eric Sandeen <[email protected]>
---

Index: linux-2.6.26/fs/ecryptfs/crypto.c
===================================================================
--- linux-2.6.26.orig/fs/ecryptfs/crypto.c
+++ linux-2.6.26/fs/ecryptfs/crypto.c
@@ -474,8 +474,8 @@ int ecryptfs_encrypt_page(struct page *p
{
struct inode *ecryptfs_inode;
struct ecryptfs_crypt_stat *crypt_stat;
- char *enc_extent_virt = NULL;
- struct page *enc_extent_page;
+ char *enc_extent_virt;
+ struct page *enc_extent_page = NULL;
loff_t extent_offset;
int rc = 0;

@@ -491,14 +491,14 @@ int ecryptfs_encrypt_page(struct page *p
page->index);
goto out;
}
- enc_extent_virt = kmalloc(PAGE_CACHE_SIZE, GFP_USER);
- if (!enc_extent_virt) {
+ enc_extent_page = alloc_page(GFP_USER);
+ if (!enc_extent_page) {
rc = -ENOMEM;
ecryptfs_printk(KERN_ERR, "Error allocating memory for "
"encrypted extent\n");
goto out;
}
- enc_extent_page = virt_to_page(enc_extent_virt);
+ enc_extent_virt = kmap(enc_extent_page);
for (extent_offset = 0;
extent_offset < (PAGE_CACHE_SIZE / crypt_stat->extent_size);
extent_offset++) {
@@ -526,7 +526,10 @@ int ecryptfs_encrypt_page(struct page *p
}
}
out:
- kfree(enc_extent_virt);
+ if (enc_extent_page) {
+ kunmap(enc_extent_page);
+ __free_page(enc_extent_page);
+ }
return rc;
}

@@ -608,8 +611,8 @@ int ecryptfs_decrypt_page(struct page *p
{
struct inode *ecryptfs_inode;
struct ecryptfs_crypt_stat *crypt_stat;
- char *enc_extent_virt = NULL;
- struct page *enc_extent_page;
+ char *enc_extent_virt;
+ struct page *enc_extent_page = NULL;
unsigned long extent_offset;
int rc = 0;

@@ -626,14 +629,14 @@ int ecryptfs_decrypt_page(struct page *p
page->index);
goto out;
}
- enc_extent_virt = kmalloc(PAGE_CACHE_SIZE, GFP_USER);
- if (!enc_extent_virt) {
+ enc_extent_page = alloc_page(GFP_USER);
+ if (!enc_extent_page) {
rc = -ENOMEM;
ecryptfs_printk(KERN_ERR, "Error allocating memory for "
"encrypted extent\n");
goto out;
}
- enc_extent_page = virt_to_page(enc_extent_virt);
+ enc_extent_virt = kmap(enc_extent_page);
for (extent_offset = 0;
extent_offset < (PAGE_CACHE_SIZE / crypt_stat->extent_size);
extent_offset++) {
@@ -661,7 +664,10 @@ int ecryptfs_decrypt_page(struct page *p
}
}
out:
- kfree(enc_extent_virt);
+ if (enc_extent_page) {
+ kunmap(enc_extent_page);
+ __free_page(enc_extent_page);
+ }
return rc;
}


2008-07-28 16:25:29

by Michael Halcrow

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

On Mon, Jul 28, 2008 at 11:13:03AM -0500, Eric Sandeen wrote:
> With SLUB debugging turned on in 2.6.26, I was getting memory corruption
> when testing eCryptfs. The root cause turned out to be that eCryptfs
> was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
> as a nice page-aligned chunk of memory. But at least with SLUB debugging
> on, this is not always true, and the page we get from virt_to_page does
> not necessarily match the PAGE_CACHE_SIZE worth of memory we got from
> kmalloc.
>
> My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for
> 2 different multi-megabyte files. With this change I no longer see
> the corruption.
>
> Thanks,
> -Eric
>
> Signed-off-by: Eric Sandeen <[email protected]>

Acked-by: Michael Halcrow <[email protected]>

> ---
>
> Index: linux-2.6.26/fs/ecryptfs/crypto.c
> ===================================================================
> --- linux-2.6.26.orig/fs/ecryptfs/crypto.c
> +++ linux-2.6.26/fs/ecryptfs/crypto.c
> @@ -474,8 +474,8 @@ int ecryptfs_encrypt_page(struct page *p
> {
> struct inode *ecryptfs_inode;
> struct ecryptfs_crypt_stat *crypt_stat;
> - char *enc_extent_virt = NULL;
> - struct page *enc_extent_page;
> + char *enc_extent_virt;
> + struct page *enc_extent_page = NULL;
> loff_t extent_offset;
> int rc = 0;
>
> @@ -491,14 +491,14 @@ int ecryptfs_encrypt_page(struct page *p
> page->index);
> goto out;
> }
> - enc_extent_virt = kmalloc(PAGE_CACHE_SIZE, GFP_USER);
> - if (!enc_extent_virt) {
> + enc_extent_page = alloc_page(GFP_USER);
> + if (!enc_extent_page) {
> rc = -ENOMEM;
> ecryptfs_printk(KERN_ERR, "Error allocating memory for "
> "encrypted extent\n");
> goto out;
> }
> - enc_extent_page = virt_to_page(enc_extent_virt);
> + enc_extent_virt = kmap(enc_extent_page);
> for (extent_offset = 0;
> extent_offset < (PAGE_CACHE_SIZE / crypt_stat->extent_size);
> extent_offset++) {
> @@ -526,7 +526,10 @@ int ecryptfs_encrypt_page(struct page *p
> }
> }
> out:
> - kfree(enc_extent_virt);
> + if (enc_extent_page) {
> + kunmap(enc_extent_page);
> + __free_page(enc_extent_page);
> + }
> return rc;
> }
>
> @@ -608,8 +611,8 @@ int ecryptfs_decrypt_page(struct page *p
> {
> struct inode *ecryptfs_inode;
> struct ecryptfs_crypt_stat *crypt_stat;
> - char *enc_extent_virt = NULL;
> - struct page *enc_extent_page;
> + char *enc_extent_virt;
> + struct page *enc_extent_page = NULL;
> unsigned long extent_offset;
> int rc = 0;
>
> @@ -626,14 +629,14 @@ int ecryptfs_decrypt_page(struct page *p
> page->index);
> goto out;
> }
> - enc_extent_virt = kmalloc(PAGE_CACHE_SIZE, GFP_USER);
> - if (!enc_extent_virt) {
> + enc_extent_page = alloc_page(GFP_USER);
> + if (!enc_extent_page) {
> rc = -ENOMEM;
> ecryptfs_printk(KERN_ERR, "Error allocating memory for "
> "encrypted extent\n");
> goto out;
> }
> - enc_extent_page = virt_to_page(enc_extent_virt);
> + enc_extent_virt = kmap(enc_extent_page);
> for (extent_offset = 0;
> extent_offset < (PAGE_CACHE_SIZE / crypt_stat->extent_size);
> extent_offset++) {
> @@ -661,7 +664,10 @@ int ecryptfs_decrypt_page(struct page *p
> }
> }
> out:
> - kfree(enc_extent_virt);
> + if (enc_extent_page) {
> + kunmap(enc_extent_page);
> + __free_page(enc_extent_page);
> + }
> return rc;
> }
>
>

2008-07-28 17:11:22

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

On Mon, 28 Jul 2008 11:13:03 -0500
Eric Sandeen <[email protected]> wrote:

> With SLUB debugging turned on in 2.6.26, I was getting memory corruption
> when testing eCryptfs. The root cause turned out to be that eCryptfs
> was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
> as a nice page-aligned chunk of memory.

uh oh

> Signed-off-by: Eric Sandeen <[email protected]>

Acked-by: Rik van Riel <[email protected]>

--
All Rights Reversed

2008-07-28 20:35:53

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

On Mon, 28 Jul 2008 11:13:03 -0500
Eric Sandeen <[email protected]> wrote:

> With SLUB debugging turned on in 2.6.26, I was getting memory corruption
> when testing eCryptfs. The root cause turned out to be that eCryptfs
> was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
> as a nice page-aligned chunk of memory. But at least with SLUB debugging
> on, this is not always true, and the page we get from virt_to_page does
> not necessarily match the PAGE_CACHE_SIZE worth of memory we got from
> kmalloc.
>
> My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for
> 2 different multi-megabyte files. With this change I no longer see
> the corruption.

The fix applies to both 2.6.25 and to 2.6.26 and appears to be needed
in both kernel versions, so I have tagged it for backporting into both.

2008-07-28 20:42:47

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

On Mon, Jul 28, 2008 at 11:35 PM, Andrew Morton
<[email protected]> wrote:
> On Mon, 28 Jul 2008 11:13:03 -0500
> Eric Sandeen <[email protected]> wrote:
>
>> With SLUB debugging turned on in 2.6.26, I was getting memory corruption
>> when testing eCryptfs. The root cause turned out to be that eCryptfs
>> was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
>> as a nice page-aligned chunk of memory. But at least with SLUB debugging
>> on, this is not always true, and the page we get from virt_to_page does
>> not necessarily match the PAGE_CACHE_SIZE worth of memory we got from
>> kmalloc.
>>
>> My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for
>> 2 different multi-megabyte files. With this change I no longer see
>> the corruption.
>
> The fix applies to both 2.6.25 and to 2.6.26 and appears to be needed
> in both kernel versions, so I have tagged it for backporting into both.

Hmm, SLUB will use the page allocator directly for PAGE_CACHE_SIZE
regadless of whether debugging is enabled or not...

2008-07-28 20:49:22

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

Andrew Morton wrote:
> On Mon, 28 Jul 2008 11:13:03 -0500
> Eric Sandeen <[email protected]> wrote:
>
>> With SLUB debugging turned on in 2.6.26, I was getting memory corruption
>> when testing eCryptfs. The root cause turned out to be that eCryptfs
>> was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
>> as a nice page-aligned chunk of memory. But at least with SLUB debugging
>> on, this is not always true, and the page we get from virt_to_page does
>> not necessarily match the PAGE_CACHE_SIZE worth of memory we got from
>> kmalloc.
>>
>> My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for
>> 2 different multi-megabyte files. With this change I no longer see
>> the corruption.
>
> The fix applies to both 2.6.25 and to 2.6.26 and appears to be needed
> in both kernel versions, so I have tagged it for backporting into both.

I agree, thanks.

-Eric

2008-07-28 21:18:45

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

Pekka Enberg wrote:
> On Mon, Jul 28, 2008 at 11:35 PM, Andrew Morton
> <[email protected]> wrote:
>> On Mon, 28 Jul 2008 11:13:03 -0500
>> Eric Sandeen <[email protected]> wrote:
>>
>>> With SLUB debugging turned on in 2.6.26, I was getting memory corruption
>>> when testing eCryptfs. The root cause turned out to be that eCryptfs
>>> was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
>>> as a nice page-aligned chunk of memory. But at least with SLUB debugging
>>> on, this is not always true, and the page we get from virt_to_page does
>>> not necessarily match the PAGE_CACHE_SIZE worth of memory we got from
>>> kmalloc.
>>>
>>> My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for
>>> 2 different multi-megabyte files. With this change I no longer see
>>> the corruption.
>> The fix applies to both 2.6.25 and to 2.6.26 and appears to be needed
>> in both kernel versions, so I have tagged it for backporting into both.
>
> Hmm, SLUB will use the page allocator directly for PAGE_CACHE_SIZE
> regadless of whether debugging is enabled or not...

For whatever reason, I did see non-page-aligned memory returned from
kmalloc(PAGE_CACHE_SIZE), and I think this is what caused the problem
once virt_to_page() was used to get hold of a page to pass around in the
ecryptfs/crypto code...

-Eric

2008-07-28 21:19:49

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

Hi Eric,

Eric Sandeen wrote:
> Pekka Enberg wrote:
>> On Mon, Jul 28, 2008 at 11:35 PM, Andrew Morton
>> <[email protected]> wrote:
>>> On Mon, 28 Jul 2008 11:13:03 -0500
>>> Eric Sandeen <[email protected]> wrote:
>>>
>>>> With SLUB debugging turned on in 2.6.26, I was getting memory corruption
>>>> when testing eCryptfs. The root cause turned out to be that eCryptfs
>>>> was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that
>>>> as a nice page-aligned chunk of memory. But at least with SLUB debugging
>>>> on, this is not always true, and the page we get from virt_to_page does
>>>> not necessarily match the PAGE_CACHE_SIZE worth of memory we got from
>>>> kmalloc.
>>>>
>>>> My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for
>>>> 2 different multi-megabyte files. With this change I no longer see
>>>> the corruption.
>>> The fix applies to both 2.6.25 and to 2.6.26 and appears to be needed
>>> in both kernel versions, so I have tagged it for backporting into both.
>> Hmm, SLUB will use the page allocator directly for PAGE_CACHE_SIZE
>> regadless of whether debugging is enabled or not...
>
> For whatever reason, I did see non-page-aligned memory returned from
> kmalloc(PAGE_CACHE_SIZE), and I think this is what caused the problem
> once virt_to_page() was used to get hold of a page to pass around in the
> ecryptfs/crypto code...

With SLUB? I can't see how that's possible. I can see this with SLAB,
though, for 4K pages.

In any case, the patch, of course, make sense as kmalloc() behavior
varies between allocators.

Pekka

2008-07-30 14:41:19

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

Pekka Enberg wrote:

> Hmm, SLUB will use the page allocator directly for PAGE_CACHE_SIZE
> regadless of whether debugging is enabled or not...

Only until 2.6.24. SLUB uses page allocator for > PAGE_CACHE_SIZE

2008-07-30 14:56:30

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs - use page_alloc not kmalloc to get a page of memory

Pekka Enberg wrote:
>> For whatever reason, I did see non-page-aligned memory returned from
>> kmalloc(PAGE_CACHE_SIZE), and I think this is what caused the problem
>> once virt_to_page() was used to get hold of a page to pass around in the
>> ecryptfs/crypto code...
>
> With SLUB? I can't see how that's possible. I can see this with SLAB,
> though, for 4K pages.

It possible because PAGE_CACHE_SIZE is still handled by SLUB and if debugging is on then kmalloc may return non page aligned objects. The handoff to the page allocator only occurs for objects > 4k. We used to do this also for 4k objects but then we got performance regressions in tbench.



> In any case, the patch, of course, make sense as kmalloc() behavior
> varies between allocators.
>
> Pekka