2008-02-17 17:03:34

by Oliver Pinter

[permalink] [raw]
Subject: request for v2.6.22.19-queue

>From a8a7690626756b6dcd49ad23b58f4406bfa59d7f Mon Sep 17 00:00:00 2001
From: Jonathan Corbet <[email protected]>
Date: Mon, 11 Feb 2008 16:17:33 -0700
Subject: [PATCH] Be more robust about bad arguments in get_user_pages()

MAINLINE: 900cf086fd2fbad07f72f4575449e0d0958f860f

So I spent a while pounding my head against my monitor trying to figure
out the vmsplice() vulnerability - how could a failure to check for
*read* access turn into a root exploit? It turns out that it's a buffer
overflow problem which is made easy by the way get_user_pages() is
coded.

In particular, "len" is a signed int, and it is only checked at the
*end* of a do {} while() loop. So, if it is passed in as zero, the loop
will execute once and decrement len to -1. At that point, the loop will
proceed until the next invalid address is found; in the process, it will
likely overflow the pages array passed in to get_user_pages().

I think that, if get_user_pages() has been asked to grab zero pages,
that's what it should do. Thus this patch; it is, among other things,
enough to block the (already fixed) root exploit and any others which
might be lurking in similar code. I also think that the number of pages
should be unsigned, but changing the prototype of this function probably
requires some more careful review.

Signed-off-by: Jonathan Corbet <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/mm/memory.c b/mm/memory.c
index f64cbf9..538f054 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -983,6 +983,8 @@ int get_user_pages(struct task_struct *tsk, struct
mm_struct *mm,
int i;
unsigned int vm_flags;

+ if (len <= 0)
+ return 0;
/*
* Require read or write permissions.
* If 'force' is set, we only require the "MAY" flags.
>From 7d0cf656110bc5a64798162c5c03913b2ffed65b Mon Sep 17 00:00:00 2001
From: Jesper Juhl <[email protected]>
Date: Tue, 31 Jul 2007 00:39:39 -0700
Subject: [PATCH] cciss: fix memory leak

There's a memory leak in the cciss driver.

MAINLINE: f2912a1223c0917a7b4e054f18086209137891ea

in alloc_cciss_hba() we may leak sizeof(ctlr_info_t) bytes if a
call to alloc_disk(1 << NWD_SHIFT) fails.
This patch should fix the issue.

Spotted by the Coverity checker.

Signed-off-by: Jesper Juhl <[email protected]>
Acked-by: Mike Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 5acc6c4..132f76b 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -3225,12 +3225,15 @@ static int alloc_cciss_hba(void)
for (i = 0; i < MAX_CTLR; i++) {
if (!hba[i]) {
ctlr_info_t *p;
+
p = kzalloc(sizeof(ctlr_info_t), GFP_KERNEL);
if (!p)
goto Enomem;
p->gendisk[0] = alloc_disk(1 << NWD_SHIFT);
- if (!p->gendisk[0])
+ if (!p->gendisk[0]) {
+ kfree(p);
goto Enomem;
+ }
hba[i] = p;
return i;
}
>From a228962494b69e7b3ed16f8cda693c816c68bbe7 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <[email protected]>
Date: Fri, 15 Feb 2008 20:58:22 +0100
Subject: [PATCH] x86_64: CPA, fix cache attribute inconsistency bug,
v2.6.22 backport

MAINLINE: not-yet-in-mainline

fix CPA cache attribute bug in v2.6.2[234]. When phys_base is nonzero
(when CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates
the secondary alias address by -14 MB (depending on the configured
offset).

The default 64-bit kernels of Fedora and Ubuntu are affected:

$ grep RELOCA /boot/config-2.6.23.9-85.fc8
CONFIG_RELOCATABLE=y

$ grep RELOC /boot/config-2.6.22-14-generic
CONFIG_RELOCATABLE=y

and probably on many other distros as well.

the bug affects all pages in the first 40 MB of physical RAM that
are allocated by some subsystem that does ioremap_nocache() on them:

if (__pa(address) < KERNEL_TEXT_SIZE) {

Hence we might leave page table entries with inconsistent cache
attributes around (pages mapped at both UnCacheable and Write-Back),
and we can also set the wrong kernel text pages to UnCacheable.

the effects of this bug can be random slowdowns and other misbehavior.
If for example AGP allocates its aperture pages into the first 40 MB
of physical RAM, then the -14 MB bug might mark random kernel texto
pages as uncacheable, slowing down a random portion of the 64-bit
kernel until the AGP driver is unloaded.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/arch/x86_64/mm/pageattr.c b/arch/x86_64/mm/pageattr.c
index 7cd5254..d6cd5c4 100644
--- a/arch/x86_64/mm/pageattr.c
+++ b/arch/x86_64/mm/pageattr.c
@@ -204,7 +204,7 @@ int change_page_attr_addr(unsigned long address,
int numpages, pgprot_t prot)
if (__pa(address) < KERNEL_TEXT_SIZE) {
unsigned long addr2;
pgprot_t prot2;
- addr2 = __START_KERNEL_map + __pa(address);
+ addr2 = __START_KERNEL_map + __pa(address) - phys_base;
/* Make sure the kernel mappings stay executable */
prot2 = pte_pgprot(pte_mkexec(pfn_pte(0, prot)));
err = __change_page_attr(addr2, pfn, prot2,
>From 3dacfc9387587eca249e91d819930308508a4b15 Mon Sep 17 00:00:00 2001
From: Roland McGrath <[email protected]>
Date: Mon, 16 Jul 2007 08:03:16 -0700
Subject: [PATCH] Handle bogus %cs selector in single-step instruction decoding

Handle bogus %cs selector in single-step instruction decoding

MAINLINE: 29eb51101c02df517ca64ec472d7501127ad1da8

The code for LDT segment selectors was not robust in the face of a bogus
selector set in %cs via ptrace before the single-step was done.

Signed-off-by: Roland McGrath <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Acked-by: Jeff Mahoney <[email protected]>
CC: Oliver Pinter <[email protected]>
Signed-off-by: Oliver Pinter <[email protected]>

diff --git a/arch/i386/kernel/ptrace.c b/arch/i386/kernel/ptrace.c
index 0c0ceec..120a63b 100644
--- a/arch/i386/kernel/ptrace.c
+++ b/arch/i386/kernel/ptrace.c
@@ -164,14 +164,22 @@ static unsigned long
convert_eip_to_linear(struct task_struct *child, struct pt_
u32 *desc;
unsigned long base;

- down(&child->mm->context.sem);
- desc = child->mm->context.ldt + (seg & ~7);
- base = (desc[0] >> 16) | ((desc[1] & 0xff) << 16) | (desc[1] & 0xff000000);
+ seg &= ~7UL;

- /* 16-bit code segment? */
- if (!((desc[1] >> 22) & 1))
- addr &= 0xffff;
- addr += base;
+ down(&child->mm->context.sem);
+ if (unlikely((seg >> 3) >= child->mm->context.size))
+ addr = -1L; /* bogus selector, access would fault */
+ else {
+ desc = child->mm->context.ldt + seg;
+ base = ((desc[0] >> 16) |
+ ((desc[1] & 0xff) << 16) |
+ (desc[1] & 0xff000000));
+
+ /* 16-bit code segment? */
+ if (!((desc[1] >> 22) & 1))
+ addr &= 0xffff;
+ addr += base;
+ }
up(&child->mm->context.sem);
}
return addr;
diff --git a/arch/x86_64/kernel/ptrace.c b/arch/x86_64/kernel/ptrace.c
index 8d89d8c..7fc0e73 100644
--- a/arch/x86_64/kernel/ptrace.c
+++ b/arch/x86_64/kernel/ptrace.c
@@ -102,16 +102,25 @@ unsigned long convert_rip_to_linear(struct
task_struct *child, struct pt_regs *r
u32 *desc;
unsigned long base;

- down(&child->mm->context.sem);
- desc = child->mm->context.ldt + (seg & ~7);
- base = (desc[0] >> 16) | ((desc[1] & 0xff) << 16) | (desc[1] & 0xff000000);
+ seg &= ~7UL;

- /* 16-bit code segment? */
- if (!((desc[1] >> 22) & 1))
- addr &= 0xffff;
- addr += base;
+ down(&child->mm->context.sem);
+ if (unlikely((seg >> 3) >= child->mm->context.size))
+ addr = -1L; /* bogus selector, access would fault */
+ else {
+ desc = child->mm->context.ldt + seg;
+ base = ((desc[0] >> 16) |
+ ((desc[1] & 0xff) << 16) |
+ (desc[1] & 0xff000000));
+
+ /* 16-bit code segment? */
+ if (!((desc[1] >> 22) & 1))
+ addr &= 0xffff;
+ addr += base;
+ }
up(&child->mm->context.sem);
}
+
return addr;
}

>From 502ebf4d4b9b0508bebd88594985d1e8ec968d01 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <[email protected]>
Date: Wed, 18 Jul 2007 18:59:22 +0200
Subject: [PATCH] i386: fixup TRACE_IRQ breakage

i386: fixup TRACE_IRQ breakage

MAINLINE: a10d9a71bafd3a283da240d2868e71346d2aef6f

The TRACE_IRQS_ON function in iret_exc: calls a C function without
ensuring that the segments are set properly. Move the trace function and
the enabling of interrupt into the C stub.

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Acked-by: Jeff Mahoney <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/arch/i386/kernel/entry.S b/arch/i386/kernel/entry.S
index 3c3c220..b7be5cf 100644
--- a/arch/i386/kernel/entry.S
+++ b/arch/i386/kernel/entry.S
@@ -409,8 +409,6 @@ restore_nocheck_notrace:
1: INTERRUPT_RETURN
.section .fixup,"ax"
iret_exc:
- TRACE_IRQS_ON
- ENABLE_INTERRUPTS(CLBR_NONE)
pushl $0 # no error code
pushl $do_iret_error
jmp error_code
diff --git a/arch/i386/kernel/traps.c b/arch/i386/kernel/traps.c
index 90da057..4995b92 100644
--- a/arch/i386/kernel/traps.c
+++ b/arch/i386/kernel/traps.c
@@ -517,10 +517,12 @@ fastcall void do_##name(struct pt_regs * regs,
long error_code) \
do_trap(trapnr, signr, str, 0, regs, error_code, NULL); \
}

-#define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr) \
+#define DO_ERROR_INFO(trapnr, signr, str, name, sicode, siaddr, irq) \
fastcall void do_##name(struct pt_regs * regs, long error_code) \
{ \
siginfo_t info; \
+ if (irq) \
+ local_irq_enable(); \
info.si_signo = signr; \
info.si_errno = 0; \
info.si_code = sicode; \
@@ -560,13 +562,13 @@ DO_VM86_ERROR( 3, SIGTRAP, "int3", int3)
#endif
DO_VM86_ERROR( 4, SIGSEGV, "overflow", overflow)
DO_VM86_ERROR( 5, SIGSEGV, "bounds", bounds)
-DO_ERROR_INFO( 6, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN, regs->eip)
+DO_ERROR_INFO( 6, SIGILL, "invalid opcode", invalid_op, ILL_ILLOPN,
regs->eip, 0)
DO_ERROR( 9, SIGFPE, "coprocessor segment overrun",
coprocessor_segment_overrun)
DO_ERROR(10, SIGSEGV, "invalid TSS", invalid_TSS)
DO_ERROR(11, SIGBUS, "segment not present", segment_not_present)
DO_ERROR(12, SIGBUS, "stack segment", stack_segment)
-DO_ERROR_INFO(17, SIGBUS, "alignment check", alignment_check, BUS_ADRALN, 0)
-DO_ERROR_INFO(32, SIGSEGV, "iret exception", iret_error, ILL_BADSTK, 0)
+DO_ERROR_INFO(17, SIGBUS, "alignment check", alignment_check, BUS_ADRALN, 0, 0)
+DO_ERROR_INFO(32, SIGSEGV, "iret exception", iret_error, ILL_BADSTK, 0, 1)

fastcall void __kprobes do_general_protection(struct pt_regs * regs,
long error_code)
>From 8aa4c64e9902139d19418a2c13782e191f1a674a Mon Sep 17 00:00:00 2001
From: Wang Zhenyu <[email protected]>
Date: Fri, 8 Feb 2008 23:05:15 +0100
Subject: [PATCH] Intel_agp: really fix 945/965GME

Intel_agp: really fix 945/965GME

MAINLINE: dde4787642ee3cb85aef80bdade04b6f8ddc3df8

Fix some missing places to check with device id info, which
should probe the device gart correctly.

Signed-off-by: Wang Zhenyu <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
Acked-by: Takashi Iwai <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
index a124060..d06b652 100644
--- a/drivers/char/agp/intel-agp.c
+++ b/drivers/char/agp/intel-agp.c
@@ -20,7 +20,9 @@
#define PCI_DEVICE_ID_INTEL_82965G_IG 0x29A2
#define PCI_DEVICE_ID_INTEL_82965GM_HB 0x2A00
#define PCI_DEVICE_ID_INTEL_82965GM_IG 0x2A02
+#define PCI_DEVICE_ID_INTEL_82965GME_HB 0x2A10
#define PCI_DEVICE_ID_INTEL_82965GME_IG 0x2A12
+#define PCI_DEVICE_ID_INTEL_82945GME_HB 0x27AC
#define PCI_DEVICE_ID_INTEL_82945GME_IG 0x27AE
#define PCI_DEVICE_ID_INTEL_G33_HB 0x29C0
#define PCI_DEVICE_ID_INTEL_G33_IG 0x29C2
@@ -33,7 +35,8 @@
agp_bridge->dev->device ==
PCI_DEVICE_ID_INTEL_82965G_1_HB || \
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965Q_HB || \
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965G_HB || \
- agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GM_HB)
+ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GM_HB || \
+ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82965GME_HB)

#define IS_G33 (agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_G33_HB || \
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_Q35_HB || \
@@ -527,6 +530,7 @@ static void intel_i830_init_gtt_entries(void)
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82915GM_HB ||
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82945G_HB ||
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82945GM_HB ||
+ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82945GME_HB ||
IS_I965 || IS_G33)
gtt_entries = MB(48) - KB(size);
else
@@ -538,6 +542,7 @@ static void intel_i830_init_gtt_entries(void)
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82915GM_HB ||
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82945G_HB ||
agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82945GM_HB ||
+ agp_bridge->dev->device == PCI_DEVICE_ID_INTEL_82945GME_HB ||
IS_I965 || IS_G33)
gtt_entries = MB(64) - KB(size);
else
@@ -1848,9 +1853,9 @@ static const struct intel_driver_description {
NULL, &intel_915_driver },
{ PCI_DEVICE_ID_INTEL_82945G_HB, PCI_DEVICE_ID_INTEL_82945G_IG, 0, "945G",
NULL, &intel_915_driver },
- { PCI_DEVICE_ID_INTEL_82945GM_HB, PCI_DEVICE_ID_INTEL_82945GM_IG, 1, "945GM",
+ { PCI_DEVICE_ID_INTEL_82945GM_HB, PCI_DEVICE_ID_INTEL_82945GM_IG, 0, "945GM",
NULL, &intel_915_driver },
- { PCI_DEVICE_ID_INTEL_82945GM_HB, PCI_DEVICE_ID_INTEL_82945GME_IG,
0, "945GME",
+ { PCI_DEVICE_ID_INTEL_82945GME_HB, PCI_DEVICE_ID_INTEL_82945GME_IG,
0, "945GME",
NULL, &intel_915_driver },
{ PCI_DEVICE_ID_INTEL_82946GZ_HB, PCI_DEVICE_ID_INTEL_82946GZ_IG, 0, "946GZ",
NULL, &intel_i965_driver },
@@ -1860,9 +1865,9 @@ static const struct intel_driver_description {
NULL, &intel_i965_driver },
{ PCI_DEVICE_ID_INTEL_82965G_HB, PCI_DEVICE_ID_INTEL_82965G_IG, 0, "965G",
NULL, &intel_i965_driver },
- { PCI_DEVICE_ID_INTEL_82965GM_HB, PCI_DEVICE_ID_INTEL_82965GM_IG, 1, "965GM",
+ { PCI_DEVICE_ID_INTEL_82965GM_HB, PCI_DEVICE_ID_INTEL_82965GM_IG, 0, "965GM",
NULL, &intel_i965_driver },
- { PCI_DEVICE_ID_INTEL_82965GM_HB, PCI_DEVICE_ID_INTEL_82965GME_IG,
0, "965GME/GLE",
+ { PCI_DEVICE_ID_INTEL_82965GME_HB, PCI_DEVICE_ID_INTEL_82965GME_IG,
0, "965GME/GLE",
NULL, &intel_i965_driver },
{ PCI_DEVICE_ID_INTEL_7505_0, 0, 0, "E7505", &intel_7505_driver, NULL },
{ PCI_DEVICE_ID_INTEL_7205_0, 0, 0, "E7205", &intel_7505_driver, NULL },
@@ -2051,11 +2056,13 @@ static struct pci_device_id agp_intel_pci_table[] = {
ID(PCI_DEVICE_ID_INTEL_82915GM_HB),
ID(PCI_DEVICE_ID_INTEL_82945G_HB),
ID(PCI_DEVICE_ID_INTEL_82945GM_HB),
+ ID(PCI_DEVICE_ID_INTEL_82945GME_HB),
ID(PCI_DEVICE_ID_INTEL_82946GZ_HB),
ID(PCI_DEVICE_ID_INTEL_82965G_1_HB),
ID(PCI_DEVICE_ID_INTEL_82965Q_HB),
ID(PCI_DEVICE_ID_INTEL_82965G_HB),
ID(PCI_DEVICE_ID_INTEL_82965GM_HB),
+ ID(PCI_DEVICE_ID_INTEL_82965GME_HB),
ID(PCI_DEVICE_ID_INTEL_G33_HB),
ID(PCI_DEVICE_ID_INTEL_Q35_HB),
ID(PCI_DEVICE_ID_INTEL_Q33_HB),
>From 2debc65425a0d38ea4af888ff265fb252c01b67b Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <[email protected]>
Date: Fri, 2 Nov 2007 15:36:08 -0400
Subject: [PATCH] knfsd: fix spurious EINVAL errors on first access of
new filesystem

mainline: ac8587dcb58e40dd336d99d60f852041e06cc3dd

The v2/v3 acl code in nfsd is translating any return from fh_verify() to
nfserr_inval. This is particularly unfortunate in the case of an
nfserr_dropit return, which is an internal error meant to indicate to
callers that this request has been deferred and should just be dropped
pending the results of an upcall to mountd.

Thanks to Roland <[email protected]> for bug report and data collection.

Cc: Roland <[email protected]>
Acked-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/fs/nfsd/nfs2acl.c b/fs/nfsd/nfs2acl.c
index b617428..0e5fa11 100644
--- a/fs/nfsd/nfs2acl.c
+++ b/fs/nfsd/nfs2acl.c
@@ -41,7 +41,7 @@ static __be32 nfsacld_proc_getacl(struct svc_rqst * rqstp,

fh = fh_copy(&resp->fh, &argp->fh);
if ((nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_NOP)))
- RETURN_STATUS(nfserr_inval);
+ RETURN_STATUS(nfserr);

if (argp->mask & ~(NFS_ACL|NFS_ACLCNT|NFS_DFACL|NFS_DFACLCNT))
RETURN_STATUS(nfserr_inval);
diff --git a/fs/nfsd/nfs3acl.c b/fs/nfsd/nfs3acl.c
index 3e3f2de..b647f2f 100644
--- a/fs/nfsd/nfs3acl.c
+++ b/fs/nfsd/nfs3acl.c
@@ -37,7 +37,7 @@ static __be32 nfsd3_proc_getacl(struct svc_rqst * rqstp,

fh = fh_copy(&resp->fh, &argp->fh);
if ((nfserr = fh_verify(rqstp, &resp->fh, 0, MAY_NOP)))
- RETURN_STATUS(nfserr_inval);
+ RETURN_STATUS(nfserr);

if (argp->mask & ~(NFS_ACL|NFS_ACLCNT|NFS_DFACL|NFS_DFACLCNT))
RETURN_STATUS(nfserr_inval);
>From 2175db184faa2202fe6c819b7050836febebab68 Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <[email protected]>
Date: Fri, 28 Sep 2007 16:45:51 -0400
Subject: [PATCH] knfsd: query filesystem for NFSv4 getattr of FATTR4_MAXNAME

mainline: a16e92edcd0a2846455a30823e1bac964e743baa

Without this we always return 2^32-1 as the the maximum namelength.

Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 15809df..0898aec 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -1453,7 +1453,8 @@ nfsd4_encode_fattr(struct svc_fh *fhp, struct
svc_export *exp,
err = vfs_getattr(exp->ex_mnt, dentry, &stat);
if (err)
goto out_nfserr;
- if ((bmval0 & (FATTR4_WORD0_FILES_FREE | FATTR4_WORD0_FILES_TOTAL)) ||
+ if ((bmval0 & (FATTR4_WORD0_FILES_FREE | FATTR4_WORD0_FILES_TOTAL |
+ FATTR4_WORD0_MAXNAME)) ||
(bmval1 & (FATTR4_WORD1_SPACE_AVAIL | FATTR4_WORD1_SPACE_FREE |
FATTR4_WORD1_SPACE_TOTAL))) {
err = vfs_statfs(dentry, &statfs);
@@ -1699,7 +1700,7 @@ out_acl:
if (bmval0 & FATTR4_WORD0_MAXNAME) {
if ((buflen -= 4) < 0)
goto out_resource;
- WRITE32(~(u32) 0);
+ WRITE32(statfs.f_namelen);
}
if (bmval0 & FATTR4_WORD0_MAXREAD) {
if ((buflen -= 8) < 0)
>From 2b28f2bc9372d6fe659810292e621c9282bc9396 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Fri, 28 Sep 2007 12:27:41 -0400
Subject: [PATCH] NFS: Fix an Oops in encode_lookup()

MAINLINE: 54af3bb543c071769141387a42deaaab5074da55

It doesn't look as if the NFS file name limit is being initialised correctly
in the struct nfs_server. Make sure that we limit whatever is being set in
nfs_probe_fsinfo() and nfs_init_server().

Also ensure that readdirplus and nfs4_path_walk respect our file name
limits.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Acked-by: NeilBrown <[email protected]>
CC: Oliver Pinter <[email protected]>
Signed-off-by: Oliver Pinter <[email protected]>

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index 881fa49..aaf4b04 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -614,16 +614,6 @@ static int nfs_init_server(struct nfs_server
*server, const struct nfs_mount_dat
server->namelen = data->namlen;
/* Create a client RPC handle for the NFSv3 ACL management interface */
nfs_init_server_aclclient(server);
- if (clp->cl_nfsversion == 3) {
- if (server->namelen == 0 || server->namelen > NFS3_MAXNAMLEN)
- server->namelen = NFS3_MAXNAMLEN;
- if (!(data->flags & NFS_MOUNT_NORDIRPLUS))
- server->caps |= NFS_CAP_READDIRPLUS;
- } else {
- if (server->namelen == 0 || server->namelen > NFS2_MAXNAMLEN)
- server->namelen = NFS2_MAXNAMLEN;
- }
-
dprintk("<-- nfs_init_server() = 0 [new %p]\n", clp);
return 0;

@@ -820,6 +810,16 @@ struct nfs_server *nfs_create_server(const struct
nfs_mount_data *data,
error = nfs_probe_fsinfo(server, mntfh, &fattr);
if (error < 0)
goto error;
+ if (server->nfs_client->rpc_ops->version == 3) {
+ if (server->namelen == 0 || server->namelen > NFS3_MAXNAMLEN)
+ server->namelen = NFS3_MAXNAMLEN;
+ if (!(data->flags & NFS_MOUNT_NORDIRPLUS))
+ server->caps |= NFS_CAP_READDIRPLUS;
+ } else {
+ if (server->namelen == 0 || server->namelen > NFS2_MAXNAMLEN)
+ server->namelen = NFS2_MAXNAMLEN;
+ }
+
if (!(fattr.valid & NFS_ATTR_FATTR)) {
error = server->nfs_client->rpc_ops->getattr(server, mntfh, &fattr);
if (error < 0) {
@@ -1010,6 +1010,9 @@ struct nfs_server *nfs4_create_server(const
struct nfs4_mount_data *data,
if (error < 0)
goto error;

+ if (server->namelen == 0 || server->namelen > NFS4_MAXNAMLEN)
+ server->namelen = NFS4_MAXNAMLEN;
+
BUG_ON(!server->nfs_client);
BUG_ON(!server->nfs_client->rpc_ops);
BUG_ON(!server->nfs_client->rpc_ops->file_inode_ops);
@@ -1082,6 +1085,9 @@ struct nfs_server
*nfs4_create_referral_server(struct nfs_clone_mount *data,
if (error < 0)
goto error;

+ if (server->namelen == 0 || server->namelen > NFS4_MAXNAMLEN)
+ server->namelen = NFS4_MAXNAMLEN;
+
dprintk("Referral FSID: %llx:%llx\n",
(unsigned long long) server->fsid.major,
(unsigned long long) server->fsid.minor);
@@ -1141,6 +1147,9 @@ struct nfs_server *nfs_clone_server(struct
nfs_server *source,
if (error < 0)
goto out_free_server;

+ if (server->namelen == 0 || server->namelen > NFS4_MAXNAMLEN)
+ server->namelen = NFS4_MAXNAMLEN;
+
dprintk("Cloned FSID: %llx:%llx\n",
(unsigned long long) server->fsid.major,
(unsigned long long) server->fsid.minor);
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 95f2631..db1d6b9 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1162,6 +1162,8 @@ static struct dentry
*nfs_readdir_lookup(nfs_readdir_descriptor_t *desc)
}
if (!desc->plus || !(entry->fattr->valid & NFS_ATTR_FATTR))
return NULL;
+ if (name.len > NFS_SERVER(dir)->namelen)
+ return NULL;
/* Note: caller is already holding the dir->i_mutex! */
dentry = d_alloc(parent, &name);
if (dentry == NULL)
diff --git a/fs/nfs/getroot.c b/fs/nfs/getroot.c
index d1cbf0a..522e5ad 100644
--- a/fs/nfs/getroot.c
+++ b/fs/nfs/getroot.c
@@ -175,6 +175,9 @@ next_component:
path++;
name.len = path - (const char *) name.name;

+ if (name.len > NFS4_MAXNAMLEN)
+ return -ENAMETOOLONG;
+
eat_dot_dir:
while (*path == '/')
path++;
>From 42fb057088829339af4c37207aef5351c083e263 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Fri, 8 Feb 2008 14:23:35 -0500
Subject: [PATCH] NFS: Fix a potential file corruption issue when writing

patch 5d47a35600270e7115061cb1320ee60ae9bcb6b8 in mainline.

If the inode is flagged as having an invalid mapping, then we can't rely on
the PageUptodate() flag. Ensure that we don't use the "anti-fragmentation"
write optimisation in nfs_updatepage(), since that will cause NFS to write
out areas of the page that are no longer guaranteed to be up to date.

A potential corruption could occur in the following scenario:

client 1 client 2
=============== ===============
fd=open("f",O_CREAT|O_WRONLY,0644);
write(fd,"fubar\n",6); // cache last page
close(fd);
fd=open("f",O_WRONLY|O_APPEND);
write(fd,"foo\n",4);
close(fd);

fd=open("f",O_WRONLY|O_APPEND);
write(fd,"bar\n",4);
close(fd);
-----
The bug may lead to the file "f" reading 'fubar\n\0\0\0\nbar\n' because
client 2 does not update the cached page after re-opening the file for
write. Instead it keeps it marked as PageUptodate() until someone calls
invalidate_inode_pages2() (typically by calling read()).

The bug was introduced by commit 44b11874ff583b6e766a05856b04f3c492c32b84
"NFS: Separate metadata and page cache revalidation mechanisms"

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index af344a1..380a7ae 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -710,6 +710,17 @@ int nfs_flush_incompatible(struct file *file,
struct page *page)
}

/*
+ * If the page cache is marked as unsafe or invalid, then we can't rely on
+ * the PageUptodate() flag. In this case, we will need to turn off
+ * write optimisations that depend on the page contents being correct.
+ */
+static int nfs_write_pageuptodate(struct page *page, struct inode *inode)
+{
+ return PageUptodate(page) &&
+ !(NFS_I(inode)->cache_validity &
(NFS_INO_REVAL_PAGECACHE|NFS_INO_INVALID_DATA));
+}
+
+/*
* Update and possibly write a cached page of an NFS file.
*
* XXX: Keep an eye on generic_file_read to make sure it doesn't do bad
@@ -730,10 +741,13 @@ int nfs_updatepage(struct file *file, struct page *page,
(long long)(page_offset(page) +offset));

/* If we're not using byte range locks, and we know the page
- * is entirely in cache, it may be more efficient to avoid
- * fragmenting write requests.
+ * is up to date, it may be more efficient to extend the write
+ * to cover the entire page in order to avoid fragmentation
+ * inefficiencies.
*/
- if (PageUptodate(page) && inode->i_flock == NULL && !(file->f_mode &
O_SYNC)) {
+ if (nfs_write_pageuptodate(page, inode) &&
+ inode->i_flock == NULL &&
+ !(file->f_mode & O_SYNC)) {
count = max(count + offset, nfs_page_length(page));
offset = 0;
}
>From 56fe4e34e97f8d0c65a2cb6ca1a4477d0b1b711a Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Tue, 5 Jun 2007 13:26:15 -0400
Subject: [PATCH] NFS: Fix nfs_reval_fsid()

MAINLINE: a0356862bcbeb20acf64bc1a82d28a4c5bb957a7

We don't need to revalidate the fsid on the root directory. It suffices to
revalidate it on the current directory.

Signed-off-by: Trond Myklebust <[email protected]>
Acked-by: NeilBrown <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index c27258b..95f2631 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -897,14 +897,13 @@ int nfs_is_exclusive_create(struct inode *dir,
struct nameidata *nd)
return (nd->intent.open.flags & O_EXCL) != 0;
}

-static inline int nfs_reval_fsid(struct vfsmount *mnt, struct inode *dir,
- struct nfs_fh *fh, struct nfs_fattr *fattr)
+static inline int nfs_reval_fsid(struct inode *dir, const struct
nfs_fattr *fattr)
{
struct nfs_server *server = NFS_SERVER(dir);

if (!nfs_fsid_equal(&server->fsid, &fattr->fsid))
- /* Revalidate fsid on root dir */
- return __nfs_revalidate_inode(server, mnt->mnt_root->d_inode);
+ /* Revalidate fsid using the parent directory */
+ return __nfs_revalidate_inode(server, dir);
return 0;
}

@@ -946,7 +945,7 @@ static struct dentry *nfs_lookup(struct inode
*dir, struct dentry * dentry, stru
res = ERR_PTR(error);
goto out_unlock;
}
- error = nfs_reval_fsid(nd->mnt, dir, &fhandle, &fattr);
+ error = nfs_reval_fsid(dir, &fattr);
if (error < 0) {
res = ERR_PTR(error);
goto out_unlock;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index bd9f5a8..2219b6c 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -961,8 +961,8 @@ static int nfs_update_inode(struct inode *inode,
struct nfs_fattr *fattr)
goto out_changed;

server = NFS_SERVER(inode);
- /* Update the fsid if and only if this is the root directory */
- if (inode == inode->i_sb->s_root->d_inode
+ /* Update the fsid? */
+ if (S_ISDIR(inode->i_mode)
&& !nfs_fsid_equal(&server->fsid, &fattr->fsid))
server->fsid = fattr->fsid;

>From c39bf6c63dd9102fe053bcaed51d236ae4256357 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Tue, 11 Dec 2007 11:05:19 -0500
Subject: [PATCH] NFSv2/v3: Fix a memory leak when using -onolock

MAINLINE: 13ef7b69b54aa8ae4ed264d0bf41339737f8543a

Neil Brown said:
> Hi Trond,
>
> We found that a machine which made moderately heavy use of
> 'automount' was leaking some nfs data structures - particularly the
> 4K allocated by rpc_alloc_iostats.
> It turns out that this only happens with filesystems with -onolock
> set.

> The problem is that if NFS_MOUNT_NONLM is set, nfs_start_lockd doesn't
> set server->destroy, so when the filesystem is unmounted, the
> ->client_acl is not shutdown, and so several resources are still
> held. Multiple mount/umount cycles will slowly eat away memory
> several pages at a time.

Signed-off-by: Trond Myklebust <[email protected]>

Acked-by: Neil Brown <[email protected]>
Signed-off-by: Neil Brown <[email protected]>
CC: Oliver Pinter <[email protected]>
Signed-off-by: Oliver Pinter <[email protected]>

diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index aaf4b04..b6fd8a7 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -433,9 +433,6 @@ static int nfs_create_rpc_client(struct nfs_client
*clp, int proto,
*/
static void nfs_destroy_server(struct nfs_server *server)
{
- if (!IS_ERR(server->client_acl))
- rpc_shutdown_client(server->client_acl);
-
if (!(server->flags & NFS_MOUNT_NONLM))
lockd_down(); /* release rpc.lockd */
}
@@ -771,6 +768,9 @@ void nfs_free_server(struct nfs_server *server)

if (server->destroy != NULL)
server->destroy(server);
+
+ if (!IS_ERR(server->client_acl))
+ rpc_shutdown_client(server->client_acl);
if (!IS_ERR(server->client))
rpc_shutdown_client(server->client);

>From 0ce012c0868f3bfbe8778e1f291c75b238257dc3 Mon Sep 17 00:00:00 2001
From: Lee Schermerhorn <[email protected]>
Date: Fri, 21 Sep 2007 08:33:55 +0200
Subject: [PATCH] Panic in blk_rq_map_sg() from CCISS driver

MAINLINE: a683d652d334a546be9175b894f42dbd8e399536

New scatter/gather list chaining [sg_next()] treats 'page' member of
struct scatterlist with low bit set [0x01] as a chain pointer to
another struct scatterlist [array]. The CCISS driver request function
passes an uninitialized, temporary, on-stack scatterlist array to
blk_rq_map_sq(). sg_next() interprets random data on the stack as a
chain pointer and eventually tries to de-reference an invalid pointer,
resulting in:

[<ffffffff8031dd70>] blk_rq_map_sg+0x70/0x170
PGD 6090c3067 PUD 0
Oops: 0000 [1] SMP
last sysfs file: /block/cciss!c0d0/cciss!c0d0p1/dev
CPU 6
Modules linked in: ehci_hcd ohci_hcd uhci_hcd
Pid: 1, comm: init Not tainted 2.6.23-rc6-mm1 #3
RIP: 0010:[<ffffffff8031dd70>] [<ffffffff8031dd70>] blk_rq_map_sg+0x70/0x170
RSP: 0018:ffff81060901f768 EFLAGS: 00010206
RAX: 000000040b161000 RBX: ffff81060901f7d8 RCX: 000000040b162c00
RDX: 0000000000000000 RSI: ffff81060b13a260 RDI: ffff81060b139600
RBP: 0000000000001400 R08: 00000000fffffffe R09: 0000000000000400
R10: 0000000000000000 R11: 000000040b163000 R12: ffff810102fe0000
R13: 0000000000000001 R14: 0000000000000001 R15: 00001e0000000000
FS: 00000000026108f0(0063) GS:ffff810409000b80(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000010000001e CR3: 00000006090c6000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process init (pid: 1, threadinfo ffff81060901e000, task ffff810409020800)
last branch before last exception/interrupt
from [<ffffffff8031de0a>] blk_rq_map_sg+0x10a/0x170
to [<ffffffff8031dd70>] blk_rq_map_sg+0x70/0x170
Stack: 000000018068ea00 ffff810102fe0000 0000000000000000 ffff810011400000
0000000000000002 0000000000000000 ffff81040b172000 ffffffff803acd3d
0000000000003ec1 ffff8106090d5000 ffff8106090d5000 ffff810102fe0000
Call Trace:
[<ffffffff803acd3d>] do_cciss_request+0x15d/0x4c0
[<ffffffff80298968>] new_slab+0x1c8/0x270
[<ffffffff80298ffd>] __slab_alloc+0x22d/0x470
[<ffffffff8027327b>] mempool_alloc+0x4b/0x130
[<ffffffff8032b21e>] cfq_set_request+0xee/0x380
[<ffffffff8027327b>] mempool_alloc+0x4b/0x130
[<ffffffff8031ff98>] get_request+0x168/0x360
[<ffffffff80331b0d>] rb_insert_color+0x8d/0x110
[<ffffffff8031cfd8>] elv_rb_add+0x58/0x60
[<ffffffff8032a329>] cfq_add_rq_rb+0x69/0xa0
[<ffffffff8031c1ab>] elv_merged_request+0x5b/0x60
[<ffffffff803224fd>] __make_request+0x23d/0x650
[<ffffffff80298ffd>] __slab_alloc+0x22d/0x470
[<ffffffff80270000>] generic_write_checks+0x140/0x190
[<ffffffff8031f012>] generic_make_request+0x1c2/0x3a0
<etc>
Kernel panic - not syncing: Attempted to kill init!

This patch initializes the tmp_sg array to zeroes. Perhaps not the ultimate
fix, but an effective work-around. I can now boot 23-rc6-mm1 on an HP
Proliant x86_64 with CCISS boot disk.

Signed-off-by: Lee Schermerhorn <[email protected]>
CC: Oliver Pinter <[email protected]>

drivers/block/cciss.c | 1 +
1 file changed, 1 insertion(+)
Signed-off-by: Jens Axboe <[email protected]>

diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index 132f76b..355fc17 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -2568,6 +2568,7 @@ static void do_cciss_request(request_queue_t *q)
(int)creq->nr_sectors);
#endif /* CCISS_DEBUG */

+ memset(tmp_sg, 0, sizeof(tmp_sg));
seg = blk_rq_map_sg(q, creq, tmp_sg);

/* get the DMA records for the setup */
>From 077e603aedf4827ceb1ef6901620b4faedcab6ee Mon Sep 17 00:00:00 2001
From: Ian Abbott <[email protected]>
Date: Mon, 4 Feb 2008 13:56:36 +0000
Subject: [PATCH] PCI: Fix fakephp deadlock

MAINLINE: 5c796ae7a7ebe56967ed9b9963d7c16d733635ff

This patch works around a problem in the fakephp driver when a process
writing "0" to a "power" sysfs file to fake removal of a PCI device ends
up deadlocking itself in the sysfs code.

The patch is functionally identical to the one in Linus' tree post 2.6.24:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=5c796ae7a7ebe56967ed9b9963d7c16d733635ff

I have tested it on a 2.6.22 kernel.

Signed-off-by: Ian Abbott <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/drivers/pci/hotplug/fakephp.c b/drivers/pci/hotplug/fakephp.c
index 027f686..02a09d5 100644
--- a/drivers/pci/hotplug/fakephp.c
+++ b/drivers/pci/hotplug/fakephp.c
@@ -39,6 +39,7 @@
#include <linux/init.h>
#include <linux/string.h>
#include <linux/slab.h>
+#include <linux/workqueue.h>
#include "../pci.h"

#if !defined(MODULE)
@@ -63,10 +64,16 @@ struct dummy_slot {
struct list_head node;
struct hotplug_slot *slot;
struct pci_dev *dev;
+ struct work_struct remove_work;
+ unsigned long removed;
};

static int debug;
static LIST_HEAD(slot_list);
+static struct workqueue_struct *dummyphp_wq;
+
+static void pci_rescan_worker(struct work_struct *work);
+static DECLARE_WORK(pci_rescan_work, pci_rescan_worker);

static int enable_slot (struct hotplug_slot *slot);
static int disable_slot (struct hotplug_slot *slot);
@@ -109,7 +116,7 @@ static int add_slot(struct pci_dev *dev)
slot->name = &dev->dev.bus_id[0];
dbg("slot->name = %s\n", slot->name);

- dslot = kmalloc(sizeof(struct dummy_slot), GFP_KERNEL);
+ dslot = kzalloc(sizeof(struct dummy_slot), GFP_KERNEL);
if (!dslot)
goto error_info;

@@ -164,6 +171,14 @@ static void remove_slot(struct dummy_slot *dslot)
err("Problem unregistering a slot %s\n", dslot->slot->name);
}

+/* called from the single-threaded workqueue handler to remove a slot */
+static void remove_slot_worker(struct work_struct *work)
+{
+ struct dummy_slot *dslot =
+ container_of(work, struct dummy_slot, remove_work);
+ remove_slot(dslot);
+}
+
/**
* Rescan slot.
* Tries hard not to re-enable already existing devices
@@ -267,11 +282,17 @@ static inline void pci_rescan(void) {
pci_rescan_buses(&pci_root_buses);
}

+/* called from the single-threaded workqueue handler to rescan all pci buses */
+static void pci_rescan_worker(struct work_struct *work)
+{
+ pci_rescan();
+}

static int enable_slot(struct hotplug_slot *hotplug_slot)
{
/* mis-use enable_slot for rescanning of the pci bus */
- pci_rescan();
+ cancel_work_sync(&pci_rescan_work);
+ queue_work(dummyphp_wq, &pci_rescan_work);
return -ENODEV;
}

@@ -306,6 +327,10 @@ static int disable_slot(struct hotplug_slot *slot)
err("Can't remove PCI devices with other PCI devices behind it yet.\n");
return -ENODEV;
}
+ if (test_and_set_bit(0, &dslot->removed)) {
+ dbg("Slot already scheduled for removal\n");
+ return -ENODEV;
+ }
/* search for subfunctions and disable them first */
if (!(dslot->dev->devfn & 7)) {
for (func = 1; func < 8; func++) {
@@ -328,8 +353,9 @@ static int disable_slot(struct hotplug_slot *slot)
/* remove the device from the pci core */
pci_remove_bus_device(dslot->dev);

- /* blow away this sysfs entry and other parts. */
- remove_slot(dslot);
+ /* queue work item to blow away this sysfs entry and other parts. */
+ INIT_WORK(&dslot->remove_work, remove_slot_worker);
+ queue_work(dummyphp_wq, &dslot->remove_work);

return 0;
}
@@ -340,6 +366,7 @@ static void cleanup_slots (void)
struct list_head *next;
struct dummy_slot *dslot;

+ destroy_workqueue(dummyphp_wq);
list_for_each_safe (tmp, next, &slot_list) {
dslot = list_entry (tmp, struct dummy_slot, node);
remove_slot(dslot);
@@ -351,6 +378,10 @@ static int __init dummyphp_init(void)
{
info(DRIVER_DESC "\n");

+ dummyphp_wq = create_singlethread_workqueue(MY_NAME);
+ if (!dummyphp_wq)
+ return -ENOMEM;
+
return pci_scan_buses();
}

>From 053a4418b8da611b1a8b750ffcda8411432b7ad7 Mon Sep 17 00:00:00 2001
From: Kees Cook <[email protected]>
Date: Wed, 19 Sep 2007 05:46:32 -0700
Subject: [PATCH] pci: fix unterminated pci_device_id lists

pci: fix unterminated pci_device_id lists

mainline: 248bdd5efca5a113cbf443a993c69e53d370236b

Fix a couple drivers that do not correctly terminate their pci_device_id
lists. This results in garbage being spewed into modules.pcimap when the
module happens to not have 28 NULL bytes following the table, and/or the
last PCI ID is actually truncated from the table when calculating the
modules.alias PCI aliases, cause those unfortunate device IDs to not
auto-load.

Signed-off-by: Kees Cook <[email protected]>
Acked-by: Corey Minyard <[email protected]>
Cc: David Woodhouse <[email protected]>
Acked-by: Jeff Garzik <[email protected]>
Cc: Greg KH <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Acked-by: Jeff Mahoney <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 78e1b96..eb894f8 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -2214,7 +2214,8 @@ static int ipmi_pci_resume(struct pci_dev *pdev)

static struct pci_device_id ipmi_pci_devices[] = {
{ PCI_DEVICE(PCI_HP_VENDOR_ID, PCI_MMC_DEVICE_ID) },
- { PCI_DEVICE_CLASS(PCI_ERMC_CLASSCODE, PCI_ERMC_CLASSCODE_MASK) }
+ { PCI_DEVICE_CLASS(PCI_ERMC_CLASSCODE, PCI_ERMC_CLASSCODE_MASK) },
+ { 0, }
};
MODULE_DEVICE_TABLE(pci, ipmi_pci_devices);

diff --git a/drivers/media/video/usbvision/usbvision-cards.c
b/drivers/media/video/usbvision/usbvision-cards.c
index 51ab265..31db1ed 100644
--- a/drivers/media/video/usbvision/usbvision-cards.c
+++ b/drivers/media/video/usbvision/usbvision-cards.c
@@ -1081,6 +1081,7 @@ struct usb_device_id usbvision_table [] = {
{ USB_DEVICE(0x2304, 0x0301), .driver_info=PINNA_LINX_VD_IN_CAB_PAL },
{ USB_DEVICE(0x2304, 0x0419), .driver_info=PINNA_PCTV_BUNGEE_PAL_FM },
{ USB_DEVICE(0x2400, 0x4200), .driver_info=HPG_WINTV },
+ { }, /* terminate list */
};

MODULE_DEVICE_TABLE (usb, usbvision_table);
diff --git a/drivers/mtd/nand/cafe_nand.c b/drivers/mtd/nand/cafe_nand.c
index cff969d..6f32a35 100644
--- a/drivers/mtd/nand/cafe_nand.c
+++ b/drivers/mtd/nand/cafe_nand.c
@@ -816,7 +816,8 @@ static void __devexit cafe_nand_remove(struct pci_dev *pdev)
}

static struct pci_device_id cafe_nand_tbl[] = {
- { 0x11ab, 0x4100, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_MEMORY_FLASH <<
8, 0xFFFF0 }
+ { 0x11ab, 0x4100, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_MEMORY_FLASH <<
8, 0xFFFF0 },
+ { 0, }
};

MODULE_DEVICE_TABLE(pci, cafe_nand_tbl);
>From b5cf6be2dfcb4ae3885edec574526221c58f0216 Mon Sep 17 00:00:00 2001
From: Christoph Lameter <[email protected]>
Date: Sat, 22 Dec 2007 14:03:23 -0800
Subject: [PATCH] quicklists: do not release off node pages early

patch ed367fc3a7349b17354c7acef551533337764859 in mainline.

quicklists must keep even off node pages on the quicklists until the TLB
flush has been completed.

Signed-off-by: Christoph Lameter <[email protected]>
Cc: Dhaval Giani <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Oliver Pinter <[email protected]>

diff --git a/include/linux/quicklist.h b/include/linux/quicklist.h
index 9371c61..39b6671 100644
--- a/include/linux/quicklist.h
+++ b/include/linux/quicklist.h
@@ -56,14 +56,6 @@ static inline void __quicklist_free(int nr, void
(*dtor)(void *), void *p,
struct page *page)
{
struct quicklist *q;
- int nid = page_to_nid(page);
-
- if (unlikely(nid != numa_node_id())) {
- if (dtor)
- dtor(p);
- __free_page(page);
- return;
- }

q = &get_cpu_var(quicklist)[nr];
*(void **)p = q->page;
>From fcae8056673aa7b5b55d669fe5255ae76a93ae53 Mon Sep 17 00:00:00 2001
From: Christoph Lameter <[email protected]>
Date: Wed, 16 Jan 2008 00:21:19 +0530
Subject: [PATCH] quicklists: Only consider memory that can be used
with GFP_KERNEL

patch 96990a4ae979df9e235d01097d6175759331e88c in mainline.

Quicklists calculates the size of the quicklists based on the number of
free pages. This must be the number of free pages that can be allocated
with GFP_KERNEL. node_page_state() includes the pages in ZONE_HIGHMEM and
ZONE_MOVABLE which may lead the quicklists to become too large causing OOM.

Signed-off-by: Christoph Lameter <[email protected]>
Tested-by: Dhaval Giani <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Oliver Pinter <[email protected]>

diff --git a/mm/quicklist.c b/mm/quicklist.c
index ae8189c..3f703f7 100644
--- a/mm/quicklist.c
+++ b/mm/quicklist.c
@@ -26,9 +26,17 @@ DEFINE_PER_CPU(struct quicklist, quicklist)[CONFIG_NR_QUICK];
static unsigned long max_pages(unsigned long min_pages)
{
unsigned long node_free_pages, max;
+ struct zone *zones = NODE_DATA(numa_node_id())->node_zones;
+
+ node_free_pages =
+#ifdef CONFIG_ZONE_DMA
+ zone_page_state(&zones[ZONE_DMA], NR_FREE_PAGES) +
+#endif
+#ifdef CONFIG_ZONE_DMA32
+ zone_page_state(&zones[ZONE_DMA32], NR_FREE_PAGES) +
+#endif
+ zone_page_state(&zones[ZONE_NORMAL], NR_FREE_PAGES);

- node_free_pages = node_page_state(numa_node_id(),
- NR_FREE_PAGES);
max = node_free_pages / FRACTION_OF_NODE_MEM;
return max(max, min_pages);
}
>From e38523fd7f35f485e8190b0ad286b8137a852b13 Mon Sep 17 00:00:00 2001
From: Mikael Pettersson <[email protected]>
Date: Wed, 16 Jan 2008 10:32:17 +0100
Subject: [PATCH] sata_promise: ASIC PRD table bug workaround

patch b9ccd4a90bbb964506f01b4bdcff4f50f8d5d334 in mainline.

Second-generation Promise SATA controllers have an ASIC bug
which can trigger if the last PRD entry is larger than 164 bytes,
resulting in intermittent errors and possible data corruption.

Work around this by replacing calls to ata_qc_prep() with a
private version that fills the PRD, checks the size of the
last entry, and if necessary splits it to avoid the bug.
Also reduce sg_tablesize by 1 to accommodate the new entry.

Tested on the second-generation SATA300 TX4 and SATA300 TX2plus,
and the first-generation PDC20378.

Thanks to Alexander Sabourenkov for verifying the bug by
studying the vendor driver, and for writing the initial patch
upon which this one is based.

Signed-off-by: Mikael Pettersson <[email protected]>
Cc: Jeff Garzik <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Oliver Pinter <[email protected]>

diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c
index 6dc0b01..1d90ce1 100644
--- a/drivers/ata/sata_promise.c
+++ b/drivers/ata/sata_promise.c
@@ -51,6 +51,7 @@
enum {
PDC_MAX_PORTS = 4,
PDC_MMIO_BAR = 3,
+ PDC_MAX_PRD = LIBATA_MAX_PRD - 1, /* -1 for ASIC PRD bug workaround */

/* register offsets */
PDC_FEATURE = 0x04, /* Feature/Error reg (per port) */
@@ -157,7 +158,7 @@ static struct scsi_host_template pdc_ata_sht = {
.queuecommand = ata_scsi_queuecmd,
.can_queue = ATA_DEF_QUEUE,
.this_id = ATA_SHT_THIS_ID,
- .sg_tablesize = LIBATA_MAX_PRD,
+ .sg_tablesize = PDC_MAX_PRD,
.cmd_per_lun = ATA_SHT_CMD_PER_LUN,
.emulated = ATA_SHT_EMULATED,
.use_clustering = ATA_SHT_USE_CLUSTERING,
@@ -531,6 +532,84 @@ static void pdc_atapi_pkt(struct ata_queued_cmd *qc)
memcpy(buf+31, cdb, cdb_len);
}

+/**
+ * pdc_fill_sg - Fill PCI IDE PRD table
+ * @qc: Metadata associated with taskfile to be transferred
+ *
+ * Fill PCI IDE PRD (scatter-gather) table with segments
+ * associated with the current disk command.
+ * Make sure hardware does not choke on it.
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ */
+static void pdc_fill_sg(struct ata_queued_cmd *qc)
+{
+ struct ata_port *ap = qc->ap;
+ struct scatterlist *sg;
+ unsigned int idx;
+ const u32 SG_COUNT_ASIC_BUG = 41*4;
+
+ if (!(qc->flags & ATA_QCFLAG_DMAMAP))
+ return;
+
+ WARN_ON(qc->__sg == NULL);
+ WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
+
+ idx = 0;
+ ata_for_each_sg(sg, qc) {
+ u32 addr, offset;
+ u32 sg_len, len;
+
+ /* determine if physical DMA addr spans 64K boundary.
+ * Note h/w doesn't support 64-bit, so we unconditionally
+ * truncate dma_addr_t to u32.
+ */
+ addr = (u32) sg_dma_address(sg);
+ sg_len = sg_dma_len(sg);
+
+ while (sg_len) {
+ offset = addr & 0xffff;
+ len = sg_len;
+ if ((offset + sg_len) > 0x10000)
+ len = 0x10000 - offset;
+
+ ap->prd[idx].addr = cpu_to_le32(addr);
+ ap->prd[idx].flags_len = cpu_to_le32(len & 0xffff);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+ idx++;
+ sg_len -= len;
+ addr += len;
+ }
+ }
+
+ if (idx) {
+ u32 len = le32_to_cpu(ap->prd[idx - 1].flags_len);
+
+ if (len > SG_COUNT_ASIC_BUG) {
+ u32 addr;
+
+ VPRINTK("Splitting last PRD.\n");
+
+ addr = le32_to_cpu(ap->prd[idx - 1].addr);
+ ap->prd[idx - 1].flags_len = cpu_to_le32(len - SG_COUNT_ASIC_BUG);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx - 1, addr, SG_COUNT_ASIC_BUG);
+
+ addr = addr + len - SG_COUNT_ASIC_BUG;
+ len = SG_COUNT_ASIC_BUG;
+ ap->prd[idx].addr = cpu_to_le32(addr);
+ ap->prd[idx].flags_len = cpu_to_le32(len);
+ VPRINTK("PRD[%u] = (0x%X, 0x%X)\n", idx, addr, len);
+
+ idx++;
+ }
+
+ ap->prd[idx - 1].flags_len |= cpu_to_le32(ATA_PRD_EOT);
+ }
+}
+
static void pdc_qc_prep(struct ata_queued_cmd *qc)
{
struct pdc_port_priv *pp = qc->ap->private_data;
@@ -540,7 +619,7 @@ static void pdc_qc_prep(struct ata_queued_cmd *qc)

switch (qc->tf.protocol) {
case ATA_PROT_DMA:
- ata_qc_prep(qc);
+ pdc_fill_sg(qc);
/* fall through */

case ATA_PROT_NODATA:
@@ -556,11 +635,11 @@ static void pdc_qc_prep(struct ata_queued_cmd *qc)
break;

case ATA_PROT_ATAPI:
- ata_qc_prep(qc);
+ pdc_fill_sg(qc);
break;

case ATA_PROT_ATAPI_DMA:
- ata_qc_prep(qc);
+ pdc_fill_sg(qc);
/*FALLTHROUGH*/
case ATA_PROT_ATAPI_NODATA:
pdc_atapi_pkt(qc);
>From 3c565dcb45570eec3471b33ace87ecb4b51e70d4 Mon Sep 17 00:00:00 2001
From: Mikael Pettersson <[email protected]>
Date: Wed, 16 Jan 2008 10:31:22 +0100
Subject: [PATCH] sata_promise: FastTrack TX4200 is a second-generation chip

patch 7f9992a23190418592f0810900e4f91546ec41da in mainline.

This patch corrects sata_promise to classify FastTrack TX4200
(DID 3515/3519) as a second-generation chip. Promise's partial-
source FT TX4200 driver confirms this classification.

Treating it as a first-generation chip causes several problems:
1. Detection failures. This is a recent regression triggered by
the hotplug-enabling changes in 2.6.23-rc1.
2. Various "failed to resume link for reset" warnings.

This patch fixes <http://bugzilla.kernel.org/show_bug.cgi?id=8936>.

Thanks to Stephen Ziemba for reporting the bug and for testing the fix.

Signed-off-by: Mikael Pettersson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c
index 1d90ce1..681b76a 100644
--- a/drivers/ata/sata_promise.c
+++ b/drivers/ata/sata_promise.c
@@ -331,8 +331,8 @@ static const struct pci_device_id pdc_ata_pci_tbl[] = {

{ PCI_VDEVICE(PROMISE, 0x3318), board_20319 },
{ PCI_VDEVICE(PROMISE, 0x3319), board_20319 },
- { PCI_VDEVICE(PROMISE, 0x3515), board_20319 },
- { PCI_VDEVICE(PROMISE, 0x3519), board_20319 },
+ { PCI_VDEVICE(PROMISE, 0x3515), board_40518 },
+ { PCI_VDEVICE(PROMISE, 0x3519), board_40518 },
{ PCI_VDEVICE(PROMISE, 0x3d17), board_40518 },
{ PCI_VDEVICE(PROMISE, 0x3d18), board_40518 },

>From 17ddbac1c59c8d184c8ceb0b5bd39e6a97f9e7a8 Mon Sep 17 00:00:00 2001
From: Mattia Dongili <[email protected]>
Date: Sun, 12 Aug 2007 07:20:27 +0900
Subject: [PATCH] sony-laptop: call sonypi_compat_init earlier

sony-laptop: call sonypi_compat_init earlier

mainline: 015a916fbbf105bb15f4bbfd80c3b9b2f2e0d7db

sonypi_compat uses a kfifo that needs to be present before _SRS is
called to be able to cope with the IRQs triggered when setting
resources.

Signed-off-by: Mattia Dongili <[email protected]>
Signed-off-by: Len Brown <[email protected]>
Acked-by: Jeff Mahoney <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/drivers/misc/sony-laptop.c b/drivers/misc/sony-laptop.c
index ab2ca63..6d2d64f 100644
--- a/drivers/misc/sony-laptop.c
+++ b/drivers/misc/sony-laptop.c
@@ -2056,8 +2056,6 @@ static int sony_pic_remove(struct acpi_device
*device, int type)
struct sony_pic_ioport *io, *tmp_io;
struct sony_pic_irq *irq, *tmp_irq;

- sonypi_compat_exit();
-
if (sony_pic_disable(device)) {
printk(KERN_ERR DRV_PFX "Couldn't disable device.\n");
return -ENXIO;
@@ -2067,6 +2065,8 @@ static int sony_pic_remove(struct acpi_device
*device, int type)
release_region(spic_dev.cur_ioport->io.minimum,
spic_dev.cur_ioport->io.address_length);

+ sonypi_compat_exit();
+
sony_laptop_remove_input();

/* pf attrs */
@@ -2132,6 +2132,9 @@ static int sony_pic_add(struct acpi_device *device)
goto err_free_resources;
}

+ if (sonypi_compat_init())
+ goto err_remove_input;
+
/* request io port */
list_for_each_entry(io, &spic_dev.ioports, list) {
if (request_region(io->io.minimum, io->io.address_length,
@@ -2146,7 +2149,7 @@ static int sony_pic_add(struct acpi_device *device)
if (!spic_dev.cur_ioport) {
printk(KERN_ERR DRV_PFX "Failed to request_region.\n");
result = -ENODEV;
- goto err_remove_input;
+ goto err_remove_compat;
}

/* request IRQ */
@@ -2186,9 +2189,6 @@ static int sony_pic_add(struct acpi_device *device)
if (result)
goto err_remove_pf;

- if (sonypi_compat_init())
- goto err_remove_pf;
-
return 0;

err_remove_pf:
@@ -2204,6 +2204,9 @@ err_release_region:
release_region(spic_dev.cur_ioport->io.minimum,
spic_dev.cur_ioport->io.address_length);

+err_remove_compat:
+ sonypi_compat_exit();
+
err_remove_input:
sony_laptop_remove_input();

>From eddfe53b77e6c2008f4a2c46a179b7d8a63c8d71 Mon Sep 17 00:00:00 2001
From: Stephen Hemminger <[email protected]>
Date: Thu, 15 Nov 2007 03:47:27 -0800
Subject: [PATCH] Don't oops on MTU change.

[VIA_VELOCITY]: Don't oops on MTU change.

MAINLINE: bd7b3f34198071d8bec05180530c362f1800ba46

Simple mtu change when device is down.
Fix http://bugzilla.kernel.org/show_bug.cgi?id=9382.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Acked-by: Jeff Mahoney <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
index b670b97..63a2f81 100644
--- a/drivers/net/via-velocity.c
+++ b/drivers/net/via-velocity.c
@@ -1798,6 +1798,11 @@ static int velocity_change_mtu(struct
net_device *dev, int new_mtu)
return -EINVAL;
}

+ if (!netif_running(dev)) {
+ dev->mtu = new_mtu;
+ return 0;
+ }
+
if (new_mtu != oldmtu) {
spin_lock_irqsave(&vptr->lock, flags);

>From ed27e4a69a8c2b0eaa9fd7b3d74a0da95baa7780 Mon Sep 17 00:00:00 2001
From: Stephen Hemminger <[email protected]>
Date: Wed, 28 Nov 2007 22:20:16 -0800
Subject: [PATCH] via-velocity: don't oops on MTU change (resend)

via-velocity: don't oops on MTU change (resend)

mainline: 48f6b053613b62fed7a2fe3255e5568260a8d615

The VIA veloicty driver needs the following to allow changing MTU when down.
The buffer size needs to be computed when device is brought up, not when
device is initialized. This also fixes a bug where the buffer size was
computed differently on change_mtu versus initial setting.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>
Acked-by: Jeff Mahoney <[email protected]>
CC: Oliver Pinter <[email protected]>

diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
index 63a2f81..431269e 100644
--- a/drivers/net/via-velocity.c
+++ b/drivers/net/via-velocity.c
@@ -1075,6 +1075,9 @@ static int velocity_init_rd_ring(struct
velocity_info *vptr)
int ret = -ENOMEM;
unsigned int rsize = sizeof(struct velocity_rd_info) *
vptr->options.numrx;
+ int mtu = vptr->dev->mtu;
+
+ vptr->rx_buf_sz = (mtu <= ETH_DATA_LEN) ? PKT_BUF_SZ : mtu + 32;

vptr->rd_info = kmalloc(rsize, GFP_KERNEL);
if(vptr->rd_info == NULL)
@@ -1733,8 +1736,6 @@ static int velocity_open(struct net_device *dev)
struct velocity_info *vptr = netdev_priv(dev);
int ret;

- vptr->rx_buf_sz = (dev->mtu <= 1504 ? PKT_BUF_SZ : dev->mtu + 32);
-
ret = velocity_init_rings(vptr);
if (ret < 0)
goto out;
@@ -1813,12 +1814,6 @@ static int velocity_change_mtu(struct
net_device *dev, int new_mtu)
velocity_free_rd_ring(vptr);

dev->mtu = new_mtu;
- if (new_mtu > 8192)
- vptr->rx_buf_sz = 9 * 1024;
- else if (new_mtu > 4096)
- vptr->rx_buf_sz = 8192;
- else
- vptr->rx_buf_sz = 4 * 1024;

ret = velocity_init_rd_ring(vptr);
if (ret < 0)


--
Thanks,
Oliver


Attachments:
(No filename) (53.48 kB)
v2.6.22.19-queue.tar.bz2 (18.98 kB)
Download all attachments