2003-08-02 22:21:23

by Andrew Morton

[permalink] [raw]
Subject: 2.6.0-test2-mm3


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test2/2.6.0-test2-mm3/

. Con's CPU scheduler rework has been dropped out and Ingo's changes have
been added.

Con's changes demonstrated that additional infrastructure is needed to
solve these problems correctly. Ingo's patch adds those. Con is
continuing to rebase his work on these changes.

. Added Ingo's 4G/4G memory split patch. It takes my kernel build from
1:51 to 1:53 so gee.

Big fat warning: whenever you change the value of CONFIG_X86_4G you will
need to run a `make clean'. Or just remove arch/i386/boot/setup.o.

The build system seems to not notice that setup.S depends on
CONFIG_X86_4G and the resulting kernel immediately triplefaults.

. A device mapper update.

. Several reiserfs bugfixes

. I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
PS/2 or synaptics problems. You are all very bad.




Changes since 2.6.0-test2-mm2:


+linus.patch

Latest Linus tree

-alsa-bk-2003-07-28.patch
-x86_64-merge.patch
-misc31.patch
-selinux.patch
-reslabify-pgds-and-pmds.patch
-buffer-debug.patch
-centrino-update.patch
-3c59x-pm-fix.patch
-dev_t-printing.patch
-rootdisk-parsing-fix.patch
-3c59x-eisa-fix.patch
-slab-reclaim-accounting-fix.patch
-stack-leak-fix.patch
-unlock_buffer-barrier.patch
-invalidate_mmap_range.patch
-buffer_io_error-readahead-fix.patch
-force_page_cache_readahead.patch
-truncate-pagefault-race-fix.patch
-truncate-pagefault-race-fix-fix.patch
-no_page-memory-barriers.patch
-ext3-elide-inode-block-reading.patch
-ext3_getblk-race-fix.patch
-ext3_write_super-speedup.patch
-alloc_bootmem_low_pages-ordering-fix.patch
-sis-drm-fix.patch
-soundcard-devfs-fix.patch
-6pack-hz-fix.patch
-devfs_lookup-revert-and-refix.patch
-write-mark_page_accessed.patch
-less-kswapd-throttling.patch
-zone-pressure.patch
-reclaim-mapped-pressure.patch
-xfs-dio-unwritten-extents.patch
-force-CONFIG_INPUT.patch
-ipt_helper-build-fix.patch
-select-xoffed-tty-fix.patch
-conntrack-build-fix.patch
-arcnet-typo-fix.patch
-ext3-commit-assertion-fix.patch
-read_dir-fix.patch
-blk_start_queue-fix.patch
-special_file-move.patch
-remove-queue_wait.patch
-uidhash-locking.patch
-osf-partition-handling.patch
-com20020_cs-build-fix.patch
-hdlc-build-fix.patch
-add-mandocs-target.patch
-binfmt_script-argv0-fix.patch
-bttv-driver-update.patch
-ppp-xon-xoff-handling.patch
-dac960-devfs-fix.patch
-dquot-typo-fix.patch
-i810-fix.patch
-intel-agp-oops-fix.patch
-export-agp_memory_reserved.patch
-pci_device_id-devinitdata.patch
-airo-fixes.patch
-ppc32-cpu-registration-fix.patch
-impi-build-fix.patch
-document-nfs-utils.patch
-untested-quota-fix.patch
-generic-hdlc-updates.patch
-stallion-devfs-fix.patch
-dm-rename-resume.patch
-serial-is-not-experimental.patch
-ftl-warning-fix.patch
-watchdog-module-param-fixes.patch

merged

+execve-fixes.patch

Fix exeve() emulation for various 64-bit architectures

+x86_64-cpumask_t-fix.patch

Maybe fix x86_64 merge for cpumask_t

+ppc64-local.patch
+ppc64-sections.patch
+ppc64-sched_clock.patch
+ppc64-prom-compile-fix.patch

Various ppc64 hacks to make it build and work.

-rcu-stats.patch

Dropped. rcu_grace_period.patch broke it.

+rcu-grace-period.patch

Instrumentation to help solve the rcu-for-route-cache starvation problem.

-o1-interactivity.patch
-o2int.patch
-o3int.patch
-o4int.patch
-o5int-2.patch
-o6int.patch
-o6.1int.patch
-o7int.patch
-o8int.patch
-o9int.patch
-o10int.patch
-o11int.patch
-o11int.1.patch
-o11.2int.patch

Dropped.

+sched-2.6.0-test2-mm2-A3.patch

Ingo's CPu scheduler update.

+sched-warning-fix.patch

Fix a warning in it.

+nforce2-acpi-fixes-fix.patch

Fix for the fix for ACPI problems with the nforce chipset

+synaptics-mode-set.patch

Fix old synaptics touchpads.

+4g-2.6.0-test2-mm2-A5.patch

4G/4G split

+4g4g-cleanups.patch

Tidy some warnings in it

+kgdb-4g4g-fix-2.patch

Fix kgdb for 4G/4G

+4g4g-config-fix.patch

Tidy up the config options.

+dm-1-module-param.patch
+dm-2-blk.patch
+dm-3-use-hex.patch
+dm-4-64-bit-ioctls.patch
+dm-5-missing-include.patch
+dm-6-sector_div.patch
+dm-7-rename-resume.patch

Device mapper update

+reiserfs-savelinks-endianness-fix.patch
+reiserfs-enospc-fix.patch
+reiserfs-link-unlink-race-fix.patch

Reiserfs fixes

+mremap-atomicity-fix.patch

mremap() fix

+spurious-SIGCHLD-fix.patch

signal fix

+aic7xxx_old-oops-fix.patch

aic7xxx_old not real fix.

+ide-cd-oops-fix.patch

Fix oops with ide-cd on end-of-disk errors.

+awe-core.patch
+awe-core-fixes.patch
+awe-use-gfp_flags.patch
+awe-use-gfp_flags-fixes.patch
+awe-fix-truncate-errors.patch
+awe-fix-truncate-errors-fixes.patch

Report EIO and ENOSPC errors during async writeout to userspace.

+as-remove-hash-valid-stuff.patch

Anticipatory scheduler leftovers

+usercopy-might_sleep-checks.patch

might_sleep() checks in usercopy functions.



All 130 patches:

linus.patch
cset-20030802_1915.txt.gz

mm.patch
add -mmN to EXTRAVERSION

kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)

kgdb-remove-cpu_callout_map.patch
kgdb: remove cpu_callout_map decls

kgdb-use-ggdb.patch

kgdb-ga-docco-fixes.patch
kgdb doc. edits/corrections

execve-fixes.patch
fix 64-bit architectures for the binprm change

cpumask_t-1.patch
cpumask_t: allow more than BITS_PER_LONG CPUs
cpumask_t fix for s390
fix cpumask_t for s390
Fix cpumask changes for x86_64
fix cpumask_t for sparc64

cpumask_t-gcc-workaround-46.patch
cpumask_t: more gcc workarounds

cpumask_t-gcc-workaround-47.patch
cpumask_t gcc bug workarounds

cpumask-acpi-fix.patch
cpumask_t: build fix

kgdb-cpumask_t.patch

x86_64-cpumask_t-fix.patch

config_spinline.patch
uninline spinlocks for profiling accuracy.

ppc64-bar-0-fix.patch
Allow PCI BARs that start at 0

ppc64-reloc_hide.patch

ppc64-semaphore-reimplementation.patch
ppc64: use the ia32 semaphore implementation

ppc64-local.patch
ppc64: local.h implementation

ppc64-sections.patch
ppc64: implement sections.h

ppc64-sched_clock.patch
ppc64: sched_clock()

ppc64-prom-compile-fix.patch
ppc64: prom.c compile fix

sym-do-160.patch
make the SYM driver do 160 MB/sec

ia64-percpu-revert.patch
revert percpu changes

x86_64-fixes.patch
x86_64 fixes

delay-ksoftirqd-fallback.patch
Try harded in IRQ context before falling back to ksoftirqd

ds-09-vicam-usercopy-fix.patch
vicam usercopy fix

rcu-grace-period.patch
Monitor RCU grace period

mtrr-hang-fix.patch
Fix mtrr-related hang

intel8x0-cleanup.patch
intel8x0 cleanups

bio-too-big-fix.patch
Fix raid "bio too big" failures

ppa-fix.patch
ppc fix

linux-isp-2.patch

linux-isp-2-fix-again.patch
lost feral fix

feral-bounce-fix.patch
Feral driver - highmem issues

feral-bounce-fix-2.patch
Feral driver bouncing fix

list_del-debug.patch
list_del debug check

print-build-options-on-oops.patch
print a few config options on oops

show_task-free-stack-fix.patch
show_task() fix and cleanup

put_task_struct-debug.patch

ia32-mknod64.patch
mknod64 for ia32

ext2-64-bit-special-inodes.patch
ext2: support for 64-bit device nodes

ext3-64-bit-special-inodes.patch
ext3: support for 64-bit device nodes

64-bit-dev_t-kdev_t.patch
64-bit dev_t and kdev_t

64-bit-dev_t-other-archs.patch
enable 64-bit dev_t for other archs

oops-dump-preceding-code.patch
i386 oops output: dump preceding code

lockmeter.patch

printk-oops-mangle-fix.patch
disentangle printk's whilst oopsing on SMP

20-odirect_enable.patch

21-odirect_cruft.patch

22-read_proc.patch

23-write_proc.patch

24-commit_proc.patch

25-odirect.patch

nfs-O_DIRECT-always-enabled.patch
Force CONFIG_NFS_DIRECTIO

kjournald-PF_SYNCWRITE.patch

sched-2.6.0-test2-mm2-A3.patch
sched-2.6.0-test2-mm2-A3

sched-warning-fix.patch

sched-balance-tuning.patch
CPU scheduler balancing fix

ext3-block-allocation-cleanup.patch

nfs-revert-backoff.patch
nfs: revert backoff changes

floppy-smp-fixes.patch
floppy smp fixes

1000HZ-time-accuracy-fix.patch
missing #if for 1000 HZ

signal-race-fix.patch
signal handling race condition causing reboot hangs

vmscan-defer-writepage.patch
vmscan: give dirty referenced pages another pass around the LRU

blacklist-asus-L3800C-dmi.patch
add ASUS l3800P to DMI black list

nforce2-acpi-fixes.patch
ACPI patch which fixes all my IRQ problems on nforce2

nforce2-acpi-fixes-fix.patch

remove-const-initdata.patch
__initdata cant be marked const

timer-race-fixes.patch
timer race fixes

local-apic-enable-fixes.patch
Local APIC enable fixes

p00001_synaptics-restore-on-close.patch

p00002_psmouse-reset-timeout.patch

p00003_synaptics-multi-button.patch

p00004_synaptics-optional.patch

p00005_synaptics-pass-through.patch

p00006_psmouse-suspend-resume.patch

p00007_synaptics-old-proto.patch

synaptics-mode-set.patch
Synaptics mode setting

bridge-notification-fix.patch
Fix bridge notification processing

keyboard-resend-fix.patch
keyboard resend fix

kobject-paranoia-checks.patch
Driver core and kobject paranoia checks

4g-2.6.0-test2-mm2-A5.patch
4G/4G split patch

4g4g-cleanups.patch

kgdb-4g4g-fix-2.patch

4g4g-config-fix.patch

dm-1-module-param.patch
dm: don't use MODULE_PARM

dm-2-blk.patch
dm: remove blk.h include

dm-3-use-hex.patch
dm: decimal device num sscanf

dm-4-64-bit-ioctls.patch
dm: 64 bit ioctl fixes

dm-5-missing-include.patch
dm: missing #include

dm-6-sector_div.patch
dm: use sector_div()

dm-7-rename-resume.patch
dm: resume() name clash

reiserfs-savelinks-endianness-fix.patch
reiserfs: fix savelinks on bigendian arches

reiserfs-enospc-fix.patch
reiserfs: fix problem when fs is out of space

reiserfs-link-unlink-race-fix.patch
reiserfs: fix races between link and unlink on same file

mremap-atomicity-fix.patch
move_one_page() atomicity fix

spurious-SIGCHLD-fix.patch
spurious SIGCHLD from dying thread group leader

aic7xxx_old-oops-fix.patch

ide-cd-oops-fix.patch
ide-cd error handling oops fix

xfs-use-after-free-fix.patch
XFS use-after-free fix

awe-core.patch
async write errors: report truncate and io errors on async writes

awe-core-fixes.patch
async write errors core: fixes

awe-use-gfp_flags.patch
async write errors: use flags in address space

awe-use-gfp_flags-fixes.patch
async write errors: mapping->flags fixes

awe-fix-truncate-errors.patch
async write errors: fix spurious fs truncate errors

awe-fix-truncate-errors-fixes.patch
async write errors: truncate handling fixes

as-remove-hash-valid-stuff.patch
AS: remove hash valid stuff

usercopy-might_sleep-checks.patch
might_sleep() checks for usercopy functions

aio-mm-refcounting-fix.patch
fix /proc mm_struct refcounting bug

aio-01-retry.patch
AIO: Core retry infrastructure

io_submit_one-EINVAL-fix.patch
Fix aio process hang on EINVAL

aio-02-lockpage_wq.patch
AIO: Async page wait

aio-03-fs_read.patch
AIO: Filesystem aio read

aio-04-buffer_wq.patch
AIO: Async buffer wait

aio-05-fs_write.patch
AIO: Filesystem aio write

aio-05-fs_write-fix.patch

aio-06-bread_wq.patch
AIO: Async block read

aio-06-bread_wq-fix.patch

aio-07-ext2getblk_wq.patch
AIO: Async get block for ext2

O_SYNC-speedup-2.patch
speed up O_SYNC writes

aio-09-o_sync.patch
aio O_SYNC

aio-10-BUG-fix.patch
AIO: fix a BUG

aio-11-workqueue-flush.patch
AIO: flush workqueues before destroying ioctx'es

aio-12-readahead.patch
AIO: readahead fixes

aio-dio-no-readahead.patch
aio O_DIRECT no readahead

lock_buffer_wq-fix.patch
lock_buffer_wq fix

unuse_mm-locked.patch
AIO: hold the context lock across unuse_mm

aio-take-task_lock.patch
From: Suparna Bhattacharya <[email protected]>
Subject: Re: 2.5.72-mm1 - Under heavy testing with AIO,.. vmstat seems to blow the kernel

aio-O_SYNC-fix.patch
Unify o_sync changes for aio and regular writes

aio-readahead-rework.patch
Unified page range readahead for aio and regular reads




2003-08-02 22:31:42

by bert hubert

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, Aug 02, 2003 at 03:22:02PM -0700, Andrew Morton wrote:

> . I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
> PS/2 or synaptics problems. You are all very bad.

Will report 12 hours from now or so, I have synaptics problems currently.

> -selinux.patch
(...)
> merged

Sure about this?

> +4g-2.6.0-test2-mm2-A5.patch
>
> 4G/4G split

Linus called this patch 'tasteless' - do you see this being merged?

Thanks.

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://lartc.org Linux Advanced Routing & Traffic Control HOWTO

2003-08-02 23:54:15

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

bert hubert <[email protected]> wrote:
>
> On Sat, Aug 02, 2003 at 03:22:02PM -0700, Andrew Morton wrote:
>
> > . I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
> > PS/2 or synaptics problems. You are all very bad.
>
> Will report 12 hours from now or so, I have synaptics problems currently.
>
> > -selinux.patch
> (...)
> > merged
>
> Sure about this?

yes.

> > +4g-2.6.0-test2-mm2-A5.patch
> >
> > 4G/4G split
>
> Linus called this patch 'tasteless' - do you see this being merged?

Bolting 64G of memory onto a 32-bit CPU is tasteless too...

We already have a bucketload of highmem hacks in the kernel, and they are
not sufficient for some people. We have several more (large) highmem hacks
being proposed.

It is fairly clear that a number of users will need more highmem hacks than
we currenly have. I'd rather add one more big highmem hack than a whole
bunch more little ones. And I'd rather that the big highmem hack be in the
base kernel, rather than having different versions floating about in
different vendor trees.

It seems that 4G+4G is the most viable patch at this stage. Wider testing
will tell.

I rather wish that the patch had been available a year ago, so we would now
have less little highmem hacks in the tree.

wrt long-term kernel purity: one approach would be to not merge 4G+4G into
2.7 at all. This keeps the long-term kernel codebase saner. It assumes
that the monster 32-bit boxes will have been obsoleted by 64-bit machines
within 3-4 years and that it is acceptable to end-of-line those machines on
a 2.6-based kernel. I think that's pretty safe.


The main concern is that I don't want to see vendor kernels madly diverging
from the public kernel right at the outset of the 2.6 series.

2003-08-03 00:10:04

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, Aug 02, 2003 at 03:22:02PM -0700, Andrew Morton wrote:
> . Added Ingo's 4G/4G memory split patch. It takes my kernel build from
> 1:51 to 1:53 so gee.

No idea who, if anyone, listened last time I answered questions on
this. Sending in fixes shortly...


-- wli

2003-08-03 01:49:58

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sun, 2003-08-03 at 00:22, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test2/2.6.0-test2-mm3/
>
> . Con's CPU scheduler rework has been dropped out and Ingo's changes have
> been added.

Why?

2003-08-03 01:59:52

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

Felipe Alfaro Solana <[email protected]> wrote:
>
> On Sun, 2003-08-03 at 00:22, Andrew Morton wrote:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test2/2.6.0-test2-mm3/
> >
> > . Con's CPU scheduler rework has been dropped out and Ingo's changes have
> > been added.
>
> Why?

Because of the other reasons which I mentioned? We need additional
infrastructure such as the nanosecond timing to do this right.

2003-08-03 02:03:15

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sun, 2003-08-03 at 04:00, Andrew Morton wrote:
> Felipe Alfaro Solana <[email protected]> wrote:
> >
> > On Sun, 2003-08-03 at 00:22, Andrew Morton wrote:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test2/2.6.0-test2-mm3/
> > >
> > > . Con's CPU scheduler rework has been dropped out and Ingo's changes have
> > > been added.
> >
> > Why?
>
> Because of the other reasons which I mentioned? We need additional
> infrastructure such as the nanosecond timing to do this right.

That's what happens when one doesn't read carefully an e-mail message.
Thanks, Andrew...

2003-08-03 02:13:18

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, Aug 02, 2003 at 03:22:02PM -0700, Andrew Morton wrote:
>> . Added Ingo's 4G/4G memory split patch. It takes my kernel build from
>> 1:51 to 1:53 so gee.

On Sat, Aug 02, 2003 at 05:11:19PM -0700, William Lee Irwin III wrote:
> No idea who, if anyone, listened last time I answered questions on
> this. Sending in fixes shortly...

Alright, let's get people an idea of what I'm talking about:

(a) pgd_ctor() is called once per pgd _object_, not once per page.
The code as posted does list_add() to the same page during
the ctor calls, and list_del() to the same page during dtor
calls when PAE is configured, for the precise reason that
there are multiple PAE pgd's per page. This is why mbligh hit
list poison. This patch restores the check for PTRS_PER_PMD == 1
in order to avoid oopsing on list poison.

(b) The entire pgd_ctor business (apart from the AGP fix, which was
actually no longer necessary once the rest was backed out) was
essentially backed out, since all preconstruction was backed out.
This restores the world to doing preconstruction as it should.

(c) clear_page_tables() only clears _userspace_ pmd's; pgd_free() isn't
clearing the trampoline pmd entries for the kernel pmd. This
patch separates out a slab for kernel pmd's to enforce the
separation (or otherwise bitblitting) _required_ by slab
preconstruction invariants and so fixes bad pmd bugs (I suppose
no one's reported them yet; perhaps not enough pmd_bad() checks
are around to catch them all or the ordering of allocations and
frees tends to be just right to avoid it for some odd reason).

(d) The #ifdef-lessness was backed out. Since everything else had to be
rearranged, this patch restores the #ifdef-lessness for free.
(This is actually a moderately useful property, since it means
all cases get compiletested regardless of .config).

(e) Either PAE or XKVA is enough to dodge needing pgd_lock and pgd_list
entirely. Don't touch them in those cases.

So the below is what it should actually look like. I didn't bother
waiting to test because there's a whole matrix to run through and that's
going to take ages, and apparently something needs to be out there to
demonstrate the right way to do this faster than that.

Yes, I'm testing (and bugfixing) the below myself.


-- wli


diff -prauN mm3-2.6.0-test2-1/arch/i386/mm/init.c mm3-2.6.0-test2-2/arch/i386/mm/init.c
--- mm3-2.6.0-test2-1/arch/i386/mm/init.c 2003-08-02 16:40:15.000000000 -0700
+++ mm3-2.6.0-test2-2/arch/i386/mm/init.c 2003-08-02 18:53:47.000000000 -0700
@@ -515,11 +515,13 @@ void __init mem_init(void)
load_LDT(&init_mm.context);
}

-kmem_cache_t *pgd_cache;
-kmem_cache_t *pmd_cache;
+kmem_cache_t *pgd_cache, *pmd_cache, *kpmd_cache;

void __init pgtable_cache_init(void)
{
+ void (*ctor)(void *, kmem_cache_t *, unsigned long);
+ void (*dtor)(void *, kmem_cache_t *, unsigned long);
+
if (PTRS_PER_PMD > 1) {
pmd_cache = kmem_cache_create("pmd",
PTRS_PER_PMD*sizeof(pmd_t),
@@ -529,13 +531,36 @@ void __init pgtable_cache_init(void)
NULL);
if (!pmd_cache)
panic("pgtable_cache_init(): cannot create pmd cache");
+
+ if (TASK_SIZE > PAGE_OFFSET) {
+ kpmd_cache = kmem_cache_create("kpmd",
+ PTRS_PER_PMD*sizeof(pmd_t),
+ 0,
+ SLAB_HWCACHE_ALIGN | SLAB_MUST_HWCACHE_ALIGN,
+ kpmd_ctor,
+ NULL);
+ if (!kpmd_cache)
+ panic("pgtable_cache_init(): "
+ "cannot create kpmd cache");
+ }
}
+
+ if (PTRS_PER_PMD == 1 || TASK_SIZE <= PAGE_OFFSET)
+ ctor = pgd_ctor;
+ else
+ ctor = NULL;
+
+ if (PTRS_PER_PMD == 1 && TASK_SIZE <= PAGE_OFFSET)
+ dtor = pgd_dtor;
+ else
+ dtor = NULL;
+
pgd_cache = kmem_cache_create("pgd",
PTRS_PER_PGD*sizeof(pgd_t),
0,
SLAB_HWCACHE_ALIGN | SLAB_MUST_HWCACHE_ALIGN,
- pgd_ctor,
- pgd_dtor);
+ ctor,
+ dtor);
if (!pgd_cache)
panic("pgtable_cache_init(): Cannot create pgd cache");
}
diff -prauN mm3-2.6.0-test2-1/arch/i386/mm/pgtable.c mm3-2.6.0-test2-2/arch/i386/mm/pgtable.c
--- mm3-2.6.0-test2-1/arch/i386/mm/pgtable.c 2003-08-02 16:40:15.000000000 -0700
+++ mm3-2.6.0-test2-2/arch/i386/mm/pgtable.c 2003-08-02 18:53:36.000000000 -0700
@@ -158,6 +158,17 @@ void pmd_ctor(void *pmd, kmem_cache_t *c
memset(pmd, 0, PTRS_PER_PMD*sizeof(pmd_t));
}

+void kpmd_ctor(void *__pmd, kmem_cache_t *cache, unsigned long flags)
+{
+ pmd_t *kpmd, *pmd;
+ kpmd = pmd_offset(&swapper_pg_dir[PTRS_PER_PGD-1],
+ (PTRS_PER_PMD - NR_SHARED_PMDS)*PMD_SIZE);
+ pmd = (pmd_t *)__pmd + (PTRS_PER_PMD - NR_SHARED_PMDS);
+
+ memset(__pmd, 0, (PTRS_PER_PMD - NR_SHARED_PMDS)*sizeof(pmd_t));
+ memcpy(pmd, kpmd, NR_SHARED_PMDS*sizeof(pmd_t));
+}
+
/*
* List of all pgd's needed so it can invalidate entries in both cached
* and uncached pgd's. This is essentially codepath-based locking
@@ -169,21 +180,60 @@ void pmd_ctor(void *pmd, kmem_cache_t *c
* could be used. The locking scheme was chosen on the basis of
* manfred's recommendations and having no core impact whatsoever.
* -- wli
+ *
+ * The entire issue goes away when XKVA is configured.
*/
spinlock_t pgd_lock = SPIN_LOCK_UNLOCKED;
LIST_HEAD(pgd_list);

+/*
+ * This is not that hard to figure out.
+ * (a) PTRS_PER_PMD == 1 means non-PAE.
+ * (b) PTRS_PER_PMD > 1 means PAE.
+ * (c) TASK_SIZE > PAGE_OFFSET means XKVA.
+ * (d) TASK_SIZE <= PAGE_OFFSET means non-XKVA.
+ *
+ * Do *NOT* back out the preconstruction like the patch I'm cleaning
+ * up after this very instant did, or at all, for that matter.
+ * This is never called when PTRS_PER_PMD > 1 && TASK_SIZE > PAGE_OFFSET.
+ * -- wli
+ */
void pgd_ctor(void *__pgd, kmem_cache_t *cache, unsigned long unused)
{
+ pgd_t *pgd = (pgd_t *)__pgd;
unsigned long flags;
- pgd_t *pgd0 = __pgd;

- spin_lock_irqsave(&pgd_lock, flags);
- list_add(&virt_to_page(pgd0)->lru, &pgd_list);
- spin_unlock_irqrestore(&pgd_lock, flags);
+ if (PTRS_PER_PMD == 1) {
+ if (TASK_SIZE <= PAGE_OFFSET)
+ spin_lock_irqsave(&pgd_lock, flags);
+ else
+ memcpy(&pgd[PTRS_PER_PGD - NR_SHARED_PMDS],
+ &swapper_pg_dir[PTRS_PER_PGD - NR_SHARED_PMDS],
+ NR_SHARED_PMDS * sizeof(pgd_t));
+ }
+
+ if (TASK_SIZE <= PAGE_OFFSET)
+ memcpy(pgd + USER_PTRS_PER_PGD,
+ swapper_pg_dir + USER_PTRS_PER_PGD,
+ (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
+
+ if (PTRS_PER_PMD > 1)
+ return;
+
+ if (TASK_SIZE > PAGE_OFFSET)
+ memset(pgd, 0, (PTRS_PER_PGD - NR_SHARED_PMDS)*sizeof(pgd_t));
+ else {
+ list_add(&virt_to_page(pgd)->lru, &pgd_list);
+ spin_unlock_irqrestore(&pgd_lock, flags);
+ memset(pgd, 0, USER_PTRS_PER_PGD*sizeof(pgd_t));
+ }
}

-/* never called when PTRS_PER_PMD > 1 */
+/*
+ * Never called when PTRS_PER_PMD > 1 || TASK_SIZE > PAGE_OFFSET
+ * for with PAE we would list_del() multiple times, and for non-PAE
+ * with XKVA all the AGP pgd shootdown code is unnecessary.
+ */
void pgd_dtor(void *pgd, kmem_cache_t *cache, unsigned long unused)
{
unsigned long flags; /* can be called from interrupt context */
@@ -193,87 +243,80 @@ void pgd_dtor(void *pgd, kmem_cache_t *c
spin_unlock_irqrestore(&pgd_lock, flags);
}

-#ifdef CONFIG_X86_PAE
-
+/*
+ * See the comments above pgd_ctor() wrt. preconstruction.
+ * Do *NOT* memcpy() here. If you do, you back out important
+ * anti- cache pollution code.
+ *
+ */
pgd_t *pgd_alloc(struct mm_struct *mm)
{
int i;
pgd_t *pgd = kmem_cache_alloc(pgd_cache, GFP_KERNEL);

- if (pgd) {
-#ifdef CONFIG_X86_4G_VM_LAYOUT
- pmd_t *pmd0, *kernel_pmd0;
-#endif
- pmd_t *pmd;
+ if (PTRS_PER_PMD == 1 || !pgd)
+ return pgd;

- for (i = 0; i < USER_PTRS_PER_PGD; ++i) {
- pmd = kmem_cache_alloc(pmd_cache, GFP_KERNEL);
- if (!pmd)
- goto out_oom;
- set_pgd(&pgd[i], __pgd(1 + __pa((u64)((u32)pmd))));
- }
+ /*
+ * In the 4G userspace case alias the top 16 MB virtual
+ * memory range into the user mappings as well (these
+ * include the trampoline and CPU data structures).
+ */
+ for (i = 0; i < USER_PTRS_PER_PGD; ++i) {
+ kmem_cache_t *cache;
+ pmd_t *pmd;

-#ifdef CONFIG_X86_4G_VM_LAYOUT
- /*
- * In the 4G userspace case alias the top 16 MB virtual
- * memory range into the user mappings as well (these
- * include the trampoline and CPU data structures).
- */
- pmd0 = pmd;
- kernel_pmd0 = (pmd_t *)__va(pgd_val(swapper_pg_dir[PTRS_PER_PGD-1]) & PAGE_MASK);
- memcpy(pmd0 + PTRS_PER_PMD - NR_SHARED_PMDS, kernel_pmd0 + PTRS_PER_PMD - NR_SHARED_PMDS, sizeof(pmd_t) * NR_SHARED_PMDS);
-#else
- memcpy(pgd + USER_PTRS_PER_PGD,
- swapper_pg_dir + USER_PTRS_PER_PGD,
- (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
-#endif
+ if (TASK_SIZE > PAGE_OFFSET && i == USER_PTRS_PER_PGD - 1)
+ cache = kpmd_cache;
+ else
+ cache = pmd_cache;
+
+ pmd = kmem_cache_alloc(cache, GFP_KERNEL);
+ if (!pmd)
+ goto out_oom;
+ set_pgd(&pgd[i], __pgd(1 + __pa((u64)((u32)pmd))));
}
+
return pgd;
out_oom:
+ /*
+ * we don't have to handle the kpmd_cache here, since it's the
+ * last allocation, and has either nothing to free or when it
+ * succeeds the whole operation succeeds.
+ */
for (i--; i >= 0; i--)
kmem_cache_free(pmd_cache, (void *)__va(pgd_val(pgd[i])-1));
kmem_cache_free(pgd_cache, pgd);
return NULL;
}

-#else /* ! PAE */
-
-pgd_t *pgd_alloc(struct mm_struct *mm)
-{
- pgd_t *pgd = kmem_cache_alloc(pgd_cache, GFP_KERNEL);
-
- if (pgd) {
-#ifdef CONFIG_X86_4G_VM_LAYOUT
- memset(pgd, 0, PTRS_PER_PGD * sizeof(pgd_t));
- /*
- * In the 4G userspace case alias the top 16 MB virtual
- * memory range into the user mappings as well (these
- * include the trampoline and CPU data structures).
- */
- memcpy(pgd + PTRS_PER_PGD-NR_SHARED_PMDS,
- swapper_pg_dir + PTRS_PER_PGD-NR_SHARED_PMDS,
- NR_SHARED_PMDS * sizeof(pgd_t));
-#else
- memset(pgd, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
- memcpy(pgd + USER_PTRS_PER_PGD,
- swapper_pg_dir + USER_PTRS_PER_PGD,
- (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
-#endif
- }
- return pgd;
-}
-
-#endif /* CONFIG_X86_PAE */
-
void pgd_free(pgd_t *pgd)
{
int i;

- /* in the PAE case user pgd entries are overwritten before usage */
- if (PTRS_PER_PMD > 1)
- for (i = 0; i < USER_PTRS_PER_PGD; ++i)
- kmem_cache_free(pmd_cache, (void *)__va(pgd_val(pgd[i])-1));
/* in the non-PAE case, clear_page_tables() clears user pgd entries */
+ if (PTRS_PER_PMD == 1)
+ goto out_free;
+
+ /* in the PAE case user pgd entries are overwritten before usage */
+ for (i = 0; i < USER_PTRS_PER_PGD; ++i) {
+ kmem_cache_t *cache;
+ pmd_t *pmd = __va(pgd_val(pgd[i]) - 1);
+
+ /*
+ * only userspace pmd's are cleared for us
+ * by mm/memory.c; it's a slab cache invariant
+ * that we must separate the kernel pmd slab
+ * all times, else we'll have bad pmd's.
+ */
+ if (TASK_SIZE > PAGE_OFFSET && i == USER_PTRS_PER_PGD - 1)
+ cache = kpmd_cache;
+ else
+ cache = pmd_cache;
+
+ kmem_cache_free(cache, pmd);
+ }
+out_free:
kmem_cache_free(pgd_cache, pgd);
}

diff -prauN mm3-2.6.0-test2-1/include/asm-i386/pgtable.h mm3-2.6.0-test2-2/include/asm-i386/pgtable.h
--- mm3-2.6.0-test2-1/include/asm-i386/pgtable.h 2003-08-02 16:40:24.000000000 -0700
+++ mm3-2.6.0-test2-2/include/asm-i386/pgtable.h 2003-08-02 18:29:46.000000000 -0700
@@ -32,12 +32,12 @@
#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
extern unsigned long empty_zero_page[1024];
extern pgd_t swapper_pg_dir[1024];
-extern kmem_cache_t *pgd_cache;
-extern kmem_cache_t *pmd_cache;
+extern kmem_cache_t *pgd_cache, *pmd_cache, *kpmd_cache;
extern spinlock_t pgd_lock;
extern struct list_head pgd_list;

void pmd_ctor(void *, kmem_cache_t *, unsigned long);
+void kpmd_ctor(void *, kmem_cache_t *, unsigned long);
void pgd_ctor(void *, kmem_cache_t *, unsigned long);
void pgd_dtor(void *, kmem_cache_t *, unsigned long);
void pgtable_cache_init(void);

2003-08-03 03:07:39

by Raphael Kubo da Costa

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

Ok, here's my first patch. ;)
This is a fix for i386's fpu_system.h, which was causing an error during
the compilation of fpu_entry.c.

--- linux-2.6.0-test2-mm3/arch/i386/math-emu/fpu_system.h 2003-08-02 22:54:44.000000000 -0300
+++ linux-2.6.0-test2-mm3-fix/arch/i386/math-emu/fpu_system.h 2003-08-02 22:53:55.000000000 -0300
@@ -22,7 +22,7 @@

/* s is always from a cpu register, and the cpu does bounds checking
* during register load --> no further bounds checks needed */
-#define LDT_DESCRIPTOR(s) (((struct desc_struct *)current->mm->context.ldt)[(s) >> 3])
+#define LDT_DESCRIPTOR(s) (((struct desc_struct *)current->mm->context.ldt_pages)[(s) >> 3])
#define SEG_D_SIZE(x) ((x).b & (3 << 21))
#define SEG_G_BIT(x) ((x).b & (1 << 23))
#define SEG_GRANULARITY(x) (((x).b & (1 << 23)) ? 4096 : 1)


2003-08-03 05:19:27

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, 2 Aug 2003, Andrew Morton wrote:

> . I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
> PS/2 or synaptics problems. You are all very bad.

It works now by disabling CONFIG_MOUSE_PS2_SYNAPTICS

Thanks,
Zwane

2003-08-03 05:34:38

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, 2 Aug 2003, Andrew Morton wrote:

> Zwane Mwaikambo <[email protected]> wrote:
> >
> > On Sat, 2 Aug 2003, Andrew Morton wrote:
> >
> > > . I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
> > > PS/2 or synaptics problems. You are all very bad.
> >
> > It works now by disabling CONFIG_MOUSE_PS2_SYNAPTICS
> >
>
> err, that's a bug isn't it?

I've had a hard time following the saga behind the synaptics code. I know
there is some external thing you have to download but never got round to
doing it. I'll give that a go now too with CONFIG_MOUSE_PS2_SYNAPTICS.
Colour me lazy...

--
function.linuxpower.ca

2003-08-03 05:27:37

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

Zwane Mwaikambo <[email protected]> wrote:
>
> On Sat, 2 Aug 2003, Andrew Morton wrote:
>
> > . I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
> > PS/2 or synaptics problems. You are all very bad.
>
> It works now by disabling CONFIG_MOUSE_PS2_SYNAPTICS
>

err, that's a bug isn't it?

2003-08-03 05:38:56

by Joshua Kwan

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sun, Aug 03, 2003 at 01:22:51AM -0400, Zwane Mwaikambo wrote:
> > > It works now by disabling CONFIG_MOUSE_PS2_SYNAPTICS
> > >
> >
> > err, that's a bug isn't it?
>
> I've had a hard time following the saga behind the synaptics code. I know
> there is some external thing you have to download but never got round to
> doing it. I'll give that a go now too with CONFIG_MOUSE_PS2_SYNAPTICS.
> Colour me lazy...

I really don't understand the point behind the synaptics code. I would
have imagined it to be an extension to the generic PS/2 code that would
finally allow me to use my 'scroll buttons' on my trackpad, but it has
caused nothing but problems. and I'm also kind of a console jockey so I
really need GPM working, which is why i'm always booting with
psmouse_noext these days...

-Josh

--
Using words to describe magic is like using a screwdriver to cut roast beef.
-- Tom Robbins


Attachments:
(No filename) (901.00 B)
(No filename) (189.00 B)
Download all attachments

2003-08-03 06:00:34

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sun, 3 Aug 2003, Zwane Mwaikambo wrote:

> > err, that's a bug isn't it?
>
> I've had a hard time following the saga behind the synaptics code. I know
> there is some external thing you have to download but never got round to
> doing it. I'll give that a go now too with CONFIG_MOUSE_PS2_SYNAPTICS.
> Colour me lazy...

Ok after downloading the XFree86 driver and enabling
CONFIG_MOUSE_PS2_SYNAPTICS everything is peachy, plus i get to use the
scroll buttons.

So its confirmed working on my formerly 'broken' setup.

Thanks,
Zwane

2003-08-03 07:05:53

by Danek Duvall

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, Aug 02, 2003 at 03:22:02PM -0700, Andrew Morton wrote:

> . I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
> PS/2 or synaptics problems. You are all very bad.

I tried it on my Fujitsu P2120, hoping that the PS/2 resume patch would
help it wake up from S3 properly, but no such luck. The radeon
framebuffer doesn't restore, and the keyboard doesn't work. The mouse
might, but there's no way for me to tell.

If I remember correctly, the network functioned properly on resume in
test1-mm2, but doesn't in test2-mm3, so I had to do a reset.

Danek

2003-08-03 07:15:25

by Eugene Teo

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

<quote sender="Danek Duvall">
> On Sat, Aug 02, 2003 at 03:22:02PM -0700, Andrew Morton wrote:
>
> > . I don't think anyone has reported on whether 2.6.0-test2-mm2 fixed any
> > PS/2 or synaptics problems. You are all very bad.
>
> I tried it on my Fujitsu P2120, hoping that the PS/2 resume patch would
> help it wake up from S3 properly, but no such luck. The radeon
> framebuffer doesn't restore, and the keyboard doesn't work. The mouse
> might, but there's no way for me to tell.
>
> If I remember correctly, the network functioned properly on resume in
> test1-mm2, but doesn't in test2-mm3, so I had to do a reset.

does your logs say that network is not functioning, yet syslog seems to
be running all these while? did you use a radeontool to "off" your lcd
screen? fyi, I am using Fujitsu E-7010.

Eugene

2003-08-03 07:28:58

by Danek Duvall

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sun, Aug 03, 2003 at 03:15:20PM +0800, Eugene Teo wrote:

> does your logs say that network is not functioning, yet syslog seems
> to be running all these while? did you use a radeontool to "off" your
> lcd screen? fyi, I am using Fujitsu E-7010.

There's nothing in my logs at all, so there's no way to tell what, if
anything, survived the resume. I'm not using radeontool; I hadn't even
been aware of it until now.

Danek

2003-08-03 07:37:24

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, Aug 02, 2003 at 04:42:05PM -0700, Andrew Morton wrote:
> We already have a bucketload of highmem hacks in the kernel, and they are
> not sufficient for some people. We have several more (large) highmem hacks
> being proposed.

Please don't put page clustering anywhere near that blacklist. There's
a lot more to it than "gee, wli shrank mem_map[] again".


On Sat, Aug 02, 2003 at 04:42:05PM -0700, Andrew Morton wrote:
> wrt long-term kernel purity: one approach would be to not merge 4G+4G into
> 2.7 at all. This keeps the long-term kernel codebase saner. It assumes
> that the monster 32-bit boxes will have been obsoleted by 64-bit machines
> within 3-4 years and that it is acceptable to end-of-line those machines on
> a 2.6-based kernel. I think that's pretty safe.

Maybe some way to get feedback to/from cpu vendors about this would
help. If we really want to kill highmem dead in 2.7, beating cpu
vendors with a baseball bat until they^W^W^W^W^W^W^W^Wkindly asking
cpu vendors to kill that fucking PAE shit dead (goddammit!) might help.


-- wli

2003-08-03 07:45:07

by Eugene Teo

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

<quote sender="Danek Duvall">
> On Sun, Aug 03, 2003 at 03:15:20PM +0800, Eugene Teo wrote:
>
> > does your logs say that network is not functioning, yet syslog seems
> > to be running all these while? did you use a radeontool to "off" your
> > lcd screen? fyi, I am using Fujitsu E-7010.
>
> There's nothing in my logs at all, so there's no way to tell what, if
> anything, survived the resume. I'm not using radeontool; I hadn't even
> been aware of it until now.

check /var/log/message, did you get something like:

Aug 3 07:01:22 amaryllis -- MARK --
Aug 3 07:21:22 amaryllis -- MARK --
Aug 3 07:41:22 amaryllis -- MARK --
Aug 3 08:01:22 amaryllis -- MARK --
Aug 3 08:21:22 amaryllis -- MARK --
Aug 3 08:41:22 amaryllis -- MARK --
Aug 3 09:01:22 amaryllis -- MARK --
Aug 3 09:21:22 amaryllis -- MARK --
Aug 3 09:41:22 amaryllis -- MARK --

it shows that even though the laptop "freezes", the
laptop is still functioning. i just can't bring it
back to life or resume it. network is down, evident
from my getmail logs.

radeontool is simply a userspace tool that turns off
your lcd backlight.

Eugene

2003-08-03 10:36:32

by Jose Luis Domingo Lopez

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Saturday, 02 August 2003, at 23:57:07 -0300,
Raphael Kubo da Costa wrote:

> Ok, here's my first patch. ;)
> This is a fix for i386's fpu_system.h, which was causing an error during
> the compilation of fpu_entry.c.
>
> --- linux-2.6.0-test2-mm3/arch/i386/math-emu/fpu_system.h 2003-08-02
> 22:54:44.000000000 -0300
> +++ linux-2.6.0-test2-mm3-fix/arch/i386/math-emu/fpu_system.h 2003-08-02
> 22:53:55.000000000 -0300
> @@ -22,7 +22,7 @@
>
Your mailer seems to have messed up newlines, so the patch (that I have
no idea if its correct or not) won't apply ;-)

Regards,

--
Jose Luis Domingo Lopez
Linux Registered User #189436 Debian Linux Sid (Linux 2.6.0-test2-mm2)

2003-08-03 15:10:33

by Marcelo Abreu

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

Jose Luis Domingo Lopez wrote:
> Your mailer seems to have messed up newlines, so the patch (that I have
> no idea if its correct or not) won't apply ;-)


The 4G patch has changed the 'ldt' member of mm_context_t, calling it
'ldt_pages'. Patch from Raphael fixes fpu_system.h for correct
compilation, but system won't boot with 'no387' parameter. So semantics
must have been changed too.

Maybe Ingo can review the effects of his changes on FPU emulation code.


Marcelo Abreu


2003-08-03 16:00:05

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3


On Sun, 3 Aug 2003, Marcelo Abreu wrote:

> The 4G patch has changed the 'ldt' member of mm_context_t, calling
> it 'ldt_pages'. Patch from Raphael fixes fpu_system.h for correct
> compilation, but system won't boot with 'no387' parameter. So semantics
> must have been changed too.

i sent the correct patch to Andrew already:

--- linux/arch/i386/math-emu/fpu_system.h.orig
+++ linux/arch/i386/math-emu/fpu_system.h
@@ -15,6 +15,7 @@
#include <linux/sched.h>
#include <linux/kernel.h>
#include <linux/mm.h>
+#include <asm/atomic_kmap.h>

/* This sets the pointer FPU_info to point to the argument part
of the stack frame of math_emulate() */
@@ -22,7 +23,7 @@

/* s is always from a cpu register, and the cpu does bounds checking
* during register load --> no further bounds checks needed */
-#define LDT_DESCRIPTOR(s) (((struct desc_struct *)current->mm->context.ldt)[(s) >> 3])
+#define LDT_DESCRIPTOR(s) (((struct desc_struct *)__kmap_atomic_vaddr(KM_LDT_PAGE0))[(s) >> 3])
#define SEG_D_SIZE(x) ((x).b & (3 << 21))
#define SEG_G_BIT(x) ((x).b & (1 << 23))
#define SEG_GRANULARITY(x) (((x).b & (1 << 23)) ? 4096 : 1)


2003-08-04 12:10:45

by jlnance

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3

On Sat, Aug 02, 2003 at 04:42:05PM -0700, Andrew Morton wrote:

> Bolting 64G of memory onto a 32-bit CPU is tasteless too...
>
> We already have a bucketload of highmem hacks in the kernel, and they are
> not sufficient for some people. We have several more (large) highmem hacks
> being proposed.

Do you see us removing the other highmem hacks if we add the 4G/4G patch?

Thanks,

Jim