short version:
1) When I am referencing a pointer in the kernel, is the value of that
pointer variable interpreted by the cpu as a logical or linear address?
2) if I have two overlapping data/stack segments presently selected,
each with a different base, how does the cpu know which segment/base
address to use to get the linear address?
longer version:
(please stop me if I'm just being stupid.)
As I understand it, logical addresses are interpreted as offsets from
the base address of some current segment descriptor. in at least the
2.0 kernel the base address of the data segment selected by the
"KERNEL_DS" segment selector is 0xc00000000. so, to get a linear
address I would do this...
int anything;
ulong linear_anything_addr = ((unsigned long) & anything) + 0xc00000000;
right?
and...
I'm messing with stack segments, trying to give each kernel stack it's
own segment. (just for kicks.) I have one data segment with a base
address of 0xc0000000 and a limit of 1GB(4GB?) that I think I use for
all data access. If I make a stack segment with a base address that
falls within the area governed by the data segment, say the base address
is 0xc00000ff, what happens when I do something like this?:
struct pt_regs * regs;
unsigned long * stack_pointer;
regs = ((struct pt_regs *) (current->kernel_stack_page + PAGE_SIZE))-1;
stack_pointer = (unsigned long *) regs->esp;
printk( "%ul", *stack_pointer ); /* what is this pointing to? */
So, what I'm thinking here is that the value of %%esp is treated as an
offset from the base address of the stack segment, 0xc00000ff, but when
I use it in a non-stack oriented context, it's possibly treated as an
offset from some other segment's base address. how does the cpu choose
which segment to use?
Thanks a lot.
- Ben
At some point in the past, someone had their attribution stripped from:
>> 1) When I am referencing a pointer in the kernel, is the value of that
>> pointer variable interpreted by the cpu as a logical or linear address?
>> 2) if I have two overlapping data/stack segments presently selected,
>> each with a different base, how does the cpu know which segment/base
>> address to use to get the linear address?
On Tue, Sep 16, 2003 at 05:48:04PM -0700, Martin J. Bligh wrote:
> IIRC, all the base segments are 0, so none of this matters ;-)
2.0.x used such things as the segment bits, hardware tasking, and so on.
-- wli
> 1) When I am referencing a pointer in the kernel, is the value of that
> pointer variable interpreted by the cpu as a logical or linear address?
>
> 2) if I have two overlapping data/stack segments presently selected,
> each with a different base, how does the cpu know which segment/base
> address to use to get the linear address?
IIRC, all the base segments are 0, so none of this matters ;-)
M.
> At some point in the past, someone had their attribution stripped from:
>>> 1) When I am referencing a pointer in the kernel, is the value of that
>>> pointer variable interpreted by the cpu as a logical or linear address?
>>> 2) if I have two overlapping data/stack segments presently selected,
>>> each with a different base, how does the cpu know which segment/base
>>> address to use to get the linear address?
>
> On Tue, Sep 16, 2003 at 05:48:04PM -0700, Martin J. Bligh wrote:
>> IIRC, all the base segments are 0, so none of this matters ;-)
>
> 2.0.x used such things as the segment bits, hardware tasking, and so on.
And happily is now supremely irrelevant ;-)
BTW, to the original question ... chapter 2 of "Understanding the Linux Kernel"
had a good explanation of all this.
M.
On Tue, Sep 16, 2003 at 05:58:08PM -0700, Martin J. Bligh wrote:
>
> BTW, to the original question ... chapter 2 of "Understanding the Linux Kernel"
> had a good explanation of all this.
Thank you. I've been reading the first addition. is there a second?
the second chapter has a very good explanation of paging and how linear
addresses are used. logical addresses on the other hand are barely
mentioned. Segmentation is described well, but the translation of
logical into linear addresses is not described.
I've read elsewhere that logical addresses are comprised of a 16-bit
segment selector and a 32-bit offset. I thought pointers were always
exactly 32-bits (on 32-bit intel). where is the 16-bit selector?
Thanks again.
- Ben
On Tue, Sep 16, 2003 at 06:44:21PM -0700, Ben Johnson wrote:
> Thank you. I've been reading the first addition. is there a second?
> the second chapter has a very good explanation of paging and how linear
> addresses are used. logical addresses on the other hand are barely
> mentioned. Segmentation is described well, but the translation of
> logical into linear addresses is not described.
> I've read elsewhere that logical addresses are comprised of a 16-bit
> segment selector and a 32-bit offset. I thought pointers were always
> exactly 32-bits (on 32-bit intel). where is the 16-bit selector?
You might want to look at intel's volume 3. They're kept in dedicated
registers separate from the pointers and used implicitly.
-- wli
On Wed, 2003-09-17 at 03:44, Ben Johnson wrote:
> On Tue, Sep 16, 2003 at 05:58:08PM -0700, Martin J. Bligh wrote:
> >
> > BTW, to the original question ... chapter 2 of "Understanding the Linux Kernel"
> > had a good explanation of all this.
>
> Thank you. I've been reading the first addition. is there a second?
> the second chapter has a very good explanation of paging and how linear
> addresses are used. logical addresses on the other hand are barely
> mentioned. Segmentation is described well, but the translation of
> logical into linear addresses is not described.
>
> I've read elsewhere that logical addresses are comprised of a 16-bit
> segment selector and a 32-bit offset. I thought pointers were always
> exactly 32-bits (on 32-bit intel). where is the 16-bit selector?
It was a long time ago since I read about i386 addressing mechanisms,
but if my memory serves me well:
The selector is a 16-bit number which points to a segment descriptor in
either the GDT or the LDT.
The LDT is a Local Descriptor Table which is particular to a given task.
There is a task register on the CPU which contains a region of memory
used to store CPU registers when a context switch is performed and where
the LDT for that task is stored. The GDT is global to the system and
shared by all tasks.
Basically, the segment descriptor tells what kind of segment it is
(code, data, stack), defines its priority (the ring, being 0 the most
privileged and 3 the least), if it's present in memory, and defines it's
base linear address and it's size (either in bytes or in 4KB pages,
depending on it's granularity).
A selector is stored in any of the CPU segment registers (CS, DS, ES,
FS, GS and SS).
Thus from the <selector:offset> pair, you get the base address from the
segment descriptor pointed to by the selector, and then add it up the
offset. The resulting linear address is then converted by the MMU to a
physical address using the page tables. Note that even selector+offset
is a 48-bit number, the resulting address is always a 32-bit address.
Modern processors can address 36 bit addresses using PAE extensions, but
those remember me of the times when we mapped Expanded Memory into 64KB
frames below 1MB of RAM.
I recommend you reading the Intel manuals, they are worth reading :-)
On Tue, Sep 16, 2003 at 06:55:27PM -0700, William Lee Irwin III wrote:
>> You might want to look at intel's volume 3. They're kept in dedicated
>> registers separate from the pointers and used implicitly.
On Tue, Sep 16, 2003 at 07:07:23PM -0700, Ben Johnson wrote:
> I've been reading that too. The problem is that there are 6 segment
> selector registers and 4 of those are just for data segments. several
> data segments can be in use simultaneously and they can all have
> different base addresses and limits. The only explanation I've found so
> far about how a segment is chosen is that logical address are 48-bit
> values, yet sizeof(void *) == 4. there has to be a way to match up
> pointer with a segment, but I am unable to find it so far. (maybe I
> need a nap.)
Logical addresses aren't 48-bit; they're just offset from linear (and
vice-versa).
The way the extra data segment registers are used is by explicitly
qualifying operands with segments.
-- wli
On Tue, Sep 16, 2003 at 06:55:27PM -0700, William Lee Irwin III wrote:
>
> You might want to look at intel's volume 3. They're kept in dedicated
> registers separate from the pointers and used implicitly.
I've been reading that too. The problem is that there are 6 segment
selector registers and 4 of those are just for data segments. several
data segments can be in use simultaneously and they can all have
different base addresses and limits. The only explanation I've found so
far about how a segment is chosen is that logical address are 48-bit
values, yet sizeof(void *) == 4. there has to be a way to match up
pointer with a segment, but I am unable to find it so far. (maybe I
need a nap.)
Thanks,
- Ben
On Tue, Sep 16, 2003 at 07:10:45PM -0700, William Lee Irwin III wrote:
>
> The way the extra data segment registers are used is by explicitly
> qualifying operands with segments.
ah! now that makes sense.
So, I'm guessing the DS register is used by default to select the base
address for all non-stack oriented operations. And I bet the SS
register is used by default for stack oriented operations, and all ops
that act on %esp (and %ebp?). I think that make my life easier.
Thanks a lot!
- Ben
On Tue, 16 Sep 2003, Ben Johnson wrote:
> short version:
>
> 1) When I am referencing a pointer in the kernel, is the value of that
> pointer variable interpreted by the cpu as a logical or linear address?
>
> 2) if I have two overlapping data/stack segments presently selected,
> each with a different base, how does the cpu know which segment/base
> address to use to get the linear address?
>
[SNIPPED...]
All stack offsets are accessed relative to SS. No exceptions.
However a compiler may calculate those offsets based upon
something else.
This is why DS must equal SS if 'C' is going to access both
stack data variables and data segment variables. This is how
the 'C' code converter is set up. It is not a CPU limitation.
If you change the SS in the kernel, strange and wonderful
things will occur.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
Note 96.31% of all statistics are fiction.
On Wed, Sep 17, 2003 at 07:39:53AM -0400, Richard B. Johnson wrote:
>
> All stack offsets are accessed relative to SS. No exceptions.
> However a compiler may calculate those offsets based upon
> something else.
> This is why DS must equal SS if 'C' is going to access both
> stack data variables and data segment variables. This is how
> the 'C' code converter is set up. It is not a CPU limitation.
> If you change the SS in the kernel, strange and wonderful
> things will occur.
Let me see if I understand you. If SS and DS point to segments that
have different base addresses then code like this... (I'm an assembly
newbie. hope I get this right.)
# get whatever is at %ss:%esp + 4 and put it in eax
movl 4(%esp), %eax
movl %esp, %edx
# get whatever is at %ds:%edx + 4 and put it in eax
movl 4(%edx), %eax
# eax probably changed twice because while esp and edx have same value,
# if SS->baseaddr != DS->baseaddr, then (%esp) and (%edx) don't point to
# the same memory location.
I'm pretty sure I've seen plenty of code like this, which must mean,
like you just told me, that the C compiler assumes the base address of
DS and SS are the same. So, if I want to change segment base addresses
then I'm up shit creek.
Thanks very much for the info!
- Ben
On Wed, 17 Sep 2003, Ben Johnson wrote:
> On Wed, Sep 17, 2003 at 07:39:53AM -0400, Richard B. Johnson wrote:
> >
> > All stack offsets are accessed relative to SS. No exceptions.
> > However a compiler may calculate those offsets based upon
> > something else.
> > This is why DS must equal SS if 'C' is going to access both
> > stack data variables and data segment variables. This is how
> > the 'C' code converter is set up. It is not a CPU limitation.
> > If you change the SS in the kernel, strange and wonderful
> > things will occur.
>
> Let me see if I understand you. If SS and DS point to segments that
> have different base addresses then code like this... (I'm an assembly
> newbie. hope I get this right.)
>
> # get whatever is at %ss:%esp + 4 and put it in eax
> movl 4(%esp), %eax
> movl %esp, %edx
> # get whatever is at %ds:%edx + 4 and put it in eax
> movl 4(%edx), %eax
>
> # eax probably changed twice because while esp and edx have same value,
> # if SS->baseaddr != DS->baseaddr, then (%esp) and (%edx) don't point to
> # the same memory location.
Correct. You can make segments with different attributes, like
you can make the stack non-executable, etc. However, the base
addresses need to be the same so that DS:[0], ES:[0], and SS:[0]
all point to the same memory location or else the offsets computed
by the 'C' compiler won't work.
You can force an assembler to calculate different offsets. For instance,
you could have the base of DS be one page lower (0x1000) than SS.
Then you could make the assembler calculate, based upon that
difference, (in Intel, TASM, MASM, the ASSUME statement). But
with most 'C' compilers, you are stuck with what you have.
>
> I'm pretty sure I've seen plenty of code like this, which must mean,
> like you just told me, that the C compiler assumes the base address of
> DS and SS are the same. So, if I want to change segment base addresses
> then I'm up shit creek.
>
Pretty much unless you want to rewrite everything that accesses data
on the stack in assembly!
> Thanks very much for the info!
>
If you wanted to write everything in assembly (shudder), in
principle, you could set up a stack segment that pointed
to entirely different RAM (real RAM) than the data segment.
However, it would mean that you need to rewrite everything
that copies data to/from local data (the stack), etc. Without
that memcpy(), etc., wouldn't work. In fact, the Intel built-in
macros like movsb, movsw, movsl, etc., assume DS:ESI, ES:EDI, for
index (pointer) registers. Note also that there are two seldom-
used segment registers, FS and GS.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.22 on an i686 machine (794.73 BogoMips).
Note 96.31% of all statistics are fiction.
Richard B. Johnson wrote:
> Note also that there are two seldom- used segment registers, FS and GS.
GS is used for thread-local storage in userspace now, since NPTL
became the standard Linux threading library.
-- Jamie