TIP: Click on subject to list as thread! ANSI
echo: os2prog
to: Ivan Todoroski
from: Denis Tonn
date: 1998-12-31 15:07:00
subject: Some weird addressing...

Original from  IVAN TODOROSKI  to DENIS TONN on 12-29-1998
Original Subject: Some weird addressing...

                         ---------------------------------------

IT>   Hey, what's all this 64 KB allocation stuff?!

 It's not 64K allocation, it's allocation units occuring on 64K 
boundries.

IT>   I thought that was gone?

 It is. A 32 bit program uses a "flat" selector with a base of zero 
and a limit of 512MB (448MB for Warp 3 GA). 

IT>    Do you get a single descritor with a 1000000 byte limit, or a bunch
IT>    of small descriptors with 65536 limits?

 The 32 bit program uses 3 "flat" 32 bit selectors (CS, DS, and ES), 
all with a base of zero and a limit of 512MB. A 32 bit application can
ignore selectors. It is ONLY when it needs to pass an address to a
16:16 piece of code that the tiling of the LDT comes into play... 

IT>   What's going on inside there? :)

 OK.. As you suggest, and example might help. 

 Lets say you have a 32 bit app that call DosAllocMem for 132K bytes. 
Alloc returns an address to you of (say) 00050000h where you have 132K 
available. Now your program calls Alloc again for 204K, Alloc will 
return an address of 00080000h where you have 204K available. Your 32 
bit program can use these addresses directly, and can access any of
the addresses between 00050000-00070FFF (132K) and 00080000-000B2FFF 
(204K).
 Lets add 2 more allocations of 4K only. Alloc returns an address of 
000C0000 where you have only 4K usable for the first one, and an 
address of 000D0000 for the second one, again only 4K usable. Your 32 
bit app can now use addresses between 000C0000-000C0FFF and 
000D0000-000D0FFF. 
 Your app  CANNOT access addresses between 00071000-0007FFFF,
000B3000-000BFFFF, 000C1000-000CFFFF, 000D1000-000DFFFF, etc.. 

 The allocations are in units of 4K (always rounded up), but *start*
on a 64K boundry. Your 32 bit application uses these 32 bit addresses 
directly, using the flat CS, DS, and ES selectors, and the page 
tables prevent you from using any other addresses. All is as you 
expect and understand (except for the 64K allocation boundries).  

 The reason for allocating on 64K boundries is so that *IF* your 
program needs to pass an address to an older 16:16 code, it can be 
converted easily to a 16:16 form that the older code will understand. 
 The flat selectors assigned to the 32 bit app are in the GDT. Behind
the scenes, the system *also* builds LDT selectors that "map" the same
ranges of memory (during the Alloc). Your 32 bit app does *not* use
these selectors, they are just there in case you need to convert a 32
bit address into a 16:16 form. In the above example, the system would 
have built LDT entries in the following layout:
   
  Selector 002F    Base 00050000 Limit FFFF \
  Selector 0037    Base 00060000 Limit FFFF   Maps 132K allocation
  Selector 003F    Base 00070000 Limit 0FFF /
  Selector 0047    Base 00080000 Limit FFFF \
  Selector 004F    Base 00090000 Limit FFFF   Maps 204K allocation
  Selector 0057    Base 000A0000 Limit FFFF
  Selector 005F    Base 000B0000 Limit 2FFF /
  Selector 0067    Base 000C0000 Limit 0FFF   Maps 4K allocation
  Selector 006F    Base 000D0000 Limit 0FFF   Maps 4K allocation

 Now, *IF* your application needed to convert one of the 32 bit 
addreses to a 16:16 form, it can easily do so. Say that you wanted a 
field at address 000B1234 to be updated by some 16:16 code (an old DLL 
for example). You need to convert this 32 bit address into something 
the 16:16 code can handle. Because of the tiling of the LDT, this is 
very simple. Take the upper 16 bits of the 32 bit address and shift to
the left by 3 bits (times 8) and (since you are trying to convert this
into a ring 3 LDT selector) turn on the last 3 bits (plus 7). In this 
example 000B*8+7=005F. That becomes our "selector" part of the 16:16 
address. The last 16 bits of the 32 bit address become the offset part
of the 16:16 address, thus 005F:1234 is the address we pass to the 
16:16 code to use. The LDT entry (and the page tables) prevents the
16 bit code from accessing beyond the allocation unit (limit of 2FFF). 

 To convert a 16:16 address form back to 32 bit we just reverse the 
process. Shift the selector portion to the right by 3 bits (integer 
divide by 8), which becomes the upper 16 bits of the 32 bit address, 
and the offset portion becomes the lower 16 bits of the 32 bit 
address. This allows your 32 bit code to use a 32 bit address to 
access memory that is allocated via DosAllocSeg by a 16:16 piece of 
code.
 (One caution is that the 32 bit code will ALWAYS have access to a 
full page, whereas the 16:16 code can have a selector limit less than
a 4K boundry when allocated via AllocSeg).

 Note: Obviously the "size" of each memory block passed to the 16:16 
code via this address conversion technique cannot cross a 64K boundry. 
 
 Hope this explains the reason for allocating units on 64K boundries. 
It is only for compatability with older 16:16 code. Your 32 bit app
does not normally know (or care) that the system has built "mappings"
for all your allocation units in the LDT. They are there just in case
a conversion needs to take place between a flat 32 bit address and a
16:16 address. Since the allocations always occur on a 64K boundry
(tiling of the LDT), the 16 bit offset can be used directly, and the
selector can be "created" on the fly quickly. This allows older 16:16
code to coexist with flat 32 bit code. The 32 bit code must "know"
that it is calling older code, or a "stub" must be inserted between
the 2 pieces that knows the parms and how to convert addresses between
the formats. 

 Again, keep in mind that the 32 bit code does not "use" the tiled LDT
selectors, it uses flat 32 bit selectors with 32 bit "offsets" 
(effectively a 32 bit "address" with a flat selector). The ONLY reason 
for LDT tiling is for address conversion from flat to 16:16 form. It's
all the same "memory", just accessed with different address formats. 


 Aside: The largest 32 bit address that can be converted in this 
fashion is 1FFFFFFF (512MB). The region below 512MB is called the 
"compatability region" because of this. The only versions of OS/2 that 
allow applications to allocate memory above this region are Warp 
Server SMP and the Aurora beta. Addresses in the Himem region (2.5GB)
cannot be "converted" for use by older 16:16 code in this fashion.


 The fallout of all this is that it is "better" in terms of address 
space utilization to allocate a single large (64K minumim) block of
memory than to allocate a series of smaller units. Memory is still
allocated in 4K "blocks", but the location (address) of each
allocation is on a 64K boundry. If you allocate a LOT of small blocks,
you could run into "address space" exhaustion. You might still have
lots of "memory" that can be allocated (it's all virtual anyway), just
no available "addresses" to assign to that memory. Your "address
space" has been used up. Rewrite your program to allocate a large
block and then suballocate (see DosSubAllocMem) from that pool and you
don't have the problem. 

 Warp Server SMP and Aurora allocate on 64K boundries in the 
compatability region and 4K boundries in the himem region (above
512MB). 
 

 Another aside: There is no reason a 32 bit piece of code could not 
use a 16:16 address directly. It is faster (in large blocks of memory)
to use a flat 32 bit address. The "speed" increases in 32 bit code
come from NOT having to reload the selector registers all the time. I
occasionally see 32 bit code that does this, but not often. 
 Older 16:16 code cannot normally use a 32 bit offset. 16:16 code 
*CAN* use a size override on the instruction, but that means the
16:16 code would not be "old" per se. It must "know" to
expect 32 bit
addresses. The only places I have seen such 16:16 code is in the
kernel and in some device drivers, neither of them "old", always
written in assembler. 



 Oh, and if anyone wants to point it out, I have not covered any of 
the differences between Warp 3 and Warp 4 concerning the actions of 
DosSetMem on the "invalid" addresses beyond a 4K allocation up to the 
next 64K boundry. It is related to this discussion, but not really 
relevant (and can confuse the issue). Suffice it to say that using 
DosSetMem under Warp 4, you *can* gain access to additional memory 
that these "invalid" addresses could point to. DosSetMem updates the
LDT "under the covers" to match the flat 32 bit mapping of memory. 


   Denis       

 All opinions are my very own, IBM has no claim upon them
. 
. 
.
 

 







--- Maximus/2 3.01
* Origin: T-Board - (604) 277-4574 (1:153/908)
SEEN-BY: 396/1 632/0 371 633/260 262 267 270 371 635/444 506 728 639/252
SEEN-BY: 670/218
@PATH: 153/908 8086 800 140/1 396/1 633/260 635/506 728 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.