TIP: Click on subject to list as thread! ANSI
echo: os2prog
to: Jonathan de Boyne Pollar
from: Denis Tonn
date: 1998-09-30 13:20:04
subject: How do DLLs load and unl

Original from  Jonathan de Boyne Pollard  to Denis Tonn on 09-27-1998
Original Subject: How do DLLs load and unlo

                         ---------------------------------------

  DT> If you are asking about the sequence that each DLL's InitTerm are 
  DT> called during process end, I would have to check into it. 
 
JP> No.  I'm asking how OS/2 resolves the chicken-and-egg problem outlined in my 
JP> original message (not being able to unmap the modules until 
JP> the InitTerm has been called, but not being able to 
JP> determine what InitTerms to call until it has traversed the 
JP> module graph and unmapped the modules whose reference 
JP> counts have dropped to zero).  As far as I could see, the 

 The reference count is only the number of processes that are using 
the DLL. It happens in the same code that maps the DLL into the
address space of the process (increment) and unmapping of the DLL
(decrement) from the address space. The system uses the reference
count to determine when it can drop the DLL from the SYSTEM, it has no
effect on the process. The reference count is a count of the number of
*processes* that are using a DLL. 

JP> only way to resolve this dilemma is to have the kernel call 
JP> into user-mode code *while* it is traversing the module 
JP> graph and altering the reference counts, which I think is 
JP> *very* ugly.  Your message confirmed this:

 Yep.. But there is no way around it.. 
 And when you look at the code involved, it is not all that ugly.. In 
some ways I think of it as somewhat elegant (elegant assembler?). 

  DT> Yep, it calls ring 3 from ring 0, and sets up a return address on the
  DT> stack that points into DOSCALL1. The code in DOSCALL1 then gets back 
  DT> to ring 0 through a callgate (actually, the stack setup is done in
  DT> DOSCALL1). Simple stack manipulation ..   
 
JP> The problem with this is with recursion, and pathological 
JP> InitTerm functions.  As I said:
 
  JP>> [...] the pathological, but permissible, case that the InitTerm
  JP>> function of a module may itself call DosLoadModule or 
  JP>> DosFreeModule ?  Or the equally pathological case that the InitTerm
  JP>> function, either accidentally or deliberately, never returns at
  JP>> all (leaving the kernel internals in an intermediate state) [...]
 
JP> I wasn't asking whether this sort of code was *valid*, by 
JP> the way.  It's obviously valid, and in any case a decent 
JP> operating system has to protect itself against mischeivous 
JP> application code.  The question is how does the kernel deal 
JP> with the kernel stack overflow or deadlock issues that 
JP> result ?  You didn't see that this was the main thrust of 
JP> my question, and only referred to this obliquely:

 It doesn't operate with a kernel stack during the callback. It
operates with the Ring 3 stack. If the process/thread stack 
overflows, normal exception processing applies. 
 For deadlock issues, it is again a process related situation. The 
kernel is not affected. If the DLL sets things up such that a DLL is 
loaded during Term (you will need 2 "cross-referencing" DLL's to 
create the loop) and therefore can never be unmapped from the process 
address space, then the process is deadlocked (and the DLL's MTE will
always have a reference count, and cannot be unloaded from the
system). 
 Key point here; DLL Init is called *once* when the DLL is mapped 
into the process, and *once* when freed. Multiple DosLoadModule calls 
for the same DLL will not create multiple calls to it's Init routine, 
unless it is interspersed with DosFreeModule. 

  DT> It's possible to create a loop here during cleanup. I don't know 
  DT> if there are any checks to prevent this. 
 
JP> *That's* the issue.  Obviously calls to DosLoadModule and DosFreeModule must 
JP> be serialised.  This in turn implies that the thread that 
JP> is making the call must lock the module table.  But what if 
JP> a call to DosFreeModule occurs in the termination function 
JP> of a DLL ?  Surely this causes a deadlock ?  If it 

 Yes it is serialized. No, the MTE is not locked. The reference count 
is updated before Init and after Term. Before Init, the system is 
in code which is not preemptable, and after Term in the same code. 
 There are spin locks on the reference count field of the MTE in the 
SMP kernel, but this is irrelevent to the discussion. Application code
cannot access the MTE. It is in Ring 0 addressable space only. 

JP> *doesn't* cause a deadlock (i.e. if the module table is 
JP> guarded by some sort of recursive mutex), then what happens 
JP> if the InitTerm code tries to do *really* pathological 
JP> things, like DosFreeModule *itself*, or attempt to 
JP> DosLoadModule a module that references it and causes its 
JP> reference count to become *non*-zero ?  And how does the 
JP> kernel handle other pathological cases, such as when many 
JP> modules have calls to DosFreeModule in their termination 
JP> code, resulting in significant nesting on the kernel stack ?

 Each thread has it's own Ring 0 stack. In fact, each thread has 3
stacks, a Ring 3 stack, a Ring 2 stack (for IOPL) and a Ring 0 stack.
Each of these are part of the "process address space" even though some
parts of it may only be reached in ring 0 code. If an application
exhausts it's Ring 0 stack, the process is ended and the address
*after* the last Ring 3/2 API call is reported as being at fault.
 The TSS contains pointers to the various stacks that will be used on 
a transition upwards through a callgate. Obviously there is no Ring 3
stack in the TSS, as it there is no way to transition "up to" Ring 3. 

JP> Incidentally, what do you mean by :
 
  DT> (actually, the stack setup is done in DOSCALL1). 
 
JP> Do you mean that the user-mode stub in DOSCALL1 for DosFreeModule sets up a 
JP> stack, "just in case a termination function needs to be 
JP> called", *before* it transfers into the kernel in the first 
JP> place ?  Or do you mean that when it needs to set up a ring 
JP> 3 callback the kernel hand-crafts a stack that, when the 
JP> termination function returns, causes code in DOSCALL1 to be 
JP> called ?

 No, the kernel/loader is the one that checks the DLL flags, and if 
the DLL has term requirements, calls DosCall1 with the process's ring
3 stack to setup the dummy return address. Then it manipulates the 
Ring 0 stack to directly "return" to the DLL's InitTerm entry. The 
stack on entry to the DLL's InitTerm is in ring 3, with a return
address pointing into Doscall1 (which will get back to the kernel
through a call gate). 
 The only reason the "return address" setup is done in Doscall1 is to 
keep some measure of independence between various kernel levels and 
Doscall1 (although there are other things that may require matched 
levels). 

JP> And how does the call gate that is the "return" from the 
JP> InitTerm function protect itself from malicious code that 
JP> would otherwise use it as a back door for entering the 
JP> kernel at any place that it liked ?

 That callgate never returns from the kernel code. It enters the 
kernel at a *particular* place. All callgates have SPECIFIC addresses
that they point to, it is part of the Intel specs. The callgates are 
setup in the GDT/LDT at system initalization. All more privileged ring 
transitions MUST go through a callgate, which takes the code to a 
particular address. 
 There is no way that user code could "use" a callgate to "enter the 
kernel at anyplace it liked". It can't change the GDT/LDT.. 
 


 When I look at the questions you are asking, I don't know what pieces
are missing, nor what I should focus on.. But here are some points
that may help, sorry if I am repeating things you already know:

 A process is a set of resources (address space, files, private sems,
threads, etc). When switching contexts, all the resources are remapped 
to the new one. Kernel addresses (code, data, etc) can be thought of 
as part of the "process", even though they cannot be
"addressed" from
the application's ring3/ring2 code. 

 Threads are a "unit of execution" only. They have no logical 
connection to "code" per se. The only connection they have to code is
that they have "something to execute". A thread can be executing code
in Ring 3, Ring 2, or Ring 0. The programmer of the appropriate piece
of code defines what the code can do, not what the "thread" can do. 
 All threads within an app have 3 stacks. Ring 3, 2, and 0 stacks. The
Ring 0 stack is only usable by Ring 0 code, and the way a thread gets
to Ring 0 (or Ring 2) is through a callgate. The callgate transition
defines the entry point in Ring 0 that is the new point of execution,
and through the TSS what memory addresses are used as the Ring 0
stack. These stack addresses are virtualized, and part of the process
resources (they are unique to a thread - part of the TCB/TSD).

 "Kernel mode" is more than just Ring 0. Internally, there are
"EnterKmode" and "ExitKmode" routines that switch
to/from an internal 
kernel stack (among other things). It is within Kernal mode that 
process/thread switching takes place.. Exceptions that happen within 
Ring 0 code, but outside of Kernel mode will result in a process level
exception, with the exception packet pointing at the instruction after
the application API that resulted in the callgate transition. In most 
cases, calls to Doscall1 result in a call through a callgate. Some 
directly, and some after some further massaging of the application 
parms. Some API's can be done completely in Ring 3, and do not 
require a callgate. 
 Exceptions while in Kmode, result in a system halt. Device drivers 
are called in Kmode. 

 Callbacks from Ring 0 back to Ring 3 do exist. Exception and Exit 
handlers, along with DLL InitTerm are examples. The Ring 3 stack is
setup to have an address in Doscall1 that will reenter Ring 0 using a 
particular callgate, then a stack frame is built on the Ring 0 stack
that "looks" (to the CPU) as though the kernel is
"returning" through 
a callgate. The return instruction causes the CPU to restore the 
application's ring 3 stack. 

 


   Denis       

 All opinions are my very own, IBM has no claim upon them
. 
. 
.
 

 




--- Maximus/2 3.01
* Origin: T-Board - (604) 277-4574 (1:153/908)
SEEN-BY: 396/1 632/0 371 633/210 260 267 270 371 635/506 728 639/252 670/218
@PATH: 153/908 8086 800 140/1 396/1 633/260 635/506 728 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.