TIP: Click on subject to list as thread! ANSI
echo: os2prog
to: Jonathan de Boyne Pollar
from: Denis Tonn
date: 1998-10-09 18:06:04
subject: How do DLLs load and unl

Original from  Jonathan de Boyne Pollard  to Denis Tonn on 10-06-1998
Original Subject: How do DLLs load and unlo

                         ---------------------------------------

 JP> But this brings up a further question: What happens when a DLL cannot be 
 JP> found ?  This should be, and *is*, reported in the pObjName 
 JP> buffer for DosExecPgm.  I can see how this would be easy to 
 JP> implement if it were the *parent* process that resolved all 
 JP> of the import module references and built the initial user 
 JP> address space of the child.  But how does this happen if 
 JP> the loading is occurring as part of the execution of user-
 JP> mode code in the child process ?  Don't tell me that 
 JP> there's a back door in OS/2 Warp for user-mode code in one 
 JP> process to write to the user address space of another 
 JP> process!
 
  DT> The loader is part of the kernel code, even though it operates in the
  DT> context of the process (and is called on a thread of the process). 
 
JP> Naturally.  Unless one has a client-server design like 
JP> Windows NT 3.x, this is the only sensible way to implement 
JP> it.  I hope that Herbert Rosenau is reading this.  (-:
 
  DT> Being part of the kernel code, it can switch contexts if required to 
  DT> access the parent address space...
 
JP> Now *this* is interesting, because it implies that, essentially, a thread can 
JP> migrate back and forth between processes -- or, at least, 
JP> migrate back and forth between the address spaces that 
JP> processes have.  It's a very ugly design, to my mind, 
JP> though.  To be honest, I think that having the parent 
JP> process be fully responsible for resolving all of the load-
JP> time DLLs as well as DOSCALL1 and the main EXE would have 
JP> been a far "cleaner" solution.  

 No, No, No... 

 I think we better start over .. I can see where I have missed 
some of the gist of what you are trying to understand, and where I may
have used imprecise terms (comes from teaching too many "beginners"). 

 We are getting wrapped up in the ideas of "process address space" = 
"private address space" = "process context". None of these are
equivalent, although from the application programmer's point of view 
they may as well be (and are often spoken of as such). 

 See below.. 

JP> The solution that you describe means that there has to be 
JP> special case code in the loader for when it is loading 
JP> "load-time" DLLs as opposed to "run-time" DLLs. 
This is an 
JP> important point that I shall return to in a bit.

 Not in the loader.. See below.. 

  DT> Keep in mind that DosExecPgm is a kernel API, it does enter ring 0 
  DT> (through Doscall1) and can access all processes. 
 
JP> It can access all of the kernel objects that represent processes, true.  That 
JP> is a given, and is how just about *all* multiprocess operating systems are 
JP> designed.  But what we are talking about is accessing the 
JP> *user address space* of a different process.  That's a far 
JP> more involved procedure, and a far more unusual one, 
JP> because on most operating systems threads cannot "migrate" 
JP> between processes, or their address spaces, in this 
JP> fashion.  Threads switch between user and kernel mode, and 
JP> can poke around with the kernel-mode data structures of 
JP> other processes.  But I've not before encountered an 
JP> operating system where a thread switches to the page tables 
JP> of a completely different process in order to directly 
JP> access its user address space.  All other operating systems 
JP> that need to communicate results in this way generally use 
JP> message passing to achieve this, or some common kernel data 
JP> area.

 I may have been imprecise in my previous descriptions. The ring 0
code (executing on the thread) can access all the kernel control
blocks (including the ring 0 stack of another process). If required,
the code can enter Kmode (and lose much of the meaning we apply to a
"thread"), and do address space switching etc. 

 A point that I haven't specificly pointed out: The system does NOT
have an "no context active" or "no thread active"
state. There will
*always* appear to be an active context and an active thread, even
though the system may be in code that is essentially "no-thread" and
"no-context". Hardware interrupt handlers and Kernel mode code both
fall into this category. The context/thread is irrelevent for this
code. 

 In the case under discussion, when a parent starts a child process, 
the parent's context (and thread) are the ones that are the "last 
used" during the start of the creation of the new process (and 
context). At some point, the child "context" must be switched to, in 
order to complete the creation and initalization of the child. In
reality, the code that does all this work is nearly all "no-context,
no thread", or at the very least "partial-context, partial thread"
code. The creation and initialization of the new process is NOT atomic
from a system perspective. 
 Information on the success/failure of the child process creation and 
initialization is returned via the ring 0 stack of the parent (it will 
not return to the parent's API call until after the above completes). 
If you want to use your perspective, it is a matter of the parent 
"seeing inside the child" for the duration of it's creation, not the
child "seeing into the parent". In reality, much of this happens in a
sort of "no-context, no-thread" code.. 
 
 Once the child is started, loadmodule will return the result of the
success/failure to the child, not the parent. As far as the loader is 
concerned, it still returns the result to the "API/context" that 
caused it to do it's work.. 

  DT> Until the loader has resolved all the "load time" linking, the 
  DT> parent process is in a kind of "childwait", [...] Once
the child is
  DT> loaded, the parent does not receive  notifications for explicit
  DT> DosLoadModule call done in the child. 
 
JP> As I said above, this again implies that there is special 
JP> case code for load-time loading as opposed to run-time 
JP> loading, to deal with the different ways of reporting a 
JP> load failure (for load-time the thread has to temporarily 
JP> switch address spaces and write to its parent process' 
JP> address space, for run-time the thread has to write to its 
JP> own process' address space).  This is not a particularly 
JP> clean design, in my view.

 Not really.. The loader is called by the process creation and init 
code. It just returns the result to the "caller". The process creation
code returns the information to it's "caller" (DosStartSession) which 
then retuns the same back to the app.. 

JP> It also causes some very strange race conditions.  Suppose that one thread in 
JP> the parent had called DosExecPgm.  This would cause the 
JP> child to run, and its primary thread to load all of the 
JP> load-time DLLs.  Now suppose that the child encountered a 
JP> failure, *and* in the meantime another thread *in the 
JP> parent* had maliciously called DosFreeMem for the pObjName 
JP> buffer that the first thread had initially passed to 
JP> DosExecPgm.  The primary thread in the child process would 
JP> switch address spaces, attempt to write the name of the 
JP> module that caused the failure to the buffer, and fault 
JP> because the pages were no longer valid.  This would have 
JP> the bizarre effect of causing a page fault in the *child* 
JP> process, when, intuitively, one would expect that an 
JP> invalid buffer passed as an argument to a kernel function 
JP> would cause a page fault *in the thread that had passed the 
JP> buffer*.  

 No.. See above.. The result of the DosExecPgm is passed back to the 
parent *at the end* of the child creation, in the context of the 
parent (via the parent's ring 0 stack). The child may (or may not) 
have preempted the CPU and gotten a time slice in the meantime... 
 In the above example, the parent would be the one that trapped as the
kernel tried to return the data to the parent. 

JP> And what, in the above scenario, happens to the first 
JP> thread, that is sitting patiently in "child wait" mode 
JP> waiting for the child process either to report either "all 
JP> load-time DLLs have been loaded" or "a load-time DLL 
JP> failed, and I've written its name to your result buffer for 
JP> you".  The child process has died with a page fault.  Does 
JP> the parent thread simply hang forever in "child wait" mode ?

 If the child completes the init/creation (including loading load time
DLL and fixups), the result is returned to the parent (assuming no 
childwait in the DosExecPgm). If the child dies with a fatal exception
after that point, it is a child failure only. The "creation" was 
successful. 

JP> I'm sorry for all of the questions.  But I'm trying to form 
JP> the clearest picture that I can of this whole area, and it 
JP> has turned out to be a *very* complicated subject, full of 
JP> hidden pitfalls and nasty situations like the above.

 I think you are trying to make it more complex that it really is. 
Since I don't know (exactly) what it is that is missing in your 
understanding (and I suspect you don't either), we can only hammer 
around the subject until the elusive concept/information surfaces. 

 This would be a lot easier if I had a whiteboard and could draw 
pictures of the flow/state/etc , but I find it interesting none the
less.. 



   Denis       

 All opinions are my very own, IBM has no claim upon them
. 
. 
.
 

 



--- Maximus/2 3.01
* Origin: T-Board - (604) 277-4574 (1:153/908)
SEEN-BY: 396/1 632/0 371 633/210 260 267 270 371 635/506 728 639/252 670/218
@PATH: 153/908 8086 800 140/1 396/1 633/260 635/506 728 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.