Hi, Frank.
*** Quoting Frank Ramsey to Eric Decker dated 05 Jan 97 ***
FR> When the head is over the proper location, a fragment of the file
FR> is read or written. Remember the heads are always in a smooth
FR> sweeping motion. Shortly, the heads are no longer on the track
FR> of interest.
I think this is not a precise picture and maybe is misleading. AFAIK
heads move to the target track, then stop and the disk performs
reading/writing; then heads move to the next target and so on. So the
correct picture will be: heads jump along a radius of the drive like a
cangaroo in one direction, then turn back and jump in the opposite
direction - _if there is a need in it, i.e. if there are requests to
different locations of the disk_ .
FR> The next fragement is located immediately after the previous
FR> fragement. The next read/write must wait till the heads have
FR> travelled the entire range and return to the track.
FR> Over time, file fragments are all over the disk. The read/write
FR> process is performed when the head is over the track. Then the
FR> heads move. If the disk is fragmented, the chance the next
FR> fragement of the file is on the same track is reduced. The chance
FR> that the next file fragement can be found before the heads have
FR> returned to the track just visited is enhanced. The amount of time
FR> required to find the next file fragement is smaller when the disk
FR> is fragmented. Therefore, elevator seeking can actually improve
FR> as the disk is fragemented.
Good picture, but it's easy to draw totally different pictures.
1. Let's consider a situation: a small net and only one user works with
a server and only with one sequential file. Then if the file were
defragmented, the heads on the disk move rare and only when a piece of
the file located on the current cylinder is processed and there is a
need to advance to the next cylinder (quite like in DOS). If the file
is fragmented, then the heads on the disk jump along the file fragments'
locations resulting in a significant loss of speed; exact percent
depends on speed of the net, time needed to read/write a block, time
needed to position heads (which obviously depends on a number of
cylinders between fragments), and time the user's application needs to
process a piece of information.
2. Next situation: two users work with a server, each with one
sequential file. Then if both files are defragmented, the heads on the
disk jump from 1st file's location to 2nd file's and back; if both files
are fragmented, the heads jump say randomly. Average speed in both cases
seems roughly the same.
3. Ultimate case: lots of simultaneous requests and speed of net or
users' applications is low (a la Internet server). Then after one
request is serviced to the moment the next request to the same file
comes, heads are in a totally different position (quite random); and is
file defragmented or not - it does not matter.
4. Somewhere between cases (2) and (3) there lies your picture. I'll try
to estimate gains and pains. Let's suppose that a file fragment locates
in a middle of disk. If file is defragmented, the heads will return to
the same cylinder after they travel in the selected direction to (nearly)
the border of disk and then back, so total number of crossed cylinders
roughly equals to the total number of cylinders on disk. If the file is
fragmented the next fragment location will vary from just current
position of heads (resulting to nearly zero time - of course,
positioning only) to the border opposite to the selected direction of
heads movement (and resulting number of crossed cylinders will be 1.5 of
the total number of cylinders on disk), so an average number of crossed
cylinders will be 0.75 of the total number of cylinders. This 0.75 is
really a gain as opposed to 1.0 (when file is defragmented). But we must
not forget that this is a case when net and applications are very fast
(compared to disk speed); taking into account net and application delays
can reduce this gain.
*** Quoting Eric Decker to Frank Ramsey dated 04 Jan 97 ***
ED> If disk fragmentation is an issue it will manifest itself more as
ED> time goes on. Certainly this issue can be put to rest if anyone
ED> has made any performance measurements of a server before total
ED> backup and then restoring it. The process of tape backup is one of
ED> queuing all file fragments together in the right order to form a
ED> contiguous write on tape; a total restore will automatically create
ED> a 100% contiguous file structure on the disk.
I saw an article where an author - "mythbreaker" - stated that such a
procedure performed by him led to 2 to 3 times speedup. I myself didn't
do this procedure. Sorry, don't remember author's name and origin of
article. It seems to me not unbelievable - in a single user test and/or
when info's amount is much less than the volume's capacity (in this case
after defragmenting files occupy only the beginning of the volume, thus
eliminating far jumps of the heads).
As a resume: IMHO defragmenting can result in a significant speedup in
small nets with no heavy load, say up to 20 users (it seems to me that
users most of time think, type their input, work with local info and so
on). However in some circumstances defragmenting can really lead to loss
of speed. Of course there is a need in a experiment.
I don't know exactly current situation in USA - how big is a number of
small nets today. Here in less evolved countries the percent of small
nets is sure quite significant (I think) and even in USA at the times
the Netware came up that was the case.
P.S.
*** Quoting Frank Ramsey to Ben dated 30 Jan 97 ***
FR> Drew Major and company did a GREAT job with this technique.
FR> As far as I know, no one else has anything even approaching
FR> Netware's technique.
In 1980 I implemented such "elevator" method as an option in our
real-time operating system (ASPO) which (the OS) here in the Soviet
Union had some spread. It was a known idea at that time; I don't
remember now, but it seems very probable to me that this idea had been
implemented in some (more famous) operating system(s). We need to agree
also that a really great throughput of Netware's servers is achieved in
a significant degree by total cashing of FAT, which, along with large
disk cache, requires a huge amount of RAM. Thank God, nowadays this
amount usually is not a very big problem - at least while whole disks'
capacity does not become very huge, but the trend is just that.
WBR,
Nick Filippov.
---
---------------
* Origin: Tubular Bells point (2:465/142.2)
|