TIP: Click on subject to list as thread! ANSI
echo: public_domain
to: Paul Edwards
from: rowan_crowe
date: 1996-01-11 00:57:28
subject: memmove

Here's a message from a while back, in the internet newsgroup
comp.lang.asm.x86. The code demonstrates how to align to a doubleword
boundary before using REP MOVSD (appears that the source is better aligned,
if both are not already on dword boundaries).

This was a good summary and conclusion to a looong thread with much
disinformation and bickering.


Ä [211] Area COMP.LANG.ASM.X86 (3:635/727.1) ÄÄÄ
 Msg  : 200 of 500 -190 +202
 From : qed{at}xenon.chromatic.com     3:633/243.100   Fri 28 Jul 95 05:35
 To   : All                                         Sun 30 Jul 95 11:31
 Subj : Re: rep movsw problem
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
{at}MSGID: 3:633/243.100 000077ba
{at}REPLYTO 3:633/243.100 UUCP
{at}REPLYADDR qed{at}xenon.chromatic.com
{at}PID GIGO unreg at foxy vsn 0.99.950303
From: Paul Hsieh 
{at}Newsgroups: comp.lang.asm.x86,rec.games.programmer
Subject: Re: rep movsw problem
{at}Date: 27 Jul 1995 18:35:44 GMT
Organization: Zocalo Engineering - Berkeley, California, USA
{at}Lines: 62
{at}Message-Id: 
{at}References: 

{at}Nntp-Posting-Host: paulh.chromatic.com
{at}Mime-Version: 1.0
{at}Content-Type: text/plain; charset=us-ascii
{at}Content-Transfer-Encoding: 7bit
{at}X-Mailer: Mozilla 1.1N (Windows; I; 32bit)
{at}Xref: melbourne.DIALix.oz.au comp.lang.asm.x86:3037 rec.games.programmer:14155

My GOD!  Everyone!  This is an extremely trivial question.  Let me
just state a few things:

(1)  The original code as posted was *CORRECT*!  Except that s/he
     misspelled "movsw" as "mosvw".  His/her problem
must be related
     to something else.

(2) RCL is *NOT* a synonym or acronym for SHL.  Any substitution of
    ROR, or RCL or SAR or ROL or anything of that nature will cause
    the originally posted program to have unexpected behavior (dependent
    on the unstated value of the carry flag upon entry to the routine.)

(3) SHR and SHL *DO* set the carry flag with the value of the bit that
    fell off the end.

(4) The value of CX after a rep movs operation has completed is guaranteed
    to be 0.  Hence "adc cx,cx" become equivalent to "movzx
cx,cf" (if you
    get my meaning.)

(5) If CX=0 rep movs will copy 0 bytes (not 65536.)

What I want to know is, how can such a simply question cause so much net
traffic?  From the posts I saw, it looked like the net was batting a
little under 50% there.

My final comment (since I have not acutally contributed anything that you
people shouldn't have already known) is that the 32 bit version of that
code which I use is as follows:

void rep_movsb(char * Src, char * Dest, unsigned int Len);
#pragma aux rep_movsb =                         \
"               mov     eax, ecx        "       \
"               lea     ecx, [edi+3]    "       \
"               not     ecx             "       \
"               and     ecx, 3          "       \
"               sub     eax,ecx         "       \
"               jle     short LEndBytes "       \
"           rep movsb                   "       \
"               mov     ecx, eax        "       \
"               and     eax, 3          "       \
"               shr     ecx, 2          "       \
"           rep movsd                   "       \
"LEndBytes:     add     ecx, eax        "       \
"           rep movsb                   "       \
parm [ESI] [EDI] [ECX]                          \
modify [EAX ECX ESI EDI];

Please excuse the WATCOM C/C++ style inline but I don't imagine anyone
will have difficulty with it.  Its not nearly as elegant, but its break
neck fast.  I posted this a while back hoping somebody might see some
improvements, but nobody has yet come forward.  I submit this as another
opportunity.  Otherwise I'll just have to live with its performance.  :)
(I'll give you guys something to start you going: not ecx is generally a
slower command than xor ecx,FFFFFFFFh on pentiums ... )

--
Paul Hsieh
qed{at}xenon.chromatic.com

What I say and what my company says is not always the same

---
* Origin: Jelly-Bean software development, Melbourne AUST. (3:635/727.1)
SEEN-BY: 50/99 632/103 348 998 633/371 634/384 635/402 503 544 727 638/102
SEEN-BY: 639/252 640/230 690/718 711/401 410 413 430 808 809 934 713/888
SEEN-BY: 800/1 7877/2809
@PATH: 635/727 632/348 635/503 50/99 711/808 809 934

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.