TIP: Click on subject to list as thread! ANSI
echo: public_domain
to: Paul Edwards
from: Frank Malcolm
date: 1995-12-30 17:26:20
subject: movsb

Hi, Paul.

PE> FM> Yep, you can do that if speed is super important. I assume you're
PE> FM> absolutely sure there's a \n in there somewhere. Stick a big number in

PE> Yes, I have put a sentinel in.

PE> FM> CX (or ECX), say $ffff, and set up ds:si from the pointer to u; es:di
PE> FM> from the pointer to t. REP SCASB for $0a, reset ds:si, subtract CX from
PE> FM> $ffff then REP MOVSB.

Small error there; SCASB works on es:di, not ds:si.

PE> What a shame.  That basically confirms that you need to look at
PE> each bit of data twice.  Thanks + bye.  Paul.

Basically, yes; there's no combination MOVSB/SCASB instruction. I was
about to write that "anyway, it'd be much faster doing it with the
single instructions", but it looks like it would be close. REPE SCASB is
7 + 5n clocks on a 486, REP MOVSB is 12 + 3n. (I think - the timings
documented in my book are complicated for these instructions.)
Forgetting the setting up of the registers (assuming this is relatively
insignificant for average lines "reasonably" long) that's 19 + 8n clocks
to do that.

Then for a loop, say

L1: LODSB
    STOSB
    CMP AH ;where you've put $0a
    JNE L1

it's 5+5+1+3 = 14n for the loop, once again ignoring setting up the
registers.

So the SCASB/MOVSB wins by 6 clocks per character, and with a bit of
work you could make it faster still - use MOVSD instead, and handle the
last 0, 1, 2 or 3 bytes as a special case.

Regards, FIM.

 * * To respond to this message, press R....
@EOT:

---
* Origin: Pedants Inc. (3:711/934.24)
SEEN-BY: 690/718 711/809 934

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.