Page 3 of 4 FirstFirst 1234 LastLast
Showing results 41 to 60 of 69

Thread: [mASM Routine] String Copy

  1. #41

    Default

    Explain to me what is the point of negating everything I suggested with obscure statements nobody can understand? Your response is nonsense, yes I write drivers too.

    You're a bit of a showman, but you do help people so I don't want to put you down Jakor. Just don't talk rubbish directly at me, that ain't cool.


  2. #42
    =) Senior Member
    Developer

    Supreme Being
    K? Pŕo?ćtiόnŹ's Avatar
    Join Date
    Oct 2004
    Posts
    11,794

    Default

    Quote Originally Posted by Dyndrilliac View Post
    I simply have the wisdom to refuse to sit here and senselessly argue with you when it's clear your beyond being convinced of anything. Unfortunately there is nothing I can do to prevent you from acting like a woman.
    Alright, keep going, taking more notes.
    Quote Originally Posted by Voice Of Korhal View Post
    Let us hope so, if that Reverse Engineering crap actually works, I'll be amazed.

  3. #43
    Formerly Known as Jakor Senior Member
    Developer

    Evangelist
    laocoon's Avatar
    Join Date
    Jan 2005
    Posts
    1,280

    Default

    All I said was that I thought some error handling in functions I use was approptiate for my use and so I left them in. I was backing up what you were saying about the speed difference in the functions by testing the ones I use alongside your example which I had to convert to the same format as the rest of my functions (error checking and length driven not null terminated (I wasn't comparing my null terminated function))
    I was not negating these facts, only explaining my reasoning for changing them from what you had posted.

    I was unsure of how rep prefixed instructions were handled as far as stopping in the middle of the instruction to give time for the next process/thread.

    Maybe a instruction with a rep prefix caused a saving of the thread context and then the processor on comming back to the thread for execution in the middle sees a flag that it should continue with a context specifically for the rep instruction before returning to the calling threads context. I have not looked into the processor logic behind this prefix and didn't want to shift a bias if one process got less true processing time due to task switching which would otherwise be not applicable with a priority that high. I may be completely wrong, and most likely am which is why I went as unbiased as I could by not changing the priority.

    While your numbers were for timing were to see how fast the instructions were, mine were to compare functions with each other. I was only explaining why I did what I did, not negate what you said as wrong.

  4. #44

    Heretic
    delzz's Avatar
    Join Date
    Jun 2009
    Location
    In your nightmares.
    Posts
    32

    Default

    Programming in mASM is a waste of time anyway.

  5. #45
    Anarchist Gold Member

    Blessed
    Antihaxer's Avatar
    Join Date
    Aug 2005
    Posts
    2,066

    Default

    Quote Originally Posted by delzz View Post
    Programming in mASM is a waste of time anyway.
    You're a waste of life. :-)

    I'm done with site politics.

  6. #46
    =) Senior Member
    Developer

    Supreme Being
    K? Pŕo?ćtiόnŹ's Avatar
    Join Date
    Oct 2004
    Posts
    11,794

    Default

    Quote Originally Posted by delzz View Post
    Programming in mASM is a waste of time anyway.
    Posting saying programming in masm is a waste of time is even a bigger waste of time. So we win.
    Quote Originally Posted by Voice Of Korhal View Post
    Let us hope so, if that Reverse Engineering crap actually works, I'll be amazed.

  7. #47

    Heretic
    delzz's Avatar
    Join Date
    Jun 2009
    Location
    In your nightmares.
    Posts
    32

    Default

    Quote Originally Posted by antihaxer View Post
    You're a waste of life. :-)



    Quote Originally Posted by K? Pŕo?ćtiόnŹ View Post
    Posting saying programming in masm is a waste of time is even a bigger waste of time. So we win.
    Ouch, time to get a fresh icepack.

  8. #48

    Default

    Quote Originally Posted by Jakor View Post
    All I said was that I thought some error handling in functions I use was approptiate for my use and so I left them in. I was backing up what you were saying about the speed difference in the functions by testing the ones I use alongside your example which I had to convert to the same format as the rest of my functions (error checking and length driven not null terminated (I wasn't comparing my null terminated function))
    I was not negating these facts, only explaining my reasoning for changing them from what you had posted.

    I was unsure of how rep prefixed instructions were handled as far as stopping in the middle of the instruction to give time for the next process/thread.

    Maybe a instruction with a rep prefix caused a saving of the thread context and then the processor on comming back to the thread for execution in the middle sees a flag that it should continue with a context specifically for the rep instruction before returning to the calling threads context. I have not looked into the processor logic behind this prefix and didn't want to shift a bias if one process got less true processing time due to task switching which would otherwise be not applicable with a priority that high. I may be completely wrong, and most likely am which is why I went as unbiased as I could by not changing the priority.

    While your numbers were for timing were to see how fast the instructions were, mine were to compare functions with each other. I was only explaining why I did what I did, not negate what you said as wrong.
    Lets go over what you've been saying.

    9 iterations certainly isn't enough, but that's too trivial to write about.

    There are string manipulation functions in the RTL if you're coding device drivers on the win32 platform. That's why I don't buy your excuse.

    Your mention of the the rep* instruction prefix, sadly does not make it an atomic operation. So there would be thread context switching in all those functions before the operation completed. Giving inaccurate results even if you are just doing profiling between function comparisons. Your responses imply my advice to raise thread priority is not necessary, which is untrue.

    For those who are new to the concepts mentioned you can read about one here. Context switch - Wikipedia, the free encyclopedia

    Atomic operation just means that the the CPU will execute a combined sequence of operations that appear to the system as a single instruction, like for example, lock cmpxchg or any cmov* instruction. rep* is not one of those.

    If you're profiling any context switch means the CPU is off doing something else while your timer is ticking away waiting for the function to finish execution. It gives false timings.


  9. #49
    =) Senior Member
    Developer

    Supreme Being
    K? Pŕo?ćtiόnŹ's Avatar
    Join Date
    Oct 2004
    Posts
    11,794

    Default

    Quote Originally Posted by delzz View Post





    Ouch, time to get a fresh icepack.
    Hey at least I dont suck a lot of penis. you queer why dont u go suck some more penis there penis boy.

    Your chrismtas list to santa:
    1. Penis.
    2. And another penis.



    god yoru so fgay
    Quote Originally Posted by Voice Of Korhal View Post
    Let us hope so, if that Reverse Engineering crap actually works, I'll be amazed.

  10. #50

    Heretic
    delzz's Avatar
    Join Date
    Jun 2009
    Location
    In your nightmares.
    Posts
    32

    Default

    Quote Originally Posted by K? Pŕo?ćtiόnŹ View Post
    Hey at least I dont suck a lot of penis. you queer why dont u go suck some more penis there penis boy.

    Your chrismtas list to santa:
    1. Penis.
    2. And another penis.



    god yoru so fgay
    I'm a girl, this totally ****s up your flame fest .

  11. #51
    =) Senior Member
    Developer

    Supreme Being
    K? Pŕo?ćtiόnŹ's Avatar
    Join Date
    Oct 2004
    Posts
    11,794

    Default

    Haha, men make you there bitch. And you still write christmas lists to santa, what a dumbass
    Quote Originally Posted by Voice Of Korhal View Post
    Let us hope so, if that Reverse Engineering crap actually works, I'll be amazed.

  12. #52

    Heretic
    delzz's Avatar
    Join Date
    Jun 2009
    Location
    In your nightmares.
    Posts
    32

    Default

    Quote Originally Posted by K? Pŕo?ćtiόnŹ View Post
    Hey at least you dont suck a lot of penis.


    Im a queer why dont i go suck some more penis because im penis boy.

    My chrismtas list to santa:
    1. Penis.
    2. And another penis. god im so cool.
    Cool story, bro.

  13. #53
    Formerly Known as Jakor Senior Member
    Developer

    Evangelist
    laocoon's Avatar
    Join Date
    Jan 2005
    Posts
    1,280

    Default

    @umbra;
    I have only just started messing with device drivers, and the rtlmovemem was the only function I had been using. The c drivers I was using for examples included the stdio file and were using copy/compare functions that got linked into the app. :-/

    I also used gettickcount initially, but that didn't update near enough. =p I stand corrected how how to time a function, however, my results came back matching yours even after only 9 cycles.

    @ the other two, there is still serious discussion going on here, let's keep the flaming down...

  14. #54
    =) Senior Member
    Developer

    Supreme Being
    K? Pŕo?ćtiόnŹ's Avatar
    Join Date
    Oct 2004
    Posts
    11,794

    Default

    Quote Originally Posted by delzz View Post
    Cool story, bro.
    Alright you got me good. You win this round.
    Quote Originally Posted by Voice Of Korhal View Post
    Let us hope so, if that Reverse Engineering crap actually works, I'll be amazed.

  15. #55

    Disciple
    ulliklliwi's Avatar
    Join Date
    May 2007
    Location
    The Code Cave after the JMP Gate
    Posts
    568

    Default

    I dont program in masm anymore. But i do know asm .

    but this asm code was reversed from ms strncpy function

    Code:
    strncpy        proc near
    Dest            = dword ptr  4
    Source          = dword ptr  8
    Count           = dword ptr  0Ch
    
                     mov     ecx, [esp+Count]
                     push    edi
                     test    ecx, ecx
                     jz      finish
                     push    esi
                     push    ebx
                     mov     ebx, ecx
                     mov     esi, [esp+0Ch+Source]
                     test    esi, 3
                     mov     edi, [esp+0Ch+Dest]
                     jnz     short src_misaligned
                     shr     ecx, 2
                     jnz     main_loop_entrance
                     jmp     short copy_tail_loop
     ; ---------------------------------------------------------------------------
     src_misaligned:
                     mov     al, [esi]
                     add     esi, 1
                     mov     [edi], al
                     add     edi, 1
                     sub     ecx, 1
                     jz      short fill_tail_end1
                     test    al, al
                     jz      short align_dest
                     test    esi, 3
                     jnz     short src_misaligned
                     mov     ebx, ecx
                     shr     ecx, 2
                     jnz     short main_loop_entrance
     tail_loop_start:
                     and     ebx, 3
                     jz      short fill_tail_end1
     copy_tail_loop:
                     mov     al, [esi]
                     add     esi, 1
                     mov     [edi], al
                     add     edi, 1
                     test    al, al
                     jz      short fill_tail_zero_bytes
                     sub     ebx, 1
                     jnz     short copy_tail_loop
     fill_tail_end1:
                     mov     eax, [esp+0Ch+Dest]
                     pop     ebx
                     pop     esi
                     pop     edi
                     retn
     ; ---------------------------------------------------------------------------
     align_dest:
                     test    edi, 3
                     jz      short dest_align_loop_end
     dest_align_loop:
                     mov     [edi], al
                     add     edi, 1
                     sub     ecx, 1
                     jz      fill_tail_end
                     test    edi, 3
                     jnz     short dest_align_loop
     dest_align_loop_end:
                     mov     ebx, ecx
                     shr     ecx, 2
                     jnz     short fill_dwords_with_EOS
     finish_loop:
                     mov     [edi], al
                     add     edi, 1
     fill_tail_zero_bytes:
                     sub     ebx, 1
                     jnz     short finish_loop
                     pop     ebx
                     pop     esi
     finish:
                     mov     eax, [esp+4+Dest]
                     pop     edi
                     retn
     ; ---------------------------------------------------------------------------
     main_loop_0:
                     mov     [edi], edx
                     add     edi, 4
                     sub     ecx, 1
                     jz      short tail_loop_start
     main_loop_entrance:
                     mov     edx, 7EFEFEFFh
                     mov     eax, [esi]
                     add     edx, eax
                     xor     eax, 0FFFFFFFFh
                     xor     eax, edx
                     mov     edx, [esi]
                     add     esi, 4
                     test    eax, 81010100h
                     jz      short main_loop_0
                     test    dl, dl
                     jz      short loc_44D059
                     test    dh, dh
                     jz      short loc_44D04F
                     test    edx, 0FF0000h
                     jz      short loc_44D045
                     test    edx, 0FF000000h
                     jnz     short main_loop_0
                     mov     [edi], edx
                     jmp     short fill_with_EOS_dwords
     ; ---------------------------------------------------------------------------
     loc_44D045:
                     and     edx, 0FFFFh
                     mov     [edi], edx
                     jmp     short fill_with_EOS_dwords
     ; ---------------------------------------------------------------------------
     loc_44D04F:
                     and     edx, 0FFh
                     mov     [edi], edx
                     jmp     short fill_with_EOS_dwords
     ; ---------------------------------------------------------------------------
     loc_44D059:
                     xor     edx, edx
                     mov     [edi], edx
     fill_with_EOS_dwords:
                     add     edi, 4
                     xor     eax, eax
                     sub     ecx, 1
                     jz      short fill_tail
     fill_dwords_with_EOS:
                     xor     eax, eax
     fill_with_EOS_loop:
                     mov     [edi], eax
                     add     edi, 4
                     sub     ecx, 1
                     jnz     short fill_with_EOS_loop
    fill_tail:
                     and     ebx, 3
                     jnz     finish_loop
     fill_tail_end:
                     mov     eax, [esp+0Ch+Dest]
                     pop     ebx
                     pop     esi
                     pop     edi
                     retn
     strncpy        endp

  16. #56

    Default

    Which is written in C so it's kind of useless for low level coders. Thanks for sharing though, people can use this to compare between what the compiler generates to our smaller refined samples.


  17. #57

    Default

    The size of ulliklliwi's assembly has more to do with the extra sanity checks it does on the input than the language it is written in. Here's a plain strncpy generated by GCC 3.4.4:
    Code:
            .text
    .globl _strncpy
            .def    _strncpy;       .scl    2;      .type   32;     .endef
    _strncpy:
            pushl   %ebp
            movl    %esp, %ebp
            movl    8(%ebp), %ecx
            pushl   %ebx
            movl    16(%ebp), %edx
            movl    12(%ebp), %ebx
    L12:
            testl   %edx, %edx
            je      L3
            movzbl  (%ebx), %eax
            decl    %edx
            incl    %ebx
            movb    %al, (%ecx)
            incl    %ecx
            testb   %al, %al
            jne     L12
    L3:
            decl    %edx
            cmpl    $-1, %edx
            je      L11
            movb    $0, (%ecx)
            incl    %ecx
            jmp     L3
    L11:
            popl    %ebx
            movl    8(%ebp), %eax
            popl    %ebp
            ret
    It's 48 bytes assembled.

  18. #58
    Programmer/PC Enthusiast Developer
    Gold Member

    Enlightened
    AgentGOD's Avatar
    Join Date
    Jul 2004
    Location
    127.0.0.1
    Posts
    2,760

    Default

    Why reinvent the wheel?

    lstrcpy is a nicely available function in the Windows API and it does what you want.
    Case: Antec 900
    CPU: Q9650 @ 4.0 GHz [IntelBurnTest stable]
    GPU: ATI Radeon HD6950 UL
    Motherboard: ASUS P5Q Deluxe
    Memory: 8 GB (4x 2 GB) OCZ Reaper HPC DDR2 1066
    Sound: Creative SB X-Fi Fatal1ty Pro
    PSU: Corsair AX-1200
    O/S: 7 Ultimate SP1 X64

    Purchase products here (e.g. Premium membership, Black Ops Intervention, H2SO4 for CSS/TF2/L4D, EliteControl for SC, MW2 Liberation v1.06+, etc)

    Get Premium for Just $8, and get MW2 Liberation v1.06!

  19. #59
    =) Senior Member
    Developer

    Supreme Being
    K? Pŕo?ćtiόnŹ's Avatar
    Join Date
    Oct 2004
    Posts
    11,794

    Default

    Because none of us are aware of that function, thanks AgentGOD for pointing out the obvious.
    Quote Originally Posted by Voice Of Korhal View Post
    Let us hope so, if that Reverse Engineering crap actually works, I'll be amazed.

  20. #60
    Programmer/PC Enthusiast Developer
    Gold Member

    Enlightened
    AgentGOD's Avatar
    Join Date
    Jul 2004
    Location
    127.0.0.1
    Posts
    2,760

    Default

    Quote Originally Posted by K? Pŕo?ćtiόnŹ View Post
    Because none of us are aware of that function, thanks AgentGOD for pointing out the obvious.
    So why continue to reinvent the wheel if it's so obviously obvious to you people?
    Case: Antec 900
    CPU: Q9650 @ 4.0 GHz [IntelBurnTest stable]
    GPU: ATI Radeon HD6950 UL
    Motherboard: ASUS P5Q Deluxe
    Memory: 8 GB (4x 2 GB) OCZ Reaper HPC DDR2 1066
    Sound: Creative SB X-Fi Fatal1ty Pro
    PSU: Corsair AX-1200
    O/S: 7 Ultimate SP1 X64

    Purchase products here (e.g. Premium membership, Black Ops Intervention, H2SO4 for CSS/TF2/L4D, EliteControl for SC, MW2 Liberation v1.06+, etc)

    Get Premium for Just $8, and get MW2 Liberation v1.06!

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [VB6] Plugin Engine Source Code
    By Dyndrilliac in forum Software Development
    Replies: 1
    Last Post: 09-23-2008, 10:45 AM
  2. Replies: 22
    Last Post: 05-26-2008, 09:57 PM
  3. Useful VB6 Functions
    By bLueStar in forum Starcraft/Brood War
    Replies: 13
    Last Post: 05-12-2008, 04:40 AM
  4. I've got another problem now.
    By Dyndrilliac in forum Software Development
    Replies: 1
    Last Post: 02-22-2006, 06:18 AM

Posting Rules

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •