[MMX Directive Edition] Calculation of Black and White Sports

xiaoxiao2021-03-06  1

Foreword: This thing should be posted early, because I have already said in the blog post of Nowcan, I have not sticked it because I have no mood, or I feel that I have not finished it. Nowadays, I am not finished. The number of shares first. This code is written by the author of Wzebra, the role is to calculate the "black and white chess" on the action, which use Bitboard technology and MMX instructions, the author said that he has considered the instruction pair, please do not modify the code in the order. Unless you really make sure you are right. The original code is the compilation format under GCC. I convert him over, this is the version compiled in VC 6.0, I hope to be a bit for you. Another: Since it is a long time ago, it is about 2 or 3 years. (I didn't look very much at the time), so you will see the Int count processing of the head and tail, not very science (maybe this is more intuitive) ), In fact, you can make him more efficient, I am too lazy to change, modifying the code faster, leave you I do it, actually quite simple.

static unsigned __int64 dir_mask0; static unsigned __int64 dir_mask1; static unsigned __int64 dir_mask2; static unsigned __int64 dir_mask3; static unsigned __int64 dir_mask4; static unsigned __int64 dir_mask5; static unsigned __int64 dir_mask6; static unsigned __int64 dir_mask7; static unsigned __int64 c0f; static unsigned __int64 c33; Static unsigned __INT64 C55;

void init_mmx (void) {dir_mask0 = 0x007e7e7e7e7e7e00; dir_mask1 = 0x00ffffffffffff00; dir_mask2 = 0x007e7e7e7e7e7e00; dir_mask3 = 0x7e7e7e7e7e7e7e7e; dir_mask4 = 0x7e7e7e7e7e7e7e7e; dir_mask5 = 0x007e7e7e7e7e7e00; dir_mask6 = 0x00ffffffffffff00; dir_mask7 = 0x007e7e7e7e7e7e00; c0f = 0x0f0f0f0f0f0f0f0f; c33 = 0x3333333333333333; c55 = 0x5555555555555555; } Typedef struct {unsigned long road;} bitboard;

INT Bitboard_mobility (const bitboard my_bits, const bitboard opp_bits) {unsigned int count

__ASM {

/ * Ready for init data * / MOV EBX, MY_BITS.HIGH; MOV ECX, MY_BITS.LOW; MOV EDI, OPP_BITS.HIGH; MOV ESI, OPP_BITS.LOW;

// MOVD MM0, EBX; PSLLQ MM0, 32; MOVD MM3, ECX; POR MM0, MM3; MM0 IS Bitboard Of MY_BITS MOVD MM1, EDI; PSLLQ MM1, 32; MOVD MM4, ESI; POR MM1, MM4; MM1 IS Bitboard OE OPP_BITS PXOR MM2, MM2; MM2 <- 0x0000000000000000 / * Shift = -9 Rowdelta = -1 Coldelta = -1 * / / * Shift = 9 Rowdelta = 1 Coldelta = 1 * /

/ * Disc # 1, Flip Direction 0. * / / * DISC # 1, Flip Direction 7. * / MOVQ MM3, MM1; MM3

/ * Disc # 2, Flip Direction 0. * / / * DISC # 2, Flip Direction 7. * / MOVQ MM5, MM4; MOVQ MM7, MM6; PSLLQ MM5, 9; PSRLQ MM7, 9; Push Ebx; Pand MM5, mm3; pand mm7, mm3; and edi, 0x7e7e7e7e; 0x7e7e7e7e and esi, 0x7e7e7e7e; 0x7e7e7e7e; value of:; 011111110; 011111110; 011111110; 011111110 por mm4, mm5; por mm6, mm7; shl ebx, 1; shl ecx, 1 ; / * Disc # 3, Flip Direction 0. * / / * DISC # 3, Flip Direction 7. * / MOVQ MM5, MM4; MOVQ MM7, MM6; and EBX, EDI; AND ECX, ESI; PSLLQ MM5, 9; PSRLQ MM7, 9; MOV EAX, EBX; MOV EDX, ECX; PAND MM5, MM3; PAND MM7, MM3; SHL EDX, 1; SHL EAX, 1; POR MM4, MM5; POR MM6, MM7; And Eax, EDI; And EDX, ESI;

/ * Disc # 4, Flip Direction 0. * / / * DISC # 4, Flip Direction 7. * / MOVQ MM5, MM4; MOVQ MM7, MM6; OR EBX, EAX; or ECX, EDX; PSLLQ MM5, 9; PSRLQ MM7, 9; MOV EAX, EBX; MOV EDX, ECX; PAND MM5, MM3; PAND MM7, MM3; SHL EDX, 1; SHL EAX, 1; POR MM4, MM5; POR MM6, MM7; And Ev, EDI; AND EDX, ESI; / * DISC # 5, FLIP DIRECTION 0. * / / * DISC # 5, Flip Direction 7. * / MOVQ MM5, MM4; MOVQ MM7, MM6; or EBX, EAX; or ECX, EDX; PSLLQ MM5 , 9 PSRLQ MM7, 9; MOV EAX, EBX; MOV EDX, ECX; PAND MM5, MM3; PAND MM7, MM3; SHL EDX, 1; SHL EAX, 1; POR MM4, MM5; POR MM6, MM7; And E EDI ; And EDX, ESI;

/ * Disc # 6, Flip Direction 0. * / / * Disc # 6, Flip Direction 7. * / MOVQ MM5, MM4; MOVQ MM7, MM6; OR EBX, EAX; OR ECX, EDX; PSRLQ MM7, 9; PSLLQ MM5, 9; MOV EAX, EBX; MOV EDX, ECX; PAND MM5, MM3; PAND MM7, MM3; SHL EDX, 1; SHL Eax, 1; POR MM4, MM5; POR MM6, MM7; And Ev, EDI; AND EDX, ESI; PSLLQ MM4, 9; PSRLQ MM6, 9; OR EBX, EAX; or ECX, EDX; POR MM2, MM4; POR MM2, MM6; MOV EAX, EBX ; MOV EDX, ECX; / * shift = -8 rowdelta = -1 Coldelta = 0 * / / * shift = 8 rowdelta = 1 coldelta = 0 * /

/ * DISC # 1, Flip Direction 1. * / / * DISC # 1, Flip Direction 6. * / MOVQ MM3, MM1; MOVQ MM4, MM0; MOVQ MM6, MM0; PAND MM3, DIR_MASK1; 0x00fffffffffffffffff00;; Dir_mask1 of Value :;EDI; AND EDX, ESI; / * DISC # 2, FLIP DIRECTION 1. * / / * DISC # 2, FLIP DIRECTION 6. * / MOVQ MM5, MM4; MOVQ MM7, MM6; OR EBX, EAX; or ECX, EDX PSLLQ MM5, 8; PSRLQ MM7, 8; SHL EBX, 1; SHL ECX, 1; PAND MM5, MM3; PAND MM7, MM3; POR MM4, MM5; POR MM6, MM7; / * Serialize Here: Add Horizontal SHL Flips * /

MOVD MM5, EBX; PSLLQ MM5, 32; MOVD MM7, ECX; POR MM5, MM7; POR MM2, MM5; / * DISC # 3, FLIP DIRECTION 1. * / / * DISC # 3, FLIP DIRECTION 6. * / MOVQ MM5, MM4; MOVQ MM7, MM6; PSLLQ MM5, 8; PSRLQ MM7, 8; POP EBX; PAND MM5, MM3; PAND MM7, MM3; POP ECX; POR MM4, MM5; PORM6, MM7; PUSH ECX;

/ * Disc # 4, Flip Direction 1. * / / * DISC # 4, Flip Direction 6. * / MOVQ MM5, MM4; MOVQ MM7, MM6; Push EBX; PSLLQ MM5, 8; PSRLQ MM7, 8; SHR EBX, 1; SHR ECX, 1; PAND MM5, MM3; PAND MM7, MM3; And EBX, EDI; AND ECX, ESI; POR MM4, MM5; POR MM6, MM7;

/ * DISC # 5, Flip Direction 1. * / / * Disc # 5, Flip Direction 6. * / MOVQ MM5, MM4; MOVQ MM7, MM6; PSLLQ MM5, 8; PSRLQ MM7, 8; MOV EAX, EBX; MOV EDX, ECX; PAND MM5, MM3; PAND MM7, MM3; SHR EAX, 1; SHR EDX, 1; POR MM4, MM5; POR MM6, MM7; And Eax, EDI; And EDX, ESI; / * DISC # 6, FLIP DIRECTION 1. * / / * DISC # 6, Flip Direction 6. * / MOVQ MM5, MM4; MOVQ MM7, MM6; OR EBX, EAX; OR ECX, EDX; PSLLQ MM5, 8; PSRLQ MM7, 8; MOV EAX EBX ; MOV EDX, ECX; PAND MM5, MM3; PAND MM7, MM3; SHR EAX, 1; SHR EDX, 1; POR MM4, MM5; POR MM6, MM7; And Es, EDI; And EDX, ESI; PSLLQ MM4, 8 PSRLQ MM6, 8; OR EBX, EAX; OR ECX, EDX; POR MM2, MM4; POR MM2, MM6;

/ * Shift = -7 rowdelta = -1 Coldelta = 1 * / / * shift = 7 rowdelta = 1 Coldelta = -1 * /

/ * DISC # 1, Flip Direction 2. * / / * DISC # 1, Flip Direction 5. * / MOVQ MM3, MM1; MOVQ MM4, MM0; MOVQ MM6, MM0; PAND MM3, DIR_MASK2; 0x007E7E7E7E7E7E00;; Dir_mask2 of Value 0111110; 01110; 01110; 0111110; 0111110; 0111110; 0111110; 00000000 psllq mm4, 7; psrlq mm6, 7; MOV EAX, EBX; MOV EDX, ECX; PAND MM4, MM3; PAND MM6, MM3; SHR EAX,1; SHR EDX, 1;

/ * Disc # 2, Flip Direction 2. * / / * DISC # 2, Flip Direction 5. * / MOVQ MM5, MM4; MOVQ MM7, MM6; And Eax, EDI; And EDX, ESI; PSLLQ MM5, 7; PSRLQ MM7, 7; OR EBX, EBX; PAND MM5, MM3; PAND MM7, MM3; MOV EAX, EBX; MOV EDX, ECX; POR MM4, MM5; POR MM6, MM7; SHR EAX, 1; SHR EDX, 1; / * DISC # 3, FLIP DIRECTION 2. * / / * DISC # 3, FLIP DIRECTION 5. * / MOVQ MM5, MM4; MOVQ MM7, MM6; And Eax, EDI; And EDX, ESI; PSLLQ MM5 , 7 PSRLQ MM7, 7; OR EBX, EAX; or ECX, EDX; PAND MM5, MM3; PAND MM7, MM3; MOV EAX, EBX; MOV EDX, ECX; POR MM4, MM5; POR MM6, MM7; SHR EAX, 1 SHR EDX, 1;

/ * Disc # 4, Flip Direction 2. * / / * DISC # 4, Flip Direction 5. * / MOVQ MM5, MM4; MOVQ MM7, MM6; and Eax, EDI; And EDX, ESI; PSLLQ MM5, 7; PSRLQ MM7, 7; OR EBX, EBX; PAND MM5, MM3; PAND MM7, MM3; MOV EAX, EBX; MOV EDX, ECX; POR MM4, MM5; POR MM6, MM7; SHR EAX, 1; SHR EDX, 1; / * DISC # 5, Flip Direction 2. * / / * DISC # 5, Flip Direction 5. * / MOVQ MM5, MM4; MOVQ MM7, MM6; And Eax, EDI; And EDX, ESI; PSLLQ MM5 , 7 ; PSRLQ MM7, 7; OR EBX, EAX; OR ECX, EDX; PAND MM5, MM3; PAND MM7, MM3; SHR EBX, 1; SHR ECX, 1; POR MM4, MM5; POR MM6, MM7;

/ * Serialize Here: add horizontal shr flips. * /

MOVD MM5, EBX; PSLLQ MM5, 32; MOVD MM7, ECX; POR MM5, MM7; POR MM2, MM5; POP EBX;

/ * Disc # 6, flip direction 2. * / / * DISC # 6, Flip Direction 5. * / MOVQ MM5, MM4; MOVQ MM7, MM6; PSLLQ MM5, 7; PSRLQ MM7, 7; POP ECX; Pand MM5, MM3; PAND MM7, MM3; POP EDI; POR MM4, MM5; POR MM6, MM7; POP ESI; PSLLQ MM4, 7; PSRLQ MM6, 7; POR MM2, MM4; POR MM2, MM6; / * mm2 is the pseudo- Feasible Moves at this point. * / / * let mm7 be the feasible Moves, IE, MM2 Restricted to Empty Squares. * /

MOVQ MM7, MM0; POR MM7, MM1; PANDN MM7, MM2;

/ * Count The Moves, I., The Number of Bits Set in mm7. * /

movq mm1, mm7; psrld mm7, 1; pand mm7, c55; c55 = 0x5555555555555555 psubd mm1, mm7; movq mm7, mm1; psrld mm1, 2; pand mm7, c33; c33 = 0x3333333333333333; pand mm1, c33; c33 = 0x3333333333333333 PADDD MM7, MM1; MOVQ MM1, MM7; PSRLD MM7, 4; PADDD MM7, MM1; PAND MM7, C0F; C0F = 0x0F0F0F0F0F0F0F;

MOVQ MM1, MM7; PSRLD MM7, 8; Paddd MM7, MM1; MOVQ MM1, MM7; PSRLD MM7, 16; PADDD MM7, MM1; MOVQ MM1, MM7; PSRLQ MM7, 32; PADDD MM7, MM1; MOVD EAX, MM7; And Eax, 63; Mov Count, Eax; // Emms;} Return Count;

转载请注明原文地址: http://www.9cbs.com/read-68182.html

New Post(0)