[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.programming

some sse exception

kenobi

8/20/2015 8:39:00 AM

as i said i have such sse code

__attribute__((force_align_arg_pointer))
__m128i mandelbrot_n_sse( __m128 cre, __m128 cim, int max_iter )

{
__m128 re = _mm_setzero_ps();
__m128 im = _mm_setzero_ps();


__m128 _1 = _mm_set_ps1(1.);
__m128 _4 = _mm_set_ps1(4.0);

__m128 iteration_counter = _mm_set_ps1(0.);


for(int n=0; n<=max_iter; n++)
{

__m128 re2 = _mm_mul_ps(re, re);
__m128 im2 = _mm_mul_ps(im, im);
__m128 radius2 = _mm_add_ps(re2,im2);

__m128 compare_mask = _mm_cmplt_ps( radius2, _4);
iteration_counter = _mm_add_ps( iteration_counter, _mm_and_ps(compare_mask, _1) );
if (_mm_movemask_ps(compare_mask)==0) break;

__m128 ren = _mm_add_ps( _mm_sub_ps(re2, im2), cre);
__m128 reim = _mm_mul_ps(re, im);

__m128 imn = _mm_add_ps( _mm_add_ps(reim, reim), cim);

re = ren;
im = imn;


}

__m128i n = mm_cvtps_epi32(iteration_counter);
return n;

}
yesterday i un-mmasked invalid-operation-mask
in MXCSR (bit7)

if masked the code works ok if unmasked i got
exception, the basic exception info is

exception code: c000 02b5 (STATUS_FLOAT_MULTIPLE_TRAPS)
flags: 0
adress: 0040 d477

the code seen in olly debug is

0040D450 $ 55 PUSH EBP
0040D451 . 89E5 MOV EBP,ESP
0040D453 . 83E4 F0 AND ESP,FFFFFFF0
0040D456 . 83EC 10 SUB ESP,10
0040D459 . 8B4D 08 MOV ECX,DWORD PTR SS:[EBP+8]
0040D45C . 0F290424 MOVAPS DQWORD PTR SS:[ESP],XMM0
0040D460 . 85C9 TEST ECX,ECX
0040D462 . 78 5B JS SHORT program.0040D4BF
0040D464 . 0F57C0 XORPS XMM0,XMM0
0040D467 . 31C0 XOR EAX,EAX
0040D469 . 0F28D0 MOVAPS XMM2,XMM0
0040D46C . 0F28E0 MOVAPS XMM4,XMM0
0040D46F . EB 1A JMP SHORT program.0040D48B
0040D471 > 0F59D4 MULPS XMM2,XMM4
0040D474 . 83C0 01 ADD EAX,1
0040D477 . 0F5CF5 SUBPS XMM6,XMM5
0040D47A . 0F282424 MOVAPS XMM4,DQWORD PTR SS:[ESP]
0040D47E . 39C1 CMP ECX,EAX
0040D480 . 0F58E6 ADDPS XMM4,XMM6
0040D483 . 0F58D2 ADDPS XMM2,XMM2
0040D486 . 0F58D1 ADDPS XMM2,XMM1
0040D489 . 7C 2E JL SHORT program.0040D4B9
0040D48B > 0F28F4 MOVAPS XMM6,XMM4
0040D48E . 0F28EA MOVAPS XMM5,XMM2
0040D491 . 0F283D 9008440>MOVAPS XMM7,DQWORD PTR DS:[440890]
0040D498 . 0F59F4 MULPS XMM6,XMM4
0040D49B . 0F59EA MULPS XMM5,XMM2
0040D49E . 0F28DE MOVAPS XMM3,XMM6
0040D4A1 . 0F58DD ADDPS XMM3,XMM5
0040D4A4 . 0FC21D 8008440>CMPLTPS XMM3,DQWORD PTR DS:[440880]
0040D4AC . 0F54FB ANDPS XMM7,XMM3
0040D4AF . 0F50D3 MOVMSKPS EDX,XMM3
0040D4B2 . 85D2 TEST EDX,EDX
0040D4B4 . 0F58C7 ADDPS XMM0,XMM7
0040D4B7 .^75 B8 JNZ SHORT program.0040D471
0040D4B9 > 66:0F5B ??? ; Unknown command
0040D4BC C0 DB C0
0040D4BD . C9 LEAVE
0040D4BE . C3 RETN
0040D4BF > 0F57C0 XORPS XMM0,XMM0
0040D4C2 .^EB F5 JMP SHORT program.0040D4B9
0040D4C4 8DB6 00000000 LEA ESI,DWORD PTR DS:[ESI]
0040D4CA 8DBF 00000000 LEA EDI,DWORD PTR DS:[EDI]

could someone hint why it is crashing?