🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Short Excursion

Published July 08, 2007
Advertisement
SSE Matrix MultiplyI don't know if you guys have ever played around with writing SSE assembly, but I have to say that it is really quite easy and has profound performance impact. I took a short break from writing to implement a matrix multiply member function for my SSE matrix/vector classes. After about 30 minutes of work, I tested the difference of my normal class vs. the new one. The original function takes approximately 320 cycles to execute, while the new one takes approximately 250.

I am certainly not an assembly or SSE expert, and have primarily used C++ for most of my programming. For the life of me, I can't understand why people aren't using SSE - I think both Intel and AMD support it now (although I don't know how far back AMD started supporting it) so there is a wide support base. Anyhow, I think I'll continue my little tests to see what else I can speed up...
0 likes 3 comments

Comments

sirob
Do you enable SSE optimizations when compiling your C++ code? if you don't, it won't be used. I seem to recall SSE optimization is disabled by default.
July 08, 2007 01:51 PM
superpig
Yeah, SSE is awesome fun. I've been meaning to write an article on it for ages.

I think that the compiler 'SSE Optimizations' flag doesn't do very much beyond using instructions like cvsst2i or whatever it's called for things like float->int conversions. In any case, to get the best use out of SSE you need to design for it - store your data as structure-of-arrays rather than array-of-structures, etc.
July 08, 2007 03:44 PM
jollyjeffers
Quote: I seem to recall SSE optimization is disabled by default.
This is a good thing [smile]

I forget the details, but a year or two back I sent an SSE compiled version of my 'HDR Pipeline' SDK sample to Simon who watched it explode via 'illegal operation' on his AMD machine. He did some digging and it seems that it was tripping over on an SSE instruction his CPU didn't support.

Maybe things have changed since then, or maybe this was more of a special case... but either way, I'm warey of that compiler flag [smile]



Cheers,
Jack
July 09, 2007 03:30 AM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement