TMS320C6000內(nèi)聯(lián)指令匯編_第1頁(yè)
TMS320C6000內(nèi)聯(lián)指令匯編_第2頁(yè)
TMS320C6000內(nèi)聯(lián)指令匯編_第3頁(yè)
TMS320C6000內(nèi)聯(lián)指令匯編_第4頁(yè)
TMS320C6000內(nèi)聯(lián)指令匯編_第5頁(yè)
已閱讀5頁(yè),還剩53頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、內(nèi)聯(lián)指令匯編指令簡(jiǎn)要描述圖示int _abs (int src);int _labs (_int40_t src)ABS返回src的絕對(duì)值int _add2 (int src1, int src2)ADD2把src1的高、低16位和src2的高、低16位分別相加,放入結(jié)果的高、低16位ushort & _amem2 (void *ptr);LDHUSTHU從內(nèi)存中加載一個(gè)halfword到dst里,必須2byte對(duì)齊(讀或存)const ushort & _amem2_const (const void *ptr);LDHU必須2byte對(duì)齊(讀)unsigned & _amem4 (void

2、 *ptr);LDWSTW必須4byte對(duì)齊(讀或存)const unsigned & _amem4_const (const void *ptr);LDW必須4byte對(duì)齊(讀)double & _amemd8 (void *ptr);LDW/LDWSTW/STW必須8byte對(duì)齊(讀或存)const double & _amemd8_const (const void *ptr);LDDW必須8byte對(duì)齊(讀)unsigned _clr (unsigned src2, unsigned csta,unsigned cstb);CLR指定了從需要清0的首位和末位unsigned _clrr

3、 (unsigned src2, int src1);CLR將src2中指定位清0,清0的首位和末位由src1的低10位指定_int40_t _dtol (double src);將一個(gè)double寄存器重新解釋成一個(gè)_int40_tlong long _dtoll (double src);將一個(gè)double寄存器重新解釋成一個(gè)long longint _ext (int src2, unsigned csta, unsigned cstb);EXT從src2里提取csta和cstb指定的區(qū)域且符號(hào)擴(kuò)展到32位。提取出的區(qū)域先符號(hào)左移再右移。int _extr (int src2, int

4、src1);EXT同上,區(qū)別:左右移的位數(shù)由src1的低10位指定unsigned _extu (unsigned src2, unsigned csta , unsigned cstb);EXTU同上上,區(qū)別最后是0擴(kuò)展到32位。unsigned _extur (unsigned src2, int src1);EXTU同上,區(qū)別:左右移的位數(shù)由src1的低10位指定例:_ftoi (1.0) = 1065353216Uunsigned _ftoi (float src);將float的比特位解釋成unsignedunsigned _hi (double src);返回double寄存器的高

5、位(奇數(shù)位)unsigned _hill (long long src);返回longlong寄存器的高位(奇數(shù)位)double _itod (unsigned src2, unsigned src1);創(chuàng)建一個(gè)新的double寄存器為了解釋2個(gè)unsigned的值,其中src2是高(奇數(shù))寄存器,src1是低(偶數(shù))寄存器float _itof (unsigned src);將unsigned中的比特位解釋成float例:_itof (0x3f800000) = 1.0long long _itoll (unsigned src2, unsigned src1);創(chuàng)建一個(gè)新的longlong

6、寄存器為了解釋2個(gè)unsigned的值,其中src2是高(奇數(shù))寄存器,src1是低(偶數(shù))unsigned _lmbd (unsigned src1, unsigned src2);LMBD搜索src2里面的1或0,1或0是由src1的LSB決定的,返回比特位變化的位數(shù)unsigned _lo (double src);返回double寄存器的低(奇數(shù))寄存器unsigned _loll (long long src);返回longlong寄存器的低(奇數(shù))寄存器double _ltod (_int40_t src);把一個(gè)_int40_t寄存器解釋成一個(gè)double寄存器double _l

7、ltod (long long src);把一個(gè)longlong寄存器解釋成一個(gè)double寄存器int _mpy (int src1, int src2); MPYSrc1和src2相乘,操作數(shù)默認(rèn)為有符號(hào)的int _mpyus (unsigned src1, int src2);MPYUS無(wú)符號(hào)src1和有符號(hào)src2相乘,S是用來(lái)那個(gè)是有符號(hào)的操作數(shù),當(dāng)兩個(gè)操作數(shù)都是有符號(hào)的或者無(wú)符號(hào)的int _mpysu (int src1, unsigned src2);MPYSU同上unsigned _mpyu (unsigned src1, unsigned src2);MPYU同上上上,默認(rèn)為

8、無(wú)符號(hào)int _mpyh (int src1, int src2);MPYH同上,區(qū)別見(jiàn)圖示int _mpyhus (unsigned src1, int src2);MPYHUSint _mpyhsu (int src1, unsigned src2);MPYHSUunsigned _mpyhu (unsigned src1, unsigned src2);MPYHUint _mpyhl (int src1, int src2);MPYHL同上,區(qū)別見(jiàn)圖示int _mpyhuls (unsigned src1, int src2);MPYHULSint _mpyhslu (int src1,

9、 unsigned src2);MPYHSLUunsigned _mpyhlu (unsigned src1, unsigned src2);MPYHLUint _mpylh (int src1, int src2);MPYLHint _mpyluhs (unsigned src1, int src2);MPYLUHSint _mpylshu (int src1, unsigned src2);MPYLSHUunsigned _mpylhu (unsigned src1, unsigned src2);MPYLHUvoid _nassert (int src);不生成代碼,告訴優(yōu)化器一些事情u

10、nsigned _norm (int src);unsigned _lnorm (_int40_t src);NORM返回src2的冗余的符號(hào)比特位的個(gè)數(shù),具體見(jiàn)圖示int _sadd (int src1, int src2);long _lsadd (int src1, _int40_t src2);SADD將src1和src2相加,且飽和其結(jié)果int _sat (_int40_t src2);SAT將一個(gè)40比特的long轉(zhuǎn)換為一個(gè)32比特的有符號(hào)int,如有需要,對(duì)結(jié)果進(jìn)行飽和unsigned _set (unsigned src2, unsigned csta , unsigned c

11、stb);SET將src2中指定的區(qū)域置位1,指定的區(qū)域由csta和cstb指定unsigned _setr (unit src2, int src1);SETint _smpy (int src1, int src2);SMPY把src1的低16位和src2的低16位相乘int _smpyh (int src1, int src2);SMPYH高16位int _smpyhl (int src1, int src2);SMPYHLint _smpylh (int src1, int src2);SMPYLHint _sshl (int src2, unsigned src1);SSHL以src

12、1操作數(shù)將src2左移,并且將結(jié)果飽和在32位int _ssub (int src1, int src2);_int40_t _lssub (int src1, _int40_t src2);SSUB從src1中減去src2,并飽和結(jié)果(src1-src2)unsigned _subc (unsigned src1, unsigned src2);SUBC有條件的減和左移(常用于除法)int _sub2 (int src1, int src2);SUB2把src1的高低16位分別減去src2的高低16位。任何低16位的借位不會(huì)影響高16位。int _abs2 (int src);ABS2計(jì)算1

13、6位的絕對(duì)值int _add4 (int src1, int src2);ADD4把src1和src2的4對(duì)8位數(shù)相加。不會(huì)進(jìn)行飽和,進(jìn)位不會(huì)影響其他的8位數(shù)long long & _amem8 (void *ptr);LDDWSTDW加載和存儲(chǔ)8bytes,指針必須8byte對(duì)齊const long long & _amem8_const (const void *ptr);LDDW加載8bytes,指針必須8byte對(duì)齊_float2_t & _amem8_f2(void * ptr);LDDWSTDW加載和存儲(chǔ)8bytes,指針必須8byte對(duì)齊,必須包含c6x.hconst _floa

14、t2_t & _amem8_f2_const(void * ptr);LDDW加載8bytes,指針必須8byte對(duì)齊,必須包含c6x.hdouble & _amemd8 (void *ptr);LDDWSTDWconst double & _amemd8_const (const void *ptr);LDDWint _avg2 (int src1, int src2);AVG2計(jì)算每對(duì)有符號(hào)16位置的平均值unsigned _avgu4 (unsigned, unsigned);AVGU4計(jì)算每對(duì)有符號(hào)8位數(shù)的平均值unsigned _bitc4 (unsigned src);BITC4統(tǒng)

15、計(jì)每個(gè)8位的比特位是1的個(gè)數(shù),寫(xiě)入結(jié)果對(duì)應(yīng)位置unsigned _bitr (unsigned src);BITR翻轉(zhuǎn)比特位的順序int _cmpeq2 (int src1, int src2);CMPEQ2比較每16位的值是否相等,結(jié)果放入dst的最低2位int _cmpeq4 (int src1, int src2);CMPEQ4比較每8位的值是否相等,結(jié)果放入dst的最低4位,相等置1,否則為0int _cmpgt2 (int src1, int src2);CMPGT2每16位有符號(hào)比較,src1src2,置為1;否則置為0。結(jié)果放入dst的最低2位unsigned _cmpgtu4

16、(unsigned src1, unsigned src2);CMPGTU4每8位無(wú)符號(hào)比較,src1src2,置為1;否則置為0。結(jié)果放入dst的最低4位unsigned _deal (unsigned src );DEAL將src中的比特位的奇數(shù)位和偶數(shù)位抽出來(lái)進(jìn)行重組,偶數(shù)位放在低的16位,奇數(shù)位放在高的16位int _dotp2 (int src1, int src2);_int40_t _ldotp2 (int src1, int src2);DOTP2DOTP2將src1中的和src2中的16位有符號(hào)對(duì)進(jìn)行點(diǎn)積,結(jié)果被寫(xiě)成有符號(hào)32位int或者符號(hào)擴(kuò)展為64位int _dotpn2

17、 (int src1, int src2);DOTPN2將src1和src2中的16位有符號(hào)數(shù)進(jìn)行點(diǎn)積相減int _dotpnrsu2 (int src1, unsigned src2);DOTPNRSU2Src1和src2的高16位的點(diǎn)積減去低16位的點(diǎn)積。Src1中的數(shù)被當(dāng)做有符號(hào),src2中的數(shù)被當(dāng)做無(wú)符號(hào),再加上215,結(jié)果再符號(hào)右移16位int _dotprsu2 (int src1, unsigned src2);DOTPRSU2Src1和src2的高16位的點(diǎn)積加上低16位的點(diǎn)積。Src1中的數(shù)被當(dāng)做有符號(hào),src2中的數(shù)被當(dāng)做無(wú)符號(hào),再加上215,結(jié)果再符號(hào)右移16位int _

18、dotpsu4 (int src1, unsigned src2);DOTPSU4將src1和src2的每8位進(jìn)行相乘再求和,src1的每8位數(shù)被當(dāng)做有符號(hào),src2的每8位數(shù)被當(dāng)做無(wú)符號(hào)unsigned _dotpu4 (unsigned src1, unsigned src2);DOTPU4都被當(dāng)做無(wú)符號(hào)的int _gmpy4 (int src1, int src2);GMPY4將src1和src2的4個(gè)無(wú)符號(hào)進(jìn)行伽羅瓦域的乘法int _max2 (int src1, int src2);MAX2將src1和src2的2個(gè)有符號(hào)16位整數(shù)比較,取較大值int _min2 (int src1

19、, int src2);MIN2將src1和src2的2個(gè)有符號(hào)16位整數(shù)比較,取較小值unsigned _maxu4 (unsigned src1, unsigned src2);MAXU4將src1和src2的4個(gè)無(wú)符號(hào)8位整數(shù)比較,取較大值unsigned _minu4 (unsigned src1, unsigned src2);MINU4將src1和src2的4個(gè)無(wú)符號(hào)8位整數(shù)比較,取較小值ushort & _mem2 (void * ptr);LDB/LDBSTB/STB加載和存儲(chǔ)2byte,不需要對(duì)齊const ushort & _mem2_const (const void *

20、 ptr);LDB/LDB加載2byte,不需要對(duì)齊unsigned & _mem4 (void * ptr);LDNWSTNW加載和存儲(chǔ)4byte,不需要對(duì)齊const unsigned & _mem4_const (const void * ptr);LDNW加載4byte,不需要對(duì)齊long long & _mem8 (void * ptr);LDNDWSTNDW加載和存儲(chǔ)8byte,不需要對(duì)齊const long long & _mem8_const (const void * ptr);LDNDW加載8byte,不需要對(duì)齊double & _memd8 (void * ptr);LD

21、NDWSTNDW加載和存儲(chǔ)8byte,不需要對(duì)齊const double & _memd8_const (const void * ptr);LDNDW加載8byte,不需要對(duì)齊long long _mpy2ll (int src1, int src2);MPY2將src1和src2中的2個(gè)有符號(hào)16位分別相乘,將2個(gè)32位的結(jié)果寫(xiě)入longlong中l(wèi)ong long _mpyhill (int src1, int src2);MPYHI將src1中高16位作為1個(gè)有符號(hào)16位乘以src2的有符號(hào)32位,結(jié)果寫(xiě)入longlong的低48位long long _mpylill (int src

22、1, int src2);MPYLI將src1中低16位作為1個(gè)有符號(hào)16位乘以src2的有符號(hào)32位,結(jié)果寫(xiě)入longlong的低48位int _mpyhir (int src1, int src2);MPYHIR將src1的高16位作為一個(gè)16位有符號(hào)乘以src2的有符號(hào)32位。乘積利用round模式通過(guò)加214轉(zhuǎn)成32位,最后再右移15位int _mpylir (int src1, int src2);MPYLIR將src1的低16位作為一個(gè)16位有符號(hào)乘以src2的有符號(hào)32位。乘積利用round模式通過(guò)加214轉(zhuǎn)成32位,最后再右移15位long long _mpysu4ll (in

23、t src1, unsigned src2);MPYSU4將src1的4個(gè)8位有符號(hào)乘src2的4個(gè)8位無(wú)符號(hào),得到4個(gè)16位有符號(hào),組成一個(gè)64位long long _mpyu4ll (unsigned src1, unsigned src2);MPYU4將src1和src2的4個(gè)無(wú)符號(hào)8位相乘,得到4個(gè)無(wú)符號(hào)16位組成一個(gè)64位的數(shù)int _mvd (int src2 );MVD將src2的數(shù)據(jù)移入返回值中,利用了乘法流水線(延遲)unsigned _pack2 (unsigned src1, unsigned src2);PACK2unsigned _packh2 (unsigned s

24、rc1, unsigned src2);PACKH2unsigned _packh4 (unsigned src1, unsigned src2);PACKH4unsigned _packl4 (unsigned src1, unsigned src2);PACKL4unsigned _packhl2 (unsigned src1, unsigned src2);PACKHL2unsigned _packlh2 (unsigned src1, unsigned src2);PACKLH2unsigned _rotl (unsigned src1, unsigned src2);ROTL按照sr

25、c1的最低5位的數(shù)去左移src2的32位,src1中剩下的高的5-31位被忽略int _sadd2 (int src1, int src2);SADD2將src1和src2中的2個(gè)16位有符號(hào)數(shù)相加,生成2個(gè)16有符號(hào)數(shù)并且是飽和過(guò)的。int _saddus2 (unsigned src1, int src2);SADDUS2將src1中的2個(gè)無(wú)符號(hào)16位數(shù)和src中的2個(gè)16位有符號(hào)數(shù)相加,得到2個(gè)無(wú)符號(hào)16位數(shù)unsigned _saddu4 (unsigned src1, unsigned src2);SADDU4將src1和src2中的4個(gè)無(wú)符號(hào)8位數(shù)相加unsigned _shfl

26、(unsigned src2);SHFL將src2的高16和低16位進(jìn)行交織unsigned _shlmb (unsigned src1, unsigned src2);SHLMB將src2左移1byte,然后將src1的最高位充入src2左移后多出來(lái)的位置unsigned _shrmb (unsigned src1, unsigned src2);SHRMB將src2右移1byte,然后將src1的最低位充入src2右移后多出來(lái)的位置int _shr2 (int src1, unsigned src2);SHR2將src2的2個(gè)16位有符號(hào)數(shù)分別右移,右移的位數(shù)由src1的低5位決定,多出的

27、位置由符號(hào)位擴(kuò)展unsigned shru2 (unsigned src1, unsigned src2);SHRU2將src2的2個(gè)16位無(wú)符號(hào)數(shù)分別右移,右移的位數(shù)由src1的低5位決定,多出的位置由0擴(kuò)展long long _smpy2ll (int src1, int src2);SMPY2將src1和src2中的2個(gè)有符號(hào)16位數(shù)相乘,然后左移1位,再進(jìn)行飽和。int _spack2 (int src1, int src2);SPACK2將src1和src2中的1個(gè)有符號(hào)32位數(shù)進(jìn)行飽和到有符號(hào)16位,然后把src1的飽和結(jié)果放入dst的高16位,src2的飽和結(jié)果放入dst的低16

28、位unsigned _spacku4 (int src1 , int src2);SPACKU4將src1和src2中的4個(gè)有符號(hào)16位數(shù)飽和成無(wú)符號(hào)8位數(shù),int _sshvl (int src2, int src1);SSHVL將src2中的有符號(hào)32位數(shù)左移或右移,移位的數(shù)量由src1指定的比特?cái)?shù)確定。src1在-31,31之間,如果src1為正,src2則左移;如果src1為負(fù),src2右移|src1|且符號(hào)位擴(kuò)展int _sshvr (int src2, int src1);SSHVR將src2中的有符號(hào)32位數(shù)左移或右移,移位的數(shù)量由src1指定的比特?cái)?shù)確定。src1在-31,31

29、之間,如果src1為正,src2則右移且是符號(hào)擴(kuò)展;如果src1為負(fù),src2左移|src1|int _sub4 (int src1, int src2);SUB4將src1和src2中的4個(gè)8位數(shù)相減,不進(jìn)行飽和int _subabs4 (int src1, int src2);SUBABS4將src1和src2中的4個(gè)無(wú)符號(hào)8位相減求絕對(duì)值unsigned _swap4 (unsigned src);SWAP4將src的4個(gè)8位無(wú)符號(hào)數(shù)按圖示換位置unsigned _unpkhu4 (unsigned src);UNPKHU4擴(kuò)展0unsigned _unpklu4 (unsigned s

30、rc);UNPKLU4擴(kuò)0unsigned _xpnd2 (unsigned src);XPND2按src的最低2位進(jìn)行擴(kuò)展,bit1擴(kuò)展高16位,bit0擴(kuò)展低16位unsigned _xpnd4 (unsigned src);XPND4按src的最低4位進(jìn)行擴(kuò)展long long _addsub (int src1, int src2);ADDSUB平行做2步:1、src2+src1-dst_o2、src1-src2-dst_elong long _addsub2 (int src1, int src2);ADDSUB216位有符號(hào)ADD2:src2的高、低16位+src1的高、低16位-

31、dst_oSUB2: src1的高、低16位-src2的高、低16位-dst_elong long _cmpy (unsigned src1, unsigned src2);CMPY有符號(hào)16位Src1和src2的高16位的點(diǎn)積-src1和src2的低16位點(diǎn)積-dst_o飽和(src1和src2的高16位的點(diǎn)積+src1和src2的低16位點(diǎn)積)-dst_eunsigned _cmpyr (unsigned src1, unsigned src2);CMPYRunsigned _cmpyr1 (unsigned src1, unsigned src2 );CMPYR1long long _d

32、dotp4 (unsigned src1, unsigned src2);DDOTP4沒(méi)有飽和long long _ddotph2 (long long src1, unsigned src2);DDOTPH2long long _ddotpl2 (long long src1, unsigned src2);DDOTPL2unsigned _ddotph2r (long long src1, unsigned src2);DDOTPH2Runsigned _ddotpl2r (long long src1, unsigned src2);DDOTPL2R long long _dmv (in

33、t src1, int src2);DMV將兩個(gè)寄存器移入一個(gè)寄存器一次性的long long _dpack2 (unsigned src1, unsigned src2);DPACK2long long _dpackx2 (unsigned src1, unsigned src2);DPACKX2_float2_t _fmdv_f2(float src1, float src2)DMVunsigned _gmpy (unsigned src1, unsigned src2);GMPY伽羅瓦域上的乘法long long _mpy2ir (int src1, int src2);MPY2IR進(jìn)行

34、16位乘32位。將src1的高16位和低16位當(dāng)做有符號(hào)16位;將src2的值當(dāng)做有符號(hào)32位。乘積通過(guò)加上214round到32位,然后結(jié)果右移15位。2個(gè)結(jié)果的低32位寫(xiě)入dst_o:dst_eint _mpy32 (int src1, int src2);MPY32進(jìn)行32位乘32位。都是有符號(hào)的,64位結(jié)果中的低32位寫(xiě)入dstlong long _mpy32ll (int src1, int src2);MPY3232位有符號(hào)數(shù)32位有符號(hào)數(shù),有符號(hào)的64位結(jié)果被寫(xiě)入dstlong long _mpy32su (int src1, int src2);MPY32SUsrc1有符號(hào)32

35、位src2無(wú)符號(hào)32位=dst有符號(hào)64位long long _mpy32us (unsigned src1, int src2);MPY32USsrc1無(wú)符號(hào)32位src2有符號(hào)32位=dst有符號(hào)64位long long _mpy32u (unsigned src1, unsigned src2);MPY32Usrc1無(wú)符號(hào)32位src2無(wú)符號(hào)32位=dst無(wú)符號(hào)64位int _rpack2 (int src1, int src2);RPACK2long long _saddsub (unsigned src1, unsigned src2);SADDSUB并行進(jìn)行:1、飽和(src1+s

36、rc2)-dst_o2、飽和(src1-src2)-dst_elong long _saddsub2 (unsigned src1, unsigned src2);SADDSUB2并行進(jìn)行SADD2和SSUB2指令long long _shfl3 (unsigned src1, unsigned src2);SHFL3如圖,生成一個(gè)longlongint _smpy32 (int src1, int src2);SMPY3232位有符號(hào)32位有符號(hào),64位的結(jié)果左移1位然后飽和,然后將之后的結(jié)果的高32位寫(xiě)入dstint _ssub2 (unsigned src1, unsigned src2

37、);SSUB2Src1中的2個(gè)16位有符號(hào)-src2中的2個(gè)有符號(hào)16位,結(jié)果進(jìn)行飽和unsigned _xormpy (unsigned src1, unsigned src2);XORMPY加瓦羅域乘法int _dpint (double src);DPINT將double轉(zhuǎn)成int(round)_int40_t _f2tol(_float2_t src);將一個(gè)_float2_t解釋成一個(gè)_int40_float2_t _f2toll(_float2_t src);將一個(gè)_float2_t解釋成一個(gè)longlongdouble _fabs (double src);ABSDP將src的絕

38、對(duì)值放入dst。float _fabsf (float src);ABSSP_float2_t _lltof2(long long src);將一個(gè)longlong解釋成一個(gè)_float2_t_float2_t _ltof2(_int40_t src);將一個(gè)_int40解釋成一個(gè)_float2_t_float2_t & _mem8_f2(void * ptr);LDNDWSTNDW從內(nèi)存里加載一個(gè)64位值const _float2_t & _mem8_f2_const(void * ptr);LDNDWSTNDWlong long _mpyidll (int src1, int src2);

39、MPYIDSrc1src2-dstdouble_mpysp2dp (float src1, float src2);MPYSP2DPSrc1src2-dstdouble_mpyspdp (float src1, double src2);MPYSPDPSrc1src2-dstdouble _rcpdp (double src);RCPDP64位double倒數(shù)近似值放入dstfloat _rcpsp (float src);RCPSP32位float的倒數(shù)近似值double _rsqrdp (double src);RSQRDP64位double的平方根倒數(shù)近似值float _rsqrsp (

40、float src);RSQRSP32位float的平方根倒數(shù)近似值int _spint (float);SPINTFloat轉(zhuǎn)為intADDDP2個(gè)double相加ADDSP2個(gè)float相加AND位與ANDN與后取反MPYSP2個(gè)float相乘OR位或SUBDP2個(gè)double相減SUBSP2和float相減XOR異或_x128_t _ccmatmpy (long long src1, _x128_t src2);CMATMPYlong long _ccmatmpyr1 (long long src1, _x128_t src2);CCMATMPYR1long long _ccmpy32r

41、1 (long long src1, long long src2);CCMPY32R1_x128_t _cmatmpy (long long src1, _x128_t src2);CMATMPYlong long _cmatmpyr1 (long long src1, _x128_t src2);CMATMPYR1long long _cmpy32r1 (long long src1, long long src2);CMPY32R1_x128_t _cmpysp (_float2_t src1, _float2_t src2);CMPYSPdouble _complex_conjugat

42、e_mpysp (double src1, double src2);CMPYSPDSUBSPdouble _complex_mpysp (double src1, double src2);CMPYSPDADDSPint _crot90 (int src);CROT90復(fù)數(shù)的90度旋轉(zhuǎn)int _crot270 (int src);CROT270復(fù)數(shù)的270度旋轉(zhuǎn)long long _dadd (long long src1, long long src2);DADDSrc1的2個(gè)32位有符號(hào)數(shù)+src2的2個(gè)32位有符號(hào)數(shù)long long _dadd2 (long long src1, l

43、ong long src2);DADD24路有符號(hào)16位相加_float2_t _daddsp (_float2_t src1, _float2_t src2);DADDSPlong long _dadd_c (scst5 immediate src1, long long src2);DADD2路float加法long long _dapys2 (long long src1, long long src2);DAPYS2long long _davg2 (long long src1, long long src2);DAVG2有符號(hào)16位long long _davgnr2 (long

44、long src1, long long src2);DAVGNR2有符號(hào)16位,無(wú)round模式long long _davgnru4 (long long src1, long long src2);DAVGNRU4無(wú)符號(hào)8位,無(wú)round模式long long _davgu4 (long long src1, long long src2);DAVGU4無(wú)符號(hào)8位long long _dccmpyr1 (long long src1, long long src2);DCCMPYR1unsigned _dcmpeq2 (long long src1, long long src2);DC

45、MPEQ216位比較,相等返回1,不等返回0unsigned _dcmpeq4 (long long src1, long long src2);DCMPEQ48位比較,相等返回1,不等返回0unsigned _dcmpgt2 (long long src1, long long src2);DCMPGT216位比較,src1src-1,否則返回0unsigned _dcmpgtu4 (long long src1, long long src2);DCMPGTU48位比較,src1src-1,否則返回0_x128_t _dccmpy (long long src1, long long sr

46、c2);DCCMPY_x128_t _dcmpy (long long src1, long long src2);DCMPYlong long _dcmpyr1 (long long src1, long long src2);DCMPYR1long long _dcrot90 (long long src);DCROT90long long _dcrot270 (long long src);DCROT270long long _ddotp4h (_x128_t src1, _x128_t src2 );DDOTP4H執(zhí)行2個(gè)dotp4h,都是有符號(hào)的long long _ddotpsu4

47、h (_x128_t src1, _x128_t src2 );DDOTPSU4H執(zhí)行2個(gè)dotpsu4h,一個(gè)有符號(hào),一個(gè)無(wú)符號(hào)_float2_t _dinthsp (int src);DINTHSPSrc中的16位有符號(hào)數(shù)轉(zhuǎn)成單精度浮點(diǎn)放入dst_e和dst_o中_float2_t _dinthspu (unsigned src);DINTHSPUSrc中的16位無(wú)符號(hào)數(shù)轉(zhuǎn)成單精度浮點(diǎn)放入dst_e和dst_o中_float2_t _dintsp(long long src);DINTSPSrc中的有符號(hào)32位轉(zhuǎn)成單精度浮點(diǎn),放入dst_e和dst_o中_float2_t _dintspu

48、(long long src);DINTSPUSrc中的無(wú)符號(hào)32位轉(zhuǎn)成單精度浮點(diǎn),放入dst_e和dst_o中l(wèi)ong long _dmax2 (long long src1, long long src2);DMAX2對(duì)src1和src2中的16位有符號(hào)數(shù)比大小,將大的放入dst中l(wèi)ong long _dmaxu4 (long long src1, long long src2);DMAXU4對(duì)src1和src2中的8位有符號(hào)數(shù)比大小,將大的放入dst中l(wèi)ong long _dmin2 (long long src1, long long src2);DMIN2對(duì)src1和src2中的16

49、位有符號(hào)數(shù)比大小,將小的放入dst中l(wèi)ong long _dminu4 (long long src1, long long src2);DMINU4對(duì)src1和src2中的8位有符號(hào)數(shù)比大小,將小的放入dst中_x128_t _dmpy2 (long long src1, long long src2);DMPY2將src1和src2中的16位有符號(hào)數(shù)相乘,得到32位有符號(hào)數(shù)放入128位寄存器中_float2_t _dmpysp (_float2_t src1, _float2_t src2);DMPYSP_x128_t _dmpysu4 (long long src1, long long

50、 src2);DMPYSU4將src1中的8位有符號(hào)數(shù)乘以src2中的無(wú)符號(hào)8位,等到有符號(hào)16位_x128_t _dmpyu2 (long long src1, long long src2);DMPYU216位無(wú)符號(hào)數(shù)相乘,得到32位數(shù)放入128位寄存器中_x128_t _dmpyu4 (long long src1, long long src2);DMPYU48位無(wú)符號(hào)數(shù)相乘,得到有符號(hào)16位結(jié)果long long _dmvd (long long src1, unsigned src2 );DMVD將2個(gè)寄存器移入一個(gè)寄存器中。依次進(jìn)行2次移動(dòng),當(dāng)處理很多的double word時(shí)很

51、有用。減輕寄存器壓力int _dotp4h (long long src1, long long src2 );DOTP4H進(jìn)行兩個(gè)系列的16位值的點(diǎn)積long long _dotp4hll (long long src1, long long src2 );DOTP4H返回值不同int _dotpsu4h (long long src1, long long src2);DOTPSU4HSrc1中被當(dāng)做有符號(hào)16位,src2被當(dāng)做無(wú)符號(hào)16位,得到32位結(jié)果long long _dotspu4hll (long long src1, long long src2);DOTPSU4HSrc1中

52、被當(dāng)做有符號(hào)16位,src2被當(dāng)做無(wú)符號(hào)16位,得到64位結(jié)果long long _dpackh2 (long long src1, long long src2);DPACKH2long long _dpackh4 (long long src1, long long src2);DPACKH4并行執(zhí)行2個(gè)PACKH4long long _dpacklh2 (long long src1, long long src2);DPACKLH2long long _dpacklh4 (unsigned src1, unsigned src2);DPACKLH4并行執(zhí)行PACKH4和PACKL4long long _dpackl2 (long long src1, long long src2);DPACKL2long long _dpackl4 (long long src1, long long src2);DPACKL4并行執(zhí)行2個(gè)PACKL4long long _dsadd (long long src1, long long src2);DSADD將src1中的2個(gè)有符號(hào)32位數(shù)加上src2中的2個(gè)有符號(hào)32位數(shù),結(jié)果進(jìn)行飽和long long _dsadd2 (long long src1, long long src

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論