雙三次插值及優(yōu)化

上傳人：d*** IP屬地：天津上傳時間：2022-09-19 格式：DOCX 頁數(shù)：97 大?。?82.04KB 積分：118 舉報 版權(quán)申訴

已閱讀5頁，還剩92頁未讀，繼續(xù)免費閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請進行舉報或認(rèn)領(lǐng)

文檔簡介

1、.1.數(shù)學(xué)模型對于一個目的像素，其坐標(biāo)通過反向變換得到的在原圖中的浮點坐標(biāo)為(i+u,j+v) ，其中i 、 j 均為非負(fù)整數(shù)，u、 v 為 0,1) 區(qū)間的浮點數(shù)，雙三次插值考慮一個浮點坐標(biāo)(i+u,j+v) 周圍的 16 個鄰點，目的像素值f(i+u,j+v) 可由如下插值公式得到：f(i+u,j+v) = A * B * CA= S(u + 1)S(u + 0)S(u - 1)S(u - 2) f(i-1, j-1)f(i-1, j+0)f(i-1, j+1)f(i-1, j+2)B= f(i+0, j-1) f(i+0, j+0) f(i+0, j+1) f(i+0, j+2) f(i

2、+1, j-1) f(i+1, j+0) f(i+1, j+1) f(i+1, j+2) f(i+2, j-1)f(i+2, j+0)f(i+2, j+1)f(i+2, j+2) S(v + 1)C= S(v + 0) S(v - 1) S(v - 2) 1-2*Abs(x)2+Abs(x)3, 0=Abs(x)1S(x)= 4-8*Abs(x)+5*Abs(x)2-Abs(x)3, 1=Abs(x)=2S(x)是對Sin(x*Pi)/x的逼近（ Pi 是圓周率），為插值核。2.計算流程獲取 16 個點的坐標(biāo) P1、 P2P162. 由插值核計算公式S(x) 分別計算出x、 y 方向的插值核

3、向量Su、Sv進行矩陣運算，得到插值結(jié)果iTemp1 = Su0 * P1 + Su1 * P5 + Su2 * P9 + Su3 * P13iTemp2 = Su0 * P2 + Su1 * P6 + Su2 * P10 + Su3 * P14iTemp3 = Su0 * P3 + Su1 * P7 + Su2 * P11 + Su3 * P15iTemp4 = Su0 * P4 + Su1 * P8 + Su2 * P12 + Su3 * P16iResult = Sv1 * iTemp1 + Sv2 * iTemp2 + Sv3 * iTemp3 + Sv4 * iTemp44. 在得到

4、插值結(jié)果圖后，我們發(fā)現(xiàn)圖像中有“毛刺”，因此對插值結(jié)果做了個后處理，即：設(shè)該點在原圖中的像素值為pSrc，若 abs(iResult - pSrc)大于某閾值，我們認(rèn)為插值后的點可能污染原圖，因此用原像素值pSrc 代替。;.3. 算法優(yōu)化由于雙三次插值計算一個點的坐標(biāo)需要其周圍16 個點，更有多達20 次的乘法及15 次的加法，計算量可以說是非常大，勢必要進行優(yōu)化。我們選擇了Intel 的 SSE2 優(yōu)化技術(shù)，它只支持在P4 及以上的機器。測試當(dāng)前CPU 是否支持 SSE2，可由 CPUID 指令得到，代碼為：BOOL g_bSSE2 = FALSE;_asmmoveax, 1;cpuid;

5、testedx, 0 x04000000;jzNotSupport;movg_bSSE2, 1NotSupport:支持 SSE2 的 CPU 引入了 8 個 128 位的寄存器，這樣一個寄存器中就可以存放4 個點(RGB) ，有利于并行計算。詳細(xì)代碼見Transform.cpp 中函數(shù) Optimize_Bicubic 。優(yōu)化中遇到的問題：圖像每個點由 RGB 通道組成，由于 1 個 SSE2 寄存器有 16 個字節(jié)，這樣讀入 4 個像素點后，要浪費 4 個字節(jié)，同時要花費時間將數(shù)據(jù)對齊，即由 BRGB | RGBR | GBRG | BRGB 對齊成 0RGB | 0RGB |

6、0RGB | 0RGB ;2. 讀 16 字節(jié)數(shù)據(jù)到寄存器時，由于圖像地址不能保證是16 字節(jié)對齊，因此需用更多時鐘周期的MOVDQU指令 (6 個以上時鐘周期)；如能使地址16 字節(jié)對齊，則可用 MOVDQA指令 (1 個時鐘周期 ) ;3. 為了消除除法及浮點運算，對權(quán)值放大256 倍，這樣在計算插值核時，必須用 2Bytes來表示 1 個系數(shù)，而圖像數(shù)據(jù)都是1Byte ，這樣在對齊做乘法時，要浪費一半的SSE2 寄存器的空間，導(dǎo)致運算時間變長；而若降低插值核的精度，使其在1Byte 表示范圍內(nèi)時，運算的精度又大為下降；4. 對各指令的周期以及若干行指令是否能夠并行流水缺乏經(jīng)驗和認(rèn)識。;.

7、附： SSE2 指令整理算術(shù) (Arithmetic)指令：ADDPD-Packed Double-Precision Floating-Point AddSSE22個 double 對應(yīng)相加ADDPD xmm0, xmm1/m128ADDPS-Packed Single-Precision Floating-Point AddSSE4個 float 對應(yīng)相加ADDPS xmm0, xmm1/m128ADDSD-Scalar Double-Precision Floating-Point Add1 個 double(低端 )對應(yīng)相加SSE2ADDSD xmm0, xmm1/m64ADDSS-S

8、calar Single-Precision Floating-Point AddSSE1 個 float(低端 )對應(yīng)相加ADDSS xmm0, xmm1/m32PADDB/PADDW/PADDD-Packed AddOpcodeInstructionDescription0F FC /rPADDB mm, mm/m64Add packed byte integers from mm/m64 andmm.66 0F FCPADDBAdd packed byte integers from xmm2/m128/rxmm1,xmm2/m128and xmm1.0F FD /rPADDW mm,

9、mm/m64Add packed word integers from mm/m64 andmm.66 0F FDPADDWxmm1,Add packed word integers from xmm2/m128/rxmm2/m128and xmm1.0F FE /rPADDD mm, mm/m64Addpackeddoublewordintegersfrommm/m64 and mm.66 0F FEPADDDxmm1,Addpackeddoublewordintegersfrom/rxmm2/m128xmm2/m128 and xmm1.;.PADDQ-Packed Quadword Ad

10、dOpcodeInstructionDescription0F D4 /rPADDQ mm1,mm2/m64Add quadword integer mm2/m64 to mm166 0F D4PADDQAdd packed quadword integers xmm2/m128/rxmm1,xmm2/m128to xmm1PADDSB/PADDSW-Packed Add with SaturationOpcodeInstructionDescription0F EC /rPADDSBmm,Add packed signed byte integers from mm/m64mm/m64and

11、 mm and saturate the results.66 0F ECPADDSB xmm1,Addpackedsignedbyteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 saturate the results.0F ED /rPADDSWmm,Add packed signed word integers from mm/m64mm/m64and mm and saturate the results.66 0F EDPADDSWxmm1,Addpackedsignedwordintegersfrom/rxmm2/m128xmm2/m128

12、and xmm1 and saturate the results.PADDUSB/PADDUSW-Packed Add Unsigned with SaturationOpcodeInstructionDescription0F DC /rPADDUSBmm,Add packed unsigned byte integers from mm/m64mm/m64and mm and saturate the results.66 0F DCPADDUSBxmm1,Addpacked unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 s

13、aturate the results.0F DD /rPADDUSWmm,Addpacked unsigned wordintegersfrommm/m64mm/m64 and mm and saturate the results.660FPADDUSWxmm1,Addpacked unsigned wordintegersfromDD /rxmm2/m128xmm2/m128 to xmm1 and saturate the results.PMADDWD-Packed Multiply and AddOpcodeInstructionDescription0F F5 /rPMADDWD

14、mm, Multiply the packed words in mm by the packedmm/m64words in mm/m64. Add the 32-bit pairs of results andstore in mm as doubleword;.66Multiply the packed word integers in xmm1 by the0F PMADDWDpacked word integers in xmm2/m128, and add theF5 /rxmm1, xmm2/m128 adjacent doubleword results.PSADBW-Pack

15、ed Sum of Absolute DifferencesOpcodeInstructionDescriptionPSADBW mm1,Absolute difference of packed unsigned byte integers0F F6 /rfrom mm2 /m64 and mm1; differences are then summedmm2/m64to produce an unsigned word integer result.PSADBWAbsolute difference of packed unsigned byte integers66 0Ffrom xmm

16、2 /m128 and xmm1; the 8 low differences andxmm1,F6 /rxmm2/m1288 high differences are then summed separately toproduce two word integer results.;.PSUBB/PSUBW/PSUBD-Packed SubtractOpcodeInstructionDescription0F F8 /rPSUBBmm,Subtract packed byte integers in mm/m64frommm/m64packed byte integers in mm.66

17、 0F F8PSUBBxmm1,Subtract packed byte integers in xmm2/m128 from/rxmm2/m128packed byte integers in xmm1.0F F9 /rPSUBWmm,Subtract packed word integers in mm/m64frommm/m64packed word integers in mm.66 0F F9PSUBWxmm1,Subtract packed word integers in xmm2/m128 from/rxmm2/m128packed word integers in xmm1.

18、0F FA /rPSUBDmm,Subtract packed doubleword integers in mm/m64mm/m64from packed doubleword integers in mm.66 0F FAPSUBDxmm1,Subtract packed doubleword integers in/rxmm2/m128xmm2/mem128 from packed doubleword integers inxmm1.PSUBQ-Packed Subtract QuadwordOpcodeInstructionDescription0F FB /rPSUBQmm1,Su

19、btract quadword integer in mm1 from mm2mm2/m64/m64.66 0F FBPSUBQxmm1,Subtract packed quadword integers in xmm1/rxmm2/m128from xmm2 /m128.PSUBSB/PSUBSW-Packed Subtract with SaturationOpcodeInstructionDescription0F E8 /rPSUBSBmm,Subtract signed packed bytes in mm/m64 from signedmm/m64packed bytes in m

20、m and saturate results.66 0F E8PSUBSBxmm1,Subtract packed signed byte integers in xmm2/m128/rxmm2/m128from packed signed byte integers in xmm1andsaturate results.0F E9 /rPSUBSWmm,Subtract signed packed words in mm/m64frommm/m64signed packed words in mm and saturate results.;.66 0F E9 PSUBSW xmm1, Su

21、btract packed signed word integers in xmm2/m128 from packed signed word integers in xmm1 and/rxmm2/m128saturate results.PSUBUSB/PSUBUSW-Packed Subtract Unsigned with SaturationOpcodeInstructionDescription0F D8 /rPSUBUSBmm,Subtract unsigned packed bytes inmm/m64 frommm/m64unsigned packed bytes in mm

22、and saturate result.660FPSUBUSBxmm1,Subtractpackedunsignedbyteintegersinxmm2/m128 from packed unsigned byte integers inD8 /rxmm2/m128xmm1 and saturate result.0F D9 /rPSUBUSWmm, Subtract unsigned packed words in mm/m64 frommm/m64unsigned packed words in mm and saturate result.660FPSUBUSWxmm1,Subtract

23、packedunsignedwordintegersinxmm2/m128 from packed unsigned word integers inD9 /rxmm2/m128xmm1 and saturate result.SUBPD-Packed Double-Precision Floating-Point SubtractOpcodeInstructionDescription66 0F 5CSUBPDxmm1, Subtract packed double-precision floating-point/rxmm2/m128values in xmm2/m128 from xmm

24、1.SUBPS-Packed Single-Precision Floating-Point SubtractOpcodeInstructionDescription0F 5CSUBPSxmm1 Subtract packed single-precision floating-point/rxmm2/m128values in xmm2/mem from xmm1.SUBSD-Scalar Double-Precision Floating-Point SubtractOpcodeInstructionDescriptionF2 0F 5CSUBSDxmm1, Subtracts the l

25、ow double-precision floating-point/rxmm2/m64numbers in xmm2/mem64 from xmm1.SUBSS-Scalar Single-FP Subtract;.OpcodeInstructionDescriptionF3 0F 5C SUBSSxmm1, Subtract the lowersingle-precision floating-point/rxmm2/m32numbers in xmm2/m32 from xmm1.-PMULHUW-Packed Multiply High UnsignedOpcodeInstructio

26、nDescription0F E4 /rPMULHUW mm1,Multiply the packed unsigned word integers in mm1mm2/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHUW xmm1,Multiply the packed unsigned word integers in xmm1E4 /rxmm2/m128and xmm2/m128, and store the high 16 bits of theresults in x

27、mm1.PMULHW-Packed Multiply High SignedOpcodeInstructionDescriptionPMULHWMultiply the packed signed word integers in mm1mm,0F E5 /rmm/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHWMultiply the packed signed word integers in xmm1xmm1,E5 /rxmm2/m128and xmm2/m128, a

28、nd store the high 16 bits of theresults in xmm1.;.PMULLW-Packed Multiply Low SignedOpcodeInstructionDescriptionPMULLWmm,Multiply the packed signed word integers in mm10F D5 /rregister and mm2/m64, and store the low 16 bits ofmm/m64the results in mm1.66 0FPMULLWxmm1,Multiply the packed signed word in

29、tegers in xmm1and xmm2/m128, and store the low 16 bits of theD5 /rxmm2/m128results in xmm1.PMULUDQ-Multiply Doubleword UnsignedOpcodeInstructionDescriptionPMULUDQ mm1,Multiplyunsigned doubleword integer in mm1 by0FF4 /runsigned doubleword integer in mm2/m64, and storemm2/m64the quadword result in mm

30、1.66OFPMULUDQMultiplypacked unsigned doubleword integers inxmm1,xmm1by packed unsigned doubleword integers inF4/rxmm2/m128xmm2/m128, and store the quadword results in xmm1.PMULUDQ instruction with 64-Bit operands:DEST63-0DEST31-0 * SRC31-0;PMULUDQ instruction with 128-Bit operands:;.DEST63-0DEST31-0

31、 * SRC31-0;DEST127-64DEST95-64*SRC95-64;MULPD-Packed Double-Precision Floating-Point MultiplyOpcodeInstructionDescription66 0F 59MULPDxmm1, Multiply packed double-precision floating-point/rxmm2/m128values in xmm2/m128 by xmm1.;.DEST63-0DEST63-0 * SRC63-0;DEST127-64DEST127-64*SRC127-64;MULPS-Packed S

32、ingle-Precision Floating-Point MultiplyOpcodeInstructionDescription0F 59MULPSxmm1, Multiply packed single-precision floating-point/rxmm2/m128values in xmm2/mem by xmm1.;.DEST31-0DEST31-0 * SRC31-0;DEST63-32DEST63-32*SRC63-32;.DEST95-64DEST95-64*SRC95-64;DEST127-96DEST127-96*SRC127-96;MULSD-Scalar Do

33、uble-Precision Floating-Point MultiplyOpcodeInstructionDescriptionF2 0FMULSD xmm1,Multiply the low double-precision floating-point value59 /rxmm2/m64in xmm2/mem64 by low double-precision floating-pointvalue in xmm1.;.DEST63-0DEST63-0*xmm2/m6463-0;* DEST127-64 remains unchanged *;MULSS-Scalar Single-

34、FP MultiplyOpcodeInstructionDescriptionF3 0F 59MULSS xmm1,Multiply the low single-precision floating-point value inxmm2/mem by the low single-precision floating-point/rxmm2/m32value in xmm1.DEST31-0DEST31-0 * SRC31-0;* DEST127-32 remains unchanged *;-;.DIVPD-Packed Double-Precision Floating-Point Di

35、videDIVPD xmm0, xmm1/m128DEST63-0DEST63-0 / (SRC63-0);DEST127-64DEST127-64/(SRC127-64);DIVPS-Packed Single-Precision Floating-Point DivideDIVPS xmm0, xmm1/m128;.DEST31-0DEST31-0 / (SRC31-0);DEST63-32DEST63-32/(SRC63-32);.DEST95-64DEST95-64/(SRC95-64);DEST127-96DEST127-96/(SRC127-96);DIVSD-Scalar Dou

36、ble-Precision Floating-Point DivideDIVSD xmm0, xmm1/m64;.DEST63-0DEST63-0 / SRC63-0;* DEST127-64 remains unchanged *;DIVSS-Scalar Single-Precision Floating-Point DivideDIVSS xmm0, xmm1/m32DEST31-0DEST31-0 / SRC31-0;* DEST127-32 remains unchanged *;-PAVGB/PAVGW-Packed AverageOpcode InstructionDescrip

37、tionPAVGBmm1, Average packed unsigned byte integers from0F E0 /rmm2/m64 and mm1, with rounding.mm2/m64;.66 0F E0, PAVGBxmm1, Average packed unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.0F E3 /rPAVGWmm1,Average packed unsigned wordintegersfrommm2/m64mm2/m64 and mm1, with roun

38、ding.66 0F E3 PAVGWxmm1,Average packed unsigned wordintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.-PMAXSW-Packed Signed Integer Word MaximumOpcodeInstructionDescription0F EE /rPMAXSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F EEPMAXSWxmm1,Compare si

39、gned word integers in xmm2/m128/rxmm2/m128and xmm1 for maximum values.PMAXUB-Packed Unsigned Integer Byte MaximumOpcodeInstructionDescription0F DE /rPMAXUBmm1, Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F DEPMAXUBxmm1, Compare unsigned byte integers in xmm2/m128/r

40、xmm2/m128and xmm1 for maximum values.PMINSW-Packed Signed Integer Word MinimumOpcodeInstructionDescription0F EA /rPMINSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for minimum values.66 0F EAPMINSWxmm1, Compare signed word integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.PM

41、INUB-Packed Unsigned Integer Byte MinimumOpcodeInstructionDescription0F DA /rPMINUBmm1,Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for minimum values.;.66 0F DA PMINUBxmm1, Compare unsigned byte integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.-RCPPS-Packed Single-Precision Fl

42、oating-Point ReciprocalOpcodeInstructionDescription0F 53RCPPS xmm1,Returns to xmm1 the packed approximations of the/rxmm2/m128reciprocals of the packed single-precision floating-pointvalues in xmm2/m128.DEST31-0APPROXIMATE(1.0/(SRC31-0);DEST63-32APPROXIMATE(1.0/(SRC63-32);DEST95-64;.APPROXIMATE(1.0/

43、(SRC95-64);DEST127-96APPROXIMATE(1.0/(SRC127-96);RCPSS-Scalar Single-Precision Floating-Point ReciprocalOpcodeInstructionDescriptionF3 0F 53RCPSS xmm1,Returns to xmm1 thepacked approximation of the/rxmm2/m32reciprocal of the lowsingle-precision floating-pointvalue in xmm2/m32.;.DEST31-0APPROX(1.0/(S

44、RC31-0);* DEST127-32 remains unchanged *;RSQRTPS-Packed Single-Precision Floating-Point Square Root ReciprocalOpcodeInstructionDescription0F 52RSQRTPS xmm1,Returns toxmm1 the packed approximations of thereciprocalsof the square roots of the packed/rxmm2/m128single-precision floating-point values in

45、xmm2/m128.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);DEST63-32;.APPROXIMATE(1.0/SQRT(SRC63-32);DEST95-64APPROXIMATE(1.0/SQRT(SRC95-64);DEST127-96;.APPROXIMATE(1.0/SQRT(SRC127-96);RSQRTSS-Scalar Single-Precision Floating-Point Square Root ReciprocalOpcode InstructionDescriptionF3RSQRTSSReturns to xmm1 an

46、approximation of the reciprocal of0Fthe square root of the low single-precision52 /rxmm1,xmm2/m32floating-point value in xmm2/m32.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);* DEST127-32 remains unchanged *;.SQRTPD-Packed Double-Precision Floating-Point Square RootOpcodeInstructionDescription66 0F 51SQRTP

47、D xmm1,Computes square roots of the packed double-precisionfloating-point values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTPS-Packed Single-Precision Floating-Point Square RootOpcodeInstructionDescription0F 51SQRTPS xmm1,Computes square roots of the packed single-precisionfloating-po

48、int values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTSD-Scalar Double-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowdouble-precisionF2 0F 51SQRTSD xmm1, floating-point value in xmm2/m64and stores the/rxmm2/m64results in xmm1.SQRTSS-Sca

49、lar Single-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowsingle-precisionF3 0F 51SQRTSS xmm1, floating-point value in xmm2/m32and stores the/rxmm2/m32results in xmm1.移動 (Move) 指令：MASKMOVDQU-Mask Move of Double Quadword Unaligned;.MASKMOVDQU xmm0, xmm1

50、MASKMOVQ-Mask Move of QuadwordMASKMOVQ mm0, mm1MOV APD-Move Aligned Packed Double-Precision Floating-Point Values MOVAPD xmm0, xmm1/m128MOVAPD xmm1/m128, xmm0MOV APS-Move Aligned Packed Single-Precision Floating-Point Values MOVAPS xmm0, xmm1/m128MOVD-Move DoublewordInstructionDescriptionMOVD mm, r/

51、m32Move doubleword from r/m32 to mm.MOVD r/m32, mmMove doubleword from mm to r/m32.MOVD xmm, r/m32Move doubleword from r/m32 to xmm.MOVD r/m32, xmmMove doubleword from xmm register to r/m32.MOVDQ2Q - Move QuadwordInstructionDescriptionMOVDQ2Q mm, xmmMove low quadword from xmm to mmx register .MOVQ2D

52、Q-Move QuadwordOpcodeInstructionDescriptionF30FMOVQ2DQxmm,Move quadword from mmx to low quadword ofD6mmxmm.;.DEST63-0SRC63-0;DEST127-6400000000000000000H;MOVDQA - Move Aligned Double QuadwordInstructionDescriptionMOVDQAxmm1, Move aligned double quadword from xmm2/m128 toxmm2/m128xmm1.MOVDQAxmm2/m128

53、, Movealigned doublequadword fromxmm1toxmm1xmm2/m128.MOVDQU - Move Unaligned Double QuadwordInstructionDescription;.MOVDQUxmm1, Move unaligned double quadword from xmm2/m128xmm2/m128to xmm1.MOVDQUxmm2/m128, Move unaligned double quadword fromxmm1 toxmm1xmm2/m128.MOVHLPS- Move Packed Single-Precision

54、 Floating-Point Values High to LowInstructionDescriptionMOVHLPSMove two packed single-precision floating-point values fromxmm1, xmm2high quadword of xmm2 to low quadword of xmm1.DEST63-0SRC127-64;* DEST127-64 unchanged *;MOVLHPS - Move Packed Single-Precision Floating-Point Values Low to HighInstruc

55、tionDescriptionMOVLHPSMove two packed single-precision floating-point values fromxmm1, xmm2low quadword of xmm2 to high quadword of xmm1.MOVHPD-Move High Packed Double-Precision Floating-Point ValueInstructionDescriptionMOVHPDxmm,Move double-precision floating-point value from m64 to highm64quadword

56、 of xmm.MOVHPDm64,Move double-precision floating-point value from high;.xmmquadword of xmm to m64.MOVHPD instruction for memory to XMM move:DEST127-64SRC ;* DEST63-0 unchanged *;MOVHPD instruction for XMM to memory move:DESTSRC127-64 ;MOVHPS-Move High Packed Single-Precision Floating-Point ValuesIns

57、tructionDescriptionMOVHPSxmm,Movetwopacked single-precision floating-pointvalues fromm64m64 to high quadword of xmm.MOVHPSm64,Movetwopacked single-precision floating-pointvalues from;.xmmhigh quadword of xmm to m64.MOVLPD-Move Low Packed Double-Precision Floating-Point ValueInstructionDescriptionMOV

58、LPDxmm,Movedouble-precision floating-point value fromm64 to lowm64quadword of xmm register.MOVLPDm64,Movedouble-precision floating-point nvaluefrom lowxmmquadword of xmm register to m64.MOVLPS - Move Low Packed Single-Precision Floating-Point ValuesOpcodeInstructionDescription0F12MOVLPS xmm,Movetwo

59、packed single-precision floating-pointvalues/rm64from m64 to low quadword of xmm.0F13MOVLPS m64,Movetwo packed single-precision floating-pointvalues/rxmmfrom low quadword of xmm to m64.MOVMSKPD - Extract Packed Double-Precision Floating-Point Sign Mask MOVMSKPD r32, xmmDEST0SRC63;.DEST1SRC127;DEST3-

60、200B;DEST31-40000000H;.MOVMSKPS - Extract Packed Single-Precision Floating-Point Sign Mask MOVMSKPS r32, xmmDEST0SRC31;DEST1SRC63;.DEST1SRC95;DEST1SRC127;DEST31-4000000H;.MOVNTDQ - Move Double Quadword Non-TemporalOpcodeInstructionDescription66 0F E7MOVNTDQMove double quadword from xmm to m128,/rm12

人人文庫> 全部分類> 辦公材料 > 辦公文檔

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲空間，僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理，對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

雙三次插值及優(yōu)化

文檔簡介

溫馨提示

最新文檔

評論

雙三次插值及優(yōu)化

文檔簡介

溫馨提示

最新文檔

評論

相關(guān)文檔