![雙三次插值及優(yōu)化_第1頁](http://file4.renrendoc.com/view/00e53c010696ba2c122aeda3deadaeb0/00e53c010696ba2c122aeda3deadaeb01.gif)
![雙三次插值及優(yōu)化_第2頁](http://file4.renrendoc.com/view/00e53c010696ba2c122aeda3deadaeb0/00e53c010696ba2c122aeda3deadaeb02.gif)
![雙三次插值及優(yōu)化_第3頁](http://file4.renrendoc.com/view/00e53c010696ba2c122aeda3deadaeb0/00e53c010696ba2c122aeda3deadaeb03.gif)
![雙三次插值及優(yōu)化_第4頁](http://file4.renrendoc.com/view/00e53c010696ba2c122aeda3deadaeb0/00e53c010696ba2c122aeda3deadaeb04.gif)
![雙三次插值及優(yōu)化_第5頁](http://file4.renrendoc.com/view/00e53c010696ba2c122aeda3deadaeb0/00e53c010696ba2c122aeda3deadaeb05.gif)
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認(rèn)領(lǐng)
文檔簡介
1、.1.數(shù)學(xué)模型對于一個目的像素,其坐標(biāo)通過反向變換得到的在原圖中的浮點坐標(biāo)為(i+u,j+v) ,其中i 、 j 均為非負(fù)整數(shù),u、 v 為 0,1) 區(qū)間的浮點數(shù),雙三次插值考慮一個浮點坐標(biāo)(i+u,j+v) 周圍的 16 個鄰點,目的像素值f(i+u,j+v) 可由如下插值公式得到:f(i+u,j+v) = A * B * CA= S(u + 1)S(u + 0)S(u - 1)S(u - 2) f(i-1, j-1)f(i-1, j+0)f(i-1, j+1)f(i-1, j+2)B= f(i+0, j-1) f(i+0, j+0) f(i+0, j+1) f(i+0, j+2) f(i
2、+1, j-1) f(i+1, j+0) f(i+1, j+1) f(i+1, j+2) f(i+2, j-1)f(i+2, j+0)f(i+2, j+1)f(i+2, j+2) S(v + 1)C= S(v + 0) S(v - 1) S(v - 2) 1-2*Abs(x)2+Abs(x)3, 0=Abs(x)1S(x)= 4-8*Abs(x)+5*Abs(x)2-Abs(x)3, 1=Abs(x)=2S(x)是對Sin(x*Pi)/x的逼近( Pi 是圓周率) ,為插值核。2.計算流程獲取 16 個點的坐標(biāo) P1、 P2P162. 由插值核計算公式S(x) 分別計算出x、 y 方向的插值核
3、向量Su、Sv進行矩陣運算,得到插值結(jié)果iTemp1 = Su0 * P1 + Su1 * P5 + Su2 * P9 + Su3 * P13iTemp2 = Su0 * P2 + Su1 * P6 + Su2 * P10 + Su3 * P14iTemp3 = Su0 * P3 + Su1 * P7 + Su2 * P11 + Su3 * P15iTemp4 = Su0 * P4 + Su1 * P8 + Su2 * P12 + Su3 * P16iResult = Sv1 * iTemp1 + Sv2 * iTemp2 + Sv3 * iTemp3 + Sv4 * iTemp44. 在得到
4、插值結(jié)果圖后,我們發(fā)現(xiàn)圖像中有“毛刺”,因此對插值結(jié)果做了個后處理,即:設(shè)該點在原圖中的像素值為pSrc,若 abs(iResult - pSrc)大于某閾值,我們認(rèn)為插值后的點可能污染原圖,因此用原像素值pSrc 代替。;.3. 算法優(yōu)化由于雙三次插值計算一個點的坐標(biāo)需要其周圍16 個點,更有多達20 次的乘法及15 次的加法,計算量可以說是非常大,勢必要進行優(yōu)化。我們選擇了Intel 的 SSE2 優(yōu)化技術(shù),它只支持在P4 及以上的機器。測試當(dāng)前CPU 是否支持 SSE2,可由 CPUID 指令得到,代碼為:BOOL g_bSSE2 = FALSE;_asmmoveax, 1;cpuid;
5、testedx, 0 x04000000;jzNotSupport;movg_bSSE2, 1NotSupport:支持 SSE2 的 CPU 引入了 8 個 128 位的寄存器,這樣一個寄存器中就可以存放4 個點(RGB) ,有利于并行計算。詳細(xì)代碼見Transform.cpp 中函數(shù) Optimize_Bicubic 。優(yōu)化中遇到的問題:圖像每個點由 RGB 通道組成, 由于 1 個 SSE2 寄存器有 16 個字節(jié), 這樣讀入 4 個像素點后, 要浪費 4 個字節(jié), 同時要花費時間將數(shù)據(jù)對齊, 即由 BRGB | RGBR | GBRG | BRGB 對齊成 0RGB | 0RGB |
6、0RGB | 0RGB ;2. 讀 16 字節(jié)數(shù)據(jù)到寄存器時,由于圖像地址不能保證是16 字節(jié)對齊,因此需用更多時鐘周期的MOVDQU指令 (6 個以上時鐘周期);如能使地址16 字節(jié)對齊,則可用 MOVDQA指令 (1 個時鐘周期 ) ;3. 為了消除除法及浮點運算,對權(quán)值放大256 倍,這樣在計算插值核時,必須用 2Bytes來表示 1 個系數(shù),而圖像數(shù)據(jù)都是1Byte ,這樣在對齊做乘法時,要浪費一半的SSE2 寄存器的空間,導(dǎo)致運算時間變長;而若降低插值核的精度,使其在1Byte 表示范圍內(nèi)時,運算的精度又大為下降;4. 對各指令的周期以及若干行指令是否能夠并行流水缺乏經(jīng)驗和認(rèn)識。;.
7、附: SSE2 指令整理算術(shù) (Arithmetic)指令:ADDPD-Packed Double-Precision Floating-Point AddSSE22個 double 對應(yīng)相加ADDPD xmm0, xmm1/m128ADDPS-Packed Single-Precision Floating-Point AddSSE4個 float 對應(yīng)相加ADDPS xmm0, xmm1/m128ADDSD-Scalar Double-Precision Floating-Point Add1 個 double(低端 )對應(yīng)相加SSE2ADDSD xmm0, xmm1/m64ADDSS-S
8、calar Single-Precision Floating-Point AddSSE1 個 float(低端 )對應(yīng)相加ADDSS xmm0, xmm1/m32PADDB/PADDW/PADDD-Packed AddOpcodeInstructionDescription0F FC /rPADDB mm, mm/m64Add packed byte integers from mm/m64 andmm.66 0F FCPADDBAdd packed byte integers from xmm2/m128/rxmm1,xmm2/m128and xmm1.0F FD /rPADDW mm,
9、mm/m64Add packed word integers from mm/m64 andmm.66 0F FDPADDWxmm1,Add packed word integers from xmm2/m128/rxmm2/m128and xmm1.0F FE /rPADDD mm, mm/m64Addpackeddoublewordintegersfrommm/m64 and mm.66 0F FEPADDDxmm1,Addpackeddoublewordintegersfrom/rxmm2/m128xmm2/m128 and xmm1.;.PADDQ-Packed Quadword Ad
10、dOpcodeInstructionDescription0F D4 /rPADDQ mm1,mm2/m64Add quadword integer mm2/m64 to mm166 0F D4PADDQAdd packed quadword integers xmm2/m128/rxmm1,xmm2/m128to xmm1PADDSB/PADDSW-Packed Add with SaturationOpcodeInstructionDescription0F EC /rPADDSBmm,Add packed signed byte integers from mm/m64mm/m64and
11、 mm and saturate the results.66 0F ECPADDSB xmm1,Addpackedsignedbyteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 saturate the results.0F ED /rPADDSWmm,Add packed signed word integers from mm/m64mm/m64and mm and saturate the results.66 0F EDPADDSWxmm1,Addpackedsignedwordintegersfrom/rxmm2/m128xmm2/m128
12、and xmm1 and saturate the results.PADDUSB/PADDUSW-Packed Add Unsigned with SaturationOpcodeInstructionDescription0F DC /rPADDUSBmm,Add packed unsigned byte integers from mm/m64mm/m64and mm and saturate the results.66 0F DCPADDUSBxmm1,Addpacked unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1 s
13、aturate the results.0F DD /rPADDUSWmm,Addpacked unsigned wordintegersfrommm/m64mm/m64 and mm and saturate the results.660FPADDUSWxmm1,Addpacked unsigned wordintegersfromDD /rxmm2/m128xmm2/m128 to xmm1 and saturate the results.PMADDWD-Packed Multiply and AddOpcodeInstructionDescription0F F5 /rPMADDWD
14、mm, Multiply the packed words in mm by the packedmm/m64words in mm/m64. Add the 32-bit pairs of results andstore in mm as doubleword;.66Multiply the packed word integers in xmm1 by the0F PMADDWDpacked word integers in xmm2/m128, and add theF5 /rxmm1, xmm2/m128 adjacent doubleword results.PSADBW-Pack
15、ed Sum of Absolute DifferencesOpcodeInstructionDescriptionPSADBW mm1,Absolute difference of packed unsigned byte integers0F F6 /rfrom mm2 /m64 and mm1; differences are then summedmm2/m64to produce an unsigned word integer result.PSADBWAbsolute difference of packed unsigned byte integers66 0Ffrom xmm
16、2 /m128 and xmm1; the 8 low differences andxmm1,F6 /rxmm2/m1288 high differences are then summed separately toproduce two word integer results.;.PSUBB/PSUBW/PSUBD-Packed SubtractOpcodeInstructionDescription0F F8 /rPSUBBmm,Subtract packed byte integers in mm/m64frommm/m64packed byte integers in mm.66
17、 0F F8PSUBBxmm1,Subtract packed byte integers in xmm2/m128 from/rxmm2/m128packed byte integers in xmm1.0F F9 /rPSUBWmm,Subtract packed word integers in mm/m64frommm/m64packed word integers in mm.66 0F F9PSUBWxmm1,Subtract packed word integers in xmm2/m128 from/rxmm2/m128packed word integers in xmm1.
18、0F FA /rPSUBDmm,Subtract packed doubleword integers in mm/m64mm/m64from packed doubleword integers in mm.66 0F FAPSUBDxmm1,Subtract packed doubleword integers in/rxmm2/m128xmm2/mem128 from packed doubleword integers inxmm1.PSUBQ-Packed Subtract QuadwordOpcodeInstructionDescription0F FB /rPSUBQmm1,Su
19、btract quadword integer in mm1 from mm2mm2/m64/m64.66 0F FBPSUBQxmm1,Subtract packed quadword integers in xmm1/rxmm2/m128from xmm2 /m128.PSUBSB/PSUBSW-Packed Subtract with SaturationOpcodeInstructionDescription0F E8 /rPSUBSBmm,Subtract signed packed bytes in mm/m64 from signedmm/m64packed bytes in m
20、m and saturate results.66 0F E8PSUBSBxmm1,Subtract packed signed byte integers in xmm2/m128/rxmm2/m128from packed signed byte integers in xmm1andsaturate results.0F E9 /rPSUBSWmm,Subtract signed packed words in mm/m64frommm/m64signed packed words in mm and saturate results.;.66 0F E9 PSUBSW xmm1, Su
21、btract packed signed word integers in xmm2/m128 from packed signed word integers in xmm1 and/rxmm2/m128saturate results.PSUBUSB/PSUBUSW-Packed Subtract Unsigned with SaturationOpcodeInstructionDescription0F D8 /rPSUBUSBmm,Subtract unsigned packed bytes inmm/m64 frommm/m64unsigned packed bytes in mm
22、and saturate result.660FPSUBUSBxmm1,Subtractpackedunsignedbyteintegersinxmm2/m128 from packed unsigned byte integers inD8 /rxmm2/m128xmm1 and saturate result.0F D9 /rPSUBUSWmm, Subtract unsigned packed words in mm/m64 frommm/m64unsigned packed words in mm and saturate result.660FPSUBUSWxmm1,Subtract
23、packedunsignedwordintegersinxmm2/m128 from packed unsigned word integers inD9 /rxmm2/m128xmm1 and saturate result.SUBPD-Packed Double-Precision Floating-Point SubtractOpcodeInstructionDescription66 0F 5CSUBPDxmm1, Subtract packed double-precision floating-point/rxmm2/m128values in xmm2/m128 from xmm
24、1.SUBPS-Packed Single-Precision Floating-Point SubtractOpcodeInstructionDescription0F 5CSUBPSxmm1 Subtract packed single-precision floating-point/rxmm2/m128values in xmm2/mem from xmm1.SUBSD-Scalar Double-Precision Floating-Point SubtractOpcodeInstructionDescriptionF2 0F 5CSUBSDxmm1, Subtracts the l
25、ow double-precision floating-point/rxmm2/m64numbers in xmm2/mem64 from xmm1.SUBSS-Scalar Single-FP Subtract;.OpcodeInstructionDescriptionF3 0F 5C SUBSSxmm1, Subtract the lowersingle-precision floating-point/rxmm2/m32numbers in xmm2/m32 from xmm1.-PMULHUW-Packed Multiply High UnsignedOpcodeInstructio
26、nDescription0F E4 /rPMULHUW mm1,Multiply the packed unsigned word integers in mm1mm2/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHUW xmm1,Multiply the packed unsigned word integers in xmm1E4 /rxmm2/m128and xmm2/m128, and store the high 16 bits of theresults in x
27、mm1.PMULHW-Packed Multiply High SignedOpcodeInstructionDescriptionPMULHWMultiply the packed signed word integers in mm1mm,0F E5 /rmm/m64register and mm2/m64, and store the high 16 bits ofthe results in mm1.66 0FPMULHWMultiply the packed signed word integers in xmm1xmm1,E5 /rxmm2/m128and xmm2/m128, a
28、nd store the high 16 bits of theresults in xmm1.;.PMULLW-Packed Multiply Low SignedOpcodeInstructionDescriptionPMULLWmm,Multiply the packed signed word integers in mm10F D5 /rregister and mm2/m64, and store the low 16 bits ofmm/m64the results in mm1.66 0FPMULLWxmm1,Multiply the packed signed word in
29、tegers in xmm1and xmm2/m128, and store the low 16 bits of theD5 /rxmm2/m128results in xmm1.PMULUDQ-Multiply Doubleword UnsignedOpcodeInstructionDescriptionPMULUDQ mm1,Multiplyunsigned doubleword integer in mm1 by0FF4 /runsigned doubleword integer in mm2/m64, and storemm2/m64the quadword result in mm
30、1.66OFPMULUDQMultiplypacked unsigned doubleword integers inxmm1,xmm1by packed unsigned doubleword integers inF4/rxmm2/m128xmm2/m128, and store the quadword results in xmm1.PMULUDQ instruction with 64-Bit operands:DEST63-0DEST31-0 * SRC31-0;PMULUDQ instruction with 128-Bit operands:;.DEST63-0DEST31-0
31、 * SRC31-0;DEST127-64DEST95-64*SRC95-64;MULPD-Packed Double-Precision Floating-Point MultiplyOpcodeInstructionDescription66 0F 59MULPDxmm1, Multiply packed double-precision floating-point/rxmm2/m128values in xmm2/m128 by xmm1.;.DEST63-0DEST63-0 * SRC63-0;DEST127-64DEST127-64*SRC127-64;MULPS-Packed S
32、ingle-Precision Floating-Point MultiplyOpcodeInstructionDescription0F 59MULPSxmm1, Multiply packed single-precision floating-point/rxmm2/m128values in xmm2/mem by xmm1.;.DEST31-0DEST31-0 * SRC31-0;DEST63-32DEST63-32*SRC63-32;.DEST95-64DEST95-64*SRC95-64;DEST127-96DEST127-96*SRC127-96;MULSD-Scalar Do
33、uble-Precision Floating-Point MultiplyOpcodeInstructionDescriptionF2 0FMULSD xmm1,Multiply the low double-precision floating-point value59 /rxmm2/m64in xmm2/mem64 by low double-precision floating-pointvalue in xmm1.;.DEST63-0DEST63-0*xmm2/m6463-0;* DEST127-64 remains unchanged *;MULSS-Scalar Single-
34、FP MultiplyOpcodeInstructionDescriptionF3 0F 59MULSS xmm1,Multiply the low single-precision floating-point value inxmm2/mem by the low single-precision floating-point/rxmm2/m32value in xmm1.DEST31-0DEST31-0 * SRC31-0;* DEST127-32 remains unchanged *;-;.DIVPD-Packed Double-Precision Floating-Point Di
35、videDIVPD xmm0, xmm1/m128DEST63-0DEST63-0 / (SRC63-0);DEST127-64DEST127-64/(SRC127-64);DIVPS-Packed Single-Precision Floating-Point DivideDIVPS xmm0, xmm1/m128;.DEST31-0DEST31-0 / (SRC31-0);DEST63-32DEST63-32/(SRC63-32);.DEST95-64DEST95-64/(SRC95-64);DEST127-96DEST127-96/(SRC127-96);DIVSD-Scalar Dou
36、ble-Precision Floating-Point DivideDIVSD xmm0, xmm1/m64;.DEST63-0DEST63-0 / SRC63-0;* DEST127-64 remains unchanged *;DIVSS-Scalar Single-Precision Floating-Point DivideDIVSS xmm0, xmm1/m32DEST31-0DEST31-0 / SRC31-0;* DEST127-32 remains unchanged *;-PAVGB/PAVGW-Packed AverageOpcode InstructionDescrip
37、tionPAVGBmm1, Average packed unsigned byte integers from0F E0 /rmm2/m64 and mm1, with rounding.mm2/m64;.66 0F E0, PAVGBxmm1, Average packed unsigned byteintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.0F E3 /rPAVGWmm1,Average packed unsigned wordintegersfrommm2/m64mm2/m64 and mm1, with roun
38、ding.66 0F E3 PAVGWxmm1,Average packed unsigned wordintegersfrom/rxmm2/m128xmm2/m128 and xmm1, with rounding.-PMAXSW-Packed Signed Integer Word MaximumOpcodeInstructionDescription0F EE /rPMAXSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F EEPMAXSWxmm1,Compare si
39、gned word integers in xmm2/m128/rxmm2/m128and xmm1 for maximum values.PMAXUB-Packed Unsigned Integer Byte MaximumOpcodeInstructionDescription0F DE /rPMAXUBmm1, Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for maximum values.66 0F DEPMAXUBxmm1, Compare unsigned byte integers in xmm2/m128/r
40、xmm2/m128and xmm1 for maximum values.PMINSW-Packed Signed Integer Word MinimumOpcodeInstructionDescription0F EA /rPMINSWmm1, Compare signed word integers in mm2/m64 andmm2/m64mm1 for minimum values.66 0F EAPMINSWxmm1, Compare signed word integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.PM
41、INUB-Packed Unsigned Integer Byte MinimumOpcodeInstructionDescription0F DA /rPMINUBmm1,Compare unsigned byte integers in mm2/m64 andmm2/m64mm1 for minimum values.;.66 0F DA PMINUBxmm1, Compare unsigned byte integers in xmm2/m128/rxmm2/m128and xmm1 for minimum values.-RCPPS-Packed Single-Precision Fl
42、oating-Point ReciprocalOpcodeInstructionDescription0F 53RCPPS xmm1,Returns to xmm1 the packed approximations of the/rxmm2/m128reciprocals of the packed single-precision floating-pointvalues in xmm2/m128.DEST31-0APPROXIMATE(1.0/(SRC31-0);DEST63-32APPROXIMATE(1.0/(SRC63-32);DEST95-64;.APPROXIMATE(1.0/
43、(SRC95-64);DEST127-96APPROXIMATE(1.0/(SRC127-96);RCPSS-Scalar Single-Precision Floating-Point ReciprocalOpcodeInstructionDescriptionF3 0F 53RCPSS xmm1,Returns to xmm1 thepacked approximation of the/rxmm2/m32reciprocal of the lowsingle-precision floating-pointvalue in xmm2/m32.;.DEST31-0APPROX(1.0/(S
44、RC31-0);* DEST127-32 remains unchanged *;RSQRTPS-Packed Single-Precision Floating-Point Square Root ReciprocalOpcodeInstructionDescription0F 52RSQRTPS xmm1,Returns toxmm1 the packed approximations of thereciprocalsof the square roots of the packed/rxmm2/m128single-precision floating-point values in
45、xmm2/m128.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);DEST63-32;.APPROXIMATE(1.0/SQRT(SRC63-32);DEST95-64APPROXIMATE(1.0/SQRT(SRC95-64);DEST127-96;.APPROXIMATE(1.0/SQRT(SRC127-96);RSQRTSS-Scalar Single-Precision Floating-Point Square Root ReciprocalOpcode InstructionDescriptionF3RSQRTSSReturns to xmm1 an
46、approximation of the reciprocal of0Fthe square root of the low single-precision52 /rxmm1,xmm2/m32floating-point value in xmm2/m32.DEST31-0APPROXIMATE(1.0/SQRT(SRC31-0);* DEST127-32 remains unchanged *;.SQRTPD-Packed Double-Precision Floating-Point Square RootOpcodeInstructionDescription66 0F 51SQRTP
47、D xmm1,Computes square roots of the packed double-precisionfloating-point values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTPS-Packed Single-Precision Floating-Point Square RootOpcodeInstructionDescription0F 51SQRTPS xmm1,Computes square roots of the packed single-precisionfloating-po
48、int values in xmm2/m128 and stores the/rxmm2/m128results in xmm1.SQRTSD-Scalar Double-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowdouble-precisionF2 0F 51SQRTSD xmm1, floating-point value in xmm2/m64and stores the/rxmm2/m64results in xmm1.SQRTSS-Sca
49、lar Single-Precision Floating-Point Square RootOpcodeInstructionDescriptionComputes square root of the lowsingle-precisionF3 0F 51SQRTSS xmm1, floating-point value in xmm2/m32and stores the/rxmm2/m32results in xmm1.移動 (Move) 指令:MASKMOVDQU-Mask Move of Double Quadword Unaligned;.MASKMOVDQU xmm0, xmm1
50、MASKMOVQ-Mask Move of QuadwordMASKMOVQ mm0, mm1MOV APD-Move Aligned Packed Double-Precision Floating-Point Values MOVAPD xmm0, xmm1/m128MOVAPD xmm1/m128, xmm0MOV APS-Move Aligned Packed Single-Precision Floating-Point Values MOVAPS xmm0, xmm1/m128MOVD-Move DoublewordInstructionDescriptionMOVD mm, r/
51、m32Move doubleword from r/m32 to mm.MOVD r/m32, mmMove doubleword from mm to r/m32.MOVD xmm, r/m32Move doubleword from r/m32 to xmm.MOVD r/m32, xmmMove doubleword from xmm register to r/m32.MOVDQ2Q - Move QuadwordInstructionDescriptionMOVDQ2Q mm, xmmMove low quadword from xmm to mmx register .MOVQ2D
52、Q-Move QuadwordOpcodeInstructionDescriptionF30FMOVQ2DQxmm,Move quadword from mmx to low quadword ofD6mmxmm.;.DEST63-0SRC63-0;DEST127-6400000000000000000H;MOVDQA - Move Aligned Double QuadwordInstructionDescriptionMOVDQAxmm1, Move aligned double quadword from xmm2/m128 toxmm2/m128xmm1.MOVDQAxmm2/m128
53、, Movealigned doublequadword fromxmm1toxmm1xmm2/m128.MOVDQU - Move Unaligned Double QuadwordInstructionDescription;.MOVDQUxmm1, Move unaligned double quadword from xmm2/m128xmm2/m128to xmm1.MOVDQUxmm2/m128, Move unaligned double quadword fromxmm1 toxmm1xmm2/m128.MOVHLPS- Move Packed Single-Precision
54、 Floating-Point Values High to LowInstructionDescriptionMOVHLPSMove two packed single-precision floating-point values fromxmm1, xmm2high quadword of xmm2 to low quadword of xmm1.DEST63-0SRC127-64;* DEST127-64 unchanged *;MOVLHPS - Move Packed Single-Precision Floating-Point Values Low to HighInstruc
55、tionDescriptionMOVLHPSMove two packed single-precision floating-point values fromxmm1, xmm2low quadword of xmm2 to high quadword of xmm1.MOVHPD-Move High Packed Double-Precision Floating-Point ValueInstructionDescriptionMOVHPDxmm,Move double-precision floating-point value from m64 to highm64quadword
56、 of xmm.MOVHPDm64,Move double-precision floating-point value from high;.xmmquadword of xmm to m64.MOVHPD instruction for memory to XMM move:DEST127-64SRC ;* DEST63-0 unchanged *;MOVHPD instruction for XMM to memory move:DESTSRC127-64 ;MOVHPS-Move High Packed Single-Precision Floating-Point ValuesIns
57、tructionDescriptionMOVHPSxmm,Movetwopacked single-precision floating-pointvalues fromm64m64 to high quadword of xmm.MOVHPSm64,Movetwopacked single-precision floating-pointvalues from;.xmmhigh quadword of xmm to m64.MOVLPD-Move Low Packed Double-Precision Floating-Point ValueInstructionDescriptionMOV
58、LPDxmm,Movedouble-precision floating-point value fromm64 to lowm64quadword of xmm register.MOVLPDm64,Movedouble-precision floating-point nvaluefrom lowxmmquadword of xmm register to m64.MOVLPS - Move Low Packed Single-Precision Floating-Point ValuesOpcodeInstructionDescription0F12MOVLPS xmm,Movetwo
59、packed single-precision floating-pointvalues/rm64from m64 to low quadword of xmm.0F13MOVLPS m64,Movetwo packed single-precision floating-pointvalues/rxmmfrom low quadword of xmm to m64.MOVMSKPD - Extract Packed Double-Precision Floating-Point Sign Mask MOVMSKPD r32, xmmDEST0SRC63;.DEST1SRC127;DEST3-
60、200B;DEST31-40000000H;.MOVMSKPS - Extract Packed Single-Precision Floating-Point Sign Mask MOVMSKPS r32, xmmDEST0SRC31;DEST1SRC63;.DEST1SRC95;DEST1SRC127;DEST31-4000000H;.MOVNTDQ - Move Double Quadword Non-TemporalOpcodeInstructionDescription66 0F E7MOVNTDQMove double quadword from xmm to m128,/rm12
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- STAT3-IN-39-生命科學(xué)試劑-MCE-5782
- ANO1-IN-4-生命科學(xué)試劑-MCE-2608
- 3-Methoxybenzeneboronic-acid-d3-3-Methoxyphenylboronic-acid-d-sub-3-sub-生命科學(xué)試劑-MCE-9929
- 二零二五年度電子產(chǎn)品銷售退換貨及售后服務(wù)協(xié)議
- 2025年度游戲工作室游戲市場調(diào)研分析師用工合同
- 二零二五年度生態(tài)旅游區(qū)集體土地入股聯(lián)營協(xié)議
- 2025年度電子商務(wù)消費者權(quán)益保護合同協(xié)議
- 二零二五年度美容店轉(zhuǎn)讓合同含美容院品牌形象使用權(quán)及廣告推廣
- 二零二五年度綠色環(huán)保餐飲商鋪租賃協(xié)議
- 科技賦能教育點亮孩子創(chuàng)新火花
- 具有履行合同所必須的設(shè)備和專業(yè)技術(shù)能力的承諾函-設(shè)備和專業(yè)技術(shù)能力承諾
- 混床計算書(新)
- 1325木工雕刻機操作系統(tǒng)說明書
- 初中衡水體英語(28篇)
- 斯瓦希里語輕松入門(完整版)實用資料
- 復(fù)古國潮風(fēng)中國風(fēng)春暖花開PPT
- GB/T 2317.2-2000電力金具電暈和無線電干擾試驗
- 機動車輛保險理賠實務(wù)2023版
- 病原微生物實驗室標(biāo)準(zhǔn)操作規(guī)程sop文件
- 最完善的高速公路機電監(jiān)理細(xì)則
- 建筑工程技術(shù)資料管理.ppt
評論
0/150
提交評論