王小鴿-高性能計算導論雙語教學的實踐_第1頁
王小鴿-高性能計算導論雙語教學的實踐_第2頁
王小鴿-高性能計算導論雙語教學的實踐_第3頁
王小鴿-高性能計算導論雙語教學的實踐_第4頁
王小鴿-高性能計算導論雙語教學的實踐_第5頁
已閱讀5頁,還剩67頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領

文檔簡介

1、高性能計算導論雙語教學的實踐 王小鴿計算機科學與技術(shù)系國家實驗室公共平臺部2008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)提綱總體情況介紹課件實例介紹經(jīng)驗與體會展望未來22008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)總體情況介紹高性能計算導論課程三個發(fā)展階段:專業(yè)課英語教學(必修課,1998-2002)定位:取代專業(yè)英語特點:以高性能計算為主線組織教學活動,使用英語為主要目標。專業(yè)課英語教學(選修課,2003-2006)定位:講授專業(yè)知識為目標,采用英語為手段。特點:專業(yè)知識與英語訓練并重。專業(yè)課雙語教學(選修課,2007- )定位:講授專業(yè)知

2、識為目標,采用雙語為手段。特點:專業(yè)知識與英語訓練并重。更關(guān)注了專業(yè)知識的教學效果。32008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)總體情況介紹課程的組織情況介紹教材:原版教材教材而非專著國內(nèi)有影印版課時:32學時/學期形式:講授+討論+實驗+作業(yè)+考試42008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)總體情況介紹雙語要求:對教師預告授課: 英語 + 適當?shù)闹形慕忉尦鲱}:英語 對學生預習課上發(fā)言:英語 + 適當?shù)闹形慕忉屪鳂I(yè):鼓勵用英語(有加分)考試開卷,允許帶字典和筆記英文 + 少量的中文注釋52008年高等學校計算機專業(yè)教學改革高級研修班(

3、5月17-18日,北京)課件實例介紹課程簡介講義實例作業(yè)實例考試題及答卷實例62008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)課件實例介紹課程簡介講義實例作業(yè)實例考試題及答卷實例72008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Introduction to High Performance ComputingXiaoge Wang82008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Course SyllabusText book: 1 Ian Forster, “Designing and Building Paralle

4、l Programs” (人民郵電出版社,英文版)References:1 Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar, “Introduction to Parallel Computing”(機械工業(yè)出版社,中、英)2 Timothy G. Mattson, Beverly A. Sanders, Berna L. Massingill, “Patterns for Parallel Programming”(清華大學出版社,翻譯版)3 Michael Quinn, “Parallel Programming in C w

5、ith MPI and OpenMP” (清華大學出版社,影印版) 92008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Course SyllabusObjectives: Answer the questions:What is HPC? Why HPC?How to do HPC?Learn some basic concepts, algorithms and tools of HPC.Improve English skill. Instruction, discussion, homework, presentation 102008年高等學校計算機專業(yè)教學改革

6、高級研修班(5月17-18日,北京)HPCconceptstoolsalgo.MPI, OpenMP, HPFLinear algebraSearchSortTask partition, SchedulingPerformance Model112008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Course SyllabusGrade Policy: Homework 45% Classroom Performance 15% Final exam 40%No tolerance to cheatingOffice Hour (English corner): Tues

7、day. 8-9pm, FIT Building, room 3-412122008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)課件實例介紹課程簡介講義實例作業(yè)實例考試題及答卷實例132008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Lesson One: Introduction142008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)IntroductionWhat is HPC?Current development of HPCOverview of concepts 152008年高等學校計算機專業(yè)教學改革高級研修班(5月1

8、7-18日,北京)What is HPC?DefinitionComponentsApplications162008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)What is HPC? - DefinitionsDefinitions of High Performance Computing on the Web:A branch of computer science that concentrates on developing supercomputers and software to run on supercomputers. A main area of

9、this discipline is developing parallel processing algorithms and software: programs that can be divided into little pieces so that each piece can be executed simultaneously by separate processors./anime3/internet/programming.htmThe field of high performance computing (HPC) comprises computing applic

10、ations on (parallel) supercomputers and computer clusters. Most ideas for the new wave of grid computing were originally borrowed from HPC. /wiki/High_Performance_Computing172008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)What is HPC? -ComponentsHardware: Supercomputer, Cluster, switch, networkSoftware: OS, Sha

11、red/distributed memory management, file systems, parallel programming toolsAlgorithm: Parallel/distributed algorithm design182008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)What is HPC? -ApplicationsModern science and engineeringGrand challenges: quantum chemistry, cosmology, astrophysics, CFD, material design,

12、 biology, genome sequencing, global weather and environmental modeling,Information TechnologyWeb services, data mining, search engine, information retrieval,192008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Current Development of HPCTrends in computer designTrends in networkingTrends in software design202008年高等

13、學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Current Development of HPC -Trends in Computer DesignHigh Performance is still an important goal.Multicore technology is maturing Multiprocessor is still the main architecture.Multicomputer is becoming the foundation of the large scale Cyber-Infrastructure (CI).212008年高等

14、學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)From /222008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Earth SimulatorBased on the NEC SX architecture, 640 nodes, each node with 8 vector processors (8 Gflop/s peak per processor), 2 ns cycle time, 16GB shared memory. Total of 5120 total processors, 40 TFlop/s peak, and 10 TB memo

15、ry. It has a single stage crossbar (1800 miles of cable) 83,000 copper cables, 16 GB/s cross section bandwidth. 700 TB disk space 1.6 PB mass store Area of computer = 4 tennis courts, 3 floors 232008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)242008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)BlueGene/LSite: DOE/NNSA/LLNLSy

16、stem Model: eServer Blue Gene SolutionVendor: IBMApplication area: ResearchMain Memory: 32768 GBInstallation Year: 2005Operating System: CNK/SLES 9Interconnect: ProprietaryProcessor: PowerPC 440 700 MHz (2.8 GFlops)252008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)BlueGene/LBlueGene/L boasts a peak speed of ove

17、r 360 teraFLOPS, a total memory of 32 tebibytes, total power of 1.5 megawatts, and machine floor space of 2,500 square feet. The full system has 65,536 dual-processor compute nodes. Multiple communications networks enable extreme application scaling: Nodes are configured as a 32 x 32 x 64 3D torus;

18、each node is connected in six different directions for nearest-neighbor communications A global reduction tree supports fast global operations such as global max/sum in a few microseconds over 65,536 nodes Multiple global barrier and interrupt networks allow fast synchronization of tasks across the

19、entire machine within a few microseconds 1,024 gigabit-per-second links to a global parallel file system to support fast input/output to disk 262008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)BlueGene/L by IBM272008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)HPC in ChinaAccording to statistic of HPC Top500 (June, 2007):Cou

20、ntriesCountShare %Rmax Sum (GF)Rpeak Sum (GF)Processor SumUS28156.20 %30794894444621816680China(mainland)132.60%9640317495427660The top one installed in China is listed as the 43th of Top500 (IBM).See also: / for China Top 100 super-computer list.282008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)HPC Facilities

21、in TsinghuaTH-Discovery 3Architecture: Cluster with 128 nodes, 256CPU, 1.3TFLOP/sec peak performance. Node: HP Server rx2600, 4DB PC2100 DDR-SDRAM memory quad (4x1GB DIMMs);Storage: 200TB Software: Redhat Linux As3.0 ia64, kernel 2.4.21 20.EL LSF Job Management System. MPI for parallel programming M

22、athematical libraries ChinaGrid Monitor 292008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Current Development of HPC - NetworkingHigh speed inter-connectionsProprietary (IBM, Cray)Commercial products: InfiniBand, Ethernet, Myrinet, QuadricsInternet Internet usage in China302008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)In

23、ternet Users in ChinaData source:中國互聯(lián)網(wǎng)絡發(fā)展狀況統(tǒng)計報告(2007。1) annual report of Chinas internet development statusComparing to other two sets of data (from CIA the world factbook): World: 1,018,057,389 / 6,602,224,175 = 15.4% US: 208,824,428 / 301,139,947 = 69.3% China: 137,000,000 / 1,321,851,888 = 10.4%3

24、12008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Internet Users in ChinaFacts: The rate of increase in #users is slowed down. Internet users is only 10.4% of total population (8.5% previous year)322008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Internet Machines*Internet hosts: China(43rd): 232,780 (2006) vs. US(1st): 195,

25、139,000 (2005) Source: /cia/publications/factbook/fields/2184.html332008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Internet MachinesFacts:Grow rate is increased slightly. The dial-in and special connection is decrease, while the broad band connection increase.342008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Bandwidth for

26、 International LinksTotal bandwidth going international reached 256,696 Mbps, increase by 120,590 Mbps.The growth rate is at 88.6%.352008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Current Development of HPC Grid Applications Image processing GridMedical image for diagnosisRemote sensing image processing and ap

27、plicationDigital humanBio-Informatics GridResources (computation power) sharingOnline-Courses GridOnline courseware sharingOnline course broadcastComputational Fluid Dynamic GridSoftware simulation tools sharingInformation Processing GridDigital Museum362008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)History of

28、 Computing in TsinghuaFirst computer degree program: 1956Establishment of the Department of Computer Science and Technology :1978Establishment of the Computing Center: 1975Single user computer: DJS130, Imported mainframes: Honeywell, Fujitsu, IBM *PC Labs and Campus information systems.Establishment

29、 of Common Platform Division in TNLIST: 2004TH-Discovery3 (2005)372008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Computer Systems ResearchComputers made in Tsinghua:1959-1964: J-911, vacuum tube 1960: Analog computer1966: J-112, transistor: 1972-1974: J-724, a real-time computer.1974: DJS-100 Series, integrate

30、d circuit 1987: THUDS, concurrent computer, transputer1993: RISC processor1998: Linux Cluster , Peak 32Gflops2003: TH-MANS, a massive storage networked system.2005: TH-Discovery3, 25th in China Top100.382008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)HPC Activities Computational Science and Engineering Research

31、 (23 ongoing projects by Jan. 2006)Recombination rate estimation and hotspot detection in the human genomeThe molecular evolution of microRNAsMicrostructures and Thermo-physical Properties of Alloy Melts and Their Effects on Solidification StructuresParallel Computing of Fast Multipole Boundary Elem

32、ent MethodInvestigation on the integral equation method for the numerical computation of electromagnetic fieldsDNS of multiphase flow with mesh-less methodEfficient sub-graph mining algorithm and its applicationsTheoretical study of the catalytic dissociation of hydrogen on Ni-Fe alloy surfacesInves

33、tigation on unfolding dynamics of the smallest proteinPattern recognition and molecular validation on alternatively spliced genesComputation optimization of thermo-acoustic engine392008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Overview of Concepts Parallel Machine Models Parallel Programming models Parallel A

34、lgorithm examples402008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Parallel Machine Models:The requirements General: Allow the study of algorithm and programming language to be independent from the improvement of architecture.Simple: To facilitate understanding and programmingRealistic: To ensure that programs

35、developed for the model execute with reasonable efficiency on real computer.412008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)The Von Neumann Computer:A central processing unit (CPU)A storage unit (memory)A control unitI/O unit 422008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)The Multiplicity -From Von Neumann Machine to

36、Modern Parallel MachinesMultiple computers: Multiple CPU: Multiple function unitsMultiple instruction execution: Multiple levels of cacheMultiple 432008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Flynns TaxonomySISD: uniprocessorSIMDProcessor arrayPipelined vector processorsMISDSystolic arrayMIMDMultiprocessors

37、MulticomputersData streamsinglemultipleInstruction streamsinglemultiple442008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Parallel Programming Models452008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Additional Properties of Parallel Software:Concurrency: each node execute its own program.Scalability: the number of nodes cou

38、ld vary.Locality: the cost of accesses to local memory is less than the cost of accesses to remote memory.462008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Parallel Program Requirements:A good parallel program has:Concurrency: Ability to perform many actions simultaneously.Locality: High ratio of local memory a

39、ccess to remote memory access.Scalability: Resilience to increasing processor counts.472008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Example Bridge construction: A bridge is to be assembled from girders being constructed at a foundry.(a)(b)foundrybridgefoundrybridgegirdersgirdersrequest482008年高等學校計算機專業(yè)教學改革高級研

40、修班(5月17-18日,北京)A Parallel Programming Model:Tasks and channels:One or more tasks which could execute concurrently.A task encapsulates a sequential program, local memory and interface to its environment (in-ports and out-ports).Four additional function of a task: send and receive messages, create new

41、 tasks and terminate.Channels: message queues connecting in-port/out-port pairs.The mapping (tasks to physical processors) does not affect the semantics of a program.492008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Other Models:Message-Passing: similar to the tasks and channels model.Shared-memory:Data paralle

42、l:A+B, 2*A, .Other models:PRAM, BPS, C3, logP, .foundrybridgestorage502008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京) Parallel Algorithm Examples512008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Scientific Computing:Mathematical model of real world problems: PDE, ODE, etc.Numerical solution of mathematical problems:Discre

43、te methods: finite difference, finite elements, etc.Solving linear equations: Direct method or iterative method.Implementation of numerical methods.522008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Finite Differences:To solve the equation: f (x) = 0 ;Use finite difference method as:f (x+h) = (f(x) f(x+h) )/h+ O

44、(h2)f (x+h) = f (x) f (x+h)/h + O(h2)f (x) = f(x-h) f(x) f(x)-f(x+h)/h2 +O(h2) = f(x-h) 2f(x) + f(x+h)/h2 +O(h2);Discretize:f(xi-1) 2f(xi) + f(xi+1) = 0, i = 0, 1, , n-1;Use iterative method to solve the equations:f(xi)(t+1) = f(xi-1)(t)+ 2f(xi) (t) + f(xi+1) (t) /4, t=1, 2, ,T; i = 0, 1, , n-1.5320

45、08年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Finite Differences:A vector X is used to contain N points of f(x) on the problem domain:Create N tasks for each point.Each task is given initial value f(xi)(0) and compute f(xi)(t), t= 1, 2, ,TSends its data f(xi)(t) on its left and right outports.Receives f(xi-1)(t

46、) f(xi+1)(t) from its left and right inports, andUses these values to compute f(xi)(t+1)2124567831345678542008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Pair-wise interactions:The computation of all N(N-1) pair-wise interactions I(Xi, Xj), ij, between N data, X0, X1, Xn-1.Parallel algorithm:Create N tasks Task

47、 i is given Xi and responsible for computing interactions: I(Xi, Xj), ijQ: How many communication channels are needed?552008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Pair-wise interactions:Answer #1: N(N-1) channels. Task i sends Xi to its N-1 outports and receives Xj, ji from its N-1 inports.01234567562008年高

48、等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Pair-wise interactions:Answer #2: N channels. Each task sends the most recently received data to its outport. Repeat N-1 times.01234567572008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Pair-wise interactions:Answer #3 (symmetry case) : N+N channels. Each task sends the most recentl

49、y received data the associated accumulator to its outport. Repeat (N-1)/2 times.01234567582008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Search:procedure search(A)begin if (solution(A) then score = eval(A) report solution and score else foreach child A(I) of A search(A(I) of A endfor endifend592008年高等學校計算機專業(yè)教學

50、改革高級研修班(5月17-18日,北京)Search:A single task is created for the root of the tree.Create a new task for each search call.Create a channel for each new task to return to its parent any solutions located in its sub-tree.Q: Can the search be terminated completely when a solution is found?602008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Search:612008年高等學校計算機專業(yè)教學改革高級研修班(5月17-18日,北京)Parameter study:A rang of different input parameters are read from an input fileThe same computation is performed using

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論