lecture1-Introduction_第1頁
lecture1-Introduction_第2頁
lecture1-Introduction_第3頁
lecture1-Introduction_第4頁
lecture1-Introduction_第5頁
已閱讀5頁,還剩74頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、12010 Xiaojie YuanYuan XiaojieDept. of Computer Science and technology, Nankai UniversitySlides adapted from material by Profs. Jeff Ullman (Stanford) and Art Keller (UCSC)22010 Xiaojie Yuan cover mostly relational databases how to design and create such databases how to use them (via SQL query lang

2、uage) how to implement them (only briefly) will touch on some advanced issues XML data models, data warehouse, data mining32010 Xiaojie Yuan Must have data structure and algorithm background Good at C+,Java,C# project will require lot of programming need C+ or Java or C# to do a good job at talking

3、with databases you or your project group picks the language Knowing only C will require more work more difficult to talk in C to databases42010 Xiaojie YuanDatabase Systems: The Complete Book, Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom, Prentice Hall, 2002.定價:65.00元北方區(qū)經(jīng)理:余勇電話:010-68

4、995264.88379625七折,送書上門)遺憾:沒有英文影印板發(fā)行52010 Xiaojie Yuan數(shù)據(jù)庫系統(tǒng)概論薩師煊、王珊 高等教育出版社數(shù)據(jù)庫系統(tǒng)原理王能斌 編著 電子工業(yè)出版社62010 Xiaojie Yuan“A First Course in Database System”, Jeffrey D.Ullman, Jennifer Widom 翻譯版,數(shù)據(jù)庫系統(tǒng)基礎(chǔ)教程, 清華大學(xué)出版社“Database System Implementation”,Hector Garcia-Molina, Jeffrey D.Ullman, 翻譯版,數(shù)據(jù)庫系統(tǒng)

5、實(shí)現(xiàn), 機(jī)械工業(yè)出版社 “Database System Concepts”,Third Edition, Fourth Edition, Abraham Silberschatz Henry F. Korth S. Sudarshan, 機(jī)械工業(yè)出版社 72010 Xiaojie YuanCS 145CS 245CS 346CS 345CS 347CS 395CS 545Fall, SpringWinterDB Systems ImplementationAdvanced TopicsTP + DDBsIndependent DB ProjectDB SeminarFallSpringSpr

6、ingAllFall, SpringA First Course in Database SystemDatabase System Principles82010 Xiaojie Yuan For all students two 70-min lectures / week (If you have conflicts, do let us know in advance, 5% ) 4 homeworks(Will be collected at the beginning of class on the due date,No late homework will be accepte

7、d, 10%) projects (10%) Midterm Examination( 15%, SQL, Computer) Final Examination Closed Book (60%) 92010 Xiaojie Yuan 講稿放在學(xué)院服務(wù)器上 Instructor: yuan xiaojie Room 309, 伯苓樓伯苓樓 Email: Office hours: wed. 10:40-11:30 (after lecture) TAs: 林偉堅(jiān) cn安誠 官瑩 102010 Xiaojie Yuan Suppose we a

8、re building a system to store the information about: students courses professors who takes what, who teaches what定義:定義:CTypedPtrList m_pDataList; 1 1 0 0 0 0 1 1 z z h h a a n n g g 1 1 0 0 0 0 2 2 w w a a n n g g 1 1 0 0 0 0 3 3 l l i i 1 1 0 0 0 0 4 4 z z h h a a o o m m _ _ p p D D a a t t a a L

9、L i i s s t t112010 Xiaojie Yuan store the data for a long period of time large amounts (100s of GB) protect against crashes protect against unauthorized use allow users to query/update: who teaches “CS 173” enroll “Mary” in “CS 311” allow several (100s, 1000s) users to access the data simultaneousl

10、y allow administrators to change the schema add information about TAs122010 Xiaojie Yuan Why Direct Implementation Wont Work: Storing data: file system is limited size less than 4GB (on 32 bits machines) when system crashes we may loose data password-based authorization insufficient Query/update: ne

11、ed to write a new C+/Java program for every new query need to worry about performance132010 Xiaojie Yuan Concurrency: limited protection need to worry about interfering with other users need to offer different views to different users (e.g. registrar, students, professors) Schema change: entails cha

12、nging file formats need to rewrite virtually all applications Better let a database system handle it142010 Xiaojie Yuan Data Definition Language - DDL Data Manipulation Language - DML query language Storage management Transaction Management concurrency control recovery152010 Xiaojie Yuan Requirement

13、s modeling (conceptual, pictures) Decide what entities should be part of the application and how they should be linked. Schema design and implementation Decide on a set of tables, attributes. Define the tables in the database system. Populate database (insert tuples). Write application programs usin

14、g the DBMS way easier now that the data management is taken care of.162010 Xiaojie YuanaddressnamefieldProfessorAdvisesTakesTeachesCourseStudentnamecategoryquarternamessncid172010 Xiaojie Yuan Tables: Separates the logical view from the physical view of the data.SSNNameCategory123-45-6789Charles und

15、ergrad234-56-7890DangradSSNCID123-45-6789CSE444123-45-6789CSE444234-56-7890CSE142Students:Takes:CIDNameQuarterCSE444DatabasesfallCSE541Operating systemswinterCourses:182010 Xiaojie Yuan Find all courses that “Mary” takes S(tructured) Q(uery) L(anguage) Query processor figures out how to answer the q

16、uery efficiently. select C.namefrom Students S, Takes T, Courses Cwhere S.name = “Mary” and S.ssn = T.ssn and T.cid = C.cid192010 Xiaojie YuanImperative query execution plan:select C.namefrom Students S, Takes T, Courses Cwhere S.name=“Mary” and S.ssn = T.ssn and T.cid = C.cidDeclarative SQL queryPl

17、an: tree of Relational Algebra operators, choice of algorithms at each operatorGoal:StudentsTakessid=sidsnamename=“Mary” cid=cidCourses202010 Xiaojie Yuan 實(shí)體聯(lián)系數(shù)據(jù)模型 關(guān)系數(shù)據(jù)模型與關(guān)系模式設(shè)計(jì) 數(shù)據(jù)庫語言SQL(含觸發(fā)器和存儲過程) 關(guān)系代數(shù) 數(shù)據(jù)存儲與索引 查詢執(zhí)行與優(yōu)化 系統(tǒng)故障對策 并發(fā)控制212010 Xiaojie Yuan Relational databases are a great success of theore

18、tical ideas. Big DBMS companies are among the largest software companies in the world. Oracle IBM (with DB2) Microsoft (SQL Server, Microsoft Access) Others $20B industry.222010 Xiaojie Yuan 1973年,Charles W.Bachman,”網(wǎng)狀數(shù)據(jù)庫之父”主持設(shè)計(jì)與開發(fā)了最早的網(wǎng)狀數(shù)據(jù)庫系統(tǒng)IDS推動和促成了數(shù)據(jù)庫標(biāo)準(zhǔn)的制定 1981年,Edgar Frank Codd,”關(guān)系數(shù)據(jù)庫之父” A relat

19、ional Model of Data for Large Shared Data Banks 1998年,James Gray或Jim Gray, 數(shù)據(jù)庫技術(shù)和“事務(wù)處理”專家 完整性、安全性、并發(fā)性、故障恢復(fù)232010 Xiaojie Yuan Several aspects: Modeling and design of databases Database programming: querying and update operations Database implementation DBMS study cuts across many fields of Computer

20、Science: OS, languages, AI, Logic, multimedia, theory.242010 Xiaojie Yuan Suppose you are working on database project Step 0: pick an application domain we will talk about this later Step 1: conceptual design discuss with your team mates what to model in the application domain need a modeling langua

21、ge to express what you want ER model is the most popular such language output: an ER diagram of the app. domain252010 Xiaojie Yuan Step 2: pick a type of DBMS relational DBMS is most popular and is our focus Step 3: translate ER design to a relational schema use a set of rules to translate from ER t

22、o rel. schema use a set of schema refinement rules to transform the above rel. schema into a good rel. schema At this point you have a good relational schema on paper262010 Xiaojie Yuan Subsequent steps include implement your relational DBMS using a database programming language called SQL ordinary

23、users cannot interact with the database directly and the database also cannot do everything you want hence write your application program in C+, Java, ASP, etc to handle the interaction and take care of things that the database cannot do So, the first thing we should start with is to learn ER model

24、.272010 Xiaojie Yuan Gives us a language to specify what information the db must hold what are the relationships among components of that information Proposed by Peter Chen in 1976 What we will cover basic stuff constraints weak entity sets design principles282010 Xiaojie Yuan Entity = “thing” or ob

25、ject. Entity set = collection of similar entities. Similar to a class in object-oriented languages. Attribute = property of an entity set. Generally, all entities in a set have the same properties. Attributes are simple values, e.g. integers or character strings.292010 Xiaojie Yuan In an entity-rela

26、tionship diagram, each entity set is represented by a rectangle. Each attribute of an entity set is represented by an oval, with a line to the rectangle representing its entity set.302010 Xiaojie Yuan Entity set Beers has two attributes, name and manf (manufacturer). Each Beer entity has values for

27、these two attributes, e.g. (Bud, Anheuser-Busch)Beersnamemanf312010 Xiaojie Yuan A relationship connects two or more entity sets. It is represented by a diamond, with lines to each of the entity sets involved.322010 Xiaojie YuanDrinkersaddrnameBeersmanfnameBarsnamelicenseaddrSellsBars sell somebeers

28、.LikesDrinkers likesome beers.FrequentsDrinkers frequentsome bars.332010 Xiaojie Yuan The current “value” of an entity set is the set of entities that belong to it. Example: the set of all bars in our database. The “value” of a relationship is a set of lists of currently related entities, one from e

29、ach of the related entity sets.342010 Xiaojie Yuan For the relationship Sells, we might have a relationship set like:BarBeerJoes BarBudJoes BarMillerSues BarBudSues BarPetes AleSues BarBud Lite352010 Xiaojie Yuan A mathematical definition: if A, B are sets, then a relation R is a subset of A x B A=1

30、,2,3, B=a,b,c,d, R = (1,a), (1,c), (3,b)makes is a subset of Product x Company:123abcdA=B=makesCompanyProduct362010 Xiaojie YuanaddressnamessnPersonbuysmakesemploysCompanyProductnamecategorystockpricenameprice372010 Xiaojie Yuan one-one: many-one many-many123abcd123abcd123abcdmakesCompanyProduct3820

31、10 Xiaojie Yuan Show a many-one relationship by an arrow entering the “one” side. Show a one-one relationship by arrows entering both entity sets. In some situations, we can also assert “exactly one,” i.e., each entity of one set must be related to exactly one entity of the other set. To do so, we u

32、se a rounded arrow.392010 Xiaojie Yuan Consider Best-seller between Manfs and Beers. Some beers are not the best-seller of any manufacturer, so a rounded arrow to Manfs would be inappropriate. But a manufacturer has to have a best-seller (we assume they are beer manufacturers).ManfsBeersBest-seller4

33、02010 Xiaojie YuanHow do we model a purchase relationship between buyers, products and stores?PurchaseProductPersonStoreCan still model as a mathematical set (how ?)412010 Xiaojie YuanQ: what does the arrow mean ?A: if I know the store, person, invoice, I know the movie tooRentalVideoStorePersonMovi

34、eInvoice422010 Xiaojie YuanQ: what do these arrow mean ?A: store, person, invoice determines movie and store, invoice, movie determines personRentalVideoStorePersonMovieInvoice432010 Xiaojie YuanPurchaseWhat if we need an entity set twice in one relationship?ProductPersonStoresalespersonbuyerPerson4

35、42010 Xiaojie YuanPurchaseWhat if we need an entity set twice in one relationship?ProductPersonStoresalespersonbuyer452010 Xiaojie YuanPurchaseProductPersonStoredate462010 Xiaojie YuanPurchasePersonStoreProductStoreOfProductOfBuyerOfdate 472010 Xiaojie Yuan Modeled as a mathematical set Binary and m

36、ultiway relationships Converting a multiway one into many binary ones Constraints on the degree of the relationship many-one, one-one, many-many limitations of arrows Attributes of relationships not necessary, but useful482010 Xiaojie YuanProductnamecategorypriceisaisaEducational ProductSoftware Pro

37、ductAge Groupplatforms492010 Xiaojie Yuan Subclass = special case = fewer entities = more properties. Example: Ales are a kind of beer. Not every beer is an ale, but some are. Let us suppose that in addition to all the properties (attributes and relationships) of beers, ales also have the attribute

38、color.502010 Xiaojie Yuan Assume subclasses form a tree. I.e., no multiple inheritance. Isa triangles indicate the subclass relationship. Point to the superclass.512010 Xiaojie YuanBeersAlesisanamemanfcolor522010 Xiaojie Yuan In the object-oriented world, objects are in one class only. Subclasses in

39、herit properties from superclasses. In contrast, E/R entities have components in all subclasses to which they belong. Matters when we convert to relations.532010 Xiaojie YuanFinding constraints is part of the modeling process. Commonly used constraints: Keys: social security number uniquely identifi

40、es a person. Single-value constraints: a person can have only one father. Referential integrity constraints: if you work for a company, it must exist in the database. Domain constraints: peoples ages are between 0 and 150. General constraints: all others (at most 50 students enroll in a class)542010

41、 Xiaojie Yuan A key is a set of attributes for one entity set such that no two entities in this set agree on all the attributes of the key. It is allowed for two entities to agree on some, but not all, of the key attributes. We must designate a key for every entity set.552010 Xiaojie Yuan Underline

42、the key attribute(s). In an Isa hierarchy, only the root entity set has a key, and it must serve as the key for all entities in the hierarchy.BeersAlesisanamemanfcolor562010 Xiaojie YuanCoursesdeptnumberhoursroom Note that hours and room could also serve as a key, but we must select only one key.572

43、010 Xiaojie Yuan Every entity set must have a key why? A key can consist of more than one attribute There can be more than one key for an entity set one key will be designated as primary key Requirement for key in an isa hierarchy see book582010 Xiaojie Yuan Occasionally, entities of an entity set n

44、eed “help” to identify them uniquely. Entity set E is said to be weak if in order to identify entities of E uniquely, we need to follow one or more many-one relationships from E and include the key of the related entities from the connected entity sets.592010 Xiaojie Yuan name is almost a key for fo

45、otball players, but there might be two with the same name. number is certainly not a key, since players on two teams could have the same number. But number, together with the Team related to the player by Plays-on should be unique.602010 Xiaojie YuanPlayersTeamsPlays-onnamenamenumber Double diamond

46、for supporting many-one relationship. Double rectangle for the weak entity set.612010 Xiaojie Yuan A weak entity set has one or more many-one relationships to other (supporting) entity sets. Not every many-one relationship from a weak entity set need be supporting. The key for a weak entity set is i

47、ts own underlined attributes and the keys for the supporting entity sets. E.g., player-number and team-name is a key for Players in the previous example.622010 Xiaojie YuanofSectionofSection #Depts Key: (DeptCode)Courses Key: (Course #, DeptCode)Sections key: (Section #, Course #, DeptCode)Course #D

48、eptDeptCodeof inNameCourse632010 Xiaojie Yuan1. Be Faithful Design should reflect your (possibly vague) ideas of the data.2. Avoid redundancy.3. Limit the use of weak entity sets.4. Dont use an entity set when an attribute will do.642010 Xiaojie YuanPurchaseProductPersonPresidentPersonCountryTeaches

49、CourseInstructor652010 Xiaojie Yuan Redundancy occurs when we say the same thing in two different ways. Redundancy wastes space and (more importantly) encourages inconsistency. The two instances of the same fact may become inconsistent if we change one and forget to change the other, related version

50、.662010 Xiaojie YuanBeersManfsManfBynameThis design gives the address of each manufacturer exactly addr672010 Xiaojie YuanBeersManfsManfBynameThis design states the manufacturer of a beer twice: as an attribute and as a related manfaddr682010 Xiaojie YuanBeersnameThis design repe

51、ats the manufacturers address once for each beer; loses the address if there are temporarily no beers for a manufacturer.manfmanfAddr692010 Xiaojie YuanMoviesDirectorsDirected-byYearDirs NameProfileNameGoodMoviesYearDirs NameDirs ProfileNameBad702010 Xiaojie YuanAn entity set should satisfy at least

52、 one of the following conditions:It is more than the name of something; it has at least one nonkey attribute.orIt is the “many” in a many-one or many-many relationship.712010 Xiaojie YuanBeersManfsManfBynameManfs deserves to be an entity set because of the nonkey attribute addr.Beers deserves to be

53、an entity set because it is the “many” of the many-one relationship ManfBaddr722010 Xiaojie YuanBeersnameThere is no need to make the manufacturer an entity set, because we record nothing about manufacturers besides their name.manf732010 Xiaojie YuanBeersManfsManfBynameSince the manufacturer is nothing but a name, and is not at the “many” end of any relationship, it should not be an entity 742010 Xiaojie Yuan Beginning database d

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論