




已閱讀5頁,還剩1頁未讀, 繼續(xù)免費閱讀
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
基于.NET的大型Web站點StackOverflow架構(gòu)分析Stack Overflow網(wǎng)址:/當前訪問量:每月9500PV(每天300多萬PV)當前Alexa排名:149所用.NET技術(shù):C#、Visual Studio 2010 Team Suite、ASP.NET 4、ASP.NET MVC 3、Razor、LINQ to SQL+raw SQL下面是英文原文:A lot has happened since my first article on theStack Overflow Architecture(2009-8-5). Contrary to the theme of that last article, which lavished attention on Stack Overflows dedication to a scale-up strategy, Stack Overflow has both grown up and out in the last few years.自從2009年8月發(fā)布了第一篇關(guān)于“Stack Overflow 架構(gòu)”方面的文章,Stack Overflow已經(jīng)發(fā)生了很大的變化。那篇文章更多關(guān)注的是Stack Overflow如何解決網(wǎng)站的擴展性(scale-up)問題,而經(jīng)過幾年的發(fā)展,Stack Overflow已經(jīng)長大成人,成長為了大型網(wǎng)站。Stack Overflowhas grown up by more then doubling in size to over 16 million users and multiplying its number of page views nearly 6 times to 95 million page views a month. 現(xiàn)在與2009年相比,Stack Overflow每月獨立訪問用戶翻了一倍,超過1600萬;每月PV翻了近6倍,達到9500萬。Stack Overflow has grown out by expanding into theStack Exchange Network, which includes Stack Overflow, Server Fault, and Super User for a grand total of 43 different sites. Thats a lot of fruitful multiplying going on.Stack Overflow新增了很多站點,比如Server Fault, Super User等,共有43個不同站點組成了Stack Exchange Network,可謂碩果累累,迅猛增長。What hasnt changed is Stack Overflows openness about what they are doing. And thats what prompted this update. A recent series of posts talks a lot about how theyve been handling their growth:Stack Exchanges Architecture in Bullet Points,Stack Overflows New York Data Center,Designing For Scalability of Management and Fault Tolerance,Stack Overflow Search Now 81% Less,Stack Overflow Network Configuration,Does StackOverflow use caching and if so, how?,Which tools and technologies build the Stack Exchange Network?.Stack Overflow的變化翻天覆地,而不變的是他們開放的心態(tài),所以才有了這篇架構(gòu)分享的文章。最近,他們寫了一系列文章分享他們?nèi)绾螒?yīng)對這樣的快速增長。Some of the more obvious differences across time are:穿越時空,我們來看看有哪些明顯的變化? Just More. More users, more page views, more datacenters, more sites, more developers, more operating systems, more databases, more machines. Just a lotmore of more.更多:更多的用戶,更多的PV,更多的數(shù)據(jù)中心,更多的站點,更多的開發(fā)者,更多的操作系統(tǒng),更多的數(shù)據(jù)庫,更多的服務(wù)器. Linux. Stack Overflow was known for their Windows stack, now they are using a lot more Linux machines for HAProxy, Redis,Bacula,Nagios, logs, and routers. All support functions seem to be handled by Linux, which has required the development ofparallel release processes.Linux:Stack Overflow因使用Windows系統(tǒng)而著稱,現(xiàn)在他們使用越來越多的Linux服務(wù)器,比如HAProxy(負載均衡), Redis(NoSQL數(shù)據(jù)庫), Bacula(數(shù)據(jù)備份系統(tǒng)), Nagios(遠程監(jiān)控軟件), 日志, 路由器都運行于Linux系統(tǒng),幾乎所有需要并行處理的功能都是由Linux處理(這句話的翻譯可能不準確)。 Fault Tolerance. Stack Overflow is nowbeing served by two differentswitches on two different internet connections, theyve added redundant machines, and some functions have moved to a second datacenter.容錯:Stack Overflow使用了兩條不同的互聯(lián)網(wǎng)線路,增加了更多的冗余服務(wù)器,將一些網(wǎng)站服務(wù)運行于第二個數(shù)據(jù)中心。 NoSQL. Redis is now used as acaching layerfor the entire network. There wasnt a separate caching tier before so this a big change, as is using a NoSQL database on Linux.NoSQL:Redis作為整個網(wǎng)站的緩存層。這是一個巨大的改變,以前并沒有將緩存作為一個獨立的層分離出來。Redis運行于Linux。Unfortunately, I couldnt find any coverage on some of the open questions I had last time, like how they were going to deal with multi-tenancy across so many diffrent properties, but theres still plenty to learn from. Heres a roll up a few different sources:遺憾的是,一些我關(guān)注的問題并沒有從中找到答案,比如面對這么多不同的系統(tǒng),如何解決多租戶的問題(Multi-tenancy 是一種軟件體系結(jié)構(gòu),在這種體系結(jié)構(gòu)中軟件運行在 software as a service 服務(wù)商的服務(wù)器上,服務(wù)于多個客戶組織即 tenant)。但是,從中我們依然可以學到很多。下面是收集的一些數(shù)據(jù)列表:The Stats 95 Million Page Views a Month 800 HTTP requests a second 180 DNS requests a second 55 Megabits per second 16 Million Users -Traffic to Stack Overflow grew 131% in 2010, to 16.6 million global monthly uniques.Data Centers 1 Rack with Peak Internet in OR (Hosts our chat and Data Explorer) 2 Racks with Peer 1 in NY (Hosts the rest of the Stack Exchange Network)Hardware 10 Dell R610 IIS web servers (3 dedicated to Stack Overflow):o 1x Intel Xeon Processor E5640 2.66 GHz Quad Core with 8 threadso 16 GB RAMo Windows Server 2008 R2 2 Dell R710 database servers:o 2x Intel Xeon Processor X5680 3.33 GHzo 64 GB RAMo 8 spindleso SQL Server 2008 R2 2 Dell R610 HAProxy servers:o 1x Intel Xeon Processor E5640 2.66 GHzo 4 GB RAMo Ubuntu Server 2 Dell R610 Redis servers:o 2x Intel Xeon Processor E5640 2.66 GHzo 16 GB RAMo CentOS 1 Dell R610 Linux backup server running Bacula:o 1x Intel Xeon Processor E5640 2.66 GHzo 32 GB RAM 1 Dell R610 Linux management server for Nagios and logs:o 1x Intel Xeon Processor E5640 2.66 GHzo 32 GB RAM 2 Dell R610 VMWare ESXi domain controllers:o 1x Intel Xeon Processor E5640 2.66 GHzo 16 GB RAM 2 Linux routers 5 Dell Power Connect switchesDev Tools C#: Language Visual Studio 2010 Team Suite:IDE Microsoft ASP.NET (version 4.0):Framework ASP.NET MVC 3:Web Framework Razor:View Engine jQuery 1.4.2:Browser Framework: LINQ to SQL, some raw SQL:Data Access Layer MercurialandKiln:Source Control(分布式版本控制系統(tǒng)) Beyond Compare 3:Compare Tool(文件比較工具)Software and Technologies Used Stack Overflow uses aWISCstack viaBizSpark Windows Server 2008 R2 x64: Operating System SQL Server 2008 R2runningMicrosoft Windows Server 2008 Enterprise Edition x64: Database Ubuntu Server CentOS IIS 7.0: Web Server HAProxy: forload balancing(高性能的負載TCP/HTTP均衡器) Redis: used as the distributed caching layer.(作為分布式緩存層的NoSQL數(shù)據(jù)庫) CruiseControl.NET: for builds andautomated deployment(.NET平臺的持續(xù)集成工具) Lucene.NET: for search Bacula: for backups(開源的數(shù)據(jù)備份系統(tǒng)) Nagios: (withn2rrdand drraw plugins) for monitoring(監(jiān)視系統(tǒng)運行狀態(tài)和網(wǎng)絡(luò)信息的遠程監(jiān)控軟件) Splunk:for logs(日志分析工具) SQL Monitor:from Red Gate - for SQL Server monitoring Bind: for DNS Rovio:a little robot (a real robot) allowing remote developers to visit the office “virtually.” Pingdom:an external monitor and alert service.(網(wǎng)站監(jiān)控服務(wù)及網(wǎng)站速度測試工具)External BitsCode that is not included as part of the development tools: reCAPTCHA(用于驗證碼驗證,已被Google收購) DotNetOpenId(.NET 平臺上的 OpenID 實現(xiàn)方案) WMD - Now developed as open source. Seegithub network graph(輕量級所見即所得編輯器) Prettify(代碼高亮顯示) Google Analytics Cruise Control .NET HAProxy(負載均衡) Cacti(網(wǎng)絡(luò)流量監(jiān)測圖形分析工具) MarkdownSharp(Markdown文本處理器的C#實現(xiàn)) Flot(基于JQuery的純JavaScript實現(xiàn)的繪圖庫) Nginx(反向代理服務(wù)器) Kiln(分布式版本控制系統(tǒng)) CDN: none, all static content is served off , which is a fast, cookieless domain intended for static content delivered to the Stack Exchange family of websites.(沒有使用CDN,用一個專門的域名傳遞所有的靜態(tài)內(nèi)容)Developers and System Administrators 14 Developers 2 System AdministratorsContent License:Creative Commons Attribution-Share Alike 2.5 Generic Standards:OpenSearch, Atom Host:PEAK InternetMore Architecture and Lessons Learned HAProxy is used instead of Windows NLB because HAProxy ischeap, easy, free, works great as a 512MB VM “device” on a network via Hyper-V. It also works in front of the boxes so its completely transparent to them, and easier to troubleshoot as a different networking layer instead of being intermixed with all your windows configuration.用HAProxy取代了Windows NLB,HAProxy成本更低,更易于使用,通過Hyper-V可以很好地運行于512M內(nèi)存的虛擬機。它工作于服務(wù)器群的最前端,對所有的服務(wù)器都透明。相比于原來混雜在一起的Windows配置,它運行于一個獨立的網(wǎng)絡(luò)層,更易于維護與故障處理。 A CDN is not used because even “cheap” CDNs like Amazon one are very expensive relative to the bandwidth they get bundled into their existing hosts plan.The least they could pay is $1k/month based on Amazons CDN rates and their bandwidth usage.沒有使用CDN,即使使用像Amazon那樣與主機空間捆綁在一起的看起來“便宜”的CDN,實際的費用也是很高的,至少需要1000美元/月。 Backup is to disk for fast retrieval and to tape for historical archiving.備份方案有兩種,一種用于快速恢復(fù)的磁盤備份,一種用于歷史數(shù)據(jù)存檔的磁帶備份。 Full Text Search in SQL Server is very badly integrated, buggy, deeply incompetent, so they went to Lucene.SQL Server的全文索引是非常差勁的,所以他們用Lucene.NET。 Mostly interested in peak HTTP request figures as this is what they need to make sure they can handle.讓人很感興趣的是他們?nèi)绾翁幚碓L問高峰時的HTTP請求。 All properties now run on the same Stack Exchange platform. That means Stack Overflow, Super User, Server Fault, Meta, WebApps, and Meta Web Apps are all running on the same software.所有這些都運行于Stack Exchange平臺,那意味著Stack Overflow, Super User, Server Fault, Meta, WebApps, 和Meta Web Apps都運行于同一個軟件。 There are separate StackExchange sites because people have different sets of expertise that shouldnt cross over to different topic sites.You can be the greatest chef in the world, but that doesnt qualify you for fixing a server.也有一些獨立運行的StackExchange站點,服務(wù)于那些具有多個專業(yè)技能,又不想為了不同的話題在多個站點之間奔波的人。如果你能成為最偉大的主廚,不能因為給你安排了服務(wù)員的工作,你就安于現(xiàn)狀。 They aggressively cache everything.他們瘋狂地使用緩存。 All pages accessed by (and subsequently served to) annonymous users are cached viaOutput Caching.未登錄用戶訪問的所有頁面都通過Output Caching進行緩存。 Each site has 3 distinct caches: local, site, global.每個站點使用三種類型的緩存:本地、站點、全局。 local cache: can only be accessed from 1 server/site pair本地緩存:只能被當前站點的當前服務(wù)器訪問。o To limit network latency they use a local L1 cache, basically HttpRuntime.Cache, of recently set/read values on a server. This would reduce the cache lookup overhead to 0 bytes on the network.為了減少網(wǎng)絡(luò)延時,通常使用HttpRuntime.Cache作為一級緩存,這樣可以避免通過網(wǎng)絡(luò)在緩存服務(wù)器上查找的開銷。o Contains things like user sessions, and pending view count updates.緩存內(nèi)容包含用戶會話,視圖數(shù)的更新。o This resides purely in memory, no network or DB access.直接緩存在內(nèi)存中。 site cache: can be accessed by any instance (on any server) of a single site站點級緩存:能被同一個站點的所有服務(wù)器訪問。o Most cached values go here, things like hot question id lists and user acceptance rates are good examples大部分的緩存都在這一級,比如熱點問題ID列表,用戶支持率。o This resides in Redis (in a distinct DB, purely for easier debugging)緩存數(shù)據(jù)存儲在Redis數(shù)據(jù)庫中。o Redis is so fast that the slowest part of a cache lookup is the time spent reading and writing bytes to the network.Redis速度很快,緩存查找的開銷主要在網(wǎng)絡(luò)傳輸上。o Values are compressed before sending them to Redis. They have plenty of CPU and most of their data are strings so they get a great compression ratio.緩存數(shù)據(jù)發(fā)送至Redis之前會被壓縮。為什么要壓縮呢?因為CPU資源綽綽有余,而且大部分緩存數(shù)據(jù)是字符串,壓縮率會很高,何樂而不為呢。o The CPU usage on their Redis machines is 0%.Redis服務(wù)器上的CPU使用率是0%。 global cache: which is shared amongst all sites and servers全局緩存:被所有站點和服務(wù)器共享。o Inboxes, API usage quotas, and a few other truly global things live here緩存內(nèi)容包含收件箱,API使用限額,一些全局設(shè)置等。o This resides in Redis (in DB 0, likewise for easier debugging)緩存于Redis數(shù)據(jù)庫中。 Most items in the cache expire after a timeout period (a few minutes usually) and are never explicitly removed. When a specific cache invalidation is required they useRedis messagingto publish removal notices to the L1 caches.大部分緩存項目在超過緩存時間之后會自動過期(通常幾分鐘),不需要進行刪除操作。當需要讓一個特定的緩存失效,會通過Redis消息系統(tǒng)給一級緩存發(fā)送刪除通知。 Joel Spolsky is not aMicrosoft Loyalist, he doesnt make the technical decisions for Stack Overflow, and considersMicrosoft licensing a rounding error. Consider yourself correctedHacker News commentor.Joel Spolsky(Stack Overflow的創(chuàng)始人)并不是微軟的忠誠分子,他不負責技術(shù)決策,使用微軟軟件考慮的也只是性價比。Hacker News上一些評論者的說法需要糾正。 For their IO systemthey selecteda RAID 10 array ofIntel X25 solid state drives. The RAID array eased any concerns about reliability and the SSD drives performed really well in comparision to FusionIO at a much cheaper price.對于IO系統(tǒng),他們選擇的是Intel X25 solid state drives(SSD硬盤)的RAID 10磁盤陣列,
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 車輛抵押借款合同規(guī)范模板
- 司機安全生產(chǎn)責任書
- 交通運輸集團招聘
- 電工崗位安全生產(chǎn)責任
- 美術(shù)介紹畫家課件
- 企業(yè)安全生產(chǎn)三年行動實施方案
- 崗位安全責任
- 羅中立父親的課件
- 2025至2030中國壓瘡墊行業(yè)項目調(diào)研及市場前景預(yù)測評估報告
- 2025至2030中國史密斯機器行業(yè)產(chǎn)業(yè)運行態(tài)勢及投資規(guī)劃深度研究報告
- JJF 1255-2010 厚度表校準規(guī)范-(高清現(xiàn)行)
- 2022年混凝土攪拌站建設(shè)項目可行性研究報告
- 《覺醒年代》朗誦稿
- 2022年社會學概論考試重點廣東海洋
- 路基工程質(zhì)量通病及防治措施
- 福建省中小學教師職務(wù)考評登記表
- 咖啡文化PPT課件:咖啡配方及制作方法步驟
- 北京市中級專業(yè)技術(shù)資格評審申報表
- 工廠供電課程設(shè)計1
- 鼠害蟲害防治管理制度
- PLM_項目建議書_PTC
評論
0/150
提交評論