版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
1、AWS動態(tài)管理大規(guī)模Spark集群技術(shù)創(chuàng)新 變革未來Founded in 2013 with HQ in Mountain View, California USAStrong financial backing over $50M by Sequoia Capital, Genesis Capital, and GSR venturesTechnology7 patent pendingElite tech-team of PhDs from top universities specializing in machine learning, big data and securityCli
2、entsPartnered with global clients with an online or mobile presence in gaming, social, commerce and finance4B+Protected accountsglobally800B+Processed eventsto date200MDetected + bad accounts to dateRules EnginesSupervisedMachine LearningUnsupervisedMachine LearningReputation ListsHow it worksSearch
3、 reputation databaseMatches against listsMake decisionSende rReceiverReputation DatabaseIs Sender IP Listed?YES / NOMakeDecisionExamplesEmailIP AddressDevicesCredit Card #sPhone #sHow it worksCheck against rule listsCriteria with weightsCombination rules with logicIF (user email = free email service
4、) AND (comment character count 150 per sec) flag user account as spammermute commentingRULEWEIGHTIP Address is anonymous proxy+800Account age 180 days-500Email is private corporate domain name-350Mismatch billing country and IP country+450Phone number found on 3 accounts+250What is it?An algorithm t
5、hat learns to perform a task from known examples (training data).An important requirement of using supervised learning is having the data to train the model.What is it?An algorithm that learns to identify linkages and patterns in the data without prior knowledge of what to look for.Unsupervised mach
6、ine learning does not requirelabeled training data.Comparison Between ApproachesTimeReputation List and Device FingerprintRules EngineSupervised Machine LearningUnsupervised Machine LearningEffectiveness Limited coverage and precision Can use emulators to bypass device fingerprintNeed to maintain an
7、d adapt rules constantly Poor against adaptive attacks Need large amount of labeled data Have difficulties detecting unknown attacksAuto-label generationDetection of unknown attacksAuto-rules generationDataVisorUnsupervised Machine LearningUML Engine Process FlowSTEP 1DYNAMIC FEATURE EXTRACTIONSTEP
8、2UNSUPERVISED ATTACK RING DETECTIONSTEP 3RESULT CATEGORIZATION & RANKINGGenerating large set of features to describe each input accountPerforming correlation analysis across all counts and identifying attack ringsAssigning confidence score and categorizing attack ringProfile InfoBehaviors & Activiti
9、esOrigins & Digital FingerprintsContents & MetadataRelationship among acctsSecurity event logs + labelsCustom scoring + metadataSocialGamingE-commerceFinanceDynamic event extractionDATA INFRASTRUCTURE (Hadoop, Spark)Correlation engine of billions of usersTemporalEventSeq.Velocity + FreqSpatial / GEO
10、DomainGraph AttributesAttributesRaw Input DataOur feature engineering was designed to operate across a very high dimensional feature space and be comprehensive in extracting fraud features.Data Processing LayerFeature ExtractionCross-event/Time-series Feature EngineeringDATAVISORS UNSUPERVISED MACHI
11、NE LEARNINGUser profile dataDerived features (frequency, velocity, correlation )user0001ProfileBehaviorFreq.Corr.Long vector to describe comprehensive profile and behavior of each useruser0002user0003User IDuser0001Update dynamicallyBehavior dataUser ID.user000 1user0002user0003Velocit yDeviceSeq.Ge
12、oAll application-level events from multiple online service verticals410 Million+ IP addresses3.6 Million+ Email domains160,000+ Device types300,000+ OS versions5.3 Million+ User agent strings700,000+ Phone prefixesFrom 4 Billion+ global users, 800 Billion+ events and growingFinancialE-CommSocialMobi
13、leGlobal Intelligence NetworkXiaomi Mi 5 is a phone that was released in 201650% of its footprint is in Russia and ChinaWhen it appears in other region, its fraud rate can up to 51%user0001user0002user0003Intel from GINIPEmail domainPhone prefixDevice infoGLOBAL INTELLIGENCE NETWORK (GIN)ProfileBeha
14、vi orFreq.Corr.Velocit yDeviceSeq.Geouser0001user0002user0003user0004user0005user9553Cluster001Dimension reduction based onStatistical analysisDomain knowledgeFeature correlationDynamic clustering based oncombinations ofFeature dimensions (f)Feature weights (w)Linkage probability func. (F)User Level
15、Cluster LevelCluster002Cluster551Based on key clustering features, DataVisor engine will output reason code and corresponding categories.ATTACK CATEGORIZATIONREASON CODEAutomated account opening fraudMass account takeover password testingManual transaction fraudCLASSIFICATIONOF FRAUD CAMPAIGNSANALYS
16、IS OFFRAUD TECHNIQUESMONITORING OF FRAUD TRENDSStop Fake Account CreationPrevent mass registration of fake account armiesPrevent Transaction FraudReduce e-commerce and financial fraud 30%-50% more than traditional solutionsIdentify Account TakeoversDetect compromised users before damage to your cust
17、omers or brandBlock Fake Reviews & LikesMaintain trust in your platform by reducing fake comments & votingFilter SpamPrevent spammers from posting illicit or annoying contentDiscover Fake App InstallsSave millions of dollars per year by flagging fake mobile app installsEarly detection view shows how
18、 DataVisor catches crime rings before damage is done.Fake App Installs and Game Play Activity15K+ installs all coming from the same device typeRedmi 3S running Android 5.1.1Fake retention activity within 7 days to 2 weeks following installMultiple app starts every few seconds or minutesCompletely go
19、ne inactive after the faked app_start retentionsAll Installs from Xiaomi Redmi 3sFollowed by fake app starts to mimic retentionOne major “wave” offraudulent installsFraud score95Derive granular user behavior informationNew user ratioFraudulent user ratioFirst/Last seen timeProxy/Data center IPGeoloc
20、ation Deep LearningGlobal Intelligence NetworkFinancialSocialE-CommMobilePro:Unified engine (end-to-end solution)Simple APISpeedCon:Deep learning integration under developmentPro:Production ready (if done right)Extensive ML API for various tasksCon:Limited data pre-processing supportNot end-to-end s
21、olutionUDFDataframeDerived featureOrigin featurePre-processingLoad data into DataFrameEach user defined function (UDF) is builtfrom a feature functionUniform APIServingEvery entry of data point is pre-processedand then fed to DL model for inferenceThe same feature function is used to process data at
22、 serving timeFeature functionsServingdataModelingInferencePipline Module 1Process 200+ GB/day/client5000+ average peak QPSacross clientsBatch process runs multiple times per dayDynamically launch and destroy Spark Cluster utilizing Spot FleetResults are precomputed and written to each data storeOrig
23、inal DataPipline Module 2Pipline Module N+1Pipline Module NPipline Module N+1Pipline Module N+1Moved to a 3 year convertible instance modelReal-time cost tracking (Cloudability)Spot FleetSparkGen Internal Spark Cluster Management SoftwareProd JobSchedulerSpark Resource ManagerProd JobsDev JobsDevelo
24、persMSSSMSSSSSMSSSSTrack pipeline dependency and run all jobs on Spot instancesTip: Spot instances are 7 times cheaper than on-demand 3 times cheaper than reserved instances.Single Static Cluster One-time launch Low utilization Idle timeMultiple Static Clusters One-time launch Moderate utilization Idle time Limited concurrency
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 我的拿手好戲彈鋼琴作文
- 2025年食品安全管理師理論考試復(fù)習題庫(含答案)
- 2025年江西楓林涉外經(jīng)貿(mào)職業(yè)學院高職單招職業(yè)適應(yīng)性測試近5年??及鎱⒖碱}庫含答案解析
- 2025年江蘇商貿(mào)職業(yè)學院高職單招語文2018-2024歷年參考題庫頻考點含答案解析
- 《稅收管理業(yè)務(wù)指導(dǎo)》課件
- 專題04:人稱選用(練習)
- 食品工廠衛(wèi)生作業(yè)內(nèi)容
- 蔬菜基地租賃承包合同范本
- 物業(yè)招標合同書
- 建筑消防系統(tǒng)技術(shù)服務(wù)合同
- 2025年華僑港澳臺學生聯(lián)招考試英語試卷試題(含答案詳解)
- 2024-2025學年北京石景山區(qū)九年級初三(上)期末語文試卷(含答案)
- 第一章 整式的乘除 單元測試(含答案) 2024-2025學年北師大版數(shù)學七年級下冊
- JD37-009-2024 山東省存量更新片區(qū)城市設(shè)計編制技術(shù)導(dǎo)則
- 中國高血壓防治指南(2024年修訂版)
- 北京市海淀區(qū)重點中學2025屆高考數(shù)學押題試卷含解析
- GB/Z 44765.3-2024用戶端能源管理系統(tǒng)和電網(wǎng)側(cè)管理系統(tǒng)間的接口第3部分:架構(gòu)
- 《春酒》琦君完整版
- 北師大版(2024新版)七年級上冊數(shù)學第四章《基本平面圖形》測試卷(含答案解析)
- 湖南省邵陽市武岡市2024屆高三上學期期中考試地理含答案解析
- 春節(jié)后復(fù)工安全教育培訓考試試題及答案
評論
0/150
提交評論