




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
統(tǒng)計(jì)學(xué)簡(jiǎn)史與數(shù)據(jù)科學(xué)袁衛(wèi)2016.12.10中南財(cái)經(jīng)政法大學(xué)英國(guó)培根:讀史可以明智(Histories
makemen
wise)德國(guó)斯勒茲:統(tǒng)計(jì)是靜態(tài)的歷史,歷史是動(dòng)態(tài)的統(tǒng)
計(jì).(Statistics
is
the
state
historywhile
history
is
the
dynamicstatistics).2一、 期源頭(Early
Beginnings)二、數(shù)學(xué)基礎(chǔ)
(MathematicalFoundations)三、現(xiàn)代發(fā)展
(Modern
Era)3一、早期源頭(公元前450年至15世紀(jì))4均值的使用5450
bc
Hippias
of
Elis
uses
the
average
value
ofthe
length
of
a
king’s
reign
(the
mean)to
workout
the
date
of
the
first
Olympic
Games,
some
300years
before
his
time.希皮亞斯(Hippias),出生于希臘伯羅奔尼撒(Peloponnesus)西北部的埃利斯(Elis),與柏拉圖(Plato)是同時(shí)代的人,歷史上第一位數(shù)學(xué)史家。他在公元前450年用以前每個(gè)國(guó)王執(zhí)政時(shí)間長(zhǎng)短的均值推算出首屆奧運(yùn)會(huì)是距當(dāng)時(shí)300多年前的公元前776年舉辦的。431
bc
Attackers
besieging
Plataea
in
thePeloponnesian
war
calculate
theheight
ofthe
wallby
countingthe
numberof
bricks.
The
count
wasrepeated
several
times
by
different
soldiers.
Themost
frequent
value
(the
mode)
was
taken
to
bethe
most
likely.
Multiplying
it
by
theheightof
onebrick
allowed
themto
calculate
thelength
of
theladders
neededto
scale
the
walls.公元前431年希臘伯羅奔尼撒戰(zhàn)爭(zhēng)中雅典人讓士兵數(shù)城墻磚的層數(shù),取士兵數(shù)據(jù)的眾數(shù)乘以每塊磚的厚度推算城墻的高度,用以計(jì)算云梯所需長(zhǎng)度。6眾數(shù)的使用7400
bc
In
the
Indian
epic
the
Mahabharata,
KingRtuparna
estimates
the
numberof
fruit
and
leaves(2095
fruit
and
50
000
000
leaves)
on
two
greatbranches
of
a
vibhitaka
tree
by
counting
thenumber
ona
single
twig,
then
multiplying
by
thenumberof
twigs.
The
estimate
is
foundto
be
veryclosetothe
actualnumber.This
is
the
firstrecorded
example
of
sampling
–
“but
thisknowledge
iskept
secret”,
says
the
account.公元前400年,印度史詩(shī)《摩訶婆羅多》(Mahabharata)中國(guó)王利用只計(jì)算兩個(gè)大樹枝上的果實(shí)和葉子數(shù)量乘上樹枝的數(shù)量估算整棵樹果實(shí)和葉子的數(shù)量,這是已知最早的抽樣推斷。抽樣推斷8AD
2
Chinese
census
under
the
Han
dynasty
finds57.67
million
people
in
12.36
million
households
–the
first
census
from
which
data
survives,
andstill
considered
by
scholars
to
have
been
accurate公元2年,中國(guó)漢代進(jìn)行了人口普查,結(jié)果是1236萬家庭,
5767萬人口。記載的數(shù)據(jù)被認(rèn)為是相當(dāng)準(zhǔn)確的。普查9AD
7
Census
by
Quirinus,
governor
of
the
Romanprovince
of
Judea,
is
mentioned
in
Luke’s
Gospelas
causing
Joseph
and
Maryto
travel
to
Bethlehemto
be
taxed.路加福音記載,公元7年,意大利羅馬省省長(zhǎng)奎里努斯實(shí)施了普查,導(dǎo)致約瑟夫和瑪麗前往約瑟夫祖籍大衛(wèi)家族所在的伯利恒申報(bào)戶籍.普查用10840
Islamic
mathematician
Al-Kindi
usesfrequency
analysis
–
the
most
common
symbolsin
a
coded
message
will
stand
for
the
mostcommon
letters
–
tobreaksecret
codes.
Al-Kindialso
introducesArabic
numeralsto
Europe.公元840年,伊斯蘭數(shù)學(xué)家金迪利最常用符號(hào)和最常用字符破解伊斯蘭密碼,他還將阿拉伯?dāng)?shù)字介紹到歐洲。頻數(shù)分析10th
century
The
earliest
known
graph,
in
acommentary
ona
book
by
Cicero,
shows
themovementsofthe
planets
through
the
zodiac.
It
isapparently
intended
for
use
in
monastery
schools.公元10世紀(jì),意大利西塞羅書中最早使用了曲線,描
述黃道帶中行星運(yùn)動(dòng)的軌跡,也是修道院最早使用的圖表曲線。11曲線121069
Domesday
Book:
survey
for
William
theConqueror
of
farms,
villages
and
livestock
in
hisnew
kingdom
–
the
start
of
official
statistics
inEngland.1069年最終稅冊(cè):英王征服者威廉一世做的調(diào)查,對(duì)新王國(guó)村莊和牲畜進(jìn)行調(diào)查,這是英國(guó)官方統(tǒng)計(jì)最早的記錄(英格蘭約150萬人,90%是農(nóng)民)。官方統(tǒng)計(jì)1150
Trial
of
the
Pyx,
an
annual
test
of
the
purityof
coins
from
the
Royal
Mint,
begins.
Coins
aredrawn
at
random,
in
fixed
proportions
to
thenumber
minted.
It
continues
to
this
day.公元1150年,英國(guó)皇家制幣廠開始硬幣純度和質(zhì)量的年度檢驗(yàn)。通過隨機(jī)樣本進(jìn)行等比例抽樣檢驗(yàn),延續(xù)至今。隨機(jī)抽樣13人口普查1188
Gerald
of
Wales
completed
the
firstpopulation
census
of
Wales.14公元1188年,英國(guó)威爾士的杰拉爾德完成了威爾士第一次人口普查。151303
A
Chinese
diagram
entitled
“The
OldMethod
Chart
of
the
Seven
Multiplying
Squares”shows
the
binomial
coefficients
upto
the
eighthpower
–
the
numbersthat
are
fundamentalto
themathematics
of
probability,
and
that
appearedfive
hundredyears
later
in
the
westas
Pascal’striangle.公元1303年中國(guó)“楊輝(1261)三角形”(賈憲更早)給出二項(xiàng)分布系數(shù)8次冪,奠定概率論的數(shù)學(xué)基礎(chǔ),而帕斯卡(1662)三角形是500年之后才出現(xiàn)。二項(xiàng)式系數(shù)161346
Giovanni
Villani’s
Nuova
Cronica
givesstatistical
information
on
the
population
and
tradeof
Florence.公元1346年,意大利佛羅倫斯當(dāng)時(shí)的歷史學(xué)家佐凡尼·微拉尼(Giovanni
Villani)在著作《Nuova
Cronica中紀(jì)錄了人口和貿(mào)易的統(tǒng)計(jì)信息。人口與貿(mào)易統(tǒng)計(jì)17二、數(shù)學(xué)基礎(chǔ)(16世紀(jì)至19世紀(jì)末)181560
Gerolamo
Cardano
calculates
probabilitiesof
different
dicethrows
for
gamblers.公元1560年,意大利文藝復(fù)興科學(xué)家吉羅拉莫·卡爾達(dá)諾計(jì)算出擲骰子的各種概率。概率初步191570
Astronomer
Tycho
Brahe
uses
thearithmeticmean
to
reduce
errors
in
his
estimates
of
thelocations
of
stars
andplanets.公元1570年,丹麥天文學(xué)家第谷·布拉赫在估計(jì)星球的位置和運(yùn)行時(shí)使用算術(shù)平均數(shù)減少誤差。均值與誤差1644
Michael
van
Langren
draws
the
first
known
graphof
statistical
data
that
shows
the
size
ofpossible
errors.
It
is
of
different
estimates
ofthedistance
between
Toledo
and
Rome.公元1644年,荷蘭天文學(xué)家Michael
van
Langren用統(tǒng)計(jì)數(shù)據(jù)畫出第一張誤差圖,用不同方法估計(jì)從西班牙托萊多到意大利羅馬的距離。20誤差圖1654
Pascal
and
Fermat
correspond
aboutdividing
stakes
in
gambling
games
and
togethercreate
themathematical
theory
of
probability.公元1654年法國(guó)帕斯卡和費(fèi)馬通過對(duì)賭博中如何下注等問題通信的研究共同創(chuàng)立了概率的數(shù)學(xué)理論。21概率數(shù)學(xué)基礎(chǔ)221657
Huygens’s
On
Reasoning
in
Games
of
Chance
is
the
first
book
on
probabilitytheory.Healso
invented
the
pendulum
clock.公元1657年,荷蘭科學(xué)家惠更斯完成“機(jī)會(huì)游戲的推理”一書,這是第一本概率理論的書,他還是擺鐘的發(fā)明者。首本概率著作231663
John
Graunt
uses
parish
records
to
estimatethe
population
of
London.公元1663年,英國(guó)約翰格朗特利用倫敦教區(qū)的洗禮、彌撒等數(shù)據(jù)分析并估計(jì)倫敦的人口,并首次給出新生嬰兒性別比52:48。人口統(tǒng)計(jì)1693
EdmundHalleyprepares
the
first
mortality
tables
statistically
relating
death
rates
to
age
–
thfoundation
of
life
insurance.
He
also
drew
a
stylisedmapof
the
path
of
a
solar
eclipse
over
England
–oneof
the
first
data
visualisation
maps.1693年,英國(guó)哈雷制作了第一張分年齡的死亡率表,為人壽保險(xiǎn)奠定了基礎(chǔ)。他還畫出日食經(jīng)過英國(guó)的路線圖,這也是數(shù)據(jù)的第一張可視化地圖。首張死亡率表241713
Jacob
Bernoulli’s
Ars
conjectandi
derivesthe
law
of
large
numbers
–
the
more
often
yourepeat
an
experiment,
the
moreaccurately
youcan
predict
the
result.1713年,瑞士科學(xué)家伯努利在《猜測(cè)術(shù)》一書中提出大數(shù)定律,即實(shí)驗(yàn)次數(shù)越多,預(yù)測(cè)結(jié)果就越準(zhǔn)確。大數(shù)定律251728
Voltaireand
his
mathematician
friend
de
laCondamine
spot
that
a
Paris
bond
lottery
isoffering
more
in
prize
money
than
the
total
costof
the
tickets;
they
corner
the
market
and
winthemselves
a
fortune.公元1728年法國(guó)伏爾泰和他的數(shù)學(xué)家朋友拉.孔達(dá)明計(jì)算出巴黎債券彩票的獎(jiǎng)金總額高于購(gòu)買的成本,于是他們壟斷了彩票市場(chǎng),并獲得收益.26博彩統(tǒng)計(jì)1749Gottfried
Achenwall
coins
the
word“statistics”
(in
German,
Statistik);
he
means
theinformation
you
needto
run
a
nation
state.公元1749年德國(guó)阿亨瓦爾創(chuàng)造了德文詞匯“Statistik”,即“statistics”。他定義“統(tǒng)計(jì)”為治理國(guó)家所需要的信息。德文“統(tǒng)計(jì)”詞匯的出現(xiàn)27281757
Casanova
becomes
a
trustee
of,
and
may
have
hada
hand
in
devising,
the
French
nationallottery.1757年法國(guó)卡薩諾瓦成為法國(guó)國(guó)家彩票的受托人,發(fā)明了彩票。國(guó)家彩票業(yè)出現(xiàn)291761
TheRev.
Thomas
Bayes
proves
Bayes’theorem
–
the
cornerstone
of
conditional
probabilityand
the
testing
of
beliefs
and
hypotheses.1761年英國(guó)貝葉斯證明了貝葉斯定理,奠定了條件概率的基礎(chǔ),檢驗(yàn)信念和假設(shè)。貝葉斯定理301786
William
Playfair
introduces
graphs
and
barcharts
to
show
economic
data.1786年英國(guó)愛丁堡WilliamPlayfair首次用圖表反映經(jīng)濟(jì)數(shù)據(jù)變化。經(jīng)濟(jì)數(shù)據(jù)圖表1789
Gilbert
White
and
other
clergymen-naturalistskeep
records
of
temperatures,
dates
of
firstsnowdrops
and
cuckoos,
etc;
the
data
is
later
usefulfor
study
of
climate
change.1789年英國(guó)吉爾伯特.懷特和其他牧師博物學(xué)家記錄溫度變化、首次降雪時(shí)間以及變化情況等。數(shù)據(jù)被用來研究氣候變化。31氣候統(tǒng)計(jì)1790
First
UScensus,
taken
by
men
on
horsebackdirected
by
Thomas
Jefferson,
counts
3.9
millionAmericans.1790年美國(guó)在第三任總統(tǒng)托馬斯.杰斐遜總統(tǒng)指導(dǎo)下進(jìn)行了首次人口普查,結(jié)果為390萬人口。美國(guó)首次人口普查321791
First
useof
the
word
“statistics”
in
English,by
Sir
John
Sinclair
in
his
Statistical
Account
ofScotland.1791年英國(guó)約翰.辛克萊在他“蘇格蘭統(tǒng)計(jì)賬戶”中首次使用英文詞“統(tǒng)計(jì)”Statistics.33英文“統(tǒng)計(jì)”詞匯的出現(xiàn)1805
Adrien-Marie
Legendre
introduces
the
methodof
least
squares
for
fitting
a
curve
to
a
given
set
ofobservations.1805年法國(guó)數(shù)學(xué)家勒讓德首次使用最小二乘法利用數(shù)據(jù)去擬合曲線。最小二乘法341808
Gauss,
with
contributions
from
Laplace,derivesthe
normal
distribution–
the
bell-shapedcurve
fundamentaltothe
study
of
variation
anderror.1808年德國(guó)高斯和拉普拉斯一起得到正態(tài)分布,即鐘形曲線,奠定了誤差研究的基礎(chǔ)。正態(tài)分布351833
The
British
Association
for
the
Advancementof
Science
sets
upa
statistics
section.
ThomasMalthus,
who
analysed
population
growth,
andCharles
Babbageare
members.
It
later
becomes
theRoyal
Statistical
Society.1833年,英國(guó)高等科學(xué)協(xié)會(huì)建立了統(tǒng)計(jì)分會(huì),分析人口增長(zhǎng)的托馬斯.馬爾薩斯和查爾斯.巴貝奇都是會(huì)員,這個(gè)分會(huì)后來成為英國(guó)皇家統(tǒng)計(jì)學(xué)會(huì)。英國(guó)統(tǒng)計(jì)學(xué)會(huì)361835
Belgian
Adolphe
Quetelet’s
Treatise
on
Man
introduces
social
science
statistics
and
theconcept
of
the
“average
man”
–
his
height,
bodymass
index,
and
earnings.1835年比利時(shí)阿道夫·凱特勒在《論人及其才能的發(fā)展》中將統(tǒng)計(jì)方法用于社會(huì)科學(xué),
并提出“平均人”的概念,討論人的身高、體重和收入等。應(yīng)用于社會(huì)科學(xué)371839:
The
American
Statistical
Association
is
formed.Alexander
Graham
Bell,
Andrew
Carnegie
andPresident
Martin
Van
Buren
will
become
members.1839年美國(guó)統(tǒng)計(jì)學(xué)會(huì)成立。亞歷山大·格雷厄姆·貝爾、德魯·卡內(nèi)基和美國(guó)總統(tǒng)馬丁?范布倫都是會(huì)員。美國(guó)統(tǒng)計(jì)學(xué)會(huì)成立381840
WilliamFarr
sets
up
the
official
system
forrecording
causes
of
death
in
England
and
Wales.This
allows
epidemics
to
be
tracked
and
diseasescompared
–the
start
of
medical
statistics.1840年,英國(guó)威廉.法爾建立了英格蘭和威爾士死亡原因的官方數(shù)據(jù)系統(tǒng),可以追蹤研究流行病,并對(duì)疾病進(jìn)行比較研究,開創(chuàng)了醫(yī)療衛(wèi)生統(tǒng)計(jì)。醫(yī)療衛(wèi)生統(tǒng)計(jì)391849
Charles
Babbage
designs
his
“differenceengine”,
embodying
the
ideas
of
data
handlingand
the
modern
computer.
AdaLovelace,
LordByron’s
niece,
writes
the
world’s
first
computerprogram
for
it.1849年英國(guó)查爾斯.巴貝奇設(shè)計(jì)了他的“差分機(jī)”,體現(xiàn)了掌握數(shù)據(jù)和現(xiàn)代計(jì)算機(jī)的基本思想。愛達(dá)·勒芙蕾絲,拜倫勛爵的侄女,為它寫了世界上最早的計(jì)算機(jī)程序。計(jì)算機(jī)程序雛形40411854
John
Snow’s
“cholera
map”pins
down
thesource
of
an
outbreak
as
a
water
pump
in
BroadStreet,
London,
beginning
the
modern
study
ofepidemics.1854年英國(guó)約翰.斯諾利用“霍亂地圖”確認(rèn)倫敦百老匯大街的供水系統(tǒng)是疾病爆發(fā)的源頭,也是現(xiàn)代流行病學(xué)研究的源頭。開創(chuàng)流行病學(xué)研究,421859
Florence
Nightingale
uses
statistics
ofCrimean
War
casualties
to
influence
public
opinion
and
theWar
Office.
She
shows
casualties
month
by
month
on
a
circular
chart
she
devises,
the
“Nightingale
rose”,forerunner
of
the
pie
chart.
She
is
the
first
woman
member
ofthe
Royal
Statistical
Society
and
the
first
overseasmember
othe
American
Statistical
Association.1859年,南丁格爾使用克里米亞戰(zhàn)爭(zhēng)傷亡統(tǒng)計(jì)數(shù)據(jù)影響公眾意見和英國(guó)戰(zhàn)爭(zhēng)決策機(jī)構(gòu)。她將戰(zhàn)爭(zhēng)期間逐月傷亡數(shù)據(jù)用她設(shè)計(jì)的圓形圖表示出來,即“南丁格爾玫瑰”是最早的餅圖。她是英國(guó)皇家統(tǒng)計(jì)學(xué)會(huì)第一位女會(huì)員,也是美國(guó)統(tǒng)計(jì)學(xué)會(huì)第一位
海外會(huì)員。餅圖的使用431868
Minard’s
graphic
diagramof
Napoleon’sMarchon
Moscow
shows
on
one
diagram
the
distancecovered,
the
numberof
men
still
alive
ateachkilometre
of
the
march,
and
the
temperatures
theyencountered
on
the
way.1868年英國(guó)米納爾繪制了拿破侖東征莫斯科戰(zhàn)爭(zhēng)圖。圖
中繪出征程中重大戰(zhàn)役以及軍隊(duì)減員數(shù)據(jù),從莫斯科撤退過程氣溫的變化,簡(jiǎn)潔地描描述述了了一一場(chǎng)場(chǎng)戰(zhàn)戰(zhàn)爭(zhēng)爭(zhēng)。。44統(tǒng)計(jì)圖表的妙用1877
Francis
Galton,
Darwin’s
cousin,describes
regression
to
the
mean.In
1888
he
introduces
the
concept
ofcorrelation.
At
a
“Guess
the
weightofan
Ox”
contest
in
Devon
he
describesthe“Wisdomof
Crowds”
–
that
the
average
of
manyuninformed
guesses1877年英國(guó)弗朗西斯·高爾頓,達(dá)爾文的表弟,首次描述了向
平均數(shù)的回歸。1888年他首次
使用了相關(guān)的概念。在德文郡一次“猜猜公牛的體重”的
競(jìng)賽中,他描述許多無知猜測(cè)
的平均數(shù)體現(xiàn)了“群眾的智慧”。
45回歸與相關(guān)1886
Philanthropist
Charles
Booth
begins
his
surveyof
theLondon
poor,
to
produce
his
“poverty
map
ofLondon”.
Areas
were
coloured
black,
for
thepoorest,
throughto
yellow
for
the
upper-middleclass
and
wealthy.1886年英國(guó)慈善家查爾斯·布斯開始在倫敦進(jìn)行貧困調(diào)查并繪制了“貧困地圖”。地圖上用黑色表示最貧窮地區(qū),逐漸過渡到黃色代表中上層和富裕地區(qū)。貧困地圖461894
Karl
Pearson
introduces
the
term
“standarddeviation”.
If
errors
are
normally
distributed,
68%
osamples
will
lie
within
one
standard
deviation
of
themean.
Later
he
develops
chi-squared
tests
forwhether
two
variables
are
independent
of
each
other.1894年英國(guó)卡爾.皮爾遜首次使用了“標(biāo)準(zhǔn)差”的術(shù)語(yǔ)。如果誤差是正態(tài)分布的,68%的樣本會(huì)落在均值附近正負(fù)一個(gè)標(biāo)準(zhǔn)差之內(nèi)。不久,他又提出卡方檢驗(yàn),用來檢驗(yàn)兩個(gè)統(tǒng)計(jì)量是否獨(dú)立。標(biāo)準(zhǔn)差47481898
Von
Bortkiewicz’s
data
on
deaths
of
soldiersin
the
Prussian
army
from
horse
kicks
shows
thatapparently
rare
events
follow
a
predictable
pattern,the
Poisson
distribution1898年,德國(guó)范.鮑特凱維茲發(fā)現(xiàn)普魯士士兵被戰(zhàn)馬踢死的數(shù)據(jù)是明顯的稀有事件,可以用泊松分布進(jìn)行預(yù)測(cè)。泊松分布三、現(xiàn)代發(fā)展(20世紀(jì)初至今)49501900
Louis
Bachelier
shows
that
fluctuations
instock
market
prices
behave
inthe
same
way
as
therandom
Brownian
motion
of
molecules
–
the
startof
financial
mathematics.1900年,法國(guó)數(shù)學(xué)家巴施里葉(Louis
Bachelier)首先發(fā)現(xiàn)股票市場(chǎng)價(jià)格的波動(dòng)與分子隨機(jī)布朗運(yùn)動(dòng)一致,開創(chuàng)了金融數(shù)學(xué)。金融數(shù)學(xué)1908
William
Sealy
Gossett,
chief
brewer
forGuinness
in
Dublin,
describes
the
t-test.
It
uses
asmall
numberof
samplesto
ensure
that
every
brewtastes
equally
good.1908年英國(guó)都柏吉尼斯啤酒廠的首席釀酒師威廉.戈塞特提出了t檢驗(yàn)。使用小樣本確保每一桶啤酒都是一樣
的高質(zhì)量。小樣本t統(tǒng)計(jì)量511911
Herman
Hollerith,
inventorof
punch-carddevices
usedto
analyse
data
in
US
censuses,merges
his
company
toform
what
will
becomeIBM,
pioneers
of
machinesto
handle
business
dataandof
early
computers.1911年在美國(guó)人口普查中使用打孔機(jī)分析數(shù)據(jù)的赫爾曼.霍爾瑞斯將收購(gòu)的公司與自己公司合并形成IBM,是使用機(jī)器處理數(shù)據(jù)和早期計(jì)算機(jī)的先驅(qū)。計(jì)算機(jī)處理數(shù)據(jù)先驅(qū)52。531916
During
the
First
World
Warcar
designerFrederick
Lanchester
develops
statistical
laws
topredict
the
outcomes
of
aerial
battles:
if
you
doubletheir
size
land
armies
are
only
twice
as
strong,
butair
forces
are
four
times
as
powerful.1916年第一次世界大戰(zhàn)期間,英國(guó)汽車的設(shè)計(jì)者蘭徹斯特用統(tǒng)計(jì)法則預(yù)測(cè)空戰(zhàn)結(jié)果如果陸軍軍隊(duì)數(shù)量是對(duì)手兩倍的話,其空軍戰(zhàn)斗力將是對(duì)手的四倍。第一次世界大戰(zhàn)中統(tǒng)計(jì)的應(yīng)用541924
Walter
Shewhartinvents
the
control
chart
toaid
industrial
production
and
management1924年美國(guó)貝爾實(shí)驗(yàn)室的沃爾特.休哈特發(fā)明了控制圖,極大地提高了工業(yè)生產(chǎn)和管理水平。質(zhì)量控制圖551935
George
Zipf
finds
that
many
phenomena
–
riverlengths,
city
populations
–
obey
a
power
law
so
thatthe
largest
is
twice
the
size
of
the
second
largest,three
times
the
size
of
the
third,
and
so
on.1935年美國(guó)語(yǔ)言學(xué)家約翰.齊普夫發(fā)現(xiàn)許多現(xiàn)象,如河流長(zhǎng)度、城市人口數(shù)、英文詞匯使用頻率等都遵從一條定律即出現(xiàn)最多的是出現(xiàn)第二多的兩倍,是出現(xiàn)第三多的三倍,等等,被稱為齊普夫定律,也就是我們常說的“二八原則”。齊普夫定律:“二八”原則561935
R.
A.
Fisher
revolutionises
modern
statistics.His
Design
of
Experiments
gives
waysof
decidingwhich
results
of
scientific
experiments
aresignificant
and
whichare
not.1935年英國(guó)費(fèi)雪對(duì)現(xiàn)代統(tǒng)計(jì)學(xué)作出了歷史性的貢獻(xiàn)。他
的試驗(yàn)設(shè)計(jì)方法能夠確定哪些科學(xué)試驗(yàn)結(jié)果是顯著的,哪些不是?,F(xiàn)代試驗(yàn)設(shè)計(jì)571937
Jerzy
Neyman
introduces
confidence
intervalsin
statistical
testing.
His
workleads
to
modernscientific
sampling.1937年內(nèi)曼在統(tǒng)計(jì)檢驗(yàn)中給出了置信區(qū)間,他的成果開創(chuàng)了現(xiàn)代科學(xué)抽樣理論。置信區(qū)間581940-45
Alan
Turing
at
Bletchley
Park
cracks
theGerman
wartime
Enigma
code,using
advancedBayesian
statistics
and
Colossus,thefirstprogrammable
electronic
computer.1940-45年英國(guó)數(shù)學(xué)家阿蘭.圖靈在布萊切利園破解德軍戰(zhàn)爭(zhēng)中的密碼,他使用高等貝葉斯統(tǒng)計(jì),并研制了巨人計(jì)算機(jī),第一臺(tái)編程的電子計(jì)算機(jī)。首臺(tái)編程計(jì)算機(jī)591944
The
German
tank
problem:
the
Allies
desperately
needto
know
how
many
Panther
tanks
they
will
face
in
France
onD-Day.
Statistical
analysis
of
the
serial
numbers
ongearboxes
from
captured
tanks
indicates
how
many
of
eachare
being
produced.
Statisticians
predict
270
a
month;reports
from
intelligence
sources
predict
many
fewer.
Thetotal
turned
out
to
be
276.
Statistics
had
outperformed
spies1944年德軍坦克問題:盟軍急切地想知道在法國(guó)將面對(duì)
多少豹式坦克。破解了被繳獲坦克變速箱上的生產(chǎn)序列號(hào)利用統(tǒng)計(jì)方法預(yù)測(cè)德軍每月增加270輛,諜報(bào)人員預(yù)測(cè)比270少得多,最后實(shí)際數(shù)字是
276輛,統(tǒng)計(jì)預(yù)測(cè)勝過諜報(bào)工作。第二次世界大戰(zhàn)中統(tǒng)計(jì)的應(yīng)用1946
Cox’s
theoremderives
the
axioms
ofprobability
from
simple
logical
assumptions.1946年美國(guó)考克斯利用簡(jiǎn)單邏輯假設(shè)推出了概率論公理。概率論公理60611948
Claude
Shannon
introduces
informationtheoryand
the“bit”–fundamental
tothe
digital
age.1948年美國(guó)科學(xué)家克勞德.香農(nóng)提出了信息論和“比特”的概念,開創(chuàng)了數(shù)字時(shí)代。信息論與比特1948-53
The
Kinsey
Report
gathers
objective
data
onhumansexual
behaviour.
A
large-scale
survey
of5000
men
and,
later,
5000
women,
it
causes
outrage.1948-53年美國(guó)金 報(bào)告收集了人類性行為的客觀數(shù)據(jù),先是對(duì)5000男性進(jìn)行了調(diào)查,接著又對(duì)5000女性進(jìn)行調(diào)查,報(bào)告公開后引起社會(huì)巨大憤怒和反響。性學(xué)研究62631950
Richard
Doll
and
Bradford
Hill
establish
the
linkbetween
cigarette
smoking
and
lung
cancer.
Despitefierce
opposition
the
result
is
conclusively
proved,to
hugepublic
health
benefit.1950年英國(guó)理查德.多爾和布萊德?!は栄芯苛顺闊熀头伟┑年P(guān)系。他們頂住反對(duì)意見的壓力,最終給出了研究的結(jié)果,對(duì)公眾健康有益。抽煙與肺癌研究1950s
GenichiTaguchi’s
statistical
methods
toimprove
the
quality
ofautomobile
and
electronicscomponents
revolutionise
Japanese
industry,
whichfar
overtakes
western
European
rivals.20世紀(jì)50年代日本田口玄一利用統(tǒng)計(jì)方法改善汽車和電
子產(chǎn)品給日本工業(yè)界帶來革命,使得日本遠(yuǎn)遠(yuǎn)超過歐美競(jìng)爭(zhēng)者的質(zhì)量。田口的試驗(yàn)設(shè)計(jì)64651958
The
Kaplan–Meier
estimator
gives
doctors
asimple
statistical
wayof
judging
which
treatmentswork
best.
It
has
saved
millions
oflives.1958年美國(guó)Kaplan–Meier生存分析估計(jì)方法使得醫(yī)生可以用簡(jiǎn)單的統(tǒng)計(jì)方法判斷治療方案的有效性,解救了數(shù)百萬生命。生存分析1972
David
Cox’s
proportional
hazard
model
andthe
concept
of
partial
likelihood.1972年英國(guó)大衛(wèi).考克斯使用比例風(fēng)險(xiǎn)模型和偏似然函數(shù)概念。比例風(fēng)險(xiǎn)模型與偏似然函數(shù)661977
John
Tukey
introduces
the
box-plot
orbox-and-whisker
diagram,
which
shows
thequartiles,medians
and
spread
of
data
in
a
singleimage.1977年美國(guó)約翰.圖基介紹了箱線圖和莖葉圖,利用數(shù)據(jù)的四分位數(shù)、中位數(shù)和散布等簡(jiǎn)單直觀表示數(shù)據(jù)特征。探索性數(shù)據(jù)分析671979
Bradley
Efron
introduces
bootstrapping,
asimplewayto
estimate
the
distribution
of
almostany
sample
of
data.1979年美國(guó)斯坦福大學(xué)布拉德利.艾佛容提出了自助法,一種可以估計(jì)任意樣本數(shù)據(jù)抽樣分布的簡(jiǎn)單方法。自助法681982
EdwardTufte
self-publishes
The
VisualDisplay
of
Quantitative
Information,
setting
newstandards
for
graphic
visualisation
of
data.1982年美國(guó)耶魯大學(xué)愛德華·塔夫特自己印制了“數(shù)量信息的可視化”,建立了數(shù)據(jù)圖形可視化新標(biāo)準(zhǔn)。圖形可視化691988
Margaret
Thatcher
becomes
the
first
worldleader
to
call
for
action
onclimate
change.1988年英國(guó)首相瑪格麗特·撒切爾號(hào)召采取行動(dòng)面對(duì)氣候變化,成為世界上首位領(lǐng)導(dǎo)人。應(yīng)對(duì)氣候變化701993
The
statistical
programming
language
“R”
isreleased,
nowa
standard
statistical
tool.1993年統(tǒng)計(jì)“R”語(yǔ)言發(fā)布使用,現(xiàn)在成為一種標(biāo)準(zhǔn)的統(tǒng)計(jì)工具。“R”語(yǔ)言是20世紀(jì)80年代從美國(guó)“S”語(yǔ)言基礎(chǔ)上發(fā)展起來的?!癛”語(yǔ)言711997
The
term“Big
Data”first
appears
in
print.1997年“大數(shù)據(jù)”一詞首次見諸報(bào)端?!按髷?shù)據(jù)”首次出現(xiàn)722002
The
amountof
informationstoreddigitallysurpasses
non-digital.2002年以數(shù)字形式存儲(chǔ)的信息首次超過非數(shù)字信息?!皵?shù)字”數(shù)據(jù)首次為主73742002
Paul
DePodesta
uses
statistics
–“sabermetrics”–
to
transform
the
fortunes
of
the
Oakland
Athleticsbaseball
team;thefilm
Moneyball
tellsthestory.2002年美國(guó)保爾.波戴斯塔應(yīng)用統(tǒng)計(jì)方法“棒球數(shù)據(jù)分析”改變了奧克蘭運(yùn)動(dòng)家隊(duì)的命運(yùn),電影“點(diǎn)球成金”講述了這一故事。體育中應(yīng)用752004
Launchof
Significance
magazine.2004年“Significance”雜志創(chuàng)刊?!癝ignificance”雜志創(chuàng)刊2008
Hal
Varian,
chief
economist
at
Google,
saysthat
statistics
will
be
“the
sexy
professionof
thenext
ten
years”.2008年美國(guó)谷歌首席經(jīng)濟(jì)學(xué)家預(yù)測(cè)統(tǒng)計(jì)學(xué)將成為未來十年最性感的職業(yè)。對(duì)統(tǒng)計(jì)學(xué)的預(yù)測(cè)762012
The
Large
Hadron
Collider
confirmsexistenceof
a
Higgs
boson-like
particle
with
probability
of
fivstandarddeviations–
around
one
chance
in
3.5million
that
all
they
are
seeing
is
coincidence.2012年歐洲大型電子對(duì)撞機(jī)確認(rèn)希格斯玻色子粒子的存在,發(fā)生的概率是5個(gè)標(biāo)準(zhǔn)差外的小概率,即350萬分之一的概率。希格斯玻色子粒子772012
Nate
Silver,
statistician,
successfully
predictsthe
result
in
all
50
states
in
the
US
Presidentialelection.
He
becomesa
media
star.2012年美國(guó)統(tǒng)計(jì)學(xué)家內(nèi)特.席爾瓦成功地預(yù)測(cè)了美國(guó)所有50個(gè)州總統(tǒng)大選的結(jié)果,成為媒體明星。美國(guó)大選成功預(yù)測(cè)7879早期源頭:公元前5世紀(jì)--
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年上半年寧??h國(guó)企業(yè)公開招聘工作人員易考易錯(cuò)模擬試題(共500題)試卷后附參考答案
- 2025年上半年寧波慧谷投資發(fā)展限公司及下屬子公司招聘易考易錯(cuò)模擬試題(共500題)試卷后附參考答案
- 2025年上半年寧波市環(huán)境保護(hù)局局屬事業(yè)單位招考高層次人才易考易錯(cuò)模擬試題(共500題)試卷后附參考答案
- 2025年上半年寧波市東方公司招考項(xiàng)目專員易考易錯(cuò)模擬試題(共500題)試卷后附參考答案
- 2025年上半年寧波農(nóng)商發(fā)展集團(tuán)限公司招聘易考易錯(cuò)模擬試題(共500題)試卷后附參考答案
- 2024年面板封接玻璃項(xiàng)目資金籌措計(jì)劃書代可行性研究報(bào)告
- 2024西安水務(wù)建設(shè)工程集團(tuán)有限公司第一分公司招聘筆試參考題庫(kù)附帶答案詳解
- 2024福建龍巖市龍盛市場(chǎng)管理集團(tuán)有限公司招聘1人筆試參考題庫(kù)附帶答案詳解
- 2025年小精靈住房積金管理系統(tǒng)項(xiàng)目可行性研究報(bào)告
- 2024福建廣電網(wǎng)絡(luò)集團(tuán)南平分公司招聘29人筆試參考題庫(kù)附帶答案詳解
- 2025湖北日?qǐng)?bào)傳媒集團(tuán)招聘45人筆試參考題庫(kù)附帶答案詳解
- 2025年安徽交通職業(yè)技術(shù)學(xué)院?jiǎn)握芯C合素質(zhì)考試題庫(kù)一套
- 2025年北京社會(huì)管理職業(yè)學(xué)院?jiǎn)握新殬I(yè)技能考試題庫(kù)及參考答案一套
- 2025年哈爾濱幼兒師范高等??茖W(xué)校單招職業(yè)技能測(cè)試題庫(kù)學(xué)生專用
- 企業(yè)內(nèi)部系統(tǒng)使用權(quán)限規(guī)范
- 2024年亳州職業(yè)技術(shù)學(xué)院?jiǎn)握新殬I(yè)技能測(cè)試題庫(kù)
- 2025年旅行與旅游的未來:擁抱可持續(xù)與包容性增長(zhǎng)報(bào)告(英文版)-世界經(jīng)濟(jì)論壇
- 學(xué)校跟移動(dòng)公司合作協(xié)議
- T-CBIA 010-2024 營(yíng)養(yǎng)素飲料標(biāo)準(zhǔn)
- 茶館項(xiàng)目創(chuàng)業(yè)計(jì)劃書
- 化工生產(chǎn)中的智能優(yōu)化
評(píng)論
0/150
提交評(píng)論