<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8">
<meta name="generator" content="pdf2htmlEX">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<link rel="stylesheet" href="https://static.pudn.com/base/css/base.min.css">
<link rel="stylesheet" href="https://static.pudn.com/base/css/fancy.min.css">
<link rel="stylesheet" href="https://static.pudn.com/prod/directory_preview_static/629b5a0da1ab4536adfe50a7/raw.css">
<script src="https://static.pudn.com/base/js/compatibility.min.js"></script>
<script src="https://static.pudn.com/base/js/pdf2htmlEX.min.js"></script>
<script>
try{
pdf2htmlEX.defaultViewer = new pdf2htmlEX.Viewer({});
}catch(e){}
</script>
<title></title>
</head>
<body>
<div id="sidebar" style="display: none">
<div id="outline">
</div>
</div>
<div id="pf1" class="pf w0 h0" data-page-no="1"><div class="pc pc1 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/629b5a0da1ab4536adfe50a7/bg1.jpg"><div class="t m0 x1 h2 y1 ff1 fs0 fc0 sc0 ls0 ws0">大数据技术丛书</div><div class="t m0 x2 h3 y2 ff2 fs1 fc0 sc0 ls1 ws1">深入理解<span class="_ _0"> </span><span class="ff3">Spark:</span></div><div class="t m0 x3 h4 y3 ff1 fs1 fc0 sc0 ls1 ws1">核心思想与源码分析</div><div class="t m0 x4 h5 y4 ff4 fs2 fc0 sc0 ls1 ws1">耿嘉安 著</div></div><div class="pi" data-data='{"ctm":[1.763889,0.000000,0.000000,1.763889,0.000000,0.000000]}'></div></div>
</body>
</html>
<div id="pf2" class="pf w0 h0" data-page-no="2"><div class="pc pc2 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/629b5a0da1ab4536adfe50a7/bg2.jpg"><div class="t m1 x5 h6 y5 ff5 fs3 fc0 sc0 ls2 ws2">图书在版编目(<span class="_ _1"></span><span class="ff6">CIP<span class="_ _1"></span><span class="ff5">)数据</span></span></div><div class="t m1 x5 h7 y6 ff5 fs4 fc0 sc0 ls3 ws3">深入理解<span class="_ _2"> </span><span class="ff6">Spark</span>:核心思想与源码分析<span class="_ _2"> </span><span class="ff6 ls1 ws1">/<span class="_ _2"> </span></span>耿嘉安著<span class="_ _2"> </span><span class="ff6">. <span class="_ _3"></span></span>—北京:机械工业出版社,<span class="ff6">2015.12</span></div><div class="t m1 x6 h7 y7 ff5 fs4 fc0 sc0 ls3 ws3">(<span class="_ _4"></span>大数据技术丛书<span class="_ _4"></span>)</div><div class="t m1 x5 h8 y8 ff6 fs4 fc0 sc0 ls3 ws1">ISBN 978-7-111-52234-8</div><div class="t m1 x5 h7 y9 ff6 fs4 fc0 sc0 ls4 ws4">I. <span class="ff5">深… </span>II.<span class="_ _2"> </span><span class="ff5">耿… </span>III.<span class="_ _2"> </span><span class="ff5">数据处理软件 </span><span class="ws1">IV. TP274</span></div><div class="t m1 x5 h7 ya ff5 fs4 fc0 sc0 ls3 ws3">中国版本图书馆<span class="_ _2"> </span><span class="ff6">CIP<span class="_ _2"> </span></span>数据核字(<span class="ff6">2015</span>)第<span class="_ _2"> </span><span class="ff6">280808<span class="_ _2"> </span></span><span class="ls1 ws1">号</span></div><div class="t m1 x5 h9 yb ff5 fs2 fc0 sc0 ls5 ws5">深入理解<span class="_ _5"> </span><span class="ff6">Spark</span>:核心思想与源码分析</div><div class="t m1 x5 ha yc ff5 fs5 fc0 sc0 ls6 ws6">出版发行:机械工业出版社</div><div class="t m1 x7 hb yd ff5 fs6 fc0 sc0 ls7 ws7">(<span class="_ _6"></span>北京市西城区百万庄大街<span class="_ _7"> </span><span class="ff6">22<span class="_ _7"> </span></span>号 邮政编码:<span class="ff6">100037</span><span class="ls1 ws1">)</span></div><div class="t m1 x5 ha ye ff5 fs5 fc0 sc0 ls6 ws6">责任编辑:高婧雅<span class="ff6 ls1 ws1"> <span class="_ _8"> </span></span>责任校对:董纪丽</div><div class="t m1 x5 ha yf ff5 fs5 fc0 sc0 ls6 ws6">印  刷:<span class="_ _9"></span><span class="ff6 ls1 ws1"> <span class="_ _a"> </span><span class="ff5 ls6 ws6">版  次:<span class="ff6">2016<span class="_ _b"> </span></span></span><span class="ff5">年<span class="_ _b"> </span></span>1<span class="_ _2"> </span><span class="ff5 ls6 ws6">月第<span class="_ _b"> </span></span>1<span class="_ _2"> </span><span class="ff5 ls6 ws6">版第<span class="_ _b"> </span></span>1<span class="_ _b"> </span><span class="ff5 ls6 ws6">次印刷</span></span></div><div class="t m1 x5 ha y10 ff5 fs5 fc0 sc0 ls6 ws6">开  本:<span class="ff6">186mm</span><span class="ls1 ws1">×</span><span class="ff6">240mm</span></div><div class="t m1 x2 hc y11 ff5 fs7 fc0 sc0 ls1 ws1"> <span class="ff6 ls8 ws8">1/16</span></div><div class="t m1 x8 ha y10 ff6 fs5 fc0 sc0 ls1 ws1"> <span class="_ _c"> </span><span class="ff5 ls6 ws6">印  张:<span class="ff6">30.25</span></span></div><div class="t m1 x5 ha y12 ff5 fs5 fc0 sc0 ls6 ws6">书  号:<span class="ff6 ws1"> <span class="_ _d"></span>ISBN 978-7-111-52234-8 <span class="_ _e"> </span><span class="ff5 ws6">定  价:<span class="ff6">99.00<span class="_ _b"> </span></span><span class="ls1 ws1">元</span></span></span></div><div class="t m1 x5 hd y13 ff5 fs8 fc0 sc0 ls9 ws9">凡购本书,如有缺页、倒页、脱页,由本社发行部调换</div><div class="t m1 x5 hd y14 ff5 fs8 fc0 sc0 ls9 ws9">客服热线:<span class="_ _f"></span>(<span class="ff6">010</span><span class="ls1 ws1">)</span><span class="ff6">88379426</span><span class="ls1 ws1"> <span class="_ _3"></span></span><span class="ff6">88361066 <span class="_ _10"> </span></span>投稿热线:<span class="_ _f"></span>(<span class="ff6">010</span><span class="ls1 ws1">)</span><span class="ff6">88379604</span></div><div class="t m1 x5 hd y15 ff5 fs8 fc0 sc0 ls9 ws9">购书热线:<span class="_ _f"></span>(<span class="ff6">010</span><span class="ls1 ws1">)</span><span class="ff6">68326294</span><span class="ls1 ws1"> <span class="_ _3"></span></span><span class="ff6">88379649</span><span class="ls1 ws1"> </span><span class="ff6">68995259 <span class="_ _11"> </span></span>读者信箱:<span class="ff6">hzit@hzbook.com</span></div><div class="t m1 x5 ha y16 ff5 fs5 fc0 sc0 ls6 ws6">版权所有<span class="ls1 ws1">·<span class="_ _7"> </span></span>侵权必究</div><div class="t m1 x5 hd y17 ff5 fs8 fc0 sc0 ls9 ws9">封底无防伪标均为盗版</div><div class="t m1 x5 hd y18 ff5 fs8 fc0 sc0 ls9 ws9">本书法律顾问:北京大成律师事务所 韩光<span class="_ _b"> </span><span class="ff6 ls1 ws1">/<span class="_ _7"> </span></span>邹晓东</div></div><div class="pi" data-data='{"ctm":[1.763889,0.000000,0.000000,1.763889,0.000000,0.000000]}'></div></div>
<div id="pf3" class="pf w0 h0" data-page-no="3"><div class="pc pc3 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/629b5a0da1ab4536adfe50a7/bg3.jpg"><div class="t m0 x9 he y19 ff7 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 xa hf y1a ff8 fs0 fc1 sc0 ls1 ws1">Preface</div><div class="t m0 xb h10 y1b ff9 fs0 fc0 sc0 ls1 ws1"> </div><div class="t m0 xc h11 y1c ffa fs0 fc0 sc0 lsa wsa">前<span class="_ _12"></span>  言</div><div class="t m0 xd h12 y1d ffb fs3 fc0 sc0 ls1 ws1">为什么写这本书</div><div class="t m0 x9 he y1e ff7 fs9 fc0 sc0 lsb wsb">要回答这个问题,需要从我个人的经历说起。说来惭愧,我第一次接触计算机是在高三<span class="_ _13"></span>。</div><div class="t m0 xd he y1f ff7 fs9 fc0 sc0 lsc wsc">当时跟大家一起去网吧玩<span class="_ _2"> </span><span class="ff9 ls1 ws1">CS</span>,跟身边的同学学怎么“<span class="_ _13"></span>玩”<span class="_ _9"></span>。正是通过这种“<span class="_ _13"></span>玩”的过程<span class="_ _13"></span>,让</div><div class="t m0 xd he y20 ff7 fs9 fc0 sc0 lsd wsd">我了解到计算机并没有那么神秘,它也只是台机器,用起来似乎并不比打开电视机费劲多少<span class="_ _13"></span>。</div><div class="t m0 xd he y21 ff7 fs9 fc0 sc0 ls2 ws2">高考填志愿的时候,凭着直觉<span class="_ _13"></span>“<span class="_ _13"></span>糊里糊涂”就选择了计算机专业<span class="_ _13"></span>。等到真正学习计算机课程</div><div class="t m0 xd he y22 ff7 fs9 fc0 sc0 ls1 ws1">的时候却又发现,它其实很难!</div><div class="t m0 x9 he y23 ff7 fs9 fc0 sc0 lse wse">早在<span class="_ _b"> </span><span class="ff9 ls1 ws1">2004<span class="_"> </span></span>年<span class="_ _13"></span>,还在学校的我跟很多同学一样<span class="_ _14"></span>,喜欢看<span class="_ _b"> </span><span class="ff9 ls1 ws1">Flash</span>,也喜欢谈论<span class="_ _b"> </span><span class="ff9 ls1 ws1">Flash<span class="_"> </span></span>甚至做</div><div class="t m0 xd he y24 ff9 fs9 fc0 sc0 ls1 ws1">Flash<span class="ff7 lsf wsf">。感觉<span class="_ _b"> </span></span>Flash<span class="_"> </span><span class="ff7 lsf wsf">正如它的名字那样“<span class="_ _13"></span>闪光”<span class="_ _9"></span>。那些年,在学校里<span class="_ _13"></span>,知道<span class="_ _b"> </span><span class="ff9 ls1 ws1">Flash<span class="_"> </span></span>的人可要比知</span></div><div class="t m0 xd he y25 ff7 fs9 fc0 sc0 ls1 ws1">道<span class="_ _15"> </span><span class="ff9">Java<span class="_"> </span></span>的人多得多,这说明当时的<span class="_ _15"> </span><span class="ff9">Flash<span class="_"> </span></span>十分火热。此外,<span class="_ _14"></span><span class="ff9">Oracle<span class="_"> </span><span class="ff7">也成为关系型数据库里的领</span></span></div><div class="t m0 xd he y26 ff7 fs9 fc0 sc0 ls1 ws1">军人物,很多人甚至觉得懂<span class="_ _15"> </span><span class="ff9">Oracle<span class="_"> </span></span>要比懂<span class="_ _2"> </span><span class="ff9">Flash</span>、<span class="ff9">Java<span class="_"> </span></span>及其他数据库要厉害得多!</div><div class="t m0 x9 he y27 ff9 fs9 fc0 sc0 ls1 ws1">2007<span class="_"> </span><span class="ff7 ls10 ws10">年<span class="_ _4"></span>,我刚刚参加工作不久<span class="_ _4"></span>。那时<span class="_ _16"></span><span class="ff9 ls1 ws1">Struts1<span class="ff7">、</span>Spring<span class="ff7">、</span>Hibernate<span class="_"> </span></span>几乎可以称为那些</span></div><div class="t m0 xd he y28 ff7 fs9 fc0 sc0 ls1 ws1">用<span class="_ _15"> </span><span class="ff9">Java<span class="_"> </span></span><span class="ls11 ws11">作为开发语言的软件公司的三驾马车<span class="_ _14"></span>。很快<span class="_ _13"></span>,<span class="_ _14"></span><span class="ff9 ls1 ws1">Struts2<span class="_"> </span><span class="ff7 ls11 ws11">替代了<span class="_ _b"> </span></span>Struts1<span class="_"> </span><span class="ff7 ls11 ws11">的地位<span class="_ _13"></span>,让我第</span></span></span></div><div class="t m0 xd he y29 ff7 fs9 fc0 sc0 ls12 ws12">一次意识到<span class="_ _b"> </span><span class="ff9 ls1 ws1">IT<span class="_"> </span></span>领域的技术更新竟然如此之快<span class="_ _14"></span>!随着很多传统软件公司向互联网公司转型<span class="_ _14"></span>,</div><div class="t m0 xd he y2a ff9 fs9 fc0 sc0 ls1 ws1">Hibernate<span class="_"> </span><span class="ff7">也难以确保其地位,</span>iBA<span class="_ _4"></span>TIS<span class="_"> </span><span class="ff7">诞生了!</span></div><div class="t m0 x9 he y2b ff9 fs9 fc0 sc0 ls1 ws1">2010<span class="_"> </span><span class="ff7 ls13 ws13">年<span class="_ _13"></span>,有关<span class="_ _b"> </span><span class="ff9 ls1 ws1">Hadoop<span class="_"> </span></span>的技术图书涌入中国,当时很多公司用它只是为了数据统计<span class="_ _14"></span>、数</span></div><div class="t m0 xd he y2c ff7 fs9 fc0 sc0 ls14 ws14">据挖掘或者搜索。一开始<span class="_ _14"></span>,人们对于<span class="_ _2"> </span><span class="ff9 ls1 ws1">Hadoop<span class="_"> </span></span>的认识和使用可能相对有限。大约<span class="_ _b"> </span><span class="ff9 ls1 ws1">201<span class="_ _13"></span>1<span class="_"> </span><span class="ff7 ls14 ws14">年的时</span></span></div><div class="t m0 xd he y2d ff7 fs9 fc0 sc0 ls2 ws2">候,关于云计算的概念在网上炒得火热<span class="_ _14"></span>,当时依然在做互联网开发的我,对其只是<span class="_ _13"></span>“道听途</div><div class="t m0 xd he y2e ff7 fs9 fc0 sc0 ls15 ws15">说”<span class="_ _f"></span>。后来跟同事借了一本有关云计算的书,回家挑着看了一些内容,也没什么收获,怅然若</div><div class="t m0 xd he y2f ff7 fs9 fc0 sc0 ls1 ws1">失!<span class="_ _15"> </span><span class="ff9">20<span class="_"> </span></span><span class="ls16 ws16">世纪<span class="_ _2"> </span></span><span class="ff9">60<span class="_"> </span></span><span class="ls16 ws16">年代,美国的军用网络作为互联网的雏形<span class="_ _14"></span>,很多内容已经与云计算中的某些</span></div><div class="t m0 xd he y30 ff7 fs9 fc0 sc0 ls16 ws16">说法类似<span class="_ _13"></span>。到<span class="_ _2"> </span><span class="ff9 ls1 ws1">20<span class="_"> </span></span>世纪<span class="_ _2"> </span><span class="ff9 ls1 ws1">80<span class="_"> </span></span>年代,互联网就已经启用了云计算<span class="_ _14"></span>,如今为什么又要重提这样的概</div><div class="t m0 xd he y31 ff7 fs9 fc0 sc0 ls1 ws1">念?这个问题我可能回答不了,还是交给历史吧。</div><div class="t m0 x9 he y32 ff9 fs9 fc0 sc0 ls17 ws17">2012<span class="_"> </span><span class="ff7 ls6 ws6">年<span class="_ _13"></span>,国内又呈现出大数据热的态势。从国家到媒体<span class="_ _14"></span>、教育、<span class="_ _13"></span><span class="ff9 ls17 ws17">IT<span class="_"> </span><span class="ff7 ls6 ws6">等几乎所有领域<span class="_ _13"></span>,人</span></span></span></div><div class="t m0 xd he y33 ff7 fs9 fc0 sc0 ls18 ws18">人都在谈大数据。我的亲戚朋友中,无论老师、销售人员,还是工程师们都可以针对大数据谈</div><div class="t m0 xd he y34 ff7 fs9 fc0 sc0 ls19 ws19">谈自己的看法。我也找来一些<span class="_ _15"> </span><span class="ff9">Hadoop<span class="_"> </span></span>的书籍进行学习,希望能在其中探索到大数据的奥妙。</div></div><div class="pi" data-data='{"ctm":[1.763889,0.000000,0.000000,1.763889,0.000000,0.000000]}'></div></div>
<div id="pf4" class="pf w0 h0" data-page-no="4"><div class="pc pc4 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/629b5a0da1ab4536adfe50a7/bg4.jpg"><div class="t m0 xe he y19 ff7 fs9 fc0 sc0 ls1a ws1a">有幸在工作过程中接触到阿里的开放数据处理服务<span class="_ _4"></span>(<span class="_ _1"></span><span class="ff9 ls19 ws1b">open data processing service<span class="ff7 ls1 ws1">,</span></span></div><div class="t m0 xf he y35 ff9 fs9 fc0 sc0 ls19 ws19">ODPS<span class="_ _4"></span><span class="ff7 ls2 ws2">)<span class="_ _17"></span>,并且基于<span class="_ _2"> </span><span class="ff9 ls19 ws19">ODPS<span class="_"> </span></span>与其他小伙伴一起构建阿里的大数据商业解决方案</span></div><div class="t m2 x10 he y35 ff7 fs9 fc0 sc0 ls1 ws1">—</div><div class="t m0 x11 he y35 ff7 fs9 fc0 sc0 ls2 ws2">御膳房。去</div><div class="t m0 xf he y36 ff7 fs9 fc0 sc0 ls1b ws1c">杭州出差的过程中<span class="_ _13"></span>,有幸认识和仲<span class="_ _14"></span>,跟他学习了阿里的实时多维分析平台</div><div class="t m2 x12 he y36 ff7 fs9 fc0 sc0 ls1 ws1">—</div><div class="t m0 x13 he y36 ff9 fs9 fc0 sc0 ls19 ws19">Garuda<span class="_"> </span><span class="ff7 ls1b ws1c">和实</span></div><div class="t m0 xf he y37 ff7 fs9 fc0 sc0 ls2 ws2">时计算平台</div><div class="t m2 x14 he y37 ff7 fs9 fc0 sc0 ls1 ws1">—</div><div class="t m0 x15 he y37 ff9 fs9 fc0 sc0 ls19 ws19">Galaxy<span class="_"> </span><span class="ff7 ls2 ws2">的部分知识。和仲推荐我阅读<span class="_ _b"> </span></span>Spark<span class="_"> </span><span class="ff7 ls2 ws2">的源码,这样会对实时计算及流</span></div><div class="t m0 xf he y38 ff7 fs9 fc0 sc0 ls1c ws1d">式计算有更深入的了解。<span class="_ _13"></span><span class="ff9 ls19 ws19">2015<span class="_"> </span><span class="ff7 ls1c ws1d">年春节期间,自己初次上网查阅<span class="_ _2"> </span></span>Spark<span class="_"> </span><span class="ff7 ls1c ws1d">的相关资料学习,开始</span></span></div><div class="t m0 xf he y39 ff7 fs9 fc0 sc0 lsf wsf">研究<span class="_ _2"> </span><span class="ff9 ls19 ws19">Spark<span class="_"> </span></span>源码。还记得那时只是出于对大数据的热爱<span class="_ _13"></span>,想使自己在这方面的技术能力有所</div><div class="t m0 xf he y3a ff7 fs9 fc0 sc0 ls19 ws19">提升。</div><div class="t m0 xe he y3b ff7 fs9 fc0 sc0 ls1d ws1e">从阅读<span class="_ _b"> </span><span class="ff9 ls1 ws1">Hibernate<span class="_"> </span></span>源码开始<span class="_ _14"></span>,到后来阅读<span class="_ _b"> </span><span class="ff9 ls1 ws1">T<span class="_ _6"></span>omcat<span class="ff7">、</span>Spring<span class="_"> </span><span class="ff7 ls1d ws1e">的源码<span class="_ _14"></span>,我也在从学习源码的</span></span></div><div class="t m0 xf he y3c ff7 fs9 fc0 sc0 ls1e ws1f">过程中成长<span class="_ _13"></span>,我对源码阅读也越来越感兴趣<span class="_ _13"></span>。随着对<span class="_ _b"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>源码阅读的深入,发现很多内容</div><div class="t m0 xf he y3d ff7 fs9 fc0 sc0 ls14 ws14">从网上找不到答案,只能自己<span class="_ _14"></span>“硬啃<span class="_ _13"></span>”了。随着自己的积累越来越多<span class="_ _14"></span>,突然有一天发现,我</div><div class="t m0 xf he y3e ff7 fs9 fc0 sc0 ls2 ws2">所总结的这些内容好像可以写成一本书了!从闪光<span class="_ _14"></span>(<span class="ff9 ls1 ws1">Flash<span class="_ _4"></span><span class="ff7 ls2 ws2">)到火花<span class="_ _14"></span>(<span class="ff9 ls1 ws1">Spark<span class="_ _4"></span><span class="ff7 ls2 ws2">)<span class="_ _17"></span>,足足有<span class="_ _b"> </span><span class="ff9 ls1f ws20">11<span class="_ _18"> </span></span>个年</span></span></span></span></div><div class="t m0 xf he y3f ff7 fs9 fc0 sc0 ls20 ws21">头了<span class="_ _13"></span>。无论是<span class="_ _2"> </span><span class="ff9 ls1 ws1">Flash<span class="ff7">、</span>Java</span>,还是<span class="_ _2"> </span><span class="ff9 ls1 ws1">Spring<span class="ff7">、</span>iBA<span class="_ _4"></span>TIS<span class="ff7 ls20 ws21">,我一直扮演着一个追随者,我接受这些书</span></span></div><div class="t m0 xf he y40 ff7 fs9 fc0 sc0 ls21 ws22">籍的洗礼<span class="_ _13"></span>,从未给予<span class="_ _13"></span>。如今我也是<span class="_ _b"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>的追随者,不同的是<span class="_ _14"></span>,我不再只想简单攫取,还要</div><div class="t m0 xf he y41 ff7 fs9 fc0 sc0 ls1 ws1">给予。</div><div class="t m0 xe he y42 ff7 fs9 fc0 sc0 ls22 ws23">最后还想说一下,<span class="_ _13"></span><span class="ff9 ls1 ws1">2016<span class="_"> </span><span class="ff7 ls22 ws23">年是我从事<span class="_ _2"> </span></span>IT<span class="_"> </span><span class="ff7 ls22 ws23">工作的第<span class="_ _2"> </span></span>10<span class="_"> </span><span class="ff7 ls22 ws23">个年头,此书特别作为送给自己的<span class="_ _2"> </span></span>10</span></div><div class="t m0 xf he y43 ff7 fs9 fc0 sc0 ls1 ws1">周年礼物。</div><div class="t m0 xf h12 y44 ffb fs3 fc0 sc0 ls1 ws1">本书特色</div><div class="t m0 x16 h13 y45 ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x17 h14 y46 ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x18 he y45 ff7 fs9 fc0 sc0 ls22 ws23">按照源码分析的习惯设计,从脚本分析到初始化再到核心内容<span class="_ _13"></span>,最后介绍<span class="_ _2"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>的扩</div><div class="t m0 x18 he y47 ff7 fs9 fc0 sc0 ls1 ws1">展内容。整个过程遵循由浅入深、由深到广的基本思路。</div><div class="t m0 x16 h13 y48 ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x17 h14 y49 ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x18 he y48 ff7 fs9 fc0 sc0 ls1 ws1">本书涉及的所有内容都有相应的例子,以便于读者对源码的深入研究。</div><div class="t m0 x16 h13 y4a ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x17 h14 y4b ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x18 he y4a ff7 fs9 fc0 sc0 ls1 ws1">本书尽可能用图来展示原理,加速读者对内容的掌握。</div><div class="t m0 x16 h13 y4c ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x17 h14 y4d ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x18 he y4c ff7 fs9 fc0 sc0 ls23 ws24">本书讲解的很多实现及原理都值得借鉴,能帮助读者提升架构设计、程序设计等方面</div><div class="t m0 x18 he y4e ff7 fs9 fc0 sc0 ls1 ws1">的能力。</div><div class="t m0 x16 h13 y4f ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x17 h14 y50 ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x18 he y4f ff7 fs9 fc0 sc0 ls24 ws25">本书尽可能保留较多的源码,以便于初学者能够在像地铁、公交这样的地方,也能轻</div><div class="t m0 x18 he y51 ff7 fs9 fc0 sc0 ls1 ws1">松阅读。</div><div class="t m0 xf h12 y52 ffb fs3 fc0 sc0 ls1 ws1">读者对象</div><div class="t m0 xe he y53 ff7 fs9 fc0 sc0 ls25 ws26">源码阅读是一项苦差事<span class="_ _13"></span>,人力和时间成本都很高<span class="_ _13"></span>,尤其是对于<span class="_ _2"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>陌生或者刚刚开始</div><div class="t m0 xf he y54 ff7 fs9 fc0 sc0 lsc wsc">学习的人来说,难度可想而知<span class="_ _14"></span>。本书尽可能保留源码,使得分析过程不至于产生跳跃感<span class="_ _13"></span>,目</div><div class="t m0 xf he y55 ff7 fs9 fc0 sc0 lsf wsf">的是降低大多数人的学习门槛<span class="_ _13"></span>。如果你是从事<span class="_ _2"> </span><span class="ff9 ls1 ws1">IT<span class="_"> </span></span>工作<span class="_ _b"> </span><span class="ff9 ls1 ws1">1<span class="_ _18"> </span><span class="ffd">~<span class="_ _2"> </span></span>3<span class="_"> </span></span>年的新人或者是希望学习<span class="_ _2"> </span><span class="ff9 ls1 ws1">Spark</span></div><div class="t m0 xf he y56 ff7 fs9 fc0 sc0 ls1e ws1f">核心知识的人<span class="_ _13"></span>,本书非常适合你<span class="_ _13"></span>。如果你已经对<span class="_ _b"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>有所了解或者已经在使用它,还想进</div><div class="t m0 xf he y57 ff7 fs9 fc0 sc0 ls1 ws1">一步提高自己,那么本书更适合你。</div><div class="t m0 xe he y58 ff7 fs9 fc0 sc0 ls26 ws27">如果你是一个开发新手,对<span class="_ _2"> </span><span class="ff9 ls1 ws1">Java<span class="ff7">、</span>Linux<span class="_"> </span></span>等基础知识不是很了解,那么本书可能不太适合</div><div class="t m0 xf he y59 ff7 fs9 fc0 sc0 ls1 ws1">你。如果你已经对<span class="_ _15"> </span><span class="ff9">Spark<span class="_"> </span></span>有深入的研究,本书也许可以作为你的参考资料。</div><div class="t m0 xf h15 y5a ffe fs9 fc0 sc0 ls1 ws1">IV</div></div><div class="pi" data-data='{"ctm":[1.763889,0.000000,0.000000,1.763889,0.000000,0.000000]}'></div></div>
<div id="pf5" class="pf w0 h0" data-page-no="5"><div class="pc pc5 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://static.pudn.com/prod/directory_preview_static/629b5a0da1ab4536adfe50a7/bg5.jpg"><div class="t m0 x19 h15 y5b ffe fs9 fc0 sc0 ls1 ws1">V</div><div class="t m0 x9 he y5c ff7 fs9 fc0 sc0 ls1 ws1">总体说来,本书适合以下人群:</div><div class="t m0 x1a h13 y5d ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x1b h14 y5e ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x1c he y5d ff7 fs9 fc0 sc0 ls1 ws1">想要使用<span class="_ _15"> </span><span class="ff9">Spark</span>,但对<span class="_ _15"> </span><span class="ff9">Spark<span class="_"> </span></span>实现原理不了解,不知道怎么学习的人;</div><div class="t m0 x1a h13 y5f ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x1b h14 y60 ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x1c he y5f ff7 fs9 fc0 sc0 ls1 ws1">大数据技术爱好者,以及想深入了解<span class="_ _2"> </span><span class="ff9">Spark<span class="_ _19"> </span></span>技术内部实现细节的人;</div><div class="t m0 x1a h13 y61 ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x1b h14 y62 ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x1c he y61 ff7 fs9 fc0 sc0 ls1 ws1">有一定<span class="_ _15"> </span><span class="ff9">Spark<span class="_"> </span></span>使用基础,但是不了解<span class="_ _15"> </span><span class="ff9">Spark<span class="_"> </span></span>技术内部实现细节的人;</div><div class="t m0 x1a h13 y63 ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x1b h14 y64 ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x1c he y63 ff7 fs9 fc0 sc0 ls1 ws1">对性能优化和部署方案感兴趣的大型互联网工程师和架构师;</div><div class="t m0 x1a h13 y65 ff9 fs9 fc0 sc0 ls1 ws1"> </div><div class="t m0 x1b h14 y66 ffc fs9 fc0 sc0 ls1 ws1">T</div><div class="t m0 x1c he y65 ff7 fs9 fc0 sc0 ls1 ws1">开源代码爱好者。喜欢研究源码的同学可以从本书学到一些阅读源码的方式与方法。</div><div class="t m0 x9 he y67 ff7 fs9 fc0 sc0 ls27 ws28">本书不会教你如何开发<span class="_ _7"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>应用程序<span class="_ _6"></span>,只是用一些经典例子演示<span class="_ _4"></span>。本书简单介绍</div><div class="t m0 xd he y68 ff9 fs9 fc0 sc0 ls1 ws1b">Hadoop MapReduce<span class="ff7 ws1">、</span><span class="ws29">Hadoop Y<span class="_ _4"></span>ARN<span class="ff7 ws1">、<span class="ff9">Mesos</span>、<span class="ff9">T<span class="_ _14"></span>achyon<span class="ff7">、</span>ZooKeeper<span class="ff7">、</span>HDFS<span class="ff7">、</span><span class="ws1b">Amazon S3<span class="ff7 ls28 ws2a">,但</span></span></span></span></span></div><div class="t m0 xd he y69 ff7 fs9 fc0 sc0 ls29 ws2b">不会过多介绍这些框架的使用,因为市场上已经有丰富的这类书籍供读者挑选<span class="_ _14"></span>。本书也不会</div><div class="t m0 xd he y6a ff7 fs9 fc0 sc0 ls1 ws1">过多介绍<span class="_ _2"> </span><span class="ff9">Scala</span>、<span class="ff9">Java</span>、<span class="ff9">Shell<span class="_"> </span></span>的语法,读者可以在市场上选择适合自己的书籍阅读。</div><div class="t m0 xd h12 y6b ffb fs3 fc0 sc0 ls1 ws1">如何阅读本书</div><div class="t m0 x9 he y6c ff7 fs9 fc0 sc0 ls1 ws1">本书分为三大部分(<span class="_ _4"></span>不包括附录<span class="_ _1"></span>)<span class="_ _1a"></span>:</div><div class="t m0 x9 he y6d ff1 fs9 fc0 sc0 ls20 ws21">准备篇<span class="_ _13"></span><span class="ff7 ls2a ws2c">(第<span class="_ _5"> </span><span class="ff9 ls1 ws1">1<span class="_"> </span><span class="ffd">~<span class="_ _15"> </span></span>2<span class="_"> </span></span><span class="ls20 ws21">章<span class="_ _1"></span>)<span class="_ _17"></span>,简单介绍了<span class="_ _2"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>的环境搭建和基本原理,帮助读者了解一些背</span></span></div><div class="t m0 xd he y6e ff7 fs9 fc0 sc0 ls1 ws1">景知识。</div><div class="t m0 x9 he y6f ff1 fs9 fc0 sc0 ls16 ws16">核心设计篇<span class="_ _13"></span><span class="ff7 ls2b ws2d">(第<span class="_ _5"> </span><span class="ff9 ls1 ws1">3<span class="_"> </span><span class="ffd">~<span class="_ _15"> </span></span>7<span class="_"> </span></span><span class="ls16 ws16">章<span class="_ _1"></span>)<span class="_ _17"></span>,着重讲解<span class="_ _2"> </span><span class="ff9 ls1 ws1">SparkContext<span class="_"> </span></span>的初始化、存储体系<span class="_ _13"></span>、任务提交与执</span></span></div><div class="t m0 xd he y70 ff7 fs9 fc0 sc0 ls1 ws1">行、计算引擎及部署模式的原理和源码分析。</div><div class="t m0 x9 he y71 ff1 fs9 fc0 sc0 ls2c ws2e">扩展篇<span class="_ _14"></span><span class="ff7 ls2d ws2f">(第<span class="_ _5"> </span><span class="ff9 ls1 ws1">8<span class="_"> </span><span class="ffd">~<span class="_ _15"> </span></span><span class="ls1f ws20">11<span class="_ _18"> </span></span></span><span class="ls2c ws2e">章<span class="_ _1"></span>)<span class="_ _1b"></span>,主要讲解基于<span class="_ _b"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>核心的各种扩展及应用<span class="_ _6"></span>,包括<span class="_ _13"></span>:<span class="_ _14"></span><span class="ff9 ls1 ws1">SQL<span class="_"> </span><span class="ff7 ls2c ws2e">处理</span></span></span></span></div><div class="t m0 xd he y45 ff7 fs9 fc0 sc0 ls1 ws1">引擎、<span class="ff9">Hive<span class="_"> </span></span><span class="ls26 ws27">处理、流式计算框架<span class="_ _2"> </span></span><span class="ff9 ws1b">Spark Streaming</span><span class="ls26 ws27">、图计算框架<span class="_ _15"> </span></span><span class="ff9">GraphX</span><span class="ls26 ws27">、机器学习库<span class="_ _15"> </span></span><span class="ff9">MLlib<span class="_"> </span></span>等</div><div class="t m0 xd he y47 ff7 fs9 fc0 sc0 ls1 ws1">内容。</div><div class="t m0 x9 he y48 ff7 fs9 fc0 sc0 ls19 ws19">本书最后还添加了几个附录,包括:附录<span class="_ _2"> </span><span class="ff9 ls1 ws1">A<span class="_"> </span></span>介绍的<span class="_ _2"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>中最常用的工具类<span class="_ _15"> </span><span class="ff9 ls1 ws1">Utils<span class="_"> </span></span>;附录</div><div class="t m0 xd he y4a ff9 fs9 fc0 sc0 ls1 ws1">B<span class="_"> </span><span class="ff7">是<span class="_ _15"> </span></span>Akka<span class="_"> </span><span class="ff7 ls2e ws30">的简介与工具类<span class="_ _b"> </span></span>AkkaUtils<span class="_"> </span><span class="ff7 ls2e ws30">的介绍<span class="_ _13"></span>;附录<span class="_ _b"> </span><span class="ff9 ls1 ws1">C<span class="_"> </span><span class="ff7">为<span class="_ _15"> </span></span>Jetty<span class="_"> </span></span>的简介和工具类<span class="_ _b"> </span><span class="ff9 ls1 ws1">JettyUtils<span class="_"> </span></span>的介</span></div><div class="t m0 xd he y4c ff7 fs9 fc0 sc0 ls2f ws31">绍<span class="_ _14"></span>;附录<span class="_ _2"> </span><span class="ff9 ls1 ws1">D<span class="_"> </span><span class="ff7">为<span class="_ _15"> </span></span>Metrics<span class="_"> </span></span>库的简介和测量容器<span class="_ _b"> </span><span class="ff9 ls1 ws1">MetricRegistry<span class="_"> </span></span>的介绍<span class="_ _13"></span>;附录<span class="_ _b"> </span><span class="ff9 ls1 ws1">E<span class="_"> </span></span>演示了<span class="_ _b"> </span><span class="ff9 ls1 ws1">Hadoop1.0</span></div><div class="t m0 xd he y4e ff7 fs9 fc0 sc0 ls2e ws30">版本中的<span class="_ _b"> </span><span class="ff9 ls1 ws1b">word count<span class="_ _15"> </span></span>例子<span class="_ _13"></span>;附录<span class="_ _b"> </span><span class="ff9 ls1 ws1">F<span class="_"> </span></span>介绍了工具类<span class="_ _b"> </span><span class="ff9 ls1 ws1">CommandUtils<span class="_"> </span></span>的常用方法<span class="_ _13"></span>;附录<span class="_ _b"> </span><span class="ff9 ls1 ws1">G<span class="_"> </span></span>是关于</div><div class="t m0 xd he y4f ff9 fs9 fc0 sc0 ls1 ws1">Netty<span class="_"> </span><span class="ff7 ls16 ws16">的简介和工具类<span class="_ _b"> </span></span>NettyUtils<span class="_"> </span><span class="ff7 ls16 ws16">的介绍;附录<span class="_ _b"> </span></span>H<span class="_"> </span><span class="ff7 ls16 ws16">列举了笔者编译<span class="_ _2"> </span></span>Spark<span class="_"> </span><span class="ff7 ls16 ws16">源码时遇到的问题及</span></div><div class="t m0 xd he y51 ff7 fs9 fc0 sc0 ls1 ws1">解决办法。</div><div class="t m0 x9 he y72 ff7 fs9 fc0 sc0 ls25 ws26">为了降低读者阅读理解<span class="_ _b"> </span><span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>源码的门槛,本书尽可能保留源码实现<span class="_ _13"></span>,希望读者能够怀</div><div class="t m0 xd he y73 ff7 fs9 fc0 sc0 ls30 ws32">着一颗好奇的心,<span class="ff9 ls1 ws1">Spark<span class="_"> </span></span>当前很火热,其版本更新也很快,本书以<span class="_ _2"> </span><span class="ff9 ls1 ws1b">Spark 1.2.3<span class="_ _15"> </span></span>版本为主,有兴</div><div class="t m0 xd he y74 ff7 fs9 fc0 sc0 ls1 ws1">趣的读者也可按照本书的方式,阅读<span class="_ _2"> </span><span class="ff9">Spark<span class="_"> </span></span>的最新源码。</div><div class="t m0 xd h12 y75 ffb fs3 fc0 sc0 ls1 ws1">勘误和支持</div><div class="t m0 x9 he y56 ff7 fs9 fc0 sc0 ls2c ws2e">本书内容很多<span class="_ _14"></span>,限于笔者水平有限<span class="_ _14"></span>,书中内容难免有错误之处<span class="_ _14"></span>。在本书出版后的任何</div><div class="t m0 xd he y57 ff7 fs9 fc0 sc0 lse wse">时间<span class="_ _14"></span>,如果你对本书有任何问题或者意见<span class="_ _13"></span>,都可以通过邮箱<span class="_ _b"> </span><span class="ff9 ls1 ws1">beliefer@163.com<span class="_"> </span></span>或博客<span class="_ _b"> </span><span class="ff9 ls1 ws1">http://</span></div><div class="t m0 xd he y58 ff9 fs9 fc0 sc0 ls1 ws1">www<span class="_ _14"></span>.cnblogs.com/jiaan-geng/<span class="_"> </span><span class="ff7">联系我,说出你的建议或者想法,希望与大家共同进步。</span></div></div><div class="pi" data-data='{"ctm":[1.763889,0.000000,0.000000,1.763889,0.000000,0.000000]}'></div></div>