首页 > 大数据 > 数据分析 > 分布式系统领域经典论文翻译集


数据分析 作者:110110AAAA 时间:2012-06-15 14:28:30 0 删除 编辑





1.      google系列论文译序

2.      The anatomy of a large-scale hypertextual Web search engine

3.      面向星球的网络搜索:google集群架构

4.      GFS:google文件系统

5.      mapreduce: Simplied Data Processing on Large Clusters

6.      bigtable: A Distributed Storage System for Structured Data

7.      Chubby: The Chubby lock service for loosely-coupled distributed systems

8.      Sawzall:Interpreting the Data--Parallel Analysis with Sawzall

9.      Pregel: A System for Large-Scale Graph Processing

10.  Dremel: Interactive Analysis of WebScale Datasets

11.  Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications

12.   MegaStore: Providing Scalable, Highly Available Storage for Interactive Services

13.   Case Study GFS: Evolution on Fast-forward

14.   Google File System II: Dawn of the Multiplying Master Nodes

15.   Tenzing - A SQL Implementation on the MapReduce Framework

16.   F1-The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business



00.    Appraising Two Decades of Distributed Computing Theory Research

0.      How to Build a Highly Available System Using Consensus

1.      分布式理论系列译序

2.      A brief history of Consensus_ 2PC and Transaction Commit

3.      拜占庭将军问题 --Leslie Lamport

4.      Impossibility of distributed consensus with one faulty process

5.      Leases:租约机制

6.      paxos made simple

       7.    The Part Time Parliament --Leslie Lamport

8.     Fast Paxos --Leslie Lamport

9.     Paxos Made Live - An Engineering Perspective

10.    Uniform consensus is harder than consensus

       11.    The Transaction Concept:Virtues and Limitations --Jim Gray

12.    2pc-2阶段提交:Notes on Data Base Operating Systems --Jim Gray


14.    Life beyond Distributed Transactions:an Apostate’s Opinion

15.    A Comparison of the Byzantine Agreement Problem and the Transaction Commit Problem --Jim Gray

16.    Consensus on Transaction Commit --Jim Gray & Leslie Lamport

21.    Time Clocks  and the Ordering of Events in a Distributed System --Leslie Lamport

22.    Distributed Snapshots: Determining Global States of a Distributed System --Leslie Lamport

23.    Virtual Time and Global States of Distributed Systems

24.    Timestamps in Message-Passing Systems That Preserve the Partial Ordering

25.    Fundamentals of Distributed Computing:A Practical Tour of Vector Clock Systems


0.      Towards Robust Distributed Systems:Brewer's 2000 PODC key notes

1.      CAP理论

2.      Harvest, Yield, and Scalable Tolerant Systems

3.      关于CAP 

4.      BASE模型:BASE an Acid Alternative

5.      最终一致性

6.      可扩展性设计模式

7.      可伸缩性原则

8.      NoSql生态系统

9.      scalability-availability-stability-patterns

10.    The 5 Minute Rule and the 5 Byte Rule

11.    The Five-Minute Rule 20 Years Later(and How Flash Memory Changes the Rules)

12.    关于MapReduce的争论

13.    MapReduce:一个巨大的倒退

14.    MapReduce:一个巨大的倒退(II)

15.    MapReduce和并行数据库,朋友还是敌人?(zz)

16.    MapReduce and Parallel DBMSs-Friends or Foes(译)

17.    MapReduce:A Flexible Data Processing Tool(译)

18.    A Comparision of Approaches to Large-Scale Data Analysis(译)

19.    MapReduce Hold不住?(zz)   

20.    Beyond MapReduce:图计算概览


1.      大数据量,海量数据处理方法总结

2.      大数据量,海量数据处理方法总结(续)

3.     Consistent Hashing And Random Trees

4.    Merkle Trees

5.    Scalable Bloom Filters

6.    Introduction to Distributed Hash Tables

7.    B-Trees and Relational Database Systems

8.    The log-structured merge-tree

9.    lock free data structure

10.    Data Structures for Spatial Database

11.    Gossip

12.    lock free algorithm

13.    The Graph Traversal Pattern


1.    MySQL索引背后的数据结构及算法原理

2.    Dynamo: Amazon’s Highly Available Key-value Store

3.    Cassandra - A Decentralized Structured Storage System

4.    PNUTS: Yahoo!’s Hosted Data Serving Platform

5.    Yahoo!的分布式数据平台PNUTS简介及感悟(zz)


6.    LevelDB:一个快速轻量级的key-value存储库(译)

7.    LevelDB:实现(译)

8.    Megastore: Providing Scalable, Highly Available Storage for Interactive Services

9.     Designs, Lessons and Advice from Building Large Distributed Systems --Jeff Dean

10.     Challenges in Building Large-Scale Information Retrieval Systems --Jeff Dean


1.    The ganglia distributed monitoring system:design, implementation, and experience

2.    Chukwa: A large-scale monitoring system

3.    Scribe : a way to aggregate data and why not, to directly fill the HDFS?

4.    Benchmarking Cloud Serving Systems with YCSB

5.    Dynamo Dremel ZooKeeper Hive 简述

七.   Hadoop相关

0.     Hadoop Reading List

1.     The Hadoop Distributed File System(译)

2.     HDFS scalability:the limits to growth(译)

3.     Name-node memory size estimates and optimization proposal.

4.     HBase Architecture(译)

5.     HFile:A Block-Indexed File Format to Store Sorted Key-Value Pairs

6.     HFile V2

7.     Hive - A Warehousing Solution Over a Map-Reduce Framework

8.    Hive – A Petabyte Scale Data Warehouse Using Hadoop

9.    HIVE RCFile高效存储结构

10.   ZooKeeper: Wait-free coordination for Internet-scale systems

11.    The life and times of a zookeeper

12.    Avro: 大数据的数据格式(zz)

13.     Apache Hadoop Goes Realtime at Facebook

14.     Hadoop平台优化综述

15.    The Anatomy of Hadoop I/O Pipeline

16.     Hadoop公平调度器指南

17.    下一代Apache Hadoop MapReduce

18.    Apache Hadoop 0.23


Reflections on Trusting Trust --Ken Thompson

Who Needs an Architect?

Go To statements considered harmfull --Edsger W.Dijkstra

No Silver Bullet Essence and Accidents of Software Engineering --Frederick P. Brooks

转载请注明作者:phylips@bmy 2011-4-30




<!-- 正文结束 -->

来自 “ ITPUB博客 ” ,链接:,如需转载,请注明出处,否则将追究法律责任。

上一篇: 没有了~
下一篇: 没有了~
请登录后发表评论 登录