ITPub博客

首页 > Linux操作系统 > Linux操作系统 > Basic Aggregation in MongoDB 2.1 with Python

Basic Aggregation in MongoDB 2.1 with Python

原创 Linux操作系统 作者:jieforest 时间:2012-06-10 07:40:04 0 删除 编辑
Why a new framework?

If you've been following along with this article series, you've been introduced to MongoDB's mapreducecommand, which up until MongoDB 2.1 has been the go-to aggregation tool for MongoDB. (There's also the group() command, but it's really no more than a less-capable and un-shardable version of mapreduce(), so we'll ignore it here.)

So if you already have mapreduce() in your toolbox, why would you ever want something else?

Mapreduce is hard; let's go shopping

The first motivation behind the new framework is that, while mapreduce() is a flexible and powerful abstraction for aggregation, it's really overkill in many situations, as it requires you to re-frame. your problem into a form. that's amenable to calculation using mapreduce().

For instance, when I want to calculate the mean value of a property in a series of documents, trying to break that down into appropriate map, reduce, and finalize steps imposes some extra cognitive overhead that we'd like to avoid. So the new aggregation framework is (IMO) simpler.

The Javascript. global interpreter lock is evil

The MapReduce algorithm, the basis of MongoDB's mapreduce() command, is a great approach to solving Embarrassingly Parallel problems.

Each invocation of map, reduce, and finalize is completely independent of the others (though the map/reduce/finalize phases are order-dependent), so we shouldbe able to dispatch these jobs to run in parallel without any problems.

Unfortunately, due to MongoDB's use of the SpiderMonkey Javascript. engine, each mongod process is restricted to running a single Javascript. thread at a time.

So in order to get any parallelism with a MongoDB mapreduce(), you must run it on a sharded cluster, and on a cluster with N shards, you're limited to N-way parallelism.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/301743/viewspace-732370/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论

注册时间:2008-04-23

  • 博文量
    443
  • 访问量
    509861