Is it better to save multiple documents or less documents with large objects in mongodb? -
i'm using mongodb storing analytics multiple websites. sites have millions of visits day thousands of different urls per day. count number of visits each url has.
right i'll need each day data of previous day.
is better store each url in it's own document or urls under 1 object in 1 document?
multiple documents or less documents large objects
inevitably, uses mongodb has choose between using multiple collections id references or embedded documents. both solutions have strengths , weaknesses. learn use both :
use separate collections
db.posts.find(); {_id: 1, title: 'unicorns awesome', ...} db.comments.find(); {_id: 1, post_id: 1, title: 'i agree', ...} {_id: 2, post_id: 1, title: 'they kill vampires too!', ...}
- or -
use embedded documents
db.posts.find(); {_id: 1, title: 'unicorns awesome', ..., comments: [ {title: 'i agree', ...}, {title: 'they kill vampires too!', ...} ]}
separate collections offer greatest querying flexibility
// sort comments want db.comments.find({post_id: 3}).sort({votes: -1}).limit(5) // pull out 1 or more specific comment(s) db.comments.find({post_id: 3, user: 'leto'}) // of user's comments joining posts title var comments = db.comments.find({user: 'leto'}, {post_id: true}) var postids = comments.map(function(c) { return c.post_id; }); db.posts.find({_id: {$in: postids}}, {title: true});
selecting embedded documents more limited
// can select range (useful paging) // can't sort, limited insertion order db.posts.find({_id: 3}, {comments: {$slice: [0, 5]}}) // can select post without comments db.posts.find({_id: 54}, {comments: -1}) // can't use update's position operator ($) field selections db.posts.find({'comments.user': 'leto'}, {title: 1, 'comments.$': 1})
a document, including embedded documents , arrays, cannot exceed 16mb.
separate collections require more work
// finding post + comments 2 queries , requires work // in code make pretty (or odm might you) db.posts.find({_id: 9001}); db.comments.find({post_id: 9001})
embedded documents easy , fast (single seek)
// finding post + comments db.posts.find({_id: 9001});
no big differences inserts , updates
// separate collection insert , update db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'}); db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}}); // embedded document insert , update db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}}) // specific update requires store _id each comment db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}})
so, separate collections if need select individual documents, need more control on querying, or have huge documents. embedded documents when want entire document, document $slice of comments, or no comments @ all. general rule, if have lot of "comments" or if large, separate collection might best. smaller and/or fewer documents tend natural fit embedding.
remember, can change mind. trying both best way learn.
Comments
Post a Comment