Is it better to save multiple documents or less documents with large objects in mongodb? -


i'm using mongodb storing analytics multiple websites. sites have millions of visits day thousands of different urls per day. count number of visits each url has.

right i'll need each day data of previous day.

is better store each url in it's own document or urls under 1 object in 1 document?

multiple documents or less documents large objects

inevitably, uses mongodb has choose between using multiple collections id references or embedded documents. both solutions have strengths , weaknesses. learn use both :

use separate collections

db.posts.find(); {_id: 1, title: 'unicorns awesome', ...}  db.comments.find(); {_id: 1, post_id: 1, title: 'i agree', ...} {_id: 2, post_id: 1, title: 'they kill vampires too!', ...} 
  • or -

use embedded documents

db.posts.find(); {_id: 1, title: 'unicorns awesome', ..., comments: [   {title: 'i agree', ...},   {title: 'they kill vampires too!', ...} ]} 

separate collections offer greatest querying flexibility

// sort comments want db.comments.find({post_id: 3}).sort({votes: -1}).limit(5)  // pull out 1 or more specific comment(s) db.comments.find({post_id: 3, user: 'leto'})  // of user's comments joining posts title var comments = db.comments.find({user: 'leto'}, {post_id: true}) var postids = comments.map(function(c) { return c.post_id; }); db.posts.find({_id: {$in: postids}}, {title: true}); 

selecting embedded documents more limited

// can select range (useful paging) // can't sort, limited insertion order db.posts.find({_id: 3}, {comments: {$slice: [0, 5]}})  // can select post without comments db.posts.find({_id: 54}, {comments: -1})  // can't use update's position operator ($) field selections db.posts.find({'comments.user': 'leto'}, {title: 1, 'comments.$': 1}) 

a document, including embedded documents , arrays, cannot exceed 16mb.

separate collections require more work

 // finding post + comments 2 queries , requires work  // in code make pretty (or odm might you)  db.posts.find({_id: 9001});  db.comments.find({post_id: 9001}) 

embedded documents easy , fast (single seek)

  // finding post + comments   db.posts.find({_id: 9001}); 

no big differences inserts , updates

 // separate collection insert , update   db.comments.insert({post_id: 43, title: 'i hate unicrons', user: 'dracula'});   db.comments.update({_id: 4949}, {$set : {title: 'i hate unicorns'}});    // embedded document insert , update   db.posts.update({_id: 43}, {$push: {title: 'lol @ emo vampire', user: 'paul'}})   // specific update requires store _id each comment   db.posts.update( {'comments._id': 4949}, {$inc:{'comments.$.votes':1}}) 

so, separate collections if need select individual documents, need more control on querying, or have huge documents. embedded documents when want entire document, document $slice of comments, or no comments @ all. general rule, if have lot of "comments" or if large, separate collection might best. smaller and/or fewer documents tend natural fit embedding.

remember, can change mind. trying both best way learn.


Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

android - Robolectric "INTERNET permission is required" -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -