Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Schema design allows you to grab stuff you want joined all at the same time (aka the data lives together in a document) in mongodb's model. This obviously has limitations in many to many relationships where the access pattern isn't clear, but often these tradeoffs can work.


sort by: page size:

You would (should?) have a well defined schema with MongoDB. I know some people see this as one of the values of it, but personally it's not why I use it.

It may also mean poor schema design. You can absolutely model relational data in MongoDB without relying on joins. If you want to normalize your data don't pick Mongo.

I honestly don't think I've ever seen a valid use case for Mongo. If you're going to query your data, you have to know what fields you're looking for, right? So why not create a schema that has those fields?

Really, the #1 reason to use MongoDB (if you're me, anyway) is to save development time associated with making your relational schema start small and change as your new app progresses. I feel a smug sense of joy every time I add a field somewhere, or delete another, or create some kind of nested document. It's taken me a while to really understand how many compromises I used to make because changing schemas is a pain in the ass.

Simplified queries, though, are a knock against mongo. Joins are great and I would like to do joins on my Mongo documents, but I end up having to replicate a lot of that in code. Sure it's nice that a document can be more complex and you don't spend a lot of time moving things into tables that are really part of the same record. It's nice because it's not forced, though, not because keeping data in different tables is always the wrong way to do things.


If you are doing a lot of joins in Mongo - you're doing it wrong. The whole purpose of a document DB is storing your model with the related data as one document.

The couple of times I've made something with Mongo, I've always just found myself defining a schema in the source code anyway. As long as some code eventually reads the data there's no such thing as "no schema", there's just "multiple" or "less precise" schemas: https://utcc.utoronto.ca/~cks/space/blog/programming/Databas...

All this is to say: from a software engineering standpoint I like defining my schema rigidly at the DB level. The added flexibility that NoSQL provides is not something I actually find desirable. That said, I've never had to share one of these DBs with other programmers.


In a word the reason MongoDB was and is popular is "objects." Objects are hard to store and retrieve in relational databases, despite countless attempts to make it work.

In MongoDB you just pop them into a table and fetch them back as needed. You don't need to define schema. In fact, you don't even need to define tables. MongoDB creates them automatically. It's still surprising to me how many people don't see how great this is for full stack developers. It makes up for a lot of other sins.


Wouldn't you just use nested documents in mongo though?

I'll give you a reason that not many people mention not to use Mongo. Schema definition acts as a form of documentation for your database. As someone that has come into a legacy project built on Mongo, its a nightmare trying to work out the structure of the database. Especially as there is redundant copies of some data in different collections.

The reality of MongoDB is that it's a very specific use case of nonrelational data, which is very rare these days. I'm sad that so many people get looped into Mongo with a MEAN/MERN stack when those apps are almost always the CRUD apps that would benefit from SQL. Why force yourself to maintain a schema implicitly for relational data when you can get error messaging and explicit schemas?

I think MongoDB is a great fit for the right problem. I've been coding for 4 years and have yet to find a problem that fit Mongo best.

As far as quick prototyping goes, this article articulates well why that really isn't the case. Going back to the authors post here on HN, that's a point I would really disagree with the CTO on as well.


I think there are two separate aspects that get conflated into one.

1) Document database - rather than a strict rigid schema, you can store nested json documents in tables/collections. Or the idea of soft schema where the whole database doesn't need to be blocked for a schema change and you have some leeway in integrity.

2) Relational database - Ability to make complex sql queries that join data from multiple tables.

Mongodb has some support for joining but it doesn't have a sql variant. If your data is mostly key:val store then it's great. You can shard it, and have replicas. It's easy to make a fast reliable backend with mongodb. Many popular sites run on mongodb backend.

However with new json types in MySQL and Postgres, it too has support for inserting documents and querying subkeys. It can be sharded and replicated (albeit with a bit more configuration).

Couchbase which is like mongo (in its document store capabilities) N1QL which offers agility of SQL and flexibility of JSON.

So like any tool, it has it's tradeoffs.

Then again kudus to the author for evoking our reptillian brains: "Never use MongoDB" incites emotions and gets you on top of HN. If it was called "When to use MongoDB", it wouldn't get the same reaction.


Awesome, thanks.

I'm actually in the opposite boat that the author is; I'm trying to find a good excuse to start using Mongo. I need to get better at modeling data without schemas... I don't feel comfortable using it until I get better at the design aspect.


My experience has been that many applications (web apps in particular) use a relational database to break up documents into an SQL schema so they can be indexed, then assemble them again when needed. Mongo really ratchets down the friction on that operation. Instead of spending time building a schema and writing lots and lots of insert and update statements, you just build a JSON object and send it over.

I have extensive experience with Lotus Domino and CouchDB.

I know enough about MongoDB, that I know its not all that much different as far as design and usage patterns go. This link [1] tells me that basically mongo's data structure is indeed a Key/Value.

The schema you are talking about is not the schema I mean. What I was talking about that for any nontrivial Key/Value based database system it would be prudent to keep a recipe of how to normalize the data to 3rd Normal Form. Keeping this 3rd Normal Form schema around would greatly ease many troubles that arise from using NoSQL databases.

So what were you thinking about when you say "schema" - is it "relational" (normalized) schema or is it just the recipe that tells you what particular fields are for?

[1]: http://www.mongodb.org/display/DOCS/Schema+Design


Check my comment history, I've given several use cases. Basically, Mongo is nice if you don't need a lot of relational stuff but do have lots of arbitrary data to store. A good example is time series data where the format changes over time -- often it's not a good idea to go back and convert old data (sometimes it's not even possible). Mongo makes it really easy to support multiple schemas if your business requires it, rather than having to maintain arbitrary numbers of different tables.

> I'll admit I'm a non-believer, but every time I see "Schemaless" in MongoDB, I think "oh, so you're implementing schema in your application?"

I think that is arguably one of the selling points of MongoDB. Yes, you do implement schema in your application, but should be doing that knowingly and embracing both the costs and benefits.

The benefit is that you can very quickly change your "schema" since it's just however you're choosing to represent data in memory (through your objects or whatever). It also allows you to have part of your application running on one "version" of the schema while another part of the application catches up.

The tradeoff is that you have to manage all this yourself. MongoDB does not know about your schema, nor does it want to, nor should it. It affords you a lot of power, but then you have to use it responsibly and understand what safety nets are not present (which you may be used to from a traditional RDBMS).

To your point about migrations, there are a few different strategies. You can do a "big bang" migration where you change all documents at once. (I would argue that if you require all documents to be consistent at all times, you should not be using MongoDB). A more "Mongo" approach is to migrate data as you need to; e.g. you pull in an older document and at that time add any missing fields.

So yes, in MongoDB, the schema effectively lives in your application. But that is by design and to use MongoDB effectively that's something you have to embrace.


> agreed, you can have reference fields to other collections and filter by them or query related data.

Yes it has that, but that's an anti-pattern. What I'm referring to is schema design. In Mongo you denormalize and store relational data in the same document as references. In SQL you a record might be split up into 3 tables, in Mongo that should all live in the same document. If you can't embed and need lookups, you should avoid Mongo and other nosql DBs


The whole thing sounds like a terrible idea. Why would you want to put all the data from at least 65 different applications in the same table? In MongoDB? To save cost?

I simply don't get it. It doesn't make sense to me, but perhaps they know something we don't.


An excellent and practical article. I do want to emphasize one thing, though, since I feel like the article almost seemed to downplay its significance:

MongoDB does not support joins; If you need to retrieve data from more than one collection you must do more than one query ... you can generally redesign your schema ... you can de-normalize your data easily.

This is a much larger issue than it seems - nested collections aren't first class objects in MongoDB - the $ operator for querying into arrays only goes one level deep, amongst its other issues, meaning that often-times you must break things out into separate collections. This doesn't work either, though, as there are no cross-collection transactions, so if you need to break things into separate collections, you can't guarantee a write to each collection will go through properly. (Though, I suppose if you're using the latest version, you could lock your whole database)

next

Legal | privacy