MongoDB Basics: Documents, Collections & CRUD Operations
Prerequisites: Understanding of NoSQL concepts. See NOSQL 01 Introduction.
This guide covers MongoDB’s core concepts by comparing them to SQL Server equivalents you already know.
[!NOTE] Environment: All commands in this article are executed in MongoDB Shell (mongosh) — the official CLI tool. You can also run these commands in MongoDB Compass’s built-in shell. The syntax is identical.
Part A: Core Concepts
1. Terminology Mapping
| SQL Server | MongoDB | Description |
|---|---|---|
| Database | Database | Container for collections |
| Table | Collection | Container for documents |
| Row | Document | Single data record (JSON) |
| Column | Field | Key in a document |
| Primary Key | _id (ObjectId) | Unique identifier |
| JOIN | $lookup / Embedding | Relating data |
| Index | Index | Same concept! |
2. Document: The Building Block
A Document is a JSON object (technically BSON — Binary JSON).
{
"_id": ObjectId("507f1f77bcf86cd799439011"),
"name": "Alice Chen",
"age": 28,
"email": "alice@example.com",
"address": {
"city": "Taipei",
"zip": "10617"
},
"hobbies": ["reading", "hiking", "coding"],
"createdAt": ISODate("2024-03-15T10:30:00Z")
}
Why Documents Beat Rows
graph LR
subgraph "SQL: Multiple Tables + JOIN"
U[Users Table]
A[Addresses Table]
H[Hobbies Table]
U --> A
U --> H
end
subgraph "MongoDB: Single Document"
D["{ user, address, hobbies[] }"]
end
style D fill:#27ae60,color:#fff
| Feature | SQL Row | MongoDB Document |
|---|---|---|
| Nested objects | ❌ Need separate table | ✅ Embed directly |
| Arrays | ❌ Need separate table | ✅ Native support |
| Schema | Fixed (must ALTER TABLE) | Flexible |
3. Collection: The Container
A Collection is a group of documents — like a table, but without enforced schema.
// Two documents in same collection with different structures
db.products.insertMany([
{ name: "Laptop", cpu: "i7", ram: "16GB" }, // Electronics
{ name: "T-Shirt", size: "L", color: "Blue" } // Clothing
]);
// This is VALID in MongoDB!
4. ObjectId: Distributed Primary Key
Every document has a unique _id field. By default, MongoDB generates an ObjectId.
507f1f77bcf86cd799439011
|______||____||__||______|
│ │ │ │
Timestamp Machine PID Counter
(4 bytes) (3) (2) (3)
Why not auto-increment (1, 2, 3)?
| Auto-Increment | ObjectId |
|---|---|
| Needs central coordinator | Each server generates independently |
| Bottleneck at scale | No coordination needed |
| Cannot merge databases | Globally unique |
Part B: CRUD Operations
5. Create (Insert)
Insert One Document
db.users.insertOne({
name: "Bob",
email: "bob@example.com",
age: 30
});
// Returns: { insertedId: ObjectId("...") }
Insert Multiple Documents
db.users.insertMany([
{ name: "Carol", age: 25 },
{ name: "Dave", age: 35 },
{ name: "Eve", age: 28 }
]);
// Returns: { insertedIds: { 0: ObjectId(...), 1: ObjectId(...), 2: ObjectId(...) } }
SQL Equivalent:
INSERT INTO users (name, email, age) VALUES ('Bob', 'bob@example.com', 30);
6. Read (Find)
[!NOTE] Cursor vs Array: In the shell,
find()automatically prints the first 20 documents. But technically, it returns a Cursor (a pointer to results), not an array. In programming languages (Node.js, Python), you must iterate with.toArray()or a loop to access the data.
Find All
db.users.find(); // SELECT * FROM users
db.users.find().pretty(); // Formatted output
Find with Filter (WHERE)
db.users.find({ age: 30 }); // WHERE age = 30
db.users.find({ age: { $gt: 25 } }); // WHERE age > 25
db.users.find({ age: { $gte: 25, $lte: 35 } }); // WHERE age BETWEEN 25 AND 35
db.users.find({ name: /^A/ }); // WHERE name LIKE 'A%'
Query Operators
[!WARNING] Case Sensitivity: MongoDB queries are case-sensitive!
{ name: "bob" }will NOT match"Bob". Use$regexwith$options: "i"for case-insensitive search, or store normalized lowercase values.
| MongoDB | SQL | Example |
|---|---|---|
$eq | = | { age: { $eq: 30 } } |
$ne | != | { status: { $ne: "inactive" } } |
$gt | > | { age: { $gt: 25 } } |
$gte | >= | { age: { $gte: 25 } } |
$lt | < | { price: { $lt: 100 } } |
$lte | <= | { price: { $lte: 100 } } |
$in | IN | { status: { $in: ["active", "pending"] } } |
$nin | NOT IN | { status: { $nin: ["deleted"] } } |
Projection (SELECT specific fields)
db.users.find(
{ age: { $gt: 25 } }, // Filter
{ name: 1, email: 1, _id: 0 } // Projection: include name, email; exclude _id
);
// SQL: SELECT name, email FROM users WHERE age > 25
[!NOTE] Projection Rule: Except for
_id, you cannot mix include (1) and exclude (0) in the same projection. For example,{ name: 1, age: 0 }is invalid and will throw an error. Either include only the fields you want, or exclude only the fields you don’t want.
Sorting & Limiting
db.users.find().sort({ age: -1 }); // ORDER BY age DESC
db.users.find().sort({ age: 1 }).limit(5); // ORDER BY age ASC LIMIT 5
db.users.find().skip(10).limit(5); // OFFSET 10 LIMIT 5 (pagination)
Find One
db.users.findOne({ email: "bob@example.com" });
// Returns single document or null
7. Update
[!IMPORTANT] Why
$set? In MongoDB,updateOne()requires atomic operators like$set. Without$set, older drivers or methods likereplaceOne()would replace the entire document — leaving only the fields you specified! Always use$setto modify specific fields safely.
Update One Document
db.users.updateOne(
{ name: "Bob" }, // Filter (WHERE)
{ $set: { age: 31, city: "NYC" } } // Update
);
// SQL: UPDATE users SET age = 31, city = 'NYC' WHERE name = 'Bob'
Update Multiple Documents
db.users.updateMany(
{ age: { $lt: 25 } },
{ $set: { category: "young" } }
);
Update Operators
| Operator | Action | Example |
|---|---|---|
$set | Set field value | { $set: { status: "active" } } |
$unset | Remove field | { $unset: { tempField: "" } } |
$inc | Increment number | { $inc: { views: 1 } } |
$push | Add to array (allows duplicates) | { $push: { tags: "new" } } |
$addToSet | Add to array (only if not exists) | { $addToSet: { tags: "new" } } |
$pull | Remove from array | { $pull: { tags: "old" } } |
$rename | Rename field | { $rename: { old: "new" } } |
Upsert (Update or Insert)
db.users.updateOne(
{ email: "new@example.com" },
{ $set: { name: "New User", age: 20 } },
{ upsert: true } // Insert if not found
);
8. Delete
Delete One
db.users.deleteOne({ name: "Bob" });
// SQL: DELETE FROM users WHERE name = 'Bob' (first match only)
Delete Many
db.users.deleteMany({ status: "inactive" });
// SQL: DELETE FROM users WHERE status = 'inactive'
Delete All (Dangerous!)
db.users.deleteMany({}); // DELETE FROM users
db.users.drop(); // DROP TABLE users
Part C: Schema Flexibility
9. Schema-less Doesn’t Mean No Schema
MongoDB allows flexible schemas, but best practice is to use consistent structure.
Optional Schema Validation
db.createCollection("users", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "email"],
properties: {
name: { bsonType: "string" },
email: { bsonType: "string", pattern: "^.+@.+$" },
age: { bsonType: "int", minimum: 0 }
}
}
}
});
10. Data Types
| Type | Example | Notes |
|---|---|---|
| String | "hello" | UTF-8 |
| Number | 42, 3.14 | int32, int64, double |
| Boolean | true, false | |
| Date | ISODate("2024-03-15") | UTC timestamp |
| Array | [1, 2, 3] | Any types allowed |
| Object | { nested: "doc" } | Embedded document |
| ObjectId | ObjectId("...") | 12-byte unique ID |
| Null | null | Missing or null value |
Part D: Practical Examples
11. E-commerce Product Catalog
// Insert a product with variable attributes
db.products.insertOne({
name: "MacBook Pro 16",
category: "Electronics",
price: 2499,
specs: {
cpu: "M3 Pro",
ram: "18GB",
storage: "512GB SSD"
},
tags: ["laptop", "apple", "professional"],
stock: 50,
createdAt: new Date()
});
// Query products
db.products.find({
category: "Electronics",
price: { $lte: 3000 },
"specs.ram": "18GB"
}).sort({ price: 1 });
12. User with Embedded Addresses
db.users.insertOne({
name: "Alice",
email: "alice@example.com",
addresses: [
{ type: "home", city: "Taipei", zip: "10617" },
{ type: "work", city: "Hsinchu", zip: "30010" }
]
});
// Find users with Taipei address
db.users.find({ "addresses.city": "Taipei" });
// Update specific address
db.users.updateOne(
{ email: "alice@example.com", "addresses.type": "home" },
{ $set: { "addresses.$.zip": "10618" } }
);
[!TIP] Dot Notation: The syntax
"addresses.city"is called Dot Notation — MongoDB’s way to query nested objects or array fields. SQL users may mistake this for a simple string, but it’s a path expression. Always wrap dot notation keys in quotes (e.g.,{ "specs.ram": "18GB" }).
Summary
CRUD Cheat Sheet
| Operation | MongoDB | SQL |
|---|---|---|
| Create | insertOne() / insertMany() | INSERT |
| Read | find() / findOne() | SELECT |
| Update | updateOne() / updateMany() | UPDATE |
| Delete | deleteOne() / deleteMany() | DELETE |
Key Differences from SQL
| SQL | MongoDB |
|---|---|
| Fixed schema | Flexible schema |
| Tables + JOINs | Embedded documents |
| Auto-increment ID | ObjectId (distributed) |
| NULL for missing | Field simply doesn’t exist |
Best Practices
- Use consistent field names across documents
- Create indexes on frequently queried fields
- Embed related data when read together
- Use references when data changes independently
- Validate schemas in production applications
💡 Practice Questions
Conceptual
-
Compare MongoDB Document vs SQL Row. What can a document do that a row cannot?
-
Why does MongoDB use ObjectId instead of auto-increment integers for primary keys?
-
What is the difference between
updateOne()with$setvs replacing the entire document?
Hands-on
// Write a MongoDB query to:
// 1. Find all users older than 25, sorted by age descending, limit 10
// 2. Update user "Bob" to add a new hobby "swimming" to his hobbies array
💡 View Answer
// 1. Find users
db.users.find({ age: { $gt: 25 } }).sort({ age: -1 }).limit(10);
// 2. Update Bob's hobbies
db.users.updateOne(
{ name: "Bob" },
{ $push: { hobbies: "swimming" } }
);
Scenario (Schema Design)
Question: A developer stores only product_id inside each order document and references the product collection for price. When prices change, what problem occurs? How would you design it differently?
💡 View Answer
Problem: Historical order prices get retroactively changed! A customer who paid $100 would see $120 if the product price was updated later.
Solution — Snapshotting:
Copy the price and name into the order document at purchase time:
// ❌ Bad: Only reference (price changes affect history)
{ orderId: 1, product_id: ObjectId("...") }
// ✅ Good: Snapshot at order time (Embedding)
{
orderId: 1,
product_id: ObjectId("..."),
product_name: "MacBook Pro 16", // Copied at order time
price_at_purchase: 2499 // Frozen price
}
This is a key pattern covered in NOSQL 03 Schema Design — knowing when to embed vs reference.