MongoDB Open-source, high-performance , document- oriented database Jay Urbain, PhD https://docs.mongodb.com/
Modern Databases Non-relational data stores (“NoSQL”) • – Key/value – Horizontal scalability, no table joins – Hive, Dynamo, Big Table, CouchDB, Redis, MongoDB,… Next generation OLAP (OnLine Analytical Processing) • – Dimensional data model – Column store – Vertica, Aster, Greenplum RDBMS OLTP (OnLine Transaction Processing) • – Relational model – Transaction oriented – Oracle, MySQL, PostgreSQL
NoSQL Database Market
Non-relational Database Characteristics • No joins, no complex transactions => horizontally scalable architectures. • Transactions at the individual table row or document level • Flexible data models => schema on read • Many are not using SQL for queries, e.g., JSON objects • Many are moving to support some level of SQL • Improved ways to develop applications? Depends…
Non-relational Data Models • Key/Value – MemCache, Amazon Dynamo • Tabular – Google Big Table, Impala • Document oriented – MongoDB, CouchDB, other JSON stores
Relational vs. non-Relational BASE (Basically Available, Soft State, Eventual consistency) analysis of NoSQL. ACID (Atomicity, Consistency, Isolation, and Durability) versus BASE. CAP theorem – get 2 of 3: consistency, availability, partition tolerance
Deciding Factors • Use case • Transactional support • Ad hoc query • Analytical processing • Reliability • Maintainability • Ease of Use • Scalability • Cost
Document oriented database • Document-oriented databases (DOD) are designed for storing, retrieving and managing document-oriented information (semi-structured data). • Popular with web applications. • DODs are one of the main categories of NoSQL databases. • DODs are a subclass of the key-value store NoSQL database. • In a key-value store, the data is considered to be opaque to the database. • A DOD relies on the internal structure of the document in order to extract metadata that the database engine uses for further optimization.
Document oriented database • Document databases contrast with traditional relational database (RDB). • RDBs store data in separate tables that are defined a priori, and a single object may be spread across several tables. • DODs store all information for a given object in a single instance in the database, and every stored object can be different from every other. • Eliminates the need for object-relational mapping while loading data into the database.
MongoDB History hu mongo us Most popular document oriented database. • Designed and developed by founders of Doubleclick, ShopWiki, GILT • group, etc. GOAL: create high performance, fully consistent, horizontally scalable • general purpose data store. MongoDB uses JSON-like documents with schemata. • Coding started fall 2007 • Open Source – AGPL, written in C++ • First production site March 2008 • Current version: ~3.6.9 / 2018 •
MongoDB MongoDB is a distributed database at its core, so high availability, • horizontal scaling, and geographic distribution are built in and easy to use. – MongoDB scales horizontally using sharding. – MongoDB can run over multiple servers, balancing the load or duplicating data to keep the system up and running in case of hardware failure. Stores data in flexible, JSON-like documents, meaning fields can vary from • document to document and the data structure can be changed over time. The document model maps to the objects in your application code, • making data easy to work with. No object-relational mapping. Ad hoc queries, indexing, and real time aggregation provide ways to • access and analyze your data Field, range query, and regular expression searches. • Fields in a MongoDB document can be indexed with primary and • secondary indices. File storage •
JSON-style Documents represented as BSON binary-encoded serialization of JSON-like documents
Flexible Schemas
Replication
Auto-sharding
Uses Cases • Good use cases – Scaling out – Caching – The Web – High volume – Simple data models • Bad use cases – Highly transactional – Ad-hoc business intelligences – Problems that require SQL – Complex relational data models
MongoDB Basics • A collection is like a relational table. • Collections contain documents. • A document within a collection is like a record (row) within a table. • Each document has an _id that is unique across all documents within a collection.
JSON Documents • Rich data models • Seamlessly map to native programming language types • Flexible for dynamic data • Better data locality
Javascript Post - API
Find posts by author - API
Last ten posts - API
RESTful Queries in MongoDB • The mongo model update function takes three arguments: – query – JSON object of matching properties to identify the document to update – data – JSON object specifying the properties to update – callback – function that is called with the number of modified documents • The data to update is retrieved from the request body, which is used to pass in larger chunks of data, often stored as a single JSON object.
RESTful Queries in MongoDB • The JSON object passed in corresponds to the Mongo database schema defining the project documents and includes only the model properties to modify. • Example: we can use curl to update a specific property, e.g., numberofsaves, in a specific project's data: $ curl -i -X PUT -H 'Content-Type: application/json' -d '{"numberofsaves": "272"}' http://localhost:3001/api/v1/projects/5593c8792fee421039c0afe6 • It sends a PUT request with JSON content to the project update endpoint.
RESTful Queries in MongoDB • The –i requests that the headers are included in the output. • The –X specifies the HTTP method. • The -d argument specifies the request body or data containing the JSON object with the properties to modify. • The routing URL includes the version number and ends with the mongo database id of the project to update. • Curl prints the following response to this request: HTTP/1.1 202 Accepted Content-Type: text/plain; charset=utf-8 Content-Length: 8
Add a new record Javascript: Python: tasks_results = mongo.db.tasks _id = tasks_results.insert(task_)
Delete a record Javascript: Python: tasks_results = mongo.db.tasks result = tasks_results.delete_one({"id": str(task_id)}) Demo MongoDB curl commands if time
JSON Serialization
json_util provides two helper methods: dumps and loads, that wrap the native JSON methods and provide explicit BSON conversion to and from JSON.
json_util – Tools for using Python’s json module with BSON documents Example usage (serialization): >>> from bson import Binary, Code >>> from bson.json_util import dumps >>> dumps([{'foo': [1, 2]}, ... {'bar': {'hello': 'world'}}, ... {'code': Code("function x() { return 1; }", {})}, ... {'bin': Binary(b"")}]) '[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'
json_util – Tools for using Python’s json module with BSON documents Example usage (deserialization): >>> from bson.json_util import loads >>> loads ('[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$scope": {}, "$code": "function x() { return 1; }"}}, {"bin": {"$type": "80", "$binary": "AQIDBA=="}}]') [{u'foo': [1, 2]}, {u'bar': {u'hello': u'world'}}, {u'code': Code('function x() { return 1; }', {})}, {u'bin': Binary('...', 128)}]
Recommend
More recommend