Axib ibase Tim ime Series Database
Axib ibase Tim ime Series Database Axibase Time-Series Database (ATSD) is a clustered non-relational database for the storage of various information coming out of the IT infrastructure. ATSD is specifically designed to store and analyze large amounts of statistical data collected at high frequency. 2 Prepared by Axibase
Database His istory ry • 1970 – IBM introduced relational algebra for data processing. • Cambrian explosion of relational database management systems: • 2000 – first large-scale applications emerge, such as Google Search. • 2004 – Google Big Table – first non-relational database using distributed file system. • Currently we are experiencing Cambrian explosion of non-relational (a.k.a. NoSQL) databases: 3 Prepared by Axibase
Key Dif ifferences Between SQL and NoSQL SQL NoSQL High-level Programming Language SQL Transactions Query Optimizer Non-key indexes 4 Prepared by Axibase
Key Dif ifferences Between SQL and NoSQL SQL NoSQL Scalability TB PB Maximum Cluster Size 48 (Oracle RAC) 1000+ Distributed Depends on table size and Read Time Linear indexes Depends on table size and Write Time Linear indexes Table Schema (column names, Raw bytes. Schema Predetermined data types) determined by application 5 Prepared by Axibase
How Proven Is NoSQL Technology NoSQL is the leading technology behind big data applications. • Google – search, gmail, AppEngine • Yahoo/Microsoft – search • Amazon – e-commerce, search, cloud computing (AWS DynamoDB) • IBM Big Insights, Microsoft Azure HD Insight 6 Prepared by Axibase
Big ig Data Adoption HBase behind Facebook Messages: • 6+ billion messages per day • 75+ billion R/W operations per day • Peak throughput: 1.5 million R/W operations per second • 2+ petabytes of data (6+ PB including replicas) with data growth of over 8 TB per day 7 Prepared by Axibase
Big ig Data Adoption IBM BigInsights behind Vestas: • A wind energy company in Denmark is reducing the time to analyze petabytes of data from several weeks to 15 minutes to improve the accuracy of wind turbine placement. • Stores 2.8 PB of company historical data together with over 178 external parameters: temperature, barometric pressure, humidity, precipitation, wind direction, wind velocity etc. • Stores precise data on weather over the past 11 years. • Collects data from over 35,000 meteorological stations. 8 Prepared by Axibase
Big ig Data Adoption HBase behind Explorys: • Explorys uses HBase to enable search and analysis of patient populations, treatment protocols, and clinical outcomes. • Stores over 275 billion clinical, financial and operational data elements. • 48 million unique patient files. • Collecting data from over 340 hospitals and 300,000 healthcare providers. • Pull data from 22 integrated major healthcare systems. 9 Prepared by Axibase
Axib ibase Tim ime Series Database Scalability & Speed • Collects billions of samples per day. Retains detailed data forever. Features • Combines database, rule engine, and visualization in one product. Analytical Rule Engine • Applies aggregate functions and filters on streaming data. Integration • Accepts data from any source based on industry-standard protocols. Visualization • Built-in portals with smart widgets. 10 Prepared by Axibase
11 Prepared by Axibase
Big ig Data for IT IT Monitoring • Retain detailed data forever. • Collect statistics at high-frequency, for example every 15 seconds. • Consolidate performance statistics from all systems into one database: facilities, network, storage, servers, applications, databases, transactions, service providers, user activity etc. • Monitor infrastructure based on abnormal deviations instead of manual thresholds. • Apply statistical formulas to predict outages. • Take advantage of schema-less database to collect data from any source. 12 Prepared by Axibase
Big ig Data for Developers • Support for annotation-style instrumentation. • Alternative to byte-code instrumentation and file logging. • Collect detailed performance and usage statistics for reporting and analytics, without writing custom monitors. 13 Prepared by Axibase
Big ig Data for Operations • Gather and analyze statistical data generated by the various systems and sensors. • Analytics that can support decision control systems. • Allows for better real‐time operations decision‐support . • Generate accurate forecasts of upcoming issues: • Delays • Scheduled maintenance based on product usage and sensor data instead of warranty periods • Improved customer service times and standards. 14 Prepared by Axibase
ATSD Archit itecture • ATSD architecture combines database, analytics and reporting tools into one complete product. • Data locality makes analytics run faster. • Application server layer is simplified to provide core shared services 15 Prepared by Axibase
ATSD Components • Pluggable driver provides support for different storage engines • Compute, persistence and data collection layers scaled independently 16 Prepared by Axibase
Fault Tole lerance • ATSD is a distributed system, with high fault tolerance. • Each data sample is automatically replicated 3 times for recovery. 17 Prepared by Axibase
ATSD Scala labil ility • ATSD is a distributed, non-relational database with high throughput, fault tolerance and reading speed. • ATSD can collect billions of metrics per day and store petabytes of data. • ATSD supports millisecond resolution and sampling intervals of up to several measurements per second. The data is stored without losing accuracy. • Additional nodes can be added at runtime to handle increasing volumes. ATSD automatically distributes the table across active nodes. • New nodes can be added in remote data centers to minimize network traffic. 18 Prepared by Axibase
Supported Data Types • Two types of data ingestion: push and pull. • ATSD supports numeric values, log messages and properties (collection of key-values). • ATSD uses collectors for retrieving structured and unstructured data from remote sources. • Support for standard protocols: Telnet, ICMP, CSV/TSV, FILE, JMX, HTTP, and JSON. 19 Prepared by Axibase
Data Coll llection • Collection is agentless; data is pushed by external systems into ATSD. • New metrics are auto-registered. No need to update schema or restart any server components. • Existing monitoring tools can be instrumented to stream data into ATSD. • Each data sample can be tagged (key = value) at source for subsequent querying, aggregations, and roll-ups. 20 Prepared by Axibase
Data Storage • Built-in data compression provides 70%-80% disk space savings over raw data. • No data needs to be deleted. Seek time is almost linear regardless of the dataset size. • Data storage is sparse and efficient. ATSD stores only what is collected instead of long rows with NULLs or zeros, as is the case in relational model. • VMware VMFS-attached disks are sufficient for small to medium clusters. • Direct attached disks with JBOD are recommended for larger clusters. • JBOD alternatives to minimize node recovery time are available from leading storage vendors, such as NetApp E-Series. 21 Prepared by Axibase
Built-in In Instruments Unlike conventional data warehouses, ATSD comes with a set of built-in tools for data analysis: • Analytical Rule Engine • Forecasting • Visualization 22 Prepared by Axibase
Analyt ytic ical Rule le Engine • Evaluates incoming data in memory based on statistical rules. • Statistical rules are applied to the incoming data stream before data is stored on disk. • As data is ingested by ATSD server, a subset of samples that match rule queries are routed to the rule engine for processing. • Rule Engine supports both time- and count- based data windows. • Rule expressions and filters can reference not just numeric values but also tags such as system type, location, priority to ensure that alerts are raised only for critical issues. • Multiple metrics and entities can be correlated within the same rule. 23 Prepared by Axibase
Recommend
More recommend