Survey of the Azure Data Landscape Ike Ellis Wintellect Core - - PowerPoint PPT Presentation

survey of the azure
SMART_READER_LITE
LIVE PREVIEW

Survey of the Azure Data Landscape Ike Ellis Wintellect Core - - PowerPoint PPT Presentation

Survey of the Azure Data Landscape Ike Ellis Wintellect Core Services Consulting Custom software application development and architecture Instructor Led Training Microsofts #1 training vendor for over 14 years having trained more than


slide-1
SLIDE 1

Ike Ellis

Survey of the Azure Data Landscape

slide-2
SLIDE 2

Wintellect Core Services

Consulting

Custom software application development and architecture

Instructor Led Training

Microsoft’s #1 training vendor for over 14 years having trained more than 50,000 Microsoft developers

On-Demand Training

World class, subscription based online training

slide-3
SLIDE 3

Industry Influencers

We wrote the book (over 30 of them)

slide-4
SLIDE 4

Some Microsoft Related Highlights

Gold Azure Partner

2016 IAMCP Gold Partner of the Year for the U.S. announced at WPC

CEO is the Microsoft Regional Director (RD) for Atlanta

DevOps competency partner

 Multiple ALM Rangers 

Software Development competency partner

Xamarin Premier Consulting Partner

 Multiple Xamarin Certified Engineers  Chosen to teach the 2-day Xamarin University pre-con at Evolve 2016 

Other: Visual Studio Integration Partner, Azure Circle Partner, ALM Inner Circle Partner, MVP of the Year, and more…

slide-5
SLIDE 5

Agenda

  • Azure Blob Storage
  • Azure Table Storage
  • Azure CosmosDB
  • Azure SQL Database
  • Azure SQL in a VM
  • Azure SQL Data Warehouse
  • Azure Data Lake
  • Lots of other things supported:
  • Postgres, MySQL, MongoDB, Redis
slide-6
SLIDE 6

Topic Agenda

  • What is it?
  • How is it used?
  • What are the competitors?
  • DEMO!
slide-7
SLIDE 7

Azure Blob Storage

slide-8
SLIDE 8

Azure Blob Storage

  • Blobs are files (PDFs, JPGs, DOCs, etc)
  • Highly durable, massively scalable
  • More than 40 trillion stored objects
  • 3.5+ Million requests/second
  • Exposed via REST APIs
  • Use them in .NET, C++, Java, Node.JS, Android…
  • AzCopy, PowerShell
slide-9
SLIDE 9

Blob Storage Fault Tolerance & Scalability

slide-10
SLIDE 10

What kind of blobs can I have?

  • Share files with clients
  • off-load static content from web servers (invoices,

contracts, resumes)

  • Azure Websites – Platform as a Service – no files on a

webserver

  • SQL BAK Files
  • VM Hard Drives
slide-11
SLIDE 11

Competitors?

  • On premise SANS and arrays
  • Amazon S3 Blob Storage
slide-12
SLIDE 12

Azure Blob Storage

 Demo

slide-13
SLIDE 13

Azure Table Storage

  • Much of it similar to Azure Blob Storage
  • Same scalability & redundancy
  • Affordable price
  • Very, very fast
  • NoSQL key value pair solution
  • Quick data retrieval, little configuration
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16

Competitors

  • Amazon DynamoDB Table Storage
slide-17
SLIDE 17

Azure Table Storage

 Demo

slide-18
SLIDE 18

JSON Document

 Standard for passing data between a server and a web

application

 Replacement for XML  Hierarchical  Terse  Simple data types

slide-19
SLIDE 19

Modeling in CosmosDB

{

"id": "1",

"firstName": "Thomas",

"lastName": "Andersen",

"addresses": [

{

"line1": "100 Some Street",

"line2": "Unit 1",

"city": "Seattle",

"state": "WA",

"zip": 98012

}

],

"contactDetails": [

{"email: "thomas@andersen.com"},

Reading is one operation Writing is one operation No assembly de-assembly

slide-20
SLIDE 20

Query Playground

http://www.documentdb.com/sql/demo

slide-21
SLIDE 21
  • MongoDB
  • Amazon DynamoDB

Competitors

slide-22
SLIDE 22
slide-23
SLIDE 23

Azure SQL Database

  • Platform as a Service
  • All data is backed up for you
  • Point in time restore
  • Can be geo-redundant
  • Scalable both in performance and in data size
  • Up to 1TB
  • Not feature complete with SQL Server in a VM
slide-24
SLIDE 24

Database Replicas

slide-25
SLIDE 25

https://azure.microsoft.com/en-us/documentation/articles/sql-database-transact- sql-information/

Azure SQL Database Unsupported Features

slide-26
SLIDE 26

You can also make it scale up!

slide-27
SLIDE 27

Amazon RDS

Competitors

slide-28
SLIDE 28

Azure SQL Database

 Demo

slide-29
SLIDE 29
  • You manage backups
  • You create fault tolerant options
  • You manage disk space
  • You manage patching
  • You don’t manage hardware failure
  • You don’t manage purchasing hardware
  • You don’t manage networking infrastructure

Azure SQL Server in a VM

slide-30
SLIDE 30
  • Use Premium Storage.
  • Use a VM size of DS3 or higher for SQL Enterprise edition and DS2 or

higher for SQL Standard edition.

  • Use a minimum of 2 P30 disks (1 for log files; 1 for data files and TempDB).
  • Keep the storage accountand SQL Server VM in the same region.
  • Disable Azure geo-redundant storage (geo-replication) on the storage

account.

  • Avoid using operating system or temporary disks for database storage or

logging.

Performance Considerations

slide-31
SLIDE 31
  • Back up to Azure Blob Storage
  • Use Always on Availability Groups and Windows Failover

Clustering Services (WFCS) for fault tolerance

  • Can use mirroring or log shipping too
  • Can also mix in on-premise

Backups & Fault Tolerance

slide-32
SLIDE 32

 Amazon EC2 – VMs in the cloud

Competitors

slide-33
SLIDE 33
  • Elastic Massively Parallel Processing System
  • Use T-SQL to query across relational and non-relational data
  • Up to petabyte volumes of data
  • Scale compute separately from data
  • When paused, you only pay for storage
  • Deploys in seconds

Azure SQL Data Warehouse

slide-34
SLIDE 34
  • Supports 32 concurrent queries
  • Used for fanning out queries over multiple machines for

processing/aggregation/analytics

  • Performance becomes far more predictable than with just

straight SQL Server

  • Not used in OLTP environments

Azure SQL Data Warehouse

slide-35
SLIDE 35
  • A unit of scale that determines how much hardware will give

great performance

  • Done in increments of 100 (mostly)
  • How many DTUs?
  • Start Small
  • Monitor
  • Change as needed. It’s instant.

What is a DTU (Data Warehouse Unit)?

ALTER DATABASE MySQLDW MODIFY (SERVICE_OBJECTIVE = 'DW1000') ;

slide-36
SLIDE 36

Two choices:

  • Distribute data based on hashing values from a single column
  • Good if clusters of tables will be joined and are related
  • Distribute data evenly but randomly
  • Fail-safe method

Partitioning Data

slide-37
SLIDE 37
slide-38
SLIDE 38

Non-supported data types

  • geometry, use a varbinary type
  • geography, use a varbinary type
  • hierarchyid, CLR type not native
  • image, text, ntext when text based use varchar/nvarchar (smaller the better)
  • nvarchar(max), use varchar(4000) or smaller for better performance
  • numeric, use decimal
  • sql_variant, split column into several strongly typed columns
  • sysname, use nvarchar(128)
  • table, convert to temporary tables
  • timestamp, re-work code to use datetime2 and CURRENT_TIMESTAMP function.
  • varchar(max), use varchar(8000) or smaller for better performance
  • uniqueidentifier, use varbinary(8)
  • user defined types, convert back to their native types where possible
  • xml, use a varchar(8000) or smaller for better performance - split across columns if needed
slide-39
SLIDE 39
  • primary keys
  • foreign keys
  • check constraints
  • unique constraints
  • unique indexes
  • computed columns
  • sparse columns
  • user-defined types
  • indexed views
  • identities
  • sequences
  • triggers
  • synonyms

Unsupported Features

slide-40
SLIDE 40

 Amazon RedShift

Competitors

slide-41
SLIDE 41

Azure SQL Data Warehouse

 Demo

slide-42
SLIDE 42

 HDFS for the cloud  Can use tools like Spark, Storm, Flume, Sqoop, Kafka, etc.  No fixed limits on account size or file size

Azure Data Lake

slide-43
SLIDE 43
  • An enterprise wide repository of every type of data collected

in a single place

  • Prior to any formal definition of requirements or schema.

Allows every type of data to be kept without discrimination Organizations can then use Hadoop or advanced analytics to find patterns of the data.

  • Serve as a repository for lower cost data preparation prior to

moving curated data into a data warehouse.

What is a generic data lake?

slide-44
SLIDE 44
  • Azure Data Lake Store – Built on HDFS
  • Azure Data Lake Analytics – Built on Yarn.

Introduces U-SQL/C#

Products

45

slide-45
SLIDE 45

 A lot of Hadoop implementations, but nothing really quite

like it

Competitors

46

slide-46
SLIDE 46
  • MongoDB
  • PostGres
  • Redis
  • MySQL
  • Oracle

More data options….

47

slide-47
SLIDE 47

Ike Ellis

 Ike Ellis, MVP blog.ikeellis.com  Book: Developing Azure Solutions  Podcast Guest: Talk Python to Me – Dec 2015,

June 2016

 .NET Rocks – Sept 2015, Sept 2016  SDTIG – www.sdtig.com