a multi tenancy cloud native digital library platform
play

A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, - PowerPoint PPT Presentation

A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, Jim Tuttle, William A. Ingram {ylchen, jim.tuttle, waingram}@vt.edu Information Technologies and Services Virginia Tech Libraries Agenda Cloud-native concept Virginia


  1. A Multi-Tenancy Cloud-Native Digital Library Platform Yinlin Chen, Jim Tuttle, William A. Ingram {ylchen, jim.tuttle, waingram}@vt.edu Information Technologies and Services Virginia Tech Libraries

  2. Agenda • Cloud-native concept • Virginia Tech Digital Library Platform (VTDLP) • Design strategy • Architecture overview • Implementation overview • VTL experiences

  3. Cloud-native Concept • Entire infrastructure is deployed in the Cloud (AWS) • Platform is composed of a suite of microservices and managed services • Focus on the business logic and workflow • Utilize the advantages provided by the Cloud

  4. Virginia Tech Digital Library Platform (VTDLP) Preservation Data Modeling Presentation • New services to Digital Library Platform – ID Minting service, Access Service, Metadata service, … • Migrating legacy services to Digital Library Platform – IAWA, VTechWork, …

  5. VTDLP Overview Presentation Preservation staging Serialization Resolution VtechWork ETDs IAW Service Service A BeyondVT IAWA Metadata Service ID Minting Service Images . . . SW Virginia Batch Metadata Others Others Other Services Service Storage APTrust Amazon S3

  6. Design Strategy • Cloud native (AWS ecosystem) • Microservice/SOA (AWS lambda) • Serverless (AWS managed services) • CI/CD Pipeline • Caching as much as possible – Static files – Lambda functions • Automation as much as possible – Infrastructure as code – No manual provisioning or managing servers

  7. AWS Ecosystem Amazon Amazon Amazon AWS Amazon ES Amazon DynamoDB EC2 CloudFront Lambda Route 53 AWS Certificate Network & Content Delivery Compute & Database Manager AWS Amazon Amazon IAM Amazon Amazon Organizations S3 Glacier SQS SNS AWS CLI Messaging Security & Identity Storage AWS Amazon Amazon AWS Amazon API Amazon AWS CloudFormation Cognito CloudWatch CloudTrail Gateway Pinpoint Amplify Services Management

  8. Software stacks React AWS Amplify AWS AppSync Node.js Python Microservice Web (AWS Lambda) App

  9. Preservation Pipeline Checksums Fixity Virus Scan AWS S3 Apache Airflow APTrust PREMIS

  10. Lambda Example – Metadata file 1. File upload to S3 2. S3 triggers a Lambda function 3. Lambda function parses file content and inserts/updates record in the DynamoDB

  11. Lambda Example – DynamoDB / ES 1. Data modifications in DynamoDB will trigger a Lambda function 2. Lambda function captures changes and updates Amazon ES

  12. Presentation - Multi-Tenant Architecture App1 App2 AppN Application Hub DB Search

  13. AWS Cloud Amazon S3 Amazon Elasticsearch Service Web Amazon Route 53 App AWS Lambda Amazon API Amazon Gateway CloudFront AWS Amazon Certificate Manager DynamoDB Amazon Cognito

  14. The International Archive of Women in Architecture • A level 0 compliant image server using Amazon S3 and Amazon CloudFront • Tiles images, manifest JSON files, and etc. • Terabytes of scan images to be processed

  15. Image processing workflow AWS Batch Raw images Amazon S3 Batch Job – image set 1 Batch Job – image set 2 Amazon EC2 Amazon AWS Lambda Amazon S3 Batch Job – CloudWatch image set 3 Rule Tiles & Manifest Batch Job – Amazon Elastic image set N File System

  16. Batch job - IIIF_S3 Docker AWS Batch • Command • Parameters • Environment variables • vCPUs Amazon S3 • Memory IIIF Tiles & Manifest Amazon Elastic File System

  17. CI/CD with AWS (4) (3) Amazon S3 AWS CodeBuild (2) (1) (6) Developers AWS (5) CodePipeline (7) AWS Lambda AWS CloudFormation Amazon API Gateway

  18. Cloud benefit - Backup examples • S3 – Amazon S3 is 99.999999999% durability and 99.99% availability. – On average, may lose one of 10,000 objects every 10 million years or so. – Cross-region replication • DynamoDB – Point-in-time recovery (Last 35 days) – On-Demand Backup (Stored in S3) • ElasticSearch – Daily snapshots (Last 14 days) – On-Demand Backup (Stored in S3)

  19. VTL Experiences • Entire development team is AWS certified • One AWS Certification Subject Matter Expert (SME) • AWS trainings and conferences • Thinking and implementing new ideas the Cloud way

  20. Q & A Thank You!

Recommend


More recommend