Where to store all the IoT Data? Piotr Robert Konopelko Business & Technical Support Manager, MooseFS Pro piotr.konopelko@moosefs.pro
Linux Servers + MooseFS = Storage High Availability High Performance Petabytes Longterm
What is MooseFS Pro? MooseFS Pro is a software product: fault tolerant, network distributed file system It distributes your data over many servers acting as one large disk.
Architecture
Advantages High Availability No Single Point of Failure (SPOF-less configuration). Metadata of the file system is kept in two or more copies on physical redundant servers. User data is redundantly spread across the storage servers in the system. Mission Critical The MooseFS Software-Defined Storage is designed for mission critical applications with high availability and high performance requirements.
Advantages Scalability • Adding new and removing old nodes on the fly in minutes • New space available immediately • Storage can be extended up to 16 Exabytes with more than 2 billion of files.
Advantages High Performance Designed to support high performance I/O operations. User data is read/written in parallel, directly on many storage nodes at once avoiding single central server or single network connection bottlenecks. Big Data The MooseFS storage software is designed for Big Data support. MooseFS enables virtually limitless pool of storage to support the most demanding distributed workloads.
Advantages Hardware / Platform Independence All-Flash and Hybrid storage setup supported. Different manufacturers' disks and servers may be used in the one storage system. Older and newer technology may be mixed if required. MooseFS can be ran on wide range of clusters from Raspberry Pi to Enterprise Servers on many architectures
Advantages Commodity Hardware Components The MooseFS based storage is built with commodity components of virtually any manufacturer. It supports all major disks and disk interfaces types: SATA / SAS, SSD / HDD. Manufacturer Support Support you get comes directly from the software manufacturer. We know each line of the source code, we can solve each issue.
Advantages Lifetime Usability Open source software with commodity components and operating systems make MooseFS safe and usable for entire lifetime. Longer Hardware Life Tiering and rolling upgrade features make hardware life longer as older, smaller and slower servers and/or disks may be used in less intensive tiers.
Use Cases • Clustering • IoT Data – Sensors • Hosting, Virtualization, HCI – Control Systems • Big Data – Data Acquisition Equipment • Supercomputing • Surveillance • Backup and Archiving • Medical Data • Scientific Data
Success Stories Success Story 1 Country: Warsaw, Poland Market: Media & Entertainment, Market analysis Company: Medium company with 20+ European offices Purpose of use: MooseFS is used as a primary storage for Internet traffic data measurement, which is the core business data for this company. MooseFS is installed on a few separate storage clusters, where heavy calculations are performed along with storing data. Concurrently used by 300+ online users and many background data processing applications. Clusters receive hundreds of thousands of new records per second. Clusters have been online since 2005 and are built with over 150 servers storing a few petabytes in total.
Success Stories Success Story 2 Country: Boston, MA, USA Market: Healthcare, Education, Research Company: One of the prominent Ivy League Universities Purpose of use: MooseFS is used on two clusters in the university labs: the first is designed for medical data storage and processing, while the second is used to store VMs disks' images. A few dedicated custom features were added to the system for this customer. Clusters have been online since 2013 storing half a petabyte of crucial data.
Success Stories Success Story 3 Country: New York, USA Market: Research, Education Company: One of the prominent Ivy League Universities Purpose of use: MooseFS is used as a storage solution for human genome research activities. Interesting pattern of MooseFS’s atomic snapshot feature use case: creating and discarding genome data for different proposals of research algorithms. More than 10 Petabytes of data is stored on a single cluster.
Success Stories Success Story 4 Country: Sweden Market: Cloud & Hosting Company: Cloud Provider Purpose of use: MooseFS is used as storage for on-premise cloud solutions for their customers. Around 30 servers and 50 end-application connections creating several Petabytes storage cluster. The solution includes storage for surveillance systems.
Success Stories Success Story 5 Country: France Market: Research Company: European oceanographic research institute Purpose of use: MooseFS is used as a primary storage for research and data analysis for the exploration of the sea. Computations on the data are moved close to the data and run on the same machines where data is stored. Nearly half a billion files, mostly satellite images consuming around 4 petabytes of data, stored on over a hundred servers and serving almost a few hundred computing applications.
Success Stories Success Story 6 Country: Poland Market: Cloud & Hosting Company: Data Center Storage, CDN Purpose of use: MooseFS is used as a primary distributed storage in one of their Data Centers in Poland. Two clusters: one used as a backend storage for internal projects, on-premise hosting and Content Delivery Network (images and videos mainly) for company's customers, second one is based on MooseFS + Proxmox integration and provides VPS hosting. Storing a few petabytes of data in total and counting.
Success Stories Success Story 7 Country: Middle East Market: Cloud & Hosting Company: CDN, Cloud & Hosting Purpose of use: MooseFS is used as a core backend storage solution for CDN clusters (images, ringtones, and videos). A few hundreds of millions of files store a couple of petabytes of data in total. The company is a leading Content Delivery Network Provider.
Success Stories Success Story 8 Country: Warsaw, Poland Market: Market: Media & Entertainment, Internet television Company: A division of the biggest Polish media group Purpose of use: MooseFS is used as a backend of the Content Delivery Network (CDN) service for internet TV serving content in Video on Demand manner. Interesting case of transcoding video on the same servers where data is stored. Over 30 servers are used to store a few Petabytes of video and image content.
Technical Features Redundancy All system components are redundant and in case of failure there's an automatic failover mechanism that is transparent to the users. Compute on Nodes Support for scheduling computation on data nodes for better overall system TCO by utilizing idle CPU and memory resources.
Technical Features Rolling Upgrades Ability to perform one node at a time system upgrades without service disruption including hardware replacement and additions. This feature allows to maintain hardware platform up-to-date with no downtime. Tiered Storage All system components are redundant and in case of failure there's an automatic failover mechanism that is transparent to the users.
Technical Features Fast Disk Recovery Ability to perform one node at a time system upgrades without service disruption including hardware replacement and additions. This feature allows to maintain hardware platform up-to-date with no downtime. Atomic Snapshots Instant, uninterrupted provisioning of the state of the filesystem at a particular point in time. This feature is ideal for on-line backup solutions.
Technical Features POSIX Compliance Support for family of standards, specified by the IEEE, to clarify and make uniform the application programming interfaces (and ancillary issues, such as command line shell utilities) provided by Unix like operating systems. Erasure Coding (”Distributed RAID”) Ensuring data redundancy using error correction code algorithms with up to 9 parities. It saves raw data space compared to an ordinary data duplication approach.
Technical Features Global Trash A virtual, global space for deleted objects configurable for each individual file and directory. A very useful feature for recovering accidentally deleted data caused by human error. Quota Limits Limits set by a system administrator to restrict certain aspects of file system usage and to allocate limited disk space in a reasonable way, i.e. number of i-nodes or capacity on directory level.
Technical Features Access Control Access to files and directories is based on a standard Unix access control model enhanced with standard Access Control Lists. Native Clients For performance reasons there is a dedicated client component for Linux, FreeBSD and Mac OS X systems.
Technical Features Parallelism Performs all I/O operations in parallel threads of executions to deliver good read/write operations performance. Ethernet 1-200 Gbps standard Ethernet based network used for all the communication with support for LACP configurations.
Technical Features Hardware Independence Works out-of-the-box on almost any hardware platform that can run POSIX compliant Operating System like Linux, macOS or FreeBSD.
Thank you peter@moosefs.pro moosefs.pro
Recommend
More recommend