Why IoT IoT Domain IoT Data Characteristics • Massive data: 20.4 Billion connected • Growing fast, impact on IoT Visualization our life thing by 2020 (Data Volume) • Industries are putting e ff ort: • Real-time integration of devices (Data Amazon, Microsoft, Intel,… Velocity) • 450 IoT platforms, Thousands of individual Amirhosein Abbasi applications • Di ff erent Criteria= Di ff erent types of Department of Electrical and Computer Engineering • Di ff erent Criteria: smart Data (Data Variety) October 2019 home/city/transportation/… • The Famous “VVV”: • IoT is not growing as fast as it should be! Users are Volume,Velocity,Variety not convenient yet. IoT Platforms Requirements Location Issue in IoT Our Scope • Number of smart things for a single • A trend in IoT industry. 450 active IoT platforms are user are increasing: How to keep • IoT is a vast scope. available. track of all of them at once? • Smart things are finding their way • Visualizing data of a specific IoT application (like Home Car through every aspect of our lives, visualizing healthcare data)? Good. But not solving the how to visually classify them? vast issue of IoT today. • Things’ Time/location Issue: In Some devices temporal attributes are • Lots of standards and protocols. (Solution: Using Web of important while in some others the Bus location is critical. Things) • Managing things and users. • For example: location does not make • Solution: Narrow down the problem to IoT platforms. sense for a co ff ee maker as well as a • Data Visualization: a responsibility. User with smart car. Also time is more valuable for a wearable smart street light rather than a car. devices! To Be Done… A typical IoT environment • Finding ways to solve time/location issue InsightVis Background - The class • Visualizing the hierarchical Map of Things: CPSC 310 is a project-heavy course, and a requirement of the Computer Science Major ● For CPSC 310 Roughly 180 or 360 students per term ● Students work in pairs, meaning we have 90 to 180 teams • /agent(i)/thing(i) : CSdepartment/Room101/light2 ● By Lucas Zamprogno and Syed Ishtiaque Ahmad • Visualizing smart things of a single user in a way that user can keep track of all of devices while having a sense of devices position on the hierarchy. Background - The project Background - The data Possible questions we want to answer? Test View Students are tasked to build a simple data storage and query language system We have records of test results for all the students commits (100MB for one term) Relationships between test cases Visualize technical debt ● ● ● ● Project is divided up into a few segments of related work called deliverables We also have their git repositories, which means entire project histories (separately on GitHub) Difficulty of tests ● Time when teams are most active ● ● ● Each deliverable is marked by the project’s ability to pass a suite of automated tests (the details of These will both take a lot of preprocessing to get out only data need, and to derive new data by Can we find struggling teams/ strong teams ● ● ● which are not entirely known by the students) combining sources Bad team dynamics / Unequal contributions ●
Team View Team Activity Vis Visualizing Protein-protein interaction THANK YOU! networks in Pseudomonas Aeruginosa CPSC 547 Project Pitch Javier J. Castillo-Arnemann October 8, 2019 PaIntDB Background: PaIntDB PaIntDB pipeline PaIntDB Three network classes: ● P seudomonas Int eraction D ata B ase 1. Run experiment (gene knockouts, antibiotic treatment, temperature...) Input: Output: 2. Perform RNASeq/TnSeq. ● Protein-protein and protein-metabolite interactions in Pseudomonas 1. BioNetwork : basic PPI networks, no experimental data, just database List of genes with optional Network showing interactions 3. Perform statistical analyses to determine genes of interest. aeruginosa strains PAO1 and PA14. (157,427 interactions) expression data. between these genes. info. 4. Analyze and interpret list of genes of interest. ● P. aeruginosa is a multi-drug resistant pathogen involved in cystic fibrosis 2. DENetwork : contains attributes and methods to handle differential 5. Upload list to PaIntDB and generate a network of interactions between these and other diseases. Antibiotic resistance has gotten worse and will genes. expression data. (log2foldchange, adjusted p-values for every gene) continue to do so. 3. Combined network : additional attributes and methods to combine ● Systems-level understanding of biological function (looking at groups of DE gene lists and TnSeq gene lists. genes instead of individual genes). ● Helps visualize and interpret RNASeq Differentially Expressed genes, TnSeq phenotypically important genes, or any kind of gene list. Attribute types Issues Project Goals Implementation Hairball effect: One solution: ● Implement node clustering and expand on-demand for Done: node-link views. Network Class Categorical Ordered Generate sub-networks out of functional ● Python back-end for generating networks and statistical analyses. ○ Cluster by network topology or by expression values? Both? BioNetwork - Location - Node degree (quantitative) enrichment. In progress: - Type ● Dash front-end for GUI. DENetwork - Log2FoldChange ● Develop matrix view for large networks to complement the (quantitative, divergent) node-link view? For the project: ○ How to order the nodes in the table? - P-value (quantitative, sequential) ● Dash.Cytoscape library for interactive node-link network visualization. Combined network - Source of interest ● D3.js for matrix view? tRNA processing genes sub-network What Why How China Multi-Generational Panel Networks & Tables Present inequality over generations; Filtering, aggregation, and navigation for networks; Dataset, ● 1.3 million annual observations of Discover other socioeconomic patterns. Streamgraph to show trends. Shuangcheng, 1866-1913 ● over 100,000 unique individuals descended from families, ● including ethnicity, life event, occupation, landholding... ● in Northeastern China, for the period 1866 - 1913 Margot Chen
Time-based Restaurant Map Now recruiting! Kevin Chow CPSC 547 10:00 AM 6:30 PM Data: What are Distributed Systems? Distributed Systems are everywhere • Google Maps API TraViz: ▶ Distributed systems • Yelp Open Dataset/API are widely deployed [1] Visualization of ● Graph processing ▶ “A distributed system is one in which the failure of a computer ● Stream processing Distributed Traces you didn't even know existed can render your own computer ● Distributed databases Tech: unusable.” ● Failure detectors ● Cluster schedulers - Leslie Lamport • Leaflet ● Version control ● ML frameworks • Polymaps ● Blockchains ● KV stores - Matheus Stolet ● ... • …. - Vaastav Anand [1] Mark Cavage . 2013. There's Just No Getting around It: You're Building a Distributed System . Queue 11, 4, Pages 30 (April 2013) Need for Observability: Ability to answer questions Need for Observability: Ability to answer questions What is Distributed Tracing? Datasets How different was the execution of 1 How different was the execution of 1 ● Each trace represents path of 1 ● ● ● Which nodes/services did the request ● Which nodes/services did the request request through the system request? request? 2 Trace Datasets & respective source code ● go through? go through? ● Trace collects and contains timing How do different groups of requests How do different groups of requests ● ● ○ DeathStarBench : https://github.com/delimitrou/DeathStarBench (Modified Version : ● Where were the bottlenecks for the ● Where were the bottlenecks for the info, events across nodes, differ? differ? https://gitlab.mpi-sws.org/cld/systems/deathstarbench) request? request? processes, and threads. Hadoop : https://gitlab.mpi-sws.org/cld/systems/hadoop Axes for differences Axes for differences ○ ● ● ● Depending on verbosity, may also ● What happened at every node/service ● What happened at every node/service ● DSB : 22390 traces ○ Structural ○ Structural contain stack traces. to process the request? to process the request? ● Hadoop : 72030 traces ○ Performance ○ Performance ● Where did the errors happen? ● Root cause analysis ● Where did the errors happen? ● Root cause analysis “Story of a request through a system” Distributed tracing can answer these questions
Recommend
More recommend