application specific specific application compression for
play

Application- -specific specific Application Compression for - PowerPoint PPT Presentation

Application- -specific specific Application Compression for Remote Compression for Remote Visualization of Genomics Visualization of Genomics Applications Applications Lars Ailo Ailo Bongo, Kai Li, Olga Bongo, Kai Li, Olga Lars


  1. Application- -specific specific Application Compression for Remote Compression for Remote Visualization of Genomics Visualization of Genomics Applications Applications Lars Ailo Ailo Bongo, Kai Li, Olga Bongo, Kai Li, Olga Lars Troyanskaya, Tore Larsen and , Tore Larsen and Troyanskaya Grant Wallace Grant Wallace

  2. Outline Outline � Motivation Motivation � � Genomics applications Genomics applications � � WAN challenges WAN challenges � � Compression � Compression � Methodology � Methodology � Compression results � Compression results � System � System � Conclusion and future work � Conclusion and future work

  3. Functional Genomics Functional Genomics � Describe the function and � Describe the function and interaction of genes. interaction of genes. � Search for patterns in � Search for patterns in microarray data. data. microarray � Hundreds of � Hundreds of measurements for measurements for thousands of genomes. thousands of genomes. � Visualizations important. � Visualizations important.

  4. Genomic Applications Genomic Applications � Example screenshots � Example screenshots

  5. Remote Collaboration Remote Collaboration � Important. � Important. by J. Craig "The sequence of the human genome," by J. Craig � "The sequence of the human genome," � Venter and 284 others , Science, 291(5507):1304 Venter and 284 others , Science, 291(5507):1304- -51, 51, 16 February 2001. 16 February 2001. � Challenges: � Challenges: � Performance: bandwidth and latency. � Performance: bandwidth and latency. � Privacy. � Privacy. � Security. � Security. � Ease of use. � Ease of use.

  6. Goal Goal WAN WAN

  7. Thin- -client Remote Visualization client Remote Visualization Thin � Only share pixels � Only share pixels � Put rectangle of pixel data at a given x, y position � Put rectangle of pixel data at a given x, y position � Examples � Examples � VNC � VNC � Microsoft Remote Desktop � Microsoft Remote Desktop � Microsoft � Microsoft Livemeeting Livemeeting � Advantages: � Advantages: � Only share visualizations, not raw data. � Only share visualizations, not raw data. � Very simple clients � portable � Very simple clients � portable � Thick servers � easy data management � Thick servers � easy data management � Disadvantage � Disadvantage � � Low Performance Low Performance � � High bandwidth requirements High bandwidth requirements

  8. Bandwidth Requirements Bandwidth Requirements

  9. Outline Outline � Motivation � Motivation � Compression Compression � � Lossless compression algorithms Lossless compression algorithms � � Rabin fingerprints Rabin fingerprints � � 2D anchoring schemes 2D anchoring schemes � � Methodology � Methodology � Compression results � Compression results � System � System � Conclusion and future work � Conclusion and future work

  10. Compression Compression � Lossy � Lossy � Update frequency reduction � Update frequency reduction � Color reduction � Color reduction � Jpeg (frequency � Jpeg (frequency downsampling downsampling) ) � Lossless � Lossless � Diff � Diff � Run � Run- -length encoding (RLE) length encoding (RLE) � Fingerprinting (FP) � Fingerprinting (FP) � Our approach: diff + FP + RLE � Our approach: diff + FP + RLE

  11. Diff Diff � Only send what has been changed since last � Only send what has been changed since last update update � VNC does this: works well for text editing � VNC does this: works well for text editing

  12. Diff (2) Diff (2) � Problem: how to � Problem: how to detect what has been detect what has been changed? changed? � Scrolling only moves � Scrolling only moves pixels. pixels. � Scrolling important for � Scrolling important for Genomics Genomics applications. applications.

  13. Bits Changed Between Synchronization Events Bits Changed Between Synchronization Events

  14. Run- -length encoding (RLE) length encoding (RLE) Run � Zlib � Zlib � DEFLATE = LZ77 + Huffman � DEFLATE = LZ77 + Huffman � Example: � Example: � AAAAABBBBCCCDDE � 5*A 4*B 3*C DDE � AAAAABBBBCCCDDE � 5*A 4*B 3*C DDE � A = 10001001 � A = 01 � A = 10001001 � A = 01 � VNC � VNC Hextile Hextile � Raw pixels and encoded pixels � Raw pixels and encoded pixels � Rectangles with a single color � Rectangles with a single color � Rectangles with same color as previous � Rectangles with same color as previous

  15. Fingerprinting - - Example Example Fingerprinting � All work and no play makes Jack a dull � All work and no play makes Jack a dull boy. All work and no play makes Jack a boy. All work and no play makes Jack a dull boy. All work and no play makes Jack dull boy. All work and no play makes Jack a dull boy. All work and no play makes a dull boy. All work and no play makes Jack a dull boy. All work and no play Jack a dull boy. All work and no play makes Jack a dull boy. All work and no makes Jack a dull boy. All work and no play makes Jack a dull boy. All work and play makes Jack a dull boy. All work and no play makes Jack a dull boy. All work no play makes Jack a dull boy. All work and no play makes Jack a dull boy. and no play makes Jack a dull boy.

  16. Select Anchor Points Select Anchor Points � All � All work and no play makes Jack a dull work and no play makes Jack a dull boy. All All work and no play makes Jack a work and no play makes Jack a boy. dull boy. All All work and no play makes Jack work and no play makes Jack dull boy. a dull boy. All All work and no play makes work and no play makes a dull boy. Jack a dull boy. All All work and no play work and no play Jack a dull boy. makes Jack a dull boy. All All work and no work and no makes Jack a dull boy. play makes Jack a dull boy. All All work and work and play makes Jack a dull boy. no play makes Jack a dull boy. All All work work no play makes Jack a dull boy. and no play makes Jack a dull boy. and no play makes Jack a dull boy.

  17. Calculate Hash for Regions Calculate Hash for Regions � hash( � hash(“ “All All work and no play makes Jack a work and no play makes Jack a � 0x12ad82b3 ) � dull boy. “ “) 0x12ad82b3 dull boy. � Send: � Send: � ( � (0x12ad82b3, 0x12ad82b3, “ “All work and no play makes All work and no play makes Jack a dull boy. “ “) ) Jack a dull boy. � 0x12ad82b3 � 0x12ad82b3 � 0x12ad82b3 � 0x12ad82b3 � … � …

  18. Rabin Fingerprints Rabin Fingerprints � Fast sliding window algorithm � Fast sliding window algorithm � All work and no play makes Jack a dull boy. � All work and no play makes Jack a dull boy. 0x83af

  19. Rabin Fingerprints Rabin Fingerprints � Fast sliding window algorithm � Fast sliding window algorithm � All work and no play makes Jack a dull boy. � All work and no play makes Jack a dull boy. 0x83af, 0x3241

  20. Rabin Fingerprints Rabin Fingerprints � Fast sliding window algorithm � Fast sliding window algorithm � All work and no play makes Jack a dull boy. � All work and no play makes Jack a dull boy. 0x83af, 0x3241, 0x31fa

  21. Rabin Fingerprints (2) Rabin Fingerprints (2) � All work and no play makes Jack a dull boy. � All work and no play makes Jack a dull boy. 0x83af, 0x3241, 0x31fa, 0x32ab, 0x3210, 0x9421, 0xab21, 0x32da, 0x31ab � Fixed window size [Spring and � Fixed window size [Spring and Wetherall Wetherall]: ]: � Use Rabin fingerprints as cache index � Use Rabin fingerprints as cache index k least significant bits. Select fingerprints based on k � Select fingerprints based on � least significant bits. � Variable window size [ � Variable window size [Manber Manber]: ]: � Use Rabin fingerprints as anchor points � Use Rabin fingerprints as anchor points k least significant bits. Select fingerprints based on k � Select fingerprints based on � least significant bits. � Calculate SHA � Calculate SHA- -1 hash value for region between 1 hash value for region between anchor points anchor points � Use SHA � Use SHA- -1 value as cache index 1 value as cache index

  22. 2D Data 2D Data � Previous work mostly � Previous work mostly A B C D E F A B C D E F for 1D bytestreams bytestreams. . for 1D G H I J K L G H I J K L � How to � How to vectorize vectorize 2D 2D array? array? M N O P Q R M N O P Q R � Our approach: anchor � Our approach: anchor S T U V W X S T U V W X then vectorize vectorize then Y Z 0 1 2 3 Y Z 0 1 2 3 4 5 6 7 8 9 4 5 6 7 8 9

  23. Anchoring Schemes Anchoring Schemes � Fixed to glass (VNC � Fixed to glass (VNC hextile hextile with caching) with caching) � Splatter (fixed window size) � Splatter (fixed window size) � ( � (Supertile Supertile) ) � Supercolumn � Supercolumn

  24. Fixed to Glass

  25. Splatter

  26. Supercolumn

  27. Outline Outline � Motivation � Motivation � Compression � Compression � Methodology Methodology � � Trace capturing and playback Trace capturing and playback � � Compression results � Compression results � System � System � Conclusion and future work � Conclusion and future work

Recommend


More recommend