malware analysis using visualized image matrices
play

Malware Analysis Using Visualized Image Matrices Tzu-Ming Huang - PowerPoint PPT Presentation

Malware Analysis Using Visualized Image Matrices Tzu-Ming Huang CISC850 Cyber Analy@cs CISC850 Cyber Analy@cs Overview malware visual analysis method convert binary files into images Reduce computa@on major block


  1. Malware Analysis Using Visualized Image Matrices Tzu-Ming Huang CISC850 Cyber Analy@cs

  2. CISC850 Cyber Analy@cs Overview • malware visual analysis method – convert binary files into images • Reduce computa@on – major block – similarity calcula@on method between these images

  3. Method Overview

  4. Extract opcode sequences from binary 1. 2. 3. 1.

  5. Repe@@on Filtering

  6. Extract opcode sequences from binary 1. 2. 3. 1.

  7. Major Block Selec@on • Not all of the basic blocks (file header, meaning less blocks) • Target suspicious behavior • Blocks include “CALL” instruc@on

  8. Major Block Selec@on

  9. Extract opcode sequences from binary 1. 2. 3. 1.

  10. Parsing Opcode Sequence • First three characters of opcode – 41.4% of opcodes have3 characters – Meaning is maintained – Eg. PUSH -> PUS; CALL -> CAL; OR? • These three-character opcodes are concatenated together

  11. Parsing Opcode Sequence

  12. Generate Image Matrix • Use hash func@on ( SimHash ) to decide X-Y coordinate and RGB colors of the pixels • Length and width of matrix are 2 n (8) • If hash in same X-Y coordinate, simply sum the RGB colors value

  13. Generate Image Matrix

  14. Choose Representa@ve Image Matrix

  15. Similarity Calcula@on Using Image Matrix • Faster performance than opcode string comparison • Finding pairs in string: O(n 2 ) • Simhash and calculate similarity in image: O(n)

  16. Similarity Calcula@on Using Image Matrix

  17. Similarity Calcula@on Using Image Matrix • vector angular-based distance measurement algorithm – Pixels are viewed as 3D vector

  18. Similarity Calcula@on Using Image Matrix

  19. Experiment: Major Blocks Selec@on?

  20. Experiment: Major Blocks Selec@on?

  21. Experiment: Feasibility

  22. Experiment: Feasibility • Similarity of sample malwares from same family: 0.19 ~ 0.36 • Similarity of sample malwares from different family: < 0.05 • Classifica@on accuracy = 0.9896

Recommend


More recommend