data visualizations of hyip dataset
play

Data Visualizations of HYIP Dataset Quantifying the World April - PowerPoint PPT Presentation

Data Visualizations of HYIP Dataset Quantifying the World April 23, 2012 Jie Han Financial Cryptography 2012 This could be you!!! http://fc12.ifca.ai/pre-proceedings/paper_27.pdf Overview 1. What's an HYIP? 2. Dataset 3. Processes 4. R


  1. Data Visualizations of HYIP Dataset Quantifying the World April 23, 2012 Jie Han

  2. Financial Cryptography 2012 This could be you!!! http://fc12.ifca.ai/pre-proceedings/paper_27.pdf

  3. Overview 1. What's an HYIP? 2. Dataset 3. Processes 4. R graph examples 5. Google Chart examples 6. Some helpful hints

  4. High Yield Investment Programs (HYIPs) ● Also known as a Ponzi or pyramid scheme ● Promise high returns on investment ● Pay existing investors with revenue from new investors ● Unsustainable in the long run

  5. Why are HYIPs a problem? ● Advertised as legitimate investments ● Sophisticated online ecosystem in support of the schemes

  6. HYIP Website

  7. HYIP Aggregator Websites

  8. HYIP Variables

  9. HYIP Lifetime Typical life cycle of an HYIP:

  10. About the Data ● Since 11/17/2010, still running ● Collected data from nine "aggregator" websites ● Total observations: 141k+ ● Total HYIPs observed: 1,576+

  11. Process Preliminary Continue data Data collection analysis collection, work on (Python, crontab, (Python, R) parsing all mongoDB) aggregators (Python) Use new tools to Difficulties in analyzing Look at what we look for patterns data -> create have, decide on (browser & eyes) interactive data what we want (R) visualizations (Python, Google Charts, JS, HTML)

  12. How an R Chart Gets Generated Data Collection (Python) Background scripts Parse data & insert into db (Python, mongoDB) New user input Fetch & manipulate data (HTML forms) Back End (Python, mongoDB, R) Front End Output a .pdf image to server User interact with data in browser

  13. How Can We Trust Aggregator Data? CDF of Standard Deviations of HYIP Lifetimes ● Aggregators agree 80% of the time

  14. How Long Do HYIPs Last Before Collapsing? Survival function of HYIP Lifetimes ● Most HYIPs collapse within a few weeks

  15. What Factors Lead to Collapse? Factors that lead to shorter HYIP lifespans: ● Higher advertised rates of return ● Shorter mandatory investment terms

  16. R vs. Google Charts R Google Charts ● Useful if familiar with the ● Anyone can view & interact dataset with the data ● Good at presenting ● See a complete data aggregate summaries distribution ● Large learning curve, ● Learning curve isn't bad especially when you want to ● Not as customizable do something specific ● Have to wait for updates for ● More customizable more functionality, or write ● Most analysis techniques your own are available

  17. How a Google Chart Gets Generated Data Collection Background scripts (Python) Parse data & insert into db (Python, mongoDB) Back End Fetch & manipulate data New user input (Python, mongoDB, R) (HTML forms) Write JS & HTML page (Python, JS, HTML, CSS) Front End User interact with data in browser

  18. Distribution of HYIPs Around the World Link

  19. Motion Charts Link

  20. Variable Changes Over Time cherryshares.com, aggregator rating Link

  21. Relationships Between Two Variables Link

  22. Multi-Dimensional Scatterplot Link

  23. Multi-Dimensional Scatterplot Link

  24. General Programming Tips ● Spend time on data quality ● Organize your code, variable names, and files ● Keep records of working examples ● Plan out your code to maximize pattern capture ● Error-catching, browser consoles, and regexes are friends ● Test out chunks of code before putting them together ● Google Tables take a while to load for large datasets ● Google Charts Playground allows you to test code in their environment

  25. Future Work ● Create an interactive web based visualization for our dataset - some examples I made ● Link scams together ● Explore larger dataset

  26. Thanks!

Recommend


More recommend