Who? Networks of social entities Max Kemman University of Luxembourg December 13, 2016 Doing Digital History: Introduction to Tools and Technology
Today • Final assignment • Preparing the data with Palladio • (Cleaning the date column with Google Spreadsheets) • Visualising with Palladio • Next time
Final assignment Some additional info about the final assignment The computers in the TIC-Lab are powerful enough to work with all mails in Google Spreadsheets (You may also use Excel if you prefer, but more difficult for me to help when you're stuck) Create a selection and argument why this selection Deadline: 20 January 2017 23:59 You receive grades on Friday 27 January 2017
Final assignment data All data is in Moodle in folder Final Assignment : • allmails-metadata.csv & allmails-metadata.ods • allmails-ner.csv & allmails-metadata.ods (including mentioned people, organisations, locations) • allmails-geocoded.csv (about 108k locations) • Folder with text files per 1k
Preparing the data with Palladio To visualize the coded data, we will use Palladio: http://hdlab.stanford.edu/palladio/ First we need to prepare the data for Palladio
Loading the data Click Start We will use the 1000mails-cleandate.csv file from Moodle in the Who folder) Drag the CSV file onto the text input field Click Load
Preparing the data You will get a list of the columns from the spreadsheet You can already give your project a title and your data table as well Do not close this tab or refresh, you will have to start over! Let's look at several columns
From Sort the values by Frequency Check the data type Click Close
Date To set the data type to date we need the format: YYYY-MM-DD In our original CSV the format included the clock, but here we have the data in the right format so it's automatically recognised See next section for how to clean the date Click Close
People This contains the named entities per email To separate multiple people in an email, enter the delimiter | in the Multiple values box Click Close
People This contains the named entities per email To separate multiple people in an email, enter the delimiter | in the Multiple values box Click Close
Cleaning the date column with Google Spreadsheets Here we used Google Spreadsheets, but also possible in Excel & LibreOffice You can skip this for now, but important for final assignment
Cleaning the Date field Select the Date column, and go to Format > Number > More Formats > More date and time formats
Cleaning the Date field Select the appropriate option YYYY-MM-DD and click Apply
Cleaning the Date field The Date column will now have the appropriate form
Exporting the CSV Click File > Download as > Comma-separated values (.csv, current sheet)
Visualising with Palladio Now let's look at the network by selecting Graph at the top bar As a source, choose the From and close the popup As a target, choose the To and close the popup Wait and watch the result!
Palladio Graph Settings Try the two Highlighting check-boxes Try Size nodes What can we learn from this graph?
Facet To filter for certain attributes, select Facet in the lower-left corner As a Dimension select From and close the popup Now you can select to filter emails only from one person You could alternatively filter emails mentioning a specific person, location, or organisation To refine even further, we can select more facets by selecting the Dimension and selecting more options To remove a facet, delete the red trashcan in the lower right corner
Facet selection from From column
Facet selection from People column
Timeline We can also create a timeline of the emails by clicking Timeline Drag the mouse in the timeline to create a bar that acts as a filter And drag the bar to move it around so you can see how the network develops: you could compare months or years To remove the timeline filter, delete the red trashcan in the lower right corner
Timeline
Filtering one part of the timeline
Filtering another part of the timeline
Why filtering? The network can become quite large when you have more emails, or when you select one of the people, locations, organisations columns in the graph Filtering will help to be able to read the spaghetti/graph See next slide an example of a spaghetti ball (trying to do this might make your computer quite slow)
Sharing To export a graph, click the Download button in the settings (the lower one). This will export an SVG file that you can embed in your HTML report with img src Palladio Graph.svg alt graph
To export the entire workspace, click the upper Download button. This will export a JSON file that you can load next time (see next slide)
If you previously exported your workspace, you can load it in by selecting "Load an existing project" and choosing the JSON file. Also useful to share with project partners
For next time 20 December Wrap-up
Recommend
More recommend