C omputa tiona l T ools a nd S tra te g ie s in E nha nc ing A c c e ss to C ultura l B ig D a ta C olle c tions Richard MARCIANO & Ryan COX marciano@umd.edu & ryan.cox@maryland.gov Director & Research Archivist Digital

  1. C omputa tiona l T ools a nd S tra te g ie s in E nha nc ing A c c e ss to C ultura l B ig D a ta C olle c tions Richard MARCIANO & Ryan COX marciano@umd.edu & ryan.cox@maryland.gov Director & Research Archivist Digital Curation Innovation Center (DCIC) & Maryland State Archives (MSA) http://DCIC.umd.edu & http://slavery.msa.maryland.gov Tuesday, April 30, 2019 Columbus Metropolitan Library, Main Library, Columbus, Ohio

  2. Wha t a re Computa tiona l Stra te g ie s? Based on the concepts of Computational Archival Science (CAS) , both case studies use visualization and analytical tools to connect and display data in new ways. The goal is to create transparency in cultural Big Data. Case studies involve: • ! automating the detection of personally identifiable information (PII) in Japanese- American World War II Incarceration Camps • ! penetrating the complex pre-Civil War slave system in Maryland

  3. What is CAS? Explore computational treatments of archival and cultural content GOAL: PORTAL: http://dcicblog.umd.edu/cas/ GOOGLE GROUP: computational-archival-science@googlegroups.com A tra nsdisc iplina ry fie ld c o nc e rne d with the a pplic a tio n o f: ! ! c o mputa tio na l me tho ds a nd re so urc e s to la rg e -sc a le re c o rds / a rc hive s: ! ! pro c e ssing , a na lysis, sto ra g e , lo ng -te rm pre se rva tio n, a nd a c c e ss, ! ! with the a im o f impro ving e ffic ie nc y, pro duc tivity a nd pre c isio n ! ! in suppo rt o f a ppra isa l, a rra ng e me nt a nd de sc riptio n, pre se rva tio n a nd a c c e ss de c isio ns, a nd e ng a g ing a nd unde rta king re se a rc h with a rc hiva l ma te ria ls. Foundational Book Chapter: May. 2018 Book: “Advances in Librarianship – Re-Envisioning the MLIS: Perspectives on the Future of Library and Information Science Education”. Book Chapter: “Archival Records and Training in the Age of Big Data”

  4. The Emergence of Computational XXX’s ! ! XXX=So c ia l Sc ie nc e ! ! “I nve stig a ting so c ia l a nd b e ha vio ra l re la tio nships a nd inte ra c tio ns thro ug h: so c ia l simula tio n, mo de ling , ne two rk a na lysis, a nd me dia a na lysis”, Wikipe dia ! ! XXX=Biology ! ! “The science of using biological data to develop algorithms or models to better understand biological systems”, Wikipedia ! ! XXX=Journalism ! ! “Finding and telling news stories, WITH, BY, or ABOUT algorithms”, Nick Diakopoulos ! ! XXX=Archival Science ? ! ! The Focus of this seminar "

  5. ! !"#$ >%.,J*=2$#%()0$KLLM$ !%&'()*)+%,*-$"./0+1*-$#/+2,/2$ %(.(7/8$$8/=$$9+-$$.-+5*$$8#/-,(,.A$ 2+88/=+-/7(1#$0#)(.,A$/,0$"/,0)@+,$0(.(7/8$25-/7(+,$ !"#$%&'&$()$*(+,##-(,.$/01/,2#)$ *-+K#27$0#1#8+*:#,7$FLM$)#/7)A$M$(,7#-/27(1#$ )2-##,)A$NL$?+-B)7/7(+,)$?(7"$NL!E$+9$)7+-/.#G3$ (,$ !"#$%&'&(")'*+&,-'&#-)&.+"/ $ ',!0(1'*+')2+!%*&%,'*+!")&-)& 3$ 4##$+5-$&64$*+-7/8$9+-$7"#$8/7#)7$ >%.,J*=2$#%()0$KLLMN$ 0#1#8+*:#,7);$ %+25:#,7$ )2/,,(,.A (:/.#$ "77*;<<02(2=8+.35:03#05<2/)<$ :/,(*58/7(+,A$ /,0$ /-2"(1/8$ (,.#)7(+,$ 9/2(8(7>$ 9+-$ .-+5*$*-+K#27)3$ Mission 30*)$+4$!"#5$ ")-*,)+/$E(+-7+,8$ 6,$(,7#-0()2(*8(,/->$9(#80$2+,2#-,#0$?(7"$ • ! Sponsor interdisciplinary • ! O,@2/:*5)$1(-75/8$:/2"(,#$9/-:$9+-$ Be a leader in the digital 7"#$/**8(2/7(+,$+9$2+:*57/7(+,/8$:#7"@ -#)#/-2"$0/7/$*-+2#))(,.A$)7+-/.#A$ +0)$/,0$-#)+5-2#)$7+$8/-.#@)2/8#$-#2@+-0)$ 0))':TT7/+/U(&7U27($ /,0$"+)7(,.$FNP!E$)7+-/.#A$L$%#88$ </-2"(1#)$ *-+2#))(,.A$ /,/8>)()A$ )7+-@/.#A$ projects that explore the )#-1#-)A$QDR/-#@*+?#-#0G3$ curation research and 8+,.@7#-:$*-#)#-1/7(+,A$/,0$/2@2#))A$?(7"$ 7"#$ /(:$ +9$ (:*-+1(,.$ #99(2(#,@2>A$ <+44+%,:$ integration of archival *-+0527(1(7>$ /,0$ *-#2()(+,$ (,$ )5**+-7$ +9$ educational fields, and foster E#$/$8#/0#-$(,$7"#$0(.(7/8$25-/7(+,$-#)#/-2"$/,0$#05@ /**-/()/8A$/--/,.#:#,7$/,0$0#)2-(*@7(+,A$ 2/7(+,/8$ 9(#80)A$ /,0$ 9+)7#-$ (,7#-0()2(*8(,/->$ 2+88/=+-/@ "&*D%,$!-%(7$ *-#)#-1/7(+,$ /,0$ /22#))$ 0#2()(+,)A$ /,0$ 7(+,)$ 5)(,.$ E(.$ J#2+-0)$ /,0$ 6-2"(1/8$ 6,/8>7(2)$ ?(7"$ research data, user- interdisciplinary partnerships #,./.(,.$ /,0$ 5,0#-7/B(,.$ -#@)#/-2"$ *5=8(2$<$(,05)7->$<$.+1#-,:#,7$*/-7,#-)"(*)3$ %/)"=+/-0@#,/=8#0$ ?(7"$/-2"(1/8$:/7#-(/8)3$ 1(-75/8$$2+:*57(,.$8/=$(,$7"#$ H%*-4:$ 28+50$9+-$2-#/7(,.$R(,0+?)<C=5,75$(,)7/,2#)$5)(,.$ contributed data, and using Big Records and 6:/S+,$R#=$4#-1(2#)$F6R4G3$ 4*+,)+-$ (,7#-0()2(*8(,/->$ *-+K#27)$ 7"/7$ #H*8+-#$ 7"#$ (,7#.-/7(+,$+9$/-2"(1/8$-#)#/-2"$0/7/A$5)#-@2+,7-(=57#0$ !"#$6%(,7+,8$9*.),2.4:$ technology to generate 0/7/A$ /,0$ 7#2",+8+.>$ 7+$ .#,#-/7#$ ,#?$ 9+-:)$ +9$ ;+/0*.7$<*./+*,% A $ C3$D/->8/,0 $ Archival Analytics through /,/8>)()$ /,0$ "()7+-(2/8$ -#)#/-2"$ #,./.#:#,7A$ */-7(258/-8>$(,$7"#$/-#,/)$+9$)+2(/8$K5)7(2#A$"5:/,$-(."7)A$ <*.=$>27824 A $ King’s College London (UK) $ F<N$!OJ2.+,P.*4).(/)(.2$ /,0$25875-/8$"#-(7/.#3$ new forms of analysis !2,)2.$*)$)02$;+12.)2/0$ ?+/=+$@2&+2(A A $ C3$E-(7()"$&+85:=(/$F&/,/0/G $ public / industry / E-78$ <*.+*$B4)21* A $ !#H/)$601/,2#0$&+:*57(,.$ &#,7#-$ <%))%:$ F!6&&G $ and historical research “Integrating Education and Research” ! government collaborations. <+/0*2-$C(.)D A $ C3$D/->8/,0 $ E+--$F,72.G%%7 A $ C3$D/->8/,0 $ N (.(7/8 $; #*+)(7+-> $" 7 $# 2/8# $Q "/7 $R ,1(7#) $! +:*57/@7(+,$ F Q +$ R :*-+1#$ ! +88#27(+,)G;$ /$ *#7/)2/8#$ /-2"(1/8$ H.28$I*,42, A $ C3$D/->8/,0$ engagements. )7+-/.#$/,0$*-#)#-1/7(+,$-#*+)(7+->$F=/)#0$+,$7"#$ <*.=$!%,.*7 A $ I/7(+,/8$6-2"(1#)$/,0$J#2+-0)$ N;"#SQR!$ +*#,@)+5-2#$)+97?/-#$TI+4UV$&/))/,0-/ $ 60:(,()7-/7(+,$FI6J6G$ 0/7/=/)#W$/,0$2+:*57/7(+,/8$(,9-/)7-5275-#$FX$%#88$ ,+0#)G3$ !

  6. An e xa mple : Ma pping Ine qua lity – a foc us on Big Da ta [Ra c ia l Zoning ] UMD Student Team: Mary Kendig Myeong Lee Sydney Vaile Maddie Allen Martin Almirón Jhon De La Cruz Shaina Destine Erin Durham Darlene Reyes Benjamin Sagay Richard Bool

  7. Historic a l Conte xt ! ! Ho me Owne rs L o a n Co rpo ra tio n 1930’ s - 1940’ s ! ! Ra te d ne ig hb o rho o ds b y ra c ia l ma ke up ! ! Are a s witho ut lo a ns fe ll a pa rt ! ! 1950’ s Urb a n Re ne wa l ta rg e te d a re a s fo r c le a ra nc e ! ! Re sult: Ma ss displa c e me nt ! ! RG195: F e de ra l Ho me L o a n Ba nk Bo a rd, HOL C,1933 - 1951 ! ! Co nta ins Ma ps, Ne ig hb o rho o d Surve ys, L o a n I nfo rma tio n ! ! NYT ime s, Aug . 24, 2017: “Se lf-fulfilling pro phe c ie s: Ho w re dlining ’ s ra c ist e ffe c ts la ste d fo r de c a de s”

  8. Ma pping Ine qua lity Do c ume nts ! ! E a c h surve y c o rre spo nde d to c ity ma p ! ! Gre e n: White / We a lthy = Be st ! ! Blue : White / Wo rking = Still De sira b le ! ! Ye llo w: F o re ig n / I nc re a se in Po C = De c lining ! ! Re d: Bla c k a nd Hispa nic = Ha za rdo us Co lle c tio n Sta tistic s ! ! 150 Bo xe s ! ! Ove r 10,000 surve ys a lo ne ! ! 250 c itie s

  9. Neighborhood description for: Boyle Heights in L.A.: * Area D-53 * “ Red ” area “ Boyle Heights remained one of the most heterogeneous neighborhoods in the city for decades and it was a center of Jewish, Mexican and Japanese immigrant life in the early 20 th century, and also hosted large Yugoslav and Russian populations. ” Wikipedia, 6/17/2011

  11. U. Richmond Virginia Tech Johns Hopkins U. Maryland Mapping Inequality http://mappinginequality.net

  12. A. Historical Lab Notebooks Paper-based Lab Notebooks: • ! Used in science research • ! Represent a record of: - observations - experiments - ideas - notes - formulas - data Electronic Lab Notebooks: • ! patient medical records

  13. Learning Goals • ! Archival Practices • ! Computational Thinking Practices • ! Ethics and Values Considerations

  14. B. Computational Framework for Library & Archival Education https://dcicblog.umd.edu/ComputationalFrameworkForArchivalEducation/ April 4 / 5, 2019 Motiva tion for Introduc ing Computa tiona l T hinking into L ibra ry a nd Arc hiva l Studie s Curric ulum: ! ! A b a sic unde rsta nding o f the c ha ra c te ristic s o f dig ita l ma te ria ls is impo rta nt fo r future lib r. & a rc hivists. ! ! Arc hiva l c o lle c tio ns a re inc re a sing ly c o mpo se d o f dig ita l ma te ria ls. ! ! T he to o ls a nd pra c tic e s a sso c ia te d a rc hiva l a c tivitie s a re inc re a sing ly de pe nde nt o n c o mputing . ! ! T he wa y use rs inte ra c t with a rc hiva l c o lle c tio ns re fle c ts the inc re a sing ly c o mputa tio na lly-me dia te d na ture o f o ur wo rld. ! ! F o r to da y’ s le a rne rs to suc c e e d in future a rc hiva l ta sks, it is e sse ntia l tha t c o mputa tio na l thinking is inc lude d a s pa rt o f the ir tra ining . Approa c h: ! ! De ve lo p c o mputa tio na l thinking e nha nc e d le sso n pla ns fo r a rc hiva l to pic s tha t c o uld b e use d b y iSc ho o l fa c ulty to intro duc e c o mputa tio na l thinking into the ir c o urse s. ! ! Build a n inte rna tio na l ne two rke d c o mmunity o f iSc ho o l fa c ulty a nd L ib ra ry a nd Arc hive s pra c titio ne rs to e ng e nde r the se c a pa b ilitie s?


