Large-scale citation analysis -Academic Landscape- J. Mori, Y. Kajikawa, and I. Sakata Innovation Policy Research Center The University of Tokyo
Innovation Policy Research Center ● Established in 2008 ● Part of Graduate School of Engineering ● Mission ● Analyzing data such as scientific papers, patents, Web, and governmental data using text mining and network analysis → ● Evidence-based policy making – Science and Technology policy – Innovation policy
Our citation analysis ● Goal ● Overview of a research field – “Academic Landscape” ● Energy ● Environment ● Aging .... ● Detection of Emerging research fields → science and technology road maps
Our Citation Analysis ● Method overview Input “Query” which represents a particular field of interest Academic Landscape Visualization Clustering Citation network
Our Citation Analysis ● Method ● Clustering – Agglomerative hierarchic clustering algorithm ● Modularity Q as the quality of a division – Dense internal connections between the nodes within modules but only sparse connections between different modules. ● Visualization – Large graph layout algorithm ● Spring model-based layout Each edge is considered to be a spring, and the node positions are chosen to minimize the global energy of the spring system
Academic Landscape of Sustainability Science Sustainability #6 Business #4 Forestry 450 papers, 5.5 ages (Agroforestry) 614 papers, 6.3 ages #14 Soil 208 papers, 5.5 ages #1 Agriculture 1,584 papers, 7.1 ages #3 Ecological Economics 1,135 papers, 5.5 ages #2 Fisheries #5 Forestry 1,419 papers, 5.5 ages (Tropical Rain Forest) 450 papers, 6.5 ages #9 Forestry #12 Energy (Biodiversity) 229 papers, 4.9 ages 353 papers, 5.4 ages #11 Rural Sociology #13 Health 271 papers, 6.6 ages 221 papers, 5.8 ages #10 Urban Planning 277 papers, 5.9 ages #15 Wild Life #7 Tourism 161 papers, 5.9 ages 423 papers, 6.5 ages # Rank, Cluster name #8 Water Cluster size, Average years after publication from 29,391 papers Keywords in the cluster 361 papers, 5.5 ages (1970-2006, connected component = 9,973 papers) Country focusing the cluster
Academic Landscape of Solar Cell Research Energy #1.5 Limitation and #2.2 CdS/CdTe modification of efficiency #2.1 Cu(In,Ga)Se2 873 papers, 1998 369 papers, 2000 888 papers, 2001 #2.5 #1.2 High- #1.3 Modeling Textured ZnO efficiency cells 1,003 papers, 1985 260 papers, 1999 1,149 papers, 1997 #2 Compounds #1 Silicon 3481 papers, 1998 #2.4 CuInS 2 #1.4 4,634 papers, 1995 316 papers, 2000 Polycrystalline 1,497 papers, 1997 #2.3 Irradiation effects 798 papers, 1993 #1.1 a-Si 1,497 papers, 1997 #4Organics 1,390 papers, 2002 #4.2 #3.3 Modeling Heterojunction 498 papers, 2003 373 papers, 2002 #3.4 Fabrication 205 papers, 2003 #3 Dye-sensitized #4.3 Cyanine #3.2 Electrolyte 2,267 papers, 2003 328 papers, 1997 715 papers, 2004 #4.1 Plastic solar cell #4.4 Conjugated polymer #3.1 Photosensitizer 448 papers, 2004 120 papers, 2004 737 papers, 2002 from 16,199papers # Rank, Cluster name (1959-2006, connected component = 13,682 papers) Cluster size, Average years publication
Academic Landscape of Energy Energy #3 Battery #6 Wastewater 8,123 papers, 7.1 ages 1,619 papers, 5.7 ages #8 Engine 1,204 papers, 6 ages #9 Solar cell 1,131 papers, 4.4 ages #1 Combustion 12,128 papers, 9.3 ages #7 Heat pump 1,413 papers, 7.7 ages #2 Coal #5 Fuel cell 11,904 papers, 10.9 ages 1,704 papers, 2.9 ages #4 Petroleum #10 Power system 5,017 papers, 10.5 ages 813 papers, 8.3 ages from 152,514 papsers (1970-2005, connected component = 53,033 papers)
Academic Landscape of Nanorisk research Safe & Ease #1 Nano risk (general) #4 Carbon nanotube as a sensing material 1617, 2005.4 362, 2004.4 No Sub-cluster #3 Dye-sensitized solar cells #1 atmospheric nanoparticles 532, 2003.3 (2004.2:511) #2 nanoparticles used in Imaging (2005.4:458) #3 toxicity of manufactured nanomaterials (2006.3:363) #4 carbon nanotube (2006.3:205) #5 field work about atmospheric nanoparticles (2005.1:34) #6 antibiotic nature of Ag nanoparticles (2006.5:22) #7 engineering ethic and policy about nano #2 Drug delivery system (DDS) (2006.3:18) 1412, 2003.1
Academic Landscape of Gerontology Aging society 3 main clusters #3 Nursing & care 4,961; 1995.8 #2 Emotion & social network 4,966; 1996.8 #7 Aging mechanism 585; 1994.9 #5 Effects of living #8 Depression environment 547; 1996.9 1,305; 1984.6 #6 Geriatrics 962; 1991.3 #4 Cognitive function 3,254; 1996.7 #1 Functional disability 5,468 papers; 1998.8 from 69,403 papers # Rank, Cluster name (1956-2008, connected component = 25,625 papers) Cluster size, Average years publication
Ex.1 Comparative study of link creating methods Citation network created by co-citation and bibliographic coupling is more random than that by direct citation, which means that similarity of papers in the cluster after clustering becomes low. (a) (b) 1 1 0.8 0.8 0.6 0.6 Qmax Qmax 0.4 0.4 0.2 0.2 0 0 1990 1992 1994 1996 1998 2000 2002 2004 1990 1992 1994 1996 1998 2000 2002 2004 (c) 1 0.8 (a) : GaN ●: direct (b) : CNW △: co 0.6 (c) : CNT ■: biblio Qmax 0.4 Shibata, N., Kajikawa, Y., Takeda, Y., & 0.2 Matsushima, K. JASIST 2009. 0 1990 1992 1994 1996 1998 2000 2002 2004
Ex.2 Investigations on clustering quality • Corpus size <100 -> small Qmax = network is nearly random • Corpus size (~100) is necessary to assure the quality of clustering →Minimum corpus size (~100) assuring the clustering quality Clustering quality is high 1 cvd 0.8 drug_ingo energy 0.6 Qmax fuel_cell nanobio 0.4 solar_cell nano 0.2 sustainab 0 1 10 100 1000 10000 100000 1E+06 Takeda Y. & Kajikawa, Y. # of papers Scientometrics, in press.
Recommend
More recommend