filesystems and i o balance on the nersc t3e
play

Filesystems and I/O Balance on the NERSC T3E Tina Butler, NERSC - PowerPoint PPT Presentation

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Filesystems and I/O Balance on the NERSC T3E Tina Butler, NERSC Systems Group This work was supported by the Director, Office of Advanced Scientific Computing Research, Division of


  1. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Filesystems and I/O Balance on the NERSC T3E Tina Butler, NERSC Systems Group This work was supported by the Director, Office of Advanced Scientific Computing Research, Division of Mathematical, Information, and Computational Sciences of the U.S. Department of Energy under contract number DE-AC03-76SF00098. 1

  2. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R What is NERSC? ¥ National Energy Research Scientific Computing Center Ð Funded by DOE Office of Science Ð Located at Lawrence Berkeley National Lab Ð Provides Computational Resources to the following programs ¥ Fusion Energy ¥ High Energy and Nuclear Sciences ¥ Basic Energy Sciences ¥ Biology and Environmental Research ¥ Computational and Environmental Research Ð Approximately 2500 Users from Major Universities and Government Labs Ð Hardware: 696 PE T3E-900, 1 J90 SE system (32 CPUs) & 3 SV1 (64 processors) 2

  3. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie - The NERSC T3E ¥ T3E 900 with 696 PEs running UNICOS/MK 2.0.4.67 ¥ 644 APP PEs ¥ 256 MB per PE ¥ 22 Gigarings ¥ 12 FCNs ¥ 8 MPNs ¥ 2 HPNs 3

  4. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R mcurie.nersc.gov MPN0 Cray T3E900 LC696-256 FCN01 25 disks w/5 parity 16 disks 174 GB/21.75 GW memory MPN10 FCN02 2.76 TB disk 16 disks 25 disks w/5 parity FCN14 30 disks w/6 parity MPN11 FCN03 24 disks 30 disks w/6 parity FCN15 30 disks w/6 parity MPN12 MPN22 FCN04 24 disks 16 disks 30 disks w/6 parity FCN16 25 disks w/5 parity MPN23 FCN05 16 disks 30 disks w/6 parity FCN17 MPN24 25 disks w/5 parity FCN06 16 disks 30 disks w/6 parity FCN20 MPN25 30 disks w/6 parity HPN07 8 disks FCN21 HPN13 30 disks w/6 parity Multipurpose Node HIPPI Node FibreChannel Node Gigaring 4

  5. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R NERSC Job Mix - Application Mix ¥ Applications from the fields of Ð Chemistry Ð Materials Science Ð Fusion Energy Ð Geophysics Ð Biology Ð High Energy Nuclear Physics Ð Climate Modeling Ð Astrophysics Ð Computational Fluid Dynamics ¥ Mostly user-written codes 5

  6. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R NERSC Job Mix - Diverse and Dynamic App Size ( PEs) % of all Apps % of PE Hours 2 - 16 56 6 17 - 64 38 56 65 - 128 5 29 129 - 512 1 9 App Run Time % of all Apps % of PE Hours 0 – 10 min 56 1 10 – 30 min 23 10 0.5 – 3.5 hr 17 49 3.5 – 12.0 hr 4 40 Mix of Development, Capacity and Capability computing 6

  7. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - performance ¥ 68 Fibre Channel disk arrays ¥ Striping of swap and checkpoint ¥ pcache for metadata optimization on root, usr, opt ¥ primary/secondary partitions ¥ remote mount file servers 7

  8. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - resiliency ¥ Mirroring of primary partitions for homes and usrtmp ¥ Alternate path for all arrays ¥ Sized for feasible dump/restore 8

  9. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - swap and checkpoint ¥ NERSC uses both checkpointing and gang scheduling for system scheduling ¥ Swap - 383 Gigabytes - 2.4 times APP memory ¥ Checkpoint - 582 Gigabytes - 3.6 times APP memory ¥ Filesystems have 5 logical partitions that are 5 or 6-way striped on FCN disk ¥ 800 MB/sec observed on checkpoint ¥ Full machine checkpoint regularly under 5 minutes 9

  10. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - homes ¥ Multiple filesystems to distribute user load and risk ¥ Configured for full mirroring ¥ Six filesystems - 25 GB on MPN disks ¥ Approximately 150 users per filesystem 1 0

  11. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - homes F ile d is tri b u tio n o n mc ur ie h o m e s 1 0 00 0 00 1 00 0 00 / u1 / u2 / u3 0 10 0 00 / u4 8 1 92 / u5 32 7 68 N um b er of fil es 1 0 00 / u6 1 31 0 72 To ta l 5 24 2 88 1 00 File s i ze ( by t es ) 2 0 97 1 52 8 3 88 6 08 10 3 3 5 54 4 32 1 3 4 2 17 7 28 1 T o ta l / u6 5 3 6 8 70 9 12 / u5 / u4 / u3 / u2 File s y s tem / u1 1 1

  12. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - /usr/tmp ¥ Main area for user data files ¥ 1.5 TB of FCN disk arrays ¥ Primary/secondary partition configuration to allow mirroring of metadata 1 2

  13. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie filesystems - space management ¥ Hard quotas on user-writable filesystems ¥ Home filesystems - 4 GB and 3500 inodes ¥ /usr/tmp filesystem - 70 GB and 6000 inodes ¥ Homes migrated to HPSS under Cray DMF control ¥ /usr/tmp - purging of files inactive for 14 days 1 3

  14. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - homes mcurie home IO volume - combined 600000 500000 400000 u1 u2 u3 u4 300000 u5 u6 200000 100000 281 271 261 251 241 231 221 Megabytes 211 201 191 181 171 Days 161 151 141 131 121 111 91 81 101 71 61 0 51 41 31 u1 u6 21 11 1 1 4

  15. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - homes mcurie home filesystems IO volume 800000 700000 600000 500000 Megabytes Read 400000 Write 300000 200000 100000 0 46 37 55 64 73 82 91 1 100 109 118 127 136 145 154 163 172 181 190 199 10 19 28 208 217 226 235 244 253 262 271 280 Days 1 5

  16. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - home /u4 Average Daily Transfer Rate 250 200 150 4K Blocks/sec Avg Write Avg Read 100 50 0 11/98 to 08/99 1 6

  17. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie filesystems - /usr/tmp mcurie /usr/tmp IO Volume 4500000 4000000 3500000 3000000 Megabytes 2500000 Read Write 2000000 1500000 1000000 500000 0 46 37 55 64 73 82 91 1 100 109 118 127 136 145 154 163 172 181 190 199 10 19 28 208 217 226 235 244 253 262 271 280 Days 1 7

  18. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - DMF traffic mcurie DMF Monthly Volume - FY99 25000 20000 15000 Megabytes Puts Gets 10000 5000 0 199810 199811 199812 199901 199902 199903 199904 199905 199906 199907 199908 199909 Total Total Total Total Total Total Total Total Total Total Total Total Month 1 8

  19. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - DMF traffic mcurie DMF Monthly Puts and Gets - FY99 10000 1000 Number of accesses 100 Puts Gets 10 1 199810 199811 199812 199901 199902 199903 199904 199905 199906 199907 199908 199909 Total Total Total Total Total Total Total Total Total Total Total Total Month 1 9

  20. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - HPSS traffic HPSS-mcurie Data Volume 70000 60000 50000 Megabytes 40000 Puts - MB Gets - MB 30000 20000 10000 0 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 / / / / / / / / / 9 9 / / / / / 4 8 1 5 9 3 7 0 4 / / / / / / / / / 7 1 4 8 2 6 0 3 7 8 1 5 / 1 / 1 2 1 2 1 2 7 1 0 1 / 2 / 1 / 1 3 1 2 / 3 4 5 1 2 8 / / / / / / / / 2 1 0 1 1 1 2 2 1 1 2 / / / / / / 4 / / / 3 5 5 6 6 7 7 1 1 1 1 1 2 Date 2 0

  21. N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTE R Mcurie Filesystems - Conclusions ¥ User home filesystems are well balanced in file distribution and transfer load ¥ Data migration is a relief valve for homes, but not a critical resource yet ¥ /usr/tmp filesystem buffers user intermediate data ¥ HPSS is being used as a long-term archive resource for user data ¥ NERSCÕs T3E storage resources are successful in supporting the growing utilization of the system 2 1

Recommend


More recommend