SURFsara NOC Flash talk Erik Ruiter, Sr. Network Specialist, SURFsara TF-NOC Meeting Cambridge UK, 20-3-2014
Services National supercomputer National compute cluster Grid compute & storage Cartesius (capability Lisa (capacity computing) Gina (middleware services) computing) HPC Cloud IaaS Hadoop – Data processing GPU cluster (Do-it-yourself) (map-reduce algorithm) (Computing on a video card) Collaboratorium Remote Render cluster Beehub / SURFDrive collaboration (video wall) (Data visualization) (Dropbox unlimited) 2 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
Core Network: High level overview Fully redundant topology Core routers: 2x Juniper MX960 Internal firewall cluster: 2x Fortigate 311b + 5x Cisco 3750 External firewall cluster: 2x Fortigate 3040 + 2x Juniper EX4550 3 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
Core Network: E-infra compute and storage network SURFsara E-infra Network infrastructure • Connects the HPC environments in SURFsara • High capacity (160 Gbps Between QNodes) • Used for East - West traffic • Low latency • High scalability (upto 786 x 10Gbps) • Easy scaling • Single CLI management • Based on Juniper QFabric 4 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
NOC Tools: Monitoring Monitoring: • Icinga • Currently monitoring 149 hosts, with 379 services • Nconf is used for configuration (no manual text editing) • Cacti • Syslog daemon • Syslog-ng -> looking for better solution (logstash) 5 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
NOC Tools: NFSen Netflow Is enabled on our 2 core routers Nfsen is used for: - Intrusion detection - Using alert triggers when suspicious traffic patterns are detected. (only a few rules in place at the moment) - Traffic monitoring for our main uplinks: - SURFnet (10 Gbps) - LHCOPN (10 Gbps) 6 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
NOC Tools: Management Network management: • All elements reachable on SSH via management infrastructure • All elements reachable through console port (on centralized console servers) • All elements Authenticated by TACACS+ • All elements have SNMP v2 READ-ONLY access (limited to single IP address) Configuration management: • Rancid + SVNWEB • Small in-house developed web interface to easily find configs 7 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
NOC Ticketing system Ticketing system: • Trac • Combined issue tracking and wiki system • Used for software development projects, can interface with SVN, GIT ,etc. • All departments within SURFsara use Trac, Having their own Trac wiki and Ticket queue. • Network access requests also have a separate Trac queue: Request is first validated by Security team. Then assigned to NOC team. 8 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
NOC Documentation Documentation • MediaWiki / Trac • Stores all project and operational related information • Currently looking into moving all documentation from MediaWiki to Trac. • Exporting is difficult (different markup Language, Trac supports less functions) • Racktables • Rackspace • VLANs • IP space • Looking into storing cabling information • Racktables custom added features: • Daily script that does reverse DNS lookups to determine IP subnet occupation • Daily script that reads IP information from routers (SNMP) to document ‘routed by’ information. • Racktables API is used to create lightweight IP / VLAN overview 9 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
NOC Structure • Small team • 5 Network Engineers • 1 Team Leader • All engineers work on support, implementation and innovation projects • Rotated NOC duty days (once per week) • answering mail • Small operational requests • handling incidents • Only working hours support 10 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
NOC Frontend / Communication • Customers • Mainly internal -> system administrators of HPC systems • No real SLA’s, however providing redundant connectivity is getting more attention • Internal communications are mainly done through email and Trac tickets • For external communications we have a mailing list noc@surfsara.nl • There is no helpdesk phone number 11 TF-NOC meeting Cambridge 2014 – NOC Flash Talk
Erik Ruiter Erik.Ruiter@surfsara.nl www.surfsara.nl
Recommend
More recommend