Network design and data collection in an academic & research network Glen Turner 2010-02-08 10 th workshop on data networks as formal objects (WODNAFO 10) The University of Adelaide aar net Australia's Academic and Research Network
Our network's design goals, 1 ● Fast in the real world – Large number of interactive users – Small number of massive data transfers ● Available in the real world – 99.999% is a five minute outage per year. If that five minutes falls at the wrong time, we have failed – The definition of 'available' quickly becomes difficult
Our network's design goals, 2 ● Advanced – We want to discourage the waste of “test bed” networks which use commercially-available equipment – We want to encourage research networking equipment – IPv6, multicast, QoS, dynamic light paths ● Enabling – To campus networks – To related activities: authentication, video conferencing, storage – Increasingly to primary and secondary schools
AARNet is not a “typical” ISP ● Small team of experts – Historically 5, about 40 people now ● Bias to insourcing – If it is understood well enough to be in an outsourcing contract, then it isn't research ● Use our customers' rather odd combination of strengths – Large, accepting of risk, smart, cooperative, focus on the long-run result ● Transparent – Public outage notices, investment plans, etc
AARNet is not a typical NREN ● Funded from revenue, not from grant ● Commodity traffic for all researchers, teachers and students – not research into networks – was the reason for our creation ● Founded by the universities, not the higher education or industrial development bureaucracy
Net assets we own, sell and buy ● Rights of way and spectrum ● Ducts ● Fibres and radios ● Cores and transmitter channels ● WDM channels ● SDH circuits ● MPLS paths ● VLANs ● IP network
Example: inter-campus link ● Project management services for a fiber build totally paid for and owned by the customer ● Duct and hut access, customer owns fiber ● Customer leases fiber pair and hut access ● Customer leases WDM channel ● Managed ethernet service, aka MPLS or VLAN ● IP access, customer uses IPSec tunnel Typically, design modelling tools are confined to one of these alternatives Customers often make a poor initial choice
Goal: fast ● High bitrate, rather obviously ● Low latency is actually harder ● Low errors – Good engineering practices – Some alternatives are poor: wireless and undersea cable ● Big packets – Complicates provisioning: one more thing to agree on ● Traffic engineering
Fast → Security complications ● Fast for script kiddies too – A worst-effort class and classification of suspicious traffic – Avoid NOC overhead by using customer-facing BGP to signal miscreant traffic ● Fast for the receiver – SYN flooding
Fast → Customer engineering ● Worst-performing link sets performance for entire path ● Customers buy routers based on “sticker speeds” ● Customer's technicians often poor practices – No fiber cleaning, no inspection – No power budget ● Customers don't measure their networks – Eg: Ethernet late collisions
Fast → Computer security politics ● Too fast for a lot of corporate firewalls – The very idea of “one firewall to rule all traffic” sounds wrong to network engineers, but it sounds right to security policy makers ● Difficult to change security policy until a project at risk of failing ● Too much computer security focus is at the network layer, since this is easy to solve, and too little in applications – Result is applications that cannot be exposed to the Internet, and thus VPNs as a substitute for application authentication
Goal: Availability, 1 ● Diversity – Of path ● “As built” maps of fiber paths, ours and of all leased fiber and capacity ● Finding diversity can be difficult, especially at some pits and building ingress and risers – Of routing ● Ladder network topology – Of management ● Change management
Goal: Availability, 2 ● Configuration control is key – Misconfiguration avoidance ● Test facilities ● Strong preference for routers with virtualisation ● Approval of change – Misconfiguration recovery ● Automated source code configuration control – Source repository – Who, when, where, what logging ● Approval of change time – Database-driven networks are the aim ● That's my hope for algebra ● Vendors hate it
Goal, Availability, 3 ● Monitoring of outages and hazards – Real-time monitoring ● Out-of-band event notification ● Out-of-band notification of traffic flow changes – Notification of changes ● aka, “wtf just happened?” ● Planned outages and hazards calendar ● Fault ticked opened for all changes, closed when acceptance testing OK ● Capacity planning – And failure scenarios
Goal, Availability, 4 ● Fault handling – Clear responsibility, AARNet's approach ● Changes: You broke it, you tell the NOC, you fix it, you are honest about the cause ● Unexpected faults are dealt with or brokered by on-call person ● If the on-call person gives you the fault, it's yours to keep ● Proactive assistance from more experienced engineering staff – Communication with customers ● Their essential processes will have manual measures
Availability assists more availability ● If you have diverse paths, you can move traffic around to avoid planned maintenance ● If the customer can select the path, they can increase their availability too
Avail. →Dynamic interior routing ● Often isn't – Hard-coded interior default route ● Firewall failover and OSPF don't cooperate ● Significant cost educating customer engineering staff – In the ways of BGP – In OSPF – In network engineering, generally
Goal: advanced ● Compatible equipment – IPv6 ● Moving goalposts – Inter-domain multicast, we're now at the 4 th architecture ● Customer engineering – Why do I need that at all? – Researcher wants the feature, central networks doesn't want the risk – Light paths and OpenFlow switch are responses to this problem
Advanced → … ● “Advanced” means there is no commercial training available ● “Advanced” means we don't know the best practice ● “Advanced” usually means doing it twice – A Linux box running research software ● Don't care if paths are long, non-diverse, etc – Supported code on commercial router ● Want production quality ● “Advanced” means no monitoring solution
Advanced → Grant funding ● If we support a feature, then there is no grant funding for the researcher to use that feature – It's now “infrastructure”, which is to be totally funded by the university's operational budget – But IT at a university isn't going to fund an experimental activity ● But if the researcher builds a testbed VPN over our network… – So we have a product to solve that political funding problem
Enabling ● The network designer and operator can help solve other issues ● Federated authentication ● Experience exchange, aka conferences ● Purchasing – If you are purchasing switches by the pallet- load, we can't do better. But we can stop you paying retail price for a single unit – Allows “best of breed” purchasing ● Eg: switches from Vendor A, firewall from Vendor B
Enabling → Consortia ● In these areas the network is not the only important entity, nor does if have the depth of experience to have the answers ● Bring to the table – An operational environment ● Hosting, configuration control, monitoring, restoration – Economies of scale ● One more highly-available Linux service, costs about $450pa – Project management expertise
Enabling → Demo–production gap ● Takes about one elapsed year and three staff year to – Describe – Document – Training materials – Policy negotiation – Configuration control – Monitoring – Capacity planning ● Usually research funding ends at the demo :-(
A word about billing ● Billing can get very complex very quickly – The highest cost of a national phone call is the billing ● Simple billing is attractive ● Complexity from – Users, particularly those where charges are much greater than costs – Competition, particularly those aiming a a subset of your customers – Changes in costs, such as the cost of submarine capacity. Introduces a temporal element
Billing and data collection ● Accounting systems usually duplicate monitoring systems – Reconcile charges from others – Contestable charges for customers ● Accounting system then loads into invoicing system – Invoicing system usually holds some low-level aggregate, such as hourly traffic quantities – Invoicing system does heavy lifting for charging plans
It's not what your network can do… ● We want to give you access to our data, on the same terms that medical researchers get access to medical histories or social scientists get access to Bureau of Statistics unit records ● But the Telecommunications Act 1997 (Cth) prevents this ● There may be sneaky ways around that ● But what we really need is – E-mails from you explaining what research this lack of access prevents – You raising this with Federal politicians and staff
Network design and data collection in an NREN www.gdt.id.au/~gdt/presentations/2010-02-08-wodnafo-aarnet/ Glen Turner Network engineer aar net Australia's Academic and Research Network
Recommend
More recommend