Implementing Longest Prefix Match Entry Destination Port 1 Cambridge 1 Searching Most specific 2 Oxford 2 3 Europe 3 FOUND 4 Asia 4 Least specific 5 Everywhere (default) 5 Summer Course Technion, Haifa, IL 2015 38
Basic components of an IP router Management Software & CLI Routing Control Plane Protocols Routing Table Hardware Data Plane Forwarding Switching Queuing per-packet Table processing Summer Course Technion, Haifa, IL 2015 39
IP router components in NetFPGA Linux SCONE Management Management & CLI Software & CLI Routing OR Routing Protocols Protocols Routing Routing Table Table Router Kit Hardware Output Port Input Output Lookup Arbiter Queues Forwarding Switching Queuing Table Summer Course Technion, Haifa, IL 2015 40
Section III: Example I Summer Course Technion, Haifa, IL 2015 41
Operational IPv4 router Java GUI SCONE Software Management Control Plane & CLI Routing Protocols Routing Table Hardware Reference router Data Plane per-packet Forwarding Switching Queuing Table processing Summer Course Technion, Haifa, IL 2015 42
Streaming video Summer Course Technion, Haifa, IL 2015 43
Streaming video NetFPGA running reference router PC & NetFPGA (NetFPGA in PC) Summer Course Technion, Haifa, IL 2015 44
Streaming video Video streaming over shortest path Video Video client server Summer Course Technion, Haifa, IL 2015 45
Streaming video Video Video client server Summer Course Technion, Haifa, IL 2015 46
Observing the routing tables Columns: • Subnet address • Subnet mask • Next hop IP • Output ports Summer Course Technion, Haifa, IL 2015 47
Example 1 Summer Course Technion, Haifa, IL 2015 48
Review NetFPGA as IPv4 router: • Reference hardware + SCONE software • Routing protocol discovers topology Demo: • Ring topology • Traffic flows over shortest path • Broken link: automatically route around failure Summer Course Technion, Haifa, IL 2015 49
Section III: Life of a Packet Summer Course Technion, Haifa, IL 2015 50
Reference Switch Pipeline • Five stages 10GE 10GE 10GE 10GE DMA – Input port RxQ RxQ RxQ RxQ – Input arbitration – Forwarding decision Input Arbiter and packet modification Output Port Lookup – Output queuing – Output port Output Queues • Packet-based module interface • Pluggable design 10GE 10GE 10GE 10GE DMA Tx Tx Tx Tx Summer Course Technion, Haifa, IL 2015 51
Full System Components Software nf0 nf1 nf2 nf3 ioctl PCIe Bus AXI Lite CPU CPU RxQ TxQ NetFPGA user data path MAC MAC MAC MAC MAC MAC TxQ RxQ 10GE 10GE TxQ RxQ TxQ RxQ Tx Rx Ports Summer Course Technion, Haifa, IL 2015 52
Life of a Packet through the Hardware 00:0a:..:0X 00:0a:..:0Y Port 1 Port 2 Summer Course Technion, Haifa, IL 2015 53
10GE Rx Queue 10GE Rx Queue Summer Course Technion, Haifa, IL 2015 54
10GE Rx Queue 10GE Length, Src Eth Hdr: port, Dst port, Rx Dst MAC, Src MAC User defined Queue 0 Payload TUSER TDATA Summer Course Technion, Haifa, IL 2015 55
Input Arbiter Rx 4 Pkt … Input Arbiter Rx 1 Pkt Rx 0 Pkt Summer Course Technion, Haifa, IL 2015 56
Output Port Lookup Output Port Lookup Summer Course Technion, Haifa, IL 2015 57
Output Port Lookup 4- Update output 1- Parse port in TUSER header: Src MAC, Dst MAC, Src port 2 - Lookup Length, Src Eth Hdr: Dst MAC= Output next hop port, Dst port, nextHop , Src MAC = MAC& output Port User defined port 4 port 0 Payload Lookup 3- Learn Src MAC & Src TUSER TDATA port Summer Course Technion, Haifa, IL 2015 58
Output Queues OQ0 Output OQ2 Queues OQ4 Summer Course Technion, Haifa, IL 2015 59
10GE Port Tx 10GE Port Tx Summer Course Technion, Haifa, IL 2015 60
MAC Tx Queue Length, Src Eth Hdr: Dst MAC , Src MAC Tx port, Dst port, MAC User defined Queue 0 Payload Summer Course Technion, Haifa, IL 2015 61
NetFPGA-Host Interaction • Linux driver interfaces with hardware – Packet interface via standard Linux network stack – Register reads/writes via ioctl system call with wrapper functions: • rwaxi(int address, unsigned *data); eg: rwaxi( 0x7d4000000 , &val); Summer Course Technion, Haifa, IL 2015 62
NetFPGA-Host Interaction NetFPGA to host packet transfer 1. Packet arrives – forwarding table sends to DMA queue 2. Interrupt PCIe Bus 3. Driver sets up notifies and initiates driver of DMA transfer packet arrival Summer Course Technion, Haifa, IL 2015 63
NetFPGA-Host Interaction NetFPGA to host packet transfer (cont.) 5. Interrupt 4. NetFPGA PCIe Bus signals transfers completion packet via of DMA DMA 6. Driver passes packet to network stack Summer Course Technion, Haifa, IL 2015 64
NetFPGA-Host Interaction Host to NetFPGA packet transfers 3. Interrupt 2. Driver sets up PCIe Bus signals and initiates completion DMA transfer of DMA 1. Software sends packet via network sockets Packet delivered to driver Summer Course Technion, Haifa, IL 2015 65
NetFPGA-Host Interaction Register access 2. Driver PCIe Bus performs PCIe memory read/write 1. Software makes ioctl call on network socket ioctl passed to driver Summer Course Technion, Haifa, IL 2015 66
Section V: Infrastructure Summer Course Technion, Haifa, IL 2015 67
Infrastructure • Tree structure • NetFPGA package contents – Reusable Verilog modules – Verification infrastructure – Build infrastructure – Utilities – Software libraries Summer Course Technion, Haifa, IL 2015 68
NetFPGA package contents • Projects: – HW: router, switch, NIC – SW: router kit, SCONE • Reusable Verilog modules • Verification infrastructure: – simulate designs (from AXI interface) – run tests against hardware – test data generation libraries (eg. packets) • Build infrastructure • Utilities: – register I/O • Software libraries Summer Course Technion, Haifa, IL 2015 69
Tree Structure (1) NetFPGA-SUME projects (including reference designs) contrib-projects (contributed user projects) lib (custom and reference IP Cores and software libraries) tools (scripts for running simulations etc.) docs (design documentations and user-guides) https://github.com/NetFPGA/NetFPGA-SUME-alpha Summer Course Technion, Haifa, IL 2015 70
Tree Structure (2) lib hw (hardware logic as IP cores) std (reference cores) contrib (contributed cores) sw (core specific software drivers/libraries) std (reference libraries) contrib (contributed libraries) Summer Course Technion, Haifa, IL 2015 71
Tree Structure (3) projects/reference_switch bitfiles (FPGA executables) hw (Vivado based project) constraints (contains user constraint files) create_ip (contains files used to configure IP cores) hdl (contains project-specific hdl code) tcl (contains scripts used to run various tools) sw embedded (contains code for microblaze) host (contains code for host communication etc.) test (contains code for project verification) Summer Course Technion, Haifa, IL 2015 72
Reusable logic (IP cores) Category IP Core(s) I/O interfaces Ethernet 10G Port PCI Express UART GPIO Output queues BRAM based Output port lookup NIC CAM based Learning switch Memory interfaces SRAM DRAM FLASH Miscellaneous FIFOs AXIS width converter Summer Course Technion, Haifa, IL 2015 73
Verification Infrastructure (1) • Simulation and Debugging – built on industry standard Xilinx “ xSim ” simulator and “ Scapy ” – Python scripts for stimuli construction and verification Summer Course Technion, Haifa, IL 2015 74
Verification Infrastructure (2) • xSim – a High Level Description (HDL) simulator – performs functional and timing simulations for embedded, VHDL, Verilog and mixed designs • Scapy – a powerful interactive packet manipulation library for creating “ test data ” – provides primitives for many standard packet formats – allows addition of custom formats Summer Course Technion, Haifa, IL 2015 75
Build Infrastructure (2) • Build/Synthesis (using Xilinx Vivado) – collection of shared hardware peripherals cores stitched together with AXI4: Lite and Stream buses – bitfile generation and verification using Xilinx synthesis and implementation tools Summer Course Technion, Haifa, IL 2015 76
Build Infrastructure (3) • Register system – collates and generates addresses for all the registers and memories in a project – uses integrated python and tcl scripts to generate HDL code (for hw) and header files (for sw) Summer Course Technion, Haifa, IL 2015 77
Section VI: Examples of using NetFPGA Summer Course Technion, Haifa, IL 2015 78
Running the Reference Router User-space development, 4x10GE line-rate forwarding OSPF BGP Memory CPU My Protocol user kernel Routing Table PCI-Express “ Mirror ” Fwding Packet 10GbE 10GbE Table Buffer FPGA 10GbE 10GbE IPv4 10GbE 10GbE Router Memory 10GbE 10GbE Summer Course Technion, Haifa, IL 2015 79
Enhancing Modular Reference Designs 1.Design Verilog, 2.Simulate System 3.Synthesize PW-OSPF Verilog, 4.Download EDA Tools Memory CPU Java GUI VHDL, (Xilinx, Front Panel Bluespec…. Mentor, etc.) (Extensible) PCI-Express NetFPGA Driver 10GbE 10GbE L3 L2 In Q FPGA Parse Parse Mgmt 10GbE 10GbE My 10GbE 10GbE IP Out Q Block Lookup Mgmt Memory 10GbE 10GbE Verilog modules interconnected by FIFO interfaces Summer Course Technion, Haifa, IL 2015 80
Creating new systems 1.Design Verilog, 2.Simulate System 3.Synthesize Verilog, 4.Download EDA Tools Memory CPU VHDL, (Xilinx, Bluespec…. Mentor, etc.) PCI-Express NetFPGA Driver 10GbE 10GbE FPGA 10GbE 10GbE My Design 10GbE 10GbE (10GE MAC is soft/replaceable) Memory 10GbE 10GbE Summer Course Technion, Haifa, IL 2015 81
Contributed Projects Platform Project Contributor 1G OpenFlow switch Stanford University Packet generator Stanford University NetFlow Probe Brno University NetThreads University of Toronto zFilter (Sp)router Ericsson Traffic Monitor University of Catania DFA UMass Lowell 10G Bluespec switch UCAM/SRI International Traffic Monitor University of Pisa NF1G legacy on NF10G Uni Pisa & Uni Cambridge High perf. DMA core University of Cambridge BERI/CHERI UCAM/SRI International OSNT UCAM/Stanford/GTech/CNRS Summer Course Technion, Haifa, IL 2015 82
OpenFlow • The most prominent NetFPGA success • Has reignited the Software Defined Networking movement • NetFPGA enabled OpenFlow – A widely available open-source development platform – Capable of line-rate and • was, until its commercial uptake, the reference platform for OpenFlow. Summer Course Technion, Haifa, IL 2015 83
Soft Processors in FPGAs FPGA Ethernet MAC DDR controller Processor(s) Soft processors: processors in the FPGA fabric User uploads program to soft processor Easier to program software than hardware in the FPGA Could be customized at the instruction level CHERI – 64bit MIPS soft processor, BSD OS Summer Course Technion, Haifa, IL 2015 84
100Gb/s Aggregation • A development platform that can aggregate 100Gb/s for: Non-Blocking – Operating systems 300Gb/s Switch 100G 100G – Protocols Testing – Measurements 100G • NetFPGA SUME can: Cost: ~$5000 – Aggregate 100Gb/s as Host Bus Adapter – Be used to create large scale switches Summer Course Technion, Haifa, IL 2015 85
Physical Interface Design • A deployment and interoperability test platform – Permits replacement of physical-layer – Provides high-speed expansion interfaces with standardised interfaces • Allows researchers to design custom daughterboards • Permits closer integration Summer Course Technion, Haifa, IL 2015 86
Power Efficient MAC • A Platform for 100Gb/s power-saving MAC design (e.g. lights-out MAC) • Porting MAC design to SUME permits: – Power measurements – Testing protocol’s response – Reconsideration of power-saving mechanisms – Evaluating suitability for complex architectures and systems Summer Course Technion, Haifa, IL 2015 87
Interconnect • Novel Architectures with line-rate performance – A lot of networking equipment – Extremely complex • NetFPGA SUME allows prototyping a complete solution N x N xN Hyper-cube Summer Course Technion, Haifa, IL 2015 88
How might we use NetFPGA? Well I ’ m not sure about you but here is a list I created: • Build an accurate, fast, line-rate NetDummy/nistnet element • Hardware channel bonding reference implementation • A flexible home-grown monitoring card • TCP sanitizer • Evaluate new packet classifiers • Other protocol sanitizer (applications… UDP DCCP, etc.) – (and application classifiers, and other neat network apps….) • Full and complete Crypto NIC • Prototype a full line-rate next-generation Ethernet-type • IPSec endpoint/ VPN appliance • Trying any of Jon Crowcrofts ’ ideas ( Sourceless IP routing for example) • VLAN reference implementation • Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware) • metarouting implementation • Build an accurate, fast, line-rate NetDummy/nistnet element • Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w) • virtual <pick-something> • Hardware supporting Virtual Routers • intelligent proxy • Check that some brave new idea actually works • application embargo-er • A flexible home-grown monitoring card e.g. Rate Control Protocol (RCP), Multipath TCP, • Layer-4 gateway • toolkit for hardware hashing • h/w gateway for VoIP/SIP/skype • MOOSE implementation • h/w gateway for video conference spaces • IP address anonymization • Evaluate new packet classifiers • security pattern/rules matching • SSL decoding “bump in the wire” • Anti-spoof traceback implementations (e.g. BBN stuff) – (and application classifiers, and other neat network apps….) • Xen specialist nic • IPtv multicast controller • computational co-processor • Intelligent IP-enabled device controller (e.g. IP cameras or IP powerm • Distributed computational co-processor • DES breaker • Prototype a full line-rate next-generation Ethernet-type • IPv6 anything • platform for flexible NIC API evaluations • IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….) • snmp statistics reference implementation • Netflow v9 reference • sflow (hp) reference implementation • Trying any of Jon Crowcrofts ’ ideas ( Sourceless IP routing for example) • PSAMP reference • trajectory sampling (reference implementation) • IPFIX reference • implementation of zeroconf/netconf configuration language for route • Different driver/buffer interfaces (e.g. PFRING) • h/w openflow and (simple) NOX controller in one… • or “escalators” (from gridprobe) for faster network monitors • Demonstrate the wonders of Metarouting in a different implementation (dedicated • Network RAID (multicast TCP with redundancy) • Firewall reference • inline compression hardware) • GPS packet-timestamp things • hardware accelorator for TOR • High-Speed Host Bus Adapter reference implementations • load-balancer – Infiniband • openflow with (netflow , ACL, ….) • Provable hardware (using a C# implementation and kiwi with NetFPGA as target – iSCSI • reference NAT device – Myranet h/w) • active measurement kit – Fiber Channel • network discovery tool • Smart Disk adapter (presuming a direct-disk interface) • passive performance measurement • Software Defined Radio (SDR) directly on the FPGA (probably UWB only) • active sender control (e.g. performance feedback fed to endpoints for • Hardware supporting Virtual Routers • Routing accelerator • Prototype platform for NON-Ethernet or near-Ethernet MACs – Hardware route-reflector – Optical LAN (no buffers) – Internet exchange route accelerator • Check that some brave new idea actually works Summer Course Technion, Haifa, IL 2015 89 e.g. Rate Control Protocol (RCP), Multipath TCP,
How might YOU use NetFPGA? • Build an accurate, fast, line-rate NetDummy/nistnet element • Hardware channel bonding reference implementation • A flexible home-grown monitoring card • TCP sanitizer • Evaluate new packet classifiers • Other protocol sanitizer (applications… UDP DCCP, etc.) – (and application classifiers, and other neat network apps….) • Full and complete Crypto NIC • Prototype a full line-rate next-generation Ethernet-type • IPSec endpoint/ VPN appliance • Trying any of Jon Crowcrofts ’ ideas ( Sourceless IP routing for example) • VLAN reference implementation • Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware) • metarouting implementation • Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w) • virtual <pick-something> • Hardware supporting Virtual Routers • intelligent proxy • Check that some brave new idea actually works • application embargo-er e.g. Rate Control Protocol (RCP), Multipath TCP, • Layer-4 gateway • toolkit for hardware hashing • h/w gateway for VoIP/SIP/skype • MOOSE implementation • h/w gateway for video conference spaces • IP address anonymization • security pattern/rules matching • SSL decoding “bump in the wire” • Anti-spoof traceback implementations (e.g. BBN stuff) • Xen specialist nic • IPtv multicast controller • computational co-processor • Intelligent IP-enabled device controller (e.g. IP cameras or IP powerm • Distributed computational co-processor • DES breaker • IPv6 anything • platform for flexible NIC API evaluations • IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….) • snmp statistics reference implementation • Netflow v9 reference • sflow (hp) reference implementation • PSAMP reference • trajectory sampling (reference implementation) • IPFIX reference • implementation of zeroconf/netconf configuration language for route • Different driver/buffer interfaces (e.g. PFRING) • h/w openflow and (simple) NOX controller in one… • or “escalators” (from gridprobe) for faster network monitors • Network RAID (multicast TCP with redundancy) • Firewall reference • inline compression • GPS packet-timestamp things • hardware accelorator for TOR • High-Speed Host Bus Adapter reference implementations • load-balancer – Infiniband • openflow with (netflow , ACL, ….) – iSCSI • reference NAT device – Myranet • active measurement kit – Fiber Channel • network discovery tool • Smart Disk adapter (presuming a direct-disk interface) • passive performance measurement • Software Defined Radio (SDR) directly on the FPGA (probably UWB only) • active sender control (e.g. performance feedback fed to endpoints for • Routing accelerator • Prototype platform for NON-Ethernet or near-Ethernet MACs – Hardware route-reflector – Optical LAN (no buffers) – Internet exchange route accelerator Summer Course Technion, Haifa, IL 2015 90
Section VII: Example Project: Crypto Switch Summer Course Technion, Haifa, IL 2015 91
Project: Cryptographic Switch Implement a learning switch that encrypts upon transmission and decrypts upon reception Summer Course Technion, Haifa, IL 2015 92
Cryptography XOR function A B A ^ B 0 0 0 XORing a 0 1 1 value with itself always 1 0 1 yields 0 1 1 0 XOR written as: ^ ⊻ ⨁ XOR is commutative: (A ^ B) ^ C = A ^ (B ^ C) Summer Course Technion, Haifa, IL 2015 93
Cryptography (cont.) Simple cryptography: – Generate a secret key – Encrypt the message by XORing the message and key – Decrypt the ciphertext by XORing with the key Explanation: (M ^ K) ^ K = M ^ (K ^ K) Commutativity = M ^ 0 A ^ A = 0 = M Summer Course Technion, Haifa, IL 2015 94
Cryptography (cont.) Example: Message: 00111011 Key: 10110001 Message ^ Key: 10001010 Key: 10110001 Message ^ Key ^ Key: 00111011 Summer Course Technion, Haifa, IL 2015 95
Cryptography (cont.) Idea: Implement simple cryptography using XOR – 32-bit key – Encrypt every word in payload with key Header Payload ⨁ Key Key Key Key Key Note: XORing with a one-time pad of the same length of the message is secure/uncrackable. See: http://en.wikipedia.org/wiki/One-time_pad Summer Course Technion, Haifa, IL 2015 96
implementation goes wild… Summer Course Technion, Haifa, IL 2015 97
What ’ s a core? • “ IP Core ” in Vivado – Standalone Module – Configurable and reuseable • HDL (Verilog/VHDL) + TCL files • Examples: – 10G Port – SRAM Controller – NIC Output port lookup Summer Course Technion, Haifa, IL 2015 98
HDL (Verilog) • NetFPGA cores – AXI-compliant • AXI = Advanced eXtensible Interface – Used in ARM-based embedded systems – Standard interface – AXI4/AXI4-Lite : Control and status interface – AXI4-Stream : Data path interface • Xilinx IPs and tool chains – Mostly AXI-compliant Summer Course Technion, Haifa, IL 2015 99
Scripts (TCL) • Integrated into Vivado toolchain – Supports Vivado-specific commands – Allows to interactively query Vivado • Has a large number of uses: – Create projects – Set properties – Generate cores – Define connectivity – Etc. Summer Course Technion, Haifa, IL 2015 100
Recommend
More recommend