Socket Programming CS457, FA ‘11
What is a socket? ● Basically, a socket is just a file descriptor that your application can use to read and write data from. ● Stream and Datagram sockets provide an interface to the transport layer of the network stack, and implement TCP and UDP, respectively. ● There are other types of sockets, but for this class you won’t need them.
Configuration ● Back in my day... – We had to use the gethostbyname() system call and then manually load the information we got back into network data structures. ● But that’s a pain, so now we use getaddrinfo() , which does all the lookups for us as well as setting up our data structures! – This has the added advantage of being able to handle IPv4 as well as IPv6
Configuration, con’t ● To correctly use getaddrinfo() , you first need to set up struct addrinfo* hints , which contains a bunch of metadata about the connection you’re about to set up – hints->ai_family : Sets the address family you want to use. Use AF_INET for v4, AF_INET6 for v6, and AF_UNSPEC if you don’t care. – hints->ai_socktype: Sets the type of socket you want. Use SOCK_STREAM for TCP, or SOCK_DGRAM for UDP. ● Check the manpage for getaddrinfo() for more.
Create the Socket ● Now that you know what kind of socket you want, you need to create it. – Pretty convenient that there’s a call for that... ● The socket() system call takes parameters for the address family, socket type (stream/datagram), and protocol (TCP/UDP). – You can supply these manually, but it is better to use the values returned from getaddrinfo() . ● socket() returns an int that represents the file descr.
Bind the Socket ● When you write a server, you need to associate a socket (which now has an address and protocol) with a specific port number. – Clients don’t need to do this (but they can). ● This is accomplished by calling bind() and passing it the socket file descriptor (henceforth “fd”) and the address and port number obtained from getaddrinfo() . – Don’t try to bind to a port below 1024!
Waiting for a Connection ● Once your server’s socket is bound, you need to tell it that it will be listening for and accepting connections from remote hosts. – Whaddya know, there’s calls for those too... ● The listen() call takes the socket fd and the number of incoming connections it can queue up. ● The accept() call takes the socket fd and a pointer to a network data structure (which is used to store data about the incoming connection) and returns a new fd
Connecting to a Server ● In your client’s case, you don’t need to listen for connections. – You don’t even need to bind to a specific port; when you open a connection, your OS will choose a port for you and inform the remote server of it. ● You do, however, need to open a TCP connection. ● Use the connect() call, which takes your socket fd and the destination’s address and port (both obtained from getaddrinfo() ).
Communication over the Connection ● Once the (TCP) connection is established, both processes use the same calls to send and receive data. – UDP uses different calls, which we’ll discuss shortly. ● send() and recv() need to know which fd to write to, what to send, and how much to send. – They both return how much they successfully sent, which may differ from how much you actually told them to send.
Communication without a Connection ● When using UDP sockets, you don’t call connect() , so you don’t have a priori knowledge of the server’s address and port. – You still get it the same way though, good old getaddrinfo() . ● Instead of using send()/recv() , you call sendto()/recvfrom() , which are basically identical except you have to provide the address and port of the host you’re sending to or receiving from, respectively.
Termination ● Once you are done with your socket, just like any other open file, you should close it. ● The close() call takes the fd and closes it. – Reading from or writing to this socket will now throw an error. ● There is also a shutdown() call that takes an additional flag to specify allowing some additional reads or writes before closing the fd. – You still have to call close(), so just skip shutdown()
Error Checking ● EVERY SINGLE ONE of these system calls returns something if it fails, usually -1 . ● You must check EVERY SINGLE ONE. ● For finer-grained debugging/error reporting, these calls will also set a system variable called errno , which you can check with the perror() call. – To use errno , # include <errno.h> – This is not required, but may help in debugging.
setsockopt() ● If you shut down your server and try to restart it, you may get an error saying that “the address is already in use” – This is because your bound socket is still resident in the OS’ file table. ● You can avoid this using the setsockopt() function, which provides a bunch of functionality. – Check Beej’s Guide or the manpage for details/examples.
Conversion Functions ● Some systems store bytes of numbers backwards. – i.e. the hex number 0xfb68 might appear in memory as 0x68fb (little-endian), or 0xfb68 (big-endian) ● In networking, we use big-endian representation, but can’t guarantee that the host systems themselves do. ● So we’ve got neat conversion library functions! – htons(), htonl(), ntohl(), ntohs() ● Rule of Thumb: Use hto__() before writing to the wire, and nto__() when reading off the wire.
Recommend
More recommend