Search:

PmWiki

pmwiki.org

edit SideBar

Main / Protocols

Tunable TCP settings can be found on /proc/sys/net/ipv4
http://unixfoo.blogspot.com/2008/09/linux-tunable-tcp-proc-parameters.html

TCP state diagram:
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.halu101/constatus.htm

How is TCP a "streaming" protocol?

TCP is stream oriented because it is able to assemble data in contiguous format. E.g. you had data from number 1 to 4000 bytes. Now it will be divided into tcp segments where each segment would have a sequence number say first is 1-1200 byte, second is 1201 - 2400 and so on. It might be delivered out of order while being sent through ip datagrams but is assembled into contiguous data latter, thereby appearing as a stream. The sequence number helps to reorder packets.

A byte stream consist of one big chunk of data with no segments or other irregularities. With datagrams (smaller) data chunks are send and received at once as a whole. In practice it means that with datagrams each send/write call sends one packet, and each read/recv call receives one packet, while with stream protocol the data can be sent and received in any way. E.g. A sender can call send() ten times, while the receiver receives all that data with one recv call. With datagrams ten send calls means ten packets and ten receive calls.

For socket use, some have simply said:

  • datagram = A connectionless, message-oriented socket
  • stream = A stream-oriented socket with the concept of a connection

Socket Programming

A socket is a software endpoint that establishes bidirectional communication between a server program and one or more client programs. The socket associates the server program with a specific hardware port on the machine where it runs so any client program anywhere in the network with a socket associated with that same port can communicate with the server program.

A socket can be set up so that the connect() call is either blocking or nonblocking. The setting is on the socket, not a connect() parameter. For a timeout, use select().

socket/connect primer: https://idea.popcount.org/2014-04-03-bind-before-connect/
TCP flags: http://amits-notes.readthedocs.io/en/latest/networking/tcpdump.html

Irrespective of stateful or stateless protocols, two clients can connect to the same server port because for each client we can assign a different socket (as the client IP will definitely differ). The same client can also have two sockets connecting to the same server port - since such sockets differ by SRC-PORT.

What is 0.0.0.0?

This means all IP addresses on the local machine (all configured network interfaces). Note that when you run a TCP server, you should use this address to allow all connections from other machines.

What is 127.0.0.1?

This is only the loopback interface on the local machine. Note that when you run a TCP server, this address will mean it only accepts connections from the local host and not other machines.

Learning about established sockets

We can get file descriptor information from /proc/<PID>/fd/

ls -al /proc/26782/fd/
total 0
dr-x------    2 root     root             0 Mar  6 22:33 .
dr-xr-xr-x    7 root     root             0 Mar  1 02:51 ..
lr-x------    1 root     root            64 Mar  6 22:33 0 -> /dev/null
lrwx------    1 root     root            64 Mar  6 22:33 1 -> /dev/console
lrwx------    1 root     root            64 Mar  6 22:33 10 -> socket:[1009512]
lrwx------    1 root     root            64 Mar  6 22:33 11 -> socket:[246262]
lrwx------    1 root     root            64 Mar  6 22:33 2 -> /dev/console
lr-x------    1 root     root            64 Mar  6 22:33 3 -> /usr/bin/compile_time
lrwx------    1 root     root            64 Mar  6 22:33 4 -> /dev/fpga_mem
lr-x------    1 root     root            64 Mar  6 22:33 5 -> /e[==]tc/config/Config.ini
lrwx------    1 root     root            64 Mar  6 22:33 6 -> /dev/axi_fpga_dev
lrwx------    1 root     root            64 Mar  6 22:33 7 -> /dev/fpga_mem
lr-x------    1 root     root            64 Mar  6 22:33 8 -> /e[==]tc/config/Config.ini
lrwx------    1 root     root            64 Mar  6 22:33 9 -> socket:[18898]

FD 0,1,2 are standard of course. The [] bracket numbers for the sockets are the inodes. How to tell what the ports are for these sockets?

We need to go look at /proc/<PID>/net/tcp:

:~# cat /proc/26782/net/tcp | grep 1009512
  27: EE0010AC:0FBC 530010AC:C5EA 01 00000000:00000000 00:00000000 00000000     0        0 1009512 1 e5550ac0 20 0 0 10 -1
:~# cat /proc/26782/net/tcp | grep 246262
  66: EE0010AC:BCDA 040010AC:0D05 01 00000000:00000000 02:00000018 00000000     0        0 246262 2 e5406040 20 4 30 10 -1
:~# cat /proc/26782/net/tcp | grep 18898
   3: 00000000:0FBC 00000000:0000 0A 00000000:00000065 02:00000001 00000000     0        0 18898 2 e3949040 100 0 0 10 0

Now we have hex encoded IP:port pairs that help us identify the following listings from netstat -peanut:

tcp      101      0 0.0.0.0:4028            0.0.0.0:*               LISTEN      26782/bmminer
tcp        0      0 172.16.0.238:4028       172.16.0.83:50666       ESTABLISHED 26782/bmminer
tcp        0      0 172.16.0.238:48346      172.16.0.4:3333         ESTABLISHED 26782/bmminer

Keep Alive

There isn't really an automated way to know if the other end of your TCP connection has bailed, excepting it shuts down gracefully and sends you a close. What you can do is modify the timing of keep-alive messages. Here's a good write-up about it: https://stackoverflow.com/questions/5435098/how-to-use-so-keepalive-option-properly-to-detect-that-the-client-at-the-other-e

You can't "check" the status of the socket from userspace with keepalive. Instead, the kernel is simply more aggressive about forcing the remote end to acknowledge packets, and determining if the socket has gone bad. When you attempt to write to the socket, you will get a SIGPIPE if keepalive has determined remote end is down.

There is a way though to use poll and recv on an open socket to infer the state of the server for the client. Here's an implementation with a reconnect method:

#ifdef __unix__

#include "TcpConnectionImpl.h"

#include <string>
#include <exception>
#include <cmath>
#include <cerrno>

#include <unistd.h>
#include <fcntl.h>
#include <poll.h>
#include <sys/ioctl.h>

#include <netinet/in.h>
#include <netinet/tcp.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
#include <netdb.h>

TcpConnectionImpl::TcpConnectionImpl(struct addrinfo const *remote)
	{
	m_socket = socket(remote->ai_family, remote->ai_socktype, remote->ai_protocol);
	if (m_socket == INVALID_SOCKET)
		throw std::runtime_error("unable to allocate socket");

	if (connect(m_socket, remote->ai_addr, remote->ai_addrlen) == -1)
		{
		int err = errno;
		printf("connect returned error %d\n",err);
		fflush(stdout);
		close(m_socket);
		throw std::runtime_error("unable to connect to TCP server");
		}

	// Make it non-blocking so we can control timeouts via select() calls
	unsigned long val = 1;
	if (ioctl (m_socket,FIONBIO,&val) < 0)
		{
		close(m_socket);
		throw std::runtime_error("unable to set socket to non-blocking");
		}

	}

TcpConnectionImpl::~TcpConnectionImpl()
	{
	if (m_socket != INVALID_SOCKET)
		close(m_socket);
	}


/*!
 * \return true if the device is "connected", for some device-specific definition of connected
 *
 * \remark
 * This routine is necessarily device specific.  Serial ports might return true if
 * the serial port is open and ready for business.  TCP streams can return the
 * connection state.
 */
#include <bsp.h>
bool TcpConnectionImpl::is_connected()
	{
        bool state = true;       

        if (m_socket == INVALID_SOCKET) {
            state = false;
        }
        else {
            // This procedure tests the connection to see if someone is on the other end

            // use the poll system call to be notified about socket status changes
            struct pollfd pfd;
            pfd.fd = m_socket;
            pfd.events = POLLIN | POLLHUP | POLLRDNORM;
            pfd.revents = 0;
            // call poll with a timeout of 500 ms
            if (poll(&pfd, 1, 500) > 0) {
                // if result > 0, this means that there is either data available on the
                // socket, or the socket has been closed
                char buffer[32];
                if (recv(m_socket, buffer, sizeof(buffer), MSG_PEEK | MSG_DONTWAIT) <= 0) {
                    // if recv returns zero, that means the connection has been closed:
                    //bsp::debug_printf(".");
                    state = false;
                    close(m_socket);
                    m_socket = INVALID_SOCKET;
                }
            }

        } //else

        return state;
	}

/*!
 * /brief  Duplicate the connect process in the constructor in case the server has hung up 
 *
 */
void TcpConnectionImpl::reconnect(struct addrinfo const *remote)
{
    // closing and invalidating the socket will keep read and write from mucking around with unready new socket
	if (m_socket != INVALID_SOCKET) {
		close(m_socket);
        m_socket = INVALID_SOCKET;
    }

    int new_socket = socket(remote->ai_family, remote->ai_socktype, remote->ai_protocol);
	if (new_socket == INVALID_SOCKET)
		throw std::runtime_error("unable to allocate socket");

	if (connect(new_socket, remote->ai_addr, remote->ai_addrlen) == -1)
		{
		int err = errno;
		printf("connect returned error %d\n",err);
		fflush(stdout);
		close(new_socket);
		throw std::runtime_error("unable to connect to TCP server");
		}

	// Make it non-blocking so we can control timeouts via select() calls
	unsigned long val = 1;
	if (ioctl (new_socket,FIONBIO,&val) < 0)
		{
		close(new_socket);
		throw std::runtime_error("unable to set socket to non-blocking");
		}

    // socket is now ready to use for read and write
    m_socket = new_socket;
}


/*!
 * Read at least one byte, and whatever else might be immediately available from the input stream
 *
 * \param [out]	bufp	Destination buffer for data bytes
 * \param [in]	nbytes	Maximum number of bytes to read
 * \param [in]	timeout	Maximum time to wait for the first byte to arrive
 *
 * \return	Number of bytes actually read
 */
size_t TcpConnectionImpl::read_at_least_one(void *bufp, size_t nbytes, rtos::steady_clock::duration timeout)
	{
	int n;
	struct timeval tv;
	fd_set rfd;

	FD_ZERO(&rfd);
	FD_SET(m_socket,&rfd);

	tv.tv_sec = timeout.count() / 1000;
	tv.tv_usec = (timeout.count() % 1000) * 1000;

    if (m_socket == INVALID_SOCKET)
        return 0;

	n = select(m_socket+1,&rfd,NULL,NULL,&tv);
	if (n < 0)
		return 0;
	if (n == 0)
		return 0;	// no data to read

	// read whatever we have available
	//return recv(m_socket,reinterpret_cast<char *>(bufp),nbytes,0);

    // we don't want to send a -1 to the process method if there's an error                                                        
    size_t readbytes = recv(m_socket,reinterpret_cast<char *>(bufp),nbytes,0);
    if (readbytes == -1) {
        int err = errno;
        printf("recv error %d\n",err);
        return 0;
    }

    return readbytes;
	}


/*!
 * Write data to the output stream
 *
 * \param [in]	bufp	Buffer with bytes to write
 * \param [in]	nbytes	How many bytes to write
 * \param [in]	timeout	Maximum time to wait for data to be written (total time, not inter-byte time)
 *
 * \return	Number of bytes actually written
 */
size_t TcpConnectionImpl::write(void const *bufp, size_t nbytes, rtos::steady_clock::duration timeout)
	{
	char const *src = static_cast<char const *>(bufp);
	size_t nwritten = 0;
	rtos::steady_countdown_timer timer(timeout);

    if (m_socket == INVALID_SOCKET)
        return 0;

	while (nwritten < nbytes)
		{
		struct pollfd pfd[1];
		pfd[0].fd = m_socket;
		pfd[0].events = POLLOUT;
		pfd[0].revents = 0;
		auto ms_remaining = std::chrono::duration_cast<std::chrono::milliseconds>(timer.remaining()).count();
		int npoll = poll(pfd,1,ms_remaining);
		if (npoll <= 0)
			break;

        int n = send(m_socket, src+nwritten, nbytes-nwritten, MSG_NOSIGNAL);
		if (n <= 0)
			break;

		nwritten += n;

		if (timer.is_expired())
			break;
		}

	return nwritten;
	}

#endif // __unix__

SNMP

v3 introducted some essential security components: Encryption (DES-56), Message integrity, and Authentication (MD5 or SHA)

Security

Encryption takes place first, followed by authentication/integrity. Think about a channel and the point of a parity bit calculation - it's the last thing you would do when sending, and the first by the receiver.

IPSec is engaged at kernel level, so you must have a kernel IPSec API for the user level code to pass parameters through.

Packet Synchronization

Usually sync markers are used to denote the beginning of a frame. Here's a basic scheme:

  • During packet processing at whatever level, when the end of a packet is complete per the given length, the next sync marker is sought
  • Non-sync bytes are ignored until sync is found
  • The sequence number is verified and the length field is checked for range if applicable
  • If these checks fail, the packet is “dropped” i.e. the bytes are tossed until another sync pattern is found
  • A bogus sync pattern found in data of a bad packet will kick off new header processing, which proceeds until one of the header/trailer validation checks fail, then sync pattern search will restart
  • A sync pattern found within the payload of a valid packet (i.e. before end reached per length field) is processed as data

Page last modified on January 24, 2024, at 01:22 PM