Optimizing TCP for High-Performance Applications: An HFT Developer's Guide

Introduction After spending over a decade in high-frequency trading (HFT) environments where microseconds can translate to millions in profit or loss, I've developed a deep appreciation for network optimization. Throughout my career building ultra-low-latency trading systems, I've collected, tested, and refined numerous TCP optimization techniques that have proven invaluable. This guide consolidates years of practical experience tuning TCP connections in production environments where performance is non-negotiable. I'm sharing these techniques with hope that it helps others squeeze maximum performance from their networked applications. What follows is a comprehensive collection of TCP socket optimizations for both Linux and Windows platforms, with concrete C++ examples, verification commands, and explanations of why each setting matters. Whether you're building HFT systems, game servers, IoT applications, or any performance-critical networked software, these techniques should provide tangible benefits. Let's dive in! Table of Contents Socket Options for Low Latency Network Interface Configuration Operating System TCP Tuning Application Design Considerations Performance Testing Complete Code Example Socket Options for Low Latency 1. TCP_NODELAY (Disable Nagle's Algorithm) Purpose: Disables Nagle's algorithm, which buffers small packets to reduce overhead. While efficient for bandwidth, it introduces latency by delaying transmission. Check current setting: Linux: # For an established connection with PID 1234 cat /proc/1234/net/tcp | grep # Look for the flags column - Nagle is disabled if the "1" flag is set Windows: # Use netstat to find connections, but Windows doesn't expose Nagle setting directly netstat -ano | findstr Set option in C++17: // Same for both Linux and Windows #include // Linux #include // Windows int flag = 1; int result = setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, (char*)&flag, sizeof(flag)); if (result

Apr 17, 2025 - 01:26
 0
Optimizing TCP for High-Performance Applications: An HFT Developer's Guide

Introduction

After spending over a decade in high-frequency trading (HFT) environments where microseconds can translate to millions in profit or loss, I've developed a deep appreciation for network optimization. Throughout my career building ultra-low-latency trading systems, I've collected, tested, and refined numerous TCP optimization techniques that have proven invaluable.

This guide consolidates years of practical experience tuning TCP connections in production environments where performance is non-negotiable. I'm sharing these techniques with hope that it helps others squeeze maximum performance from their networked applications.

What follows is a comprehensive collection of TCP socket optimizations for both Linux and Windows platforms, with concrete C++ examples, verification commands, and explanations of why each setting matters. Whether you're building HFT systems, game servers, IoT applications, or any performance-critical networked software, these techniques should provide tangible benefits.

Let's dive in!

Table of Contents

  1. Socket Options for Low Latency
  2. Network Interface Configuration
  3. Operating System TCP Tuning
  4. Application Design Considerations
  5. Performance Testing
  6. Complete Code Example

Socket Options for Low Latency

1. TCP_NODELAY (Disable Nagle's Algorithm)

Purpose:

Disables Nagle's algorithm, which buffers small packets to reduce overhead. While efficient for bandwidth, it introduces latency by delaying transmission.

Check current setting:

Linux:

# For an established connection with PID 1234
cat /proc/1234/net/tcp | grep 
# Look for the flags column - Nagle is disabled if the "1" flag is set

Windows:

# Use netstat to find connections, but Windows doesn't expose Nagle setting directly
netstat -ano | findstr <PORT>

Set option in C++17:

// Same for both Linux and Windows
#include   // Linux
#include      // Windows

int flag = 1;
int result = setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, 
                        (char*)&flag, sizeof(flag));
if (result < 0) {
    // Handle error
    std::cerr << "Failed to set TCP_NODELAY: " << strerror(errno) << std::endl;
}

// To check if it's set
int optval;
socklen_t optlen = sizeof(optval);
getsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, (char*)&optval, &optlen);
std::cout << "TCP_NODELAY is " << (optval ? "enabled" : "disabled") << std::endl;

2. TCP_QUICKACK (Linux Only)

Purpose:

Disables delayed acknowledgments, ensuring ACKs are sent immediately. This reduces response time in interactive applications.

Check current setting:

# This is dynamic and not directly viewable, but can be inferred from:
cat /proc/sys/net/ipv4/tcp_delack_min
# Lower values mean ACKs are sent more quickly

Set option in C++17:

#ifdef __linux__
    int flag = 1;
    int result = setsockopt(sockfd, IPPROTO_TCP, TCP_QUICKACK, 
                            &flag, sizeof(flag));
    if (result < 0) {
        std::cerr << "Failed to set TCP_QUICKACK: " << strerror(errno) << std::endl;
    }

    // Note: TCP_QUICKACK may need to be re-applied after each read operation
    // for persistent effect
#endif

3. Buffer Sizes (SO_RCVBUF / SO_SNDBUF)

Purpose:

Configures socket buffer sizes to optimize for your specific network conditions and workload patterns.

Check current settings:

Linux:

# For system defaults
cat /proc/sys/net/core/rmem_default
cat /proc/sys/net/core/wmem_default
cat /proc/sys/net/core/rmem_max
cat /proc/sys/net/core/wmem_max

Windows:

# Windows doesn't expose these directly via command line
# Use netsh command to view global TCP parameters:
netsh int tcp show global

Set option in C++17:

// Set receive buffer size (both platforms)
int rcvbuf = 262144; // 256 KB
int result = setsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, 
                        (char*)&rcvbuf, sizeof(rcvbuf));
if (result < 0) {
    std::cerr << "Failed to set SO_RCVBUF: " << strerror(errno) << std::endl;
}

// Set send buffer size (both platforms)
int sndbuf = 262144; // 256 KB
result = setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, 
                   (char*)&sndbuf, sizeof(sndbuf));
if (result < 0) {
    std::cerr << "Failed to set SO_SNDBUF: " << strerror(errno) << std::endl;
}

// To verify the settings
int actual_rcvbuf;
int actual_sndbuf;
socklen_t size = sizeof(int);
getsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, (char*)&actual_rcvbuf, &size);
getsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, (char*)&actual_sndbuf, &size);
std::cout << "Actual receive buffer: " << actual_rcvbuf << std::endl;
std::cout << "Actual send buffer: " << actual_sndbuf << std::endl;

// Note: Linux often doubles the value you set, as it accounts for bookkeeping overhead

4. TCP_FASTOPEN (Linux)

Purpose:

Reduces connection setup latency by allowing data transfer during the initial handshake.

Check current setting:

cat /proc/sys/net/ipv4/tcp_fastopen
# Values: 0 (disabled), 1 (client), 2 (server), 3 (both)

Set option in C++17:

#ifdef __linux__
    // For server sockets
    int qlen = 5; // Maximum number of pending TFO connection requests
    int result = setsockopt(listen_fd, IPPROTO_TCP, TCP_FASTOPEN, 
                           &qlen, sizeof(qlen));
    if (result < 0) {
        std::cerr << "Failed to set TCP_FASTOPEN: " << strerror(errno) << std::endl;
    }

    // For client sockets, use sendto() with MSG_FASTOPEN flag instead of connect()+send()
    // Example:
    // sendto(sockfd, data, data_len, MSG_FASTOPEN, 
    //       (struct sockaddr*)&server_addr, sizeof(server_addr));
#endif

5. SO_REUSEADDR and SO_REUSEPORT

Purpose:

SO_REUSEADDR allows binding to a local port even if it's in TIME_WAIT state.
SO_REUSEPORT (Linux) allows multiple sockets to bind to the same port, enabling load distribution.

Check current setting:
These are per-socket options and not visible system-wide.

Set option in C++17:

// SO_REUSEADDR (both platforms)
int reuse = 1;
int result = setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, 
                       (char*)&reuse, sizeof(reuse));
if (result < 0) {
    std::cerr << "Failed to set SO_REUSEADDR: " << strerror(errno) << std::endl;
}

// SO_REUSEPORT (Linux only)
#ifdef __linux__
    int reuseport = 1;
    result = setsockopt(sockfd, SOL_SOCKET, SO_REUSEPORT, 
                       &reuseport, sizeof(reuseport));
    if (result < 0) {
        std::cerr << "Failed to set SO_REUSEPORT: " << strerror(errno) << std::endl;
    }
#endif

6. SO_PRIORITY (Linux)

Purpose:

Sets the priority of the socket's traffic for Quality of Service.

Check current setting:
This is a per-socket setting not directly visible.

Set option in C++17:

#ifdef __linux__
    // Values range from 0 to 6, with 6 being highest priority
    int priority = 6; // High priority for low-latency traffic
    int result = setsockopt(sockfd, SOL_SOCKET, SO_PRIORITY, 
                           &priority, sizeof(priority));
    if (result < 0) {
        std::cerr << "Failed to set SO_PRIORITY: " << strerror(errno) << std::endl;
    }
#endif

7. TCP_USER_TIMEOUT (Linux)

Purpose:

Specifies the maximum time that transmitted data may remain unacknowledged before the connection is forcibly closed.

Check current setting:

# Not directly visible from proc but can be set system-wide
cat /proc/sys/net/ipv4/tcp_retries2
# This relates to the default timeout behavior

Set option in C++17:

#ifdef __linux__
    // Set timeout to 30 seconds (in milliseconds)
    int timeout = 30000;
    int result = setsockopt(sockfd, IPPROTO_TCP, TCP_USER_TIMEOUT, 
                           &timeout, sizeof(timeout));
    if (result < 0) {
        std::cerr << "Failed to set TCP_USER_TIMEOUT: " << strerror(errno) << std::endl;
    }
#endif

8. TCP_CONGESTION (Linux)

Purpose:

Selects the congestion control algorithm. For local networks, more aggressive algorithms can improve performance.

Check current setting:

# Check system default
cat /proc/sys/net/ipv4/tcp_congestion_control

# List available algorithms
cat /proc/sys/net/ipv4/tcp_available_congestion_control

Set option in C++17:

#ifdef __linux__
    // BBR is often good for high-speed networks
    char algo[16] = "bbr";
    int result = setsockopt(sockfd, IPPROTO_TCP, TCP_CONGESTION, 
                           algo, strlen(algo));
    if (result < 0) {
        std::cerr << "Failed to set TCP_CONGESTION: " << strerror(errno) << std::endl;
    }

    // To verify
    char current_algo[16];
    socklen_t optlen = sizeof(current_algo);
    getsockopt(sockfd, IPPROTO_TCP, TCP_CONGESTION, current_algo, &optlen);
    std::cout << "Current congestion algorithm: " << current_algo << std::endl;
#endif

9. TCP_KEEPALIVE Options

Purpose:

Maintains connections during periods of inactivity and detects dead peers more quickly.

Check current settings:

# Linux
cat /proc/sys/net/ipv4/tcp_keepalive_time
cat /proc/sys/net/ipv4/tcp_keepalive_intvl
cat /proc/sys/net/ipv4/tcp_keepalive_probes

# Windows
# Registry: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

Set option in C++17:

// Enable keepalive (both platforms)
int keepalive = 1;
int result = setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, 
                       (char*)&keepalive, sizeof(keepalive));
if (result < 0) {
    std::cerr << "Failed to set SO_KEEPALIVE: " << strerror(errno) << std::endl;
}

#ifdef __linux__
    // Set time before sending keepalive probes (75 seconds)
    int keepidle = 75;
    result = setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPIDLE, 
                       &keepidle, sizeof(keepidle));

    // Set interval between keepalive probes (15 seconds)
    int keepintvl = 15;
    setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPINTVL, 
              &keepintvl, sizeof(keepintvl));

    // Set number of probes before connection is considered dead
    int keepcnt = 3;
    setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPCNT, 
              &keepcnt, sizeof(keepcnt));
#endif

#ifdef _WIN32
    // Windows uses a single structure
    struct tcp_keepalive keepalive_vals;
    keepalive_vals.onoff = 1;
    keepalive_vals.keepalivetime = 75000; // 75 seconds in milliseconds
    keepalive_vals.keepaliveinterval = 15000; // 15 seconds in milliseconds

    DWORD bytes_returned = 0;
    WSAIoctl(sockfd, SIO_KEEPALIVE_VALS, &keepalive_vals, 
            sizeof(keepalive_vals), NULL, 0, &bytes_returned, NULL, NULL);
#endif

Network Interface Configuration

1. Jumbo Frames (MTU)

Purpose:

Increases Maximum Transmission Unit size to reduce overhead for large data transfers.

Check current settings:

Linux:

ip link show | grep mtu
# Or for a specific interface
ip link show eth0 | grep mtu

Windows:

netsh interface ipv4 show subinterfaces

Change MTU:

Linux:

# Temporarily
sudo ip link set dev eth0 mtu 9000

# Permanently (Ubuntu/Debian)
# Edit /etc/network/interfaces
# Add: mtu 9000 to the interface configuration

# Permanently (RHEL/CentOS)
# Edit /etc/sysconfig/network-scripts/ifcfg-eth0
# Add: MTU=9000

Windows:

# Check if NIC supports Jumbo Frames in device properties first
netsh interface ipv4 set subinterface "Ethernet" mtu=9000 store=persistent

Note:

  • All devices on the network segment must support the same MTU
  • Switches and routers must be configured for jumbo frames

2. Interrupt Coalescing

Purpose:

Controls how frequently the NIC generates interrupts. Disabling or reducing coalescing improves latency at the cost of CPU usage.

Check current settings:

Linux:

ethtool -c eth0

Windows:

# Check in Device Manager > Network Adapters > [Your adapter] > Properties > Advanced
# Look for "Interrupt Moderation" or similar setting

Change settings:

Linux:

# Disable coalescing for lowest latency
sudo ethtool -C eth0 rx-usecs 0 tx-usecs 0

# Or set a low value
sudo ethtool -C eth0 rx-usecs 16 tx-usecs 16

# Make permanent by adding to /etc/network/interfaces or similar

Windows:

  • Through Device Manager UI
  • Or programmatically via WMI/PowerShell

3. Receive Side Scaling (RSS)

Purpose:

Distributes network processing across multiple CPU cores.

Check current settings:

Linux:

ethtool -l eth0  # Shows channels
ethtool -x eth0  # Shows current RSS configuration

Windows:

Get-NetAdapterRss

Change settings:

Linux:

# Enable maximum number of channels
sudo ethtool -L eth0 combined 

# For permanent changes, add to network configuration

Windows:

Set-NetAdapterRss -Name "Ethernet" -BaseProcessorNumber 0 -MaxProcessorNumber <num_cores-1>

4. Interrupt Affinity

Purpose:

Binds network card interrupts to specific CPU cores for better cache usage.

Check current settings:

Linux:

cat /proc/interrupts | grep eth

Change settings:

Linux:

# Find your NIC's IRQ number from /proc/interrupts
IRQ=

# Bind to core 1 (second core)
echo 2 > /proc/irq/$IRQ/smp_affinity

# Or for more precise control (bitmask)
echo "1" > /proc/irq/$IRQ/smp_affinity_list

Windows:

  • Use the Interrupt Affinity Policy Tool from Microsoft or the network adapter UI

Operating System TCP Tuning

1. TCP Window Scaling and Buffer Sizes

Purpose:

Increase maximum buffer sizes to improve throughput, especially on high bandwidth-delay product networks.

Check current settings:

Linux:

sysctl net.core.rmem_max
sysctl net.core.wmem_max
sysctl net.ipv4.tcp_rmem
sysctl net.ipv4.tcp_wmem
sysctl net.ipv4.tcp_window_scaling

Windows:

netsh int tcp show global

Change settings:

Linux:

# Temporarily
sudo sysctl -w net.core.rmem_max=16777216
sudo sysctl -w net.core.wmem_max=16777216
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"
sudo sysctl -w net.ipv4.tcp_window_scaling=1

# Permanently in /etc/sysctl.conf or /etc/sysctl.d/99-network-tuning.conf
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_window_scaling = 1

Windows:

# Enable auto-tuning (Windows does a good job automatically)
netsh int tcp set global autotuninglevel=normal

# For specific settings
netsh int tcp set global rss=enabled
netsh int tcp set global chimney=disabled
netsh int tcp set global dca=enabled

2. TCP Connection Setup Options

Purpose:

Tune parameters that affect connection establishment and teardown.

Check current settings:

Linux:

sysctl net.ipv4.tcp_fin_timeout
sysctl net.ipv4.tcp_tw_reuse
sysctl net.ipv4.tcp_max_syn_backlog

Change settings:

Linux:

# Faster connection reuse
sudo sysctl -w net.ipv4.tcp_fin_timeout=15
sudo sysctl -w net.ipv4.tcp_tw_reuse=1

# Larger backlog for busy servers
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=16384

3. IRQ Balance and NUMA Settings

Purpose:

Optimize interrupt handling and memory access patterns on multi-CPU systems.

Check current settings:

Linux:

# Check if irqbalance is running
ps aux | grep irqbalance

# Check NUMA configuration
numactl --hardware

Change settings:

Linux:

# For low-latency applications, you might want to disable irqbalance
sudo systemctl stop irqbalance
sudo systemctl disable irqbalance

# For NUMA-aware applications
# Use numactl to pin your application to the same NUMA node as your NIC
numactl --cpunodebind=0 --membind=0 ./your_application

Application Design Considerations

1. Persistent Connections

Rather than establishing new connections for each data exchange, maintain persistent connections:

// Example connection pool pattern
class ConnectionPool {
private:
    std::vector<int> sockets;
    std::mutex pool_mutex;

public:
    ConnectionPool(const std::string& host, int port, int pool_size) {
        for (int i = 0; i < pool_size; i++) {
            int sockfd = create_and_connect(host, port);
            apply_socket_options(sockfd);  // Apply all optimizations
            sockets.push_back(sockfd);
        }
    }

    int acquire_connection() {
        std::lock_guard<std::mutex> lock(pool_mutex);
        if (sockets.empty()) {
            return -1;  // No connection available
        }
        int sockfd = sockets.back();
        sockets.pop_back();
        return sockfd;
    }

    void release_connection(int sockfd) {
        std::lock_guard<std::mutex> lock(pool_mutex);
        sockets.push_back(sockfd);
    }

    // Other methods for connection management
};

2. Memory Alignment

Align data structures to cache lines to reduce memory access overhead:

struct alignas(64) AlignedPacket {
    // 64 bytes is a common cache line size
    uint32_t header;
    char data[60];  // Total size: 64 bytes
};

3. Zero-Copy Techniques

Reduce memory copies to improve performance:

#ifdef __linux__
    // Using sendfile() for zero-copy transfer (Linux)
    off_t offset = 0;
    sendfile(sockfd, filefd, &offset, file_size);

    // Or using splice() for pipe-to-socket transfers
    splice(pipefd[0], NULL, sockfd, NULL, length, SPLICE_F_MOVE);
#endif

#ifdef _WIN32
    // Windows has TransmitFile for zero-copy
    TransmitFile(sockfd, filefd, 0, 0, NULL, NULL, TF_USE_KERNEL_APC);
#endif

4. Optimizing Message Batching

Balance between latency and throughput:

class MessageBatcher {
private:
    std::vector<char> buffer;
    size_t threshold;
    std::chrono::steady_clock::time_point last_flush;
    std::chrono::milliseconds max_delay;
    int sockfd;

public:
    MessageBatcher(int sockfd, size_t threshold, std::chrono::milliseconds max_delay)
        : sockfd(sockfd), threshold(threshold), max_delay(max_delay) {
        buffer.reserve(threshold);
        last_flush = std::chrono::steady_clock::now();
    }

    void add_message(const char* data, size_t len) {
        // Check if we need to flush based on time
        auto now = std::chrono::steady_clock::now();
        if (now - last_flush > max_delay) {
            flush();
        }

        // Add to buffer
        buffer.insert(buffer.end(), data, data + len);

        // Check if we need to flush based on size
        if (buffer.size() >= threshold) {
            flush();
        }
    }

    void flush() {
        if (buffer.empty()) return;

        // Send the batch
        send(sockfd, buffer.data(), buffer.size(), 0);
        buffer.clear();
        last_flush = std::chrono::steady_clock::now();
    }
};

Performance Testing

1. Basic Throughput Testing

Network throughput:

Linux:

# Install iperf3
sudo apt install iperf3  # Debian/Ubuntu
sudo yum install iperf3  # RHEL/CentOS

# Server side
iperf3 -s

# Client side
iperf3 -c server_ip -P 4  # 4 parallel connections

Windows:

# Download and install iperf3 for Windows
# Run as server
iperf3.exe -s

# Run as client
iperf3.exe -c server_ip -P 4

2. Latency Testing

Linux:

# Basic ping test
ping -c 100 server_ip

# More detailed with qperf
qperf server_ip tcp_lat

Windows:

ping -n 100 server_ip

3. Full Application Testing

Build a simple test tool to measure round-trip times:

#include 
#include 
#include 
#include   // Linux
#include      // Linux
// #include   // Windows

void test_rtt(int sockfd, int iterations) {
    const char* test_data = "PING";
    char buffer[64];
    double total_ms = 0.0;

    for (int i = 0; i < iterations; i++) {
        auto start = std::chrono::high_resolution_clock::now();

        // Send data
        send(sockfd, test_data, 4, 0);

        // Receive response
        recv(sockfd, buffer, sizeof(buffer), 0);

        auto end = std::chrono::high_resolution_clock::now();
        auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
        double ms = duration.count() / 1000.0;

        total_ms += ms;
        std::cout << "RTT " << i << ": " << ms << " ms" << std::endl;
    }

    std::cout << "Average RTT: " << (total_ms / iterations) << " ms" << std::endl;
}

Complete Code Example

Here's a complete C++17 example that incorporates many of the optimizations discussed:


cpp
#include 
#include 
#include 
#include 
#include 
#include 

#ifdef _WIN32
    #include 
    #include 
    #pragma comment(lib, "ws2_32.lib")
    typedef int socklen_t;
    #define close closesocket
#else
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
    #include 
#endif

class OptimizedSocket {
private:
    int sockfd = -1;
    bool is_server = false;

    void apply_common_options() {
        // TCP_NODELAY (Disable Nagle's algorithm)
        int flag = 1;
        setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, (char*)&flag, sizeof(flag));

        // Set buffer sizes
        int buffer_size = 262144; // 256 KB
        setsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, (char*)&buffer_size, sizeof(buffer_size));
        setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, (char*)&buffer_size, sizeof(buffer_size));

    #ifdef __linux__
        // Linux-specific options

        // TCP_QUICKACK
        setsockopt(sockfd, IPPROTO_TCP, TCP_QUICKACK, &flag, sizeof(flag));

        // Set high priority
        int priority = 6;
        setsockopt(sockfd, SOL_SOCKET, SO_PRIORITY, &priority, sizeof(priority));

        // Set timeout for unacknowledged data
        int timeout = 30000; // 30 seconds
        setsockopt(sockfd, IPPROTO_TCP, TCP_USER_TIMEOUT, &timeout, sizeof(timeout));

        // Set congestion algorithm if BBR is available
        char algo[16] = "bbr";
        if (setsockopt(sockfd, IPPROTO_TCP, TCP_CONGESTION, algo, strlen(algo)) < 0) {
            // Fall back to cubic if bbr not available
            strcpy(algo, "cubic");
            setsockopt(sockfd, IPPROTO_TCP, TCP_CONGESTION, algo, strlen(algo));
        }
    #endif

        // Enable keepalive
        int keepalive = 1;
        setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, (char*)&keepalive, sizeof(keepalive));

    #ifdef __linux__
        // Configure keepalive parameters on Linux
        int keepidle = 60; // Start probing after 60 seconds of inactivity
        int keepintvl = 10; // Probe interval of 10 seconds
        int keepcnt = 3; // 3 failed probes before declaring connection dead

        setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPIDLE, &keepidle, sizeof(keepidle));
        setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPINTVL, &keepintvl, sizeof(keepintvl));
        setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPCNT, &keepcnt, sizeof(keepcnt));
    #endif

    #ifdef _WIN32
        // Windows-specific keepalive settings
        struct tcp_keepalive keepalive_vals;
        keepalive_vals.onoff = 1;
        keepalive_vals.keepalivetime = 60000; // 60 seconds in milliseconds
        keepalive_vals.keepaliveinterval = 10000; // 10 seconds in milliseconds

        DWORD bytes_returned = 0;
        WSAIoctl(sockfd, SIO_KEEPALIVE_VALS, &keepalive_vals, 
                sizeof(keepalive_vals), NULL, 0, &bytes_returned, NULL, NULL);
    #endif
    }

public:
    OptimizedSocket() {
    #ifdef _WIN32
        // Initialize Winsock for Windows
        WSADATA wsaData;
        WSAStartup(MAKEWORD(2, 2), &wsaData);
    #endif
    }

    ~OptimizedSocket() {
        if (sockfd >= 0) {
            close(sockfd);
        }

    #ifdef _WIN32
        WSACleanup();
    #endif
    }

    bool create_server(int port, int backlog = 10) {
        sockfd = socket(AF_INET, SOCK_STREAM, 0);