Ajitabh Pandey's Soul & Syntax

Exploring systems, souls, and stories – one post at a time

Tag: Networking

  • From Cloud Abstraction to Bare Metal Reality: Understanding the Foundation of Hyperscale Infrastructure

    In today’s cloud-centric world, where virtual machines and containers seem to materialize on demand, it’s easy to overlook the physical infrastructure that makes it all possible. For the new generation of engineers, a deeper understanding of what it takes to build and manage the massive fleets of physical machines that host our virtualized environments is becoming increasingly critical. While the cloud offers abstraction and on-demand scaling, the reality is that millions of physical servers, networked and orchestrated with precision, form the bedrock of these seemingly limitless resources. One of the key technologies that enables the rapid provisioning of these servers is the Preboot Execution Environment (PXE).

    Unattended Setups and Network Booting: An Introduction to PXE

    PXE provides a standardized environment for computers to boot directly from a network interface, independent of any local storage devices or operating systems. This capability is fundamental for achieving unattended installations on a massive scale. The PXE boot process is a series of network interactions that allow a bare-metal machine to discover boot servers, download an initial program into its memory, and begin the installation or recovery process.

    The Technical Details of How PXE Works

    The PXE boot process is a series of choreographed steps involving several key components and network protocols:

    Discovery

    When a PXE-enabled computer is powered on, its firmware broadcasts a special DHCPDISCOVER packet that is extended with PXE-specific options. This packet is sent to port 67/UDP, the standard DHCP server port.

    Proxy DHCP

    A PXE redirection service (or Proxy DHCP) is a key component. If a Proxy DHCP receives an extended DHCPDISCOVER, it responds with an extended DHCPOFFER packet, which is broadcast to port 68/UDP. This offer contains critical information, including:

    • A PXE Discovery Control field to determine if the client should use Multicasting, Broadcasting, or Unicasting to contact boot servers.
    • A list of IP addresses for available PXE Boot Servers.
    • A PXE Boot Menu with options for different boot server types.
    • A PXE Boot Prompt (e.g., “Press F8 for boot menu”) and a timeout.
    • The Proxy DHCP service can run on the same host as a standard DHCP service but on a different port (4011/UDP) to avoid conflicts.

    Boot Server Interaction

    The PXE client, now aware of its boot server options, chooses a boot server and sends an extended DHCPREQUEST packet, typically to port 4011/UDP or broadcasting to 67/UDP. This request specifies the desired PXE Boot Server Type.

    Acknowledgement

    The PXE Boot Server, if configured for the client’s requested boot type, responds with an extended DHCPACK. This packet is crucial as it contains the complete file path for the Network Bootstrap Program (NBP) to be downloaded via TFTP (Trivial File Transfer Protocol).

    Execution

    The client downloads the NBP into its RAM using TFTP. Once downloaded and verified, the PXE firmware executes the NBP. The functions of the NBP are not defined by the PXE specification, allowing it to perform various tasks, from presenting a boot menu to initiating a fully automated operating system installation.

      The Role of PXE in Modern Hyperscale Infrastructure

      While PXE has existed for years, its importance in the era of hyperscale cloud computing is greater than ever. In environments where millions of physical machines need to be deployed and managed, PXE is the first and most critical step in an automated provisioning pipeline. It enables:

      • Rapid Provisioning: Automating the initial boot process allows cloud providers to provision thousands of new servers simultaneously, dramatically reducing deployment time.
      • Standardized Deployment: PXE ensures a consistent starting point for every machine, allowing for standardized operating system images and configurations to be applied fleet-wide.
      • Remote Management and Recovery: PXE provides a reliable way to boot machines into diagnostic or recovery environments without requiring physical access, which is essential for managing geographically distributed data centers.

      Connecting the Virtual to the Physica

      For new engineers, understanding the role of technologies like PXE bridges the gap between the virtual world of cloud computing and the bare-metal reality of the hardware that supports it. This knowledge is not just historical; it is a foundation for:

      • Designing Resilient Systems: Understanding the underlying infrastructure informs the design of more scalable and fault-tolerant cloud-native applications.
      • Effective Troubleshooting: When issues arise in a virtualized environment, knowing the physical layer can be crucial for diagnosing and resolving problems.
      • Building Infrastructure as Code: The principles of automating physical infrastructure deployment are directly applicable to the modern practice of Infrastructure as Code (IaC).

      By appreciating the intricacies of building and managing the physical infrastructure, engineers can build more robust, efficient, and truly cloud-native solutions, ensuring they have a complete picture of the technology stack from the bare metal to the application layer.

    1. RPi – Static IP Address on Wifi

      There is a GUI tool in the desktop called “wpa_gui” which can be used to connect to the wireless network provided you have a supported wireless card attached to the RPi. However, if I want to run a headless RPi I would need a static IP address. Unfortunately “wpa_gui” does not provide me a means of configuring static IP address on the “wan0″ interface and my ADSL router does not support associating a static IP address with a MAC Address. This means I have to use static IP address configured on my wireless interface on the RPi and have to do it the old fashioned way (read I am loving it).

      Open up the “/etc/network/interfaces” file and make the following entries. The commented lines are the one’s which were added by the “wpa_gui” and we don’t need them. If you have little Debian experience, you will find these lines self explanatory:

      auto lo
      iface lo inet loopback
      
      #auto eth0
      iface eth0 inet static
      address eth0 192.168.1.52
      netmask 255.255.255.0
      gateway 192.168.1.1
      
      auto wlan0
      allow-hotplug wlan0
      iface wlan0 inet static
      address 192.168.1.51
      netmask 255.255.255.0
      gateway 192.168.1.1
      wpa-ssid "My SSID"
      wpa-passphrase "My Passphrase"

      I have added a static IP address configuration line for the eth0 interface also, so that in case someday I connect my RPi to physical connection I will just bring my eth0 interface up and have an IP address. The reason I have commented the “auto eth0” line is because in case a physical interface is up, the default route of the system is always through the physical interface i.e eth0 in this case. So, if my WiFi is up, I want my packets to go in/out through WiFi by default and not through “eth0” (by the way it does not matter if you have connected the cable physically or not, if the physical interface is up, it is the default route out). Of-course we can prevent that, but it is going to be a little bit complicated, so we are going to just comment out auto line to make sure that the “eth0” does not come up. It is also possible to have both “eth0” and “wlan0” run simultaneously, but again it is a bit complicated for this post and I do not need that anyway.

      Now you can restart the networking or reboot the RPi and your WiFi should come up with the static IP address.

    2. Creating VPNs with OpenVPN

      Introduction

      A VPN is a set of tools which allow networks at different locations to be securely connected, using a public network as the transport layer. A VPN produces a virtual “dedicated circuit” over the internet and use cryptography to secure it.
      (more…)

    3. Changing the IP Address of a Solaris System

      Use ifconfig to change the IP address immediately

      ifconfig    

      If the new IP address calls for a different gateway then change it using the route command:

      route add default 
      route del default 
       

      Change the hosts’s IP address in

      • /etc/hosts file to take effect after each reboot
      • /etc/inet/ipnodes (for Solaris 10)

      Change the host’s subnet mask in /etc/netmask

      Change the host’s default gateway in /etc/defaultrouter

    4. Setting up SNMP

      SNMP is Simple Network Management Protocol. It allows the operational statistics of a computer to be stored in object identifiers (OIDs) which can then be remotely queried and changed.
      For any serious remote monitoring, SNMP is required. I generally prefer to monitor server performances remotely using Nagios and SNMP.
      This document describes the SNMP setup, which can then be used by any SNMP remote management software.
      As a security measure, one needs to know the passwords or community strings in order to query the OIDs. The read-only community strings allow the data to be queried only and the read-write community strings allows the data to be changed.
      I will be refering the setup on an Ubuntu server, while they should apply to any linux distribution.
      Install SNMP daemon by

      $ sudo apt-get install snmpd

      and then add the following lines on top of the cofiguration file – /etc/snmp/snmpd.conf as follows.

      $ sudo vi /etc/snmp/snmpd.conf
      # type of string   private/public  host-from-which-access-is-restricted
      rwcommunity        private         127.0.0.1
      rocommunity        public          127.0.0.1
      
      rwcommunity        ultraprivate    cms.unixclinic.net
      rocommunity        itsallyours     cms.unixclinic.net

      The first column is the type of community string, the second column is the community string itself and the third column (not mandatory) is the host restricted to use that community string.
      The first two lines specifies that only localhost (127.0.0.1) is allowed to query the SNMP daemon using the specified read-only and read-write community strings. The next two lines specifies that only the host cms.unixclinic.net is allowed to query the SNMP daemon using the specified read-only and read-write strings.

      If I remove the hostname (cms.unixclinic.net) then basically any host can query the snmp daemon if it knows the right community strings.

      After making these changes, give the snmp daemon a restart and then test it using snmpwalk program:

      $ sudo invoke-rc.d snmpd restart
      Restarting network management services: snmpd.
      $ snmpwalk -v1 -c public localhost system
      SNMPv2-MIB::sysDescr.0 = STRING: Linux cms.unixclinic.net 2.6.17-10-generic #2 SMP Tue Dec 5 21:16:35 UTC 2006 x86_64
      SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
      DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (1314) 0:00:13.14
      SNMPv2-MIB::sysContact.0 = STRING: Ajitabh Pandey <hostmaster (at) unixclinic (dot) net>
      SNMPv2-MIB::sysName.0 = STRING: cms.unixclinic.net
      .......
      .......

      As a result of snmpwalk, you should see the system details as reported by SNMP. The snmpwalk command executed above means, you are querying “localhost” for “system” MIB and have specified SNMP ver 1 protocol to be used and the community string is “public”. Now as you know that this community string is for read-only access and is restricted to queries from 127.0.0.1 IP address only, so this works fine.

      Further, if you now try to execute the following command over the network from host “cms.unixclinic.net” using the community string “itsallyours”, it should also work. But in mycase instead a timeout is received:

      $ snmpwalk -v1 -c itsallyours cms.unixclinic.net system
      Timeout: No Response from cms.unixclinic.net

      Just for clarification, the current host from which snmpwalk is being run is also cms.unixclinic.net.

      This should work on most distributions (RHEL 3, RHEL 4 and Debian Sarge it works like this), but on Ubuntu “Edgy Eft” 6.10 its not the case. This will fail. The reason being the defualt settings of SNMP. Following is the output of ps command from both an Edgy Eft machine and Sarge machine:

      Ubuntu $  ps -ef|grep snmp|grep -v "grep"
      snmp      5620     1  0 11:39 ?        00:00:00 /usr/sbin/snmpd -Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid 127.0.0.1
      
      Debian $ ps -ef|grep snmp|grep -v "grep"
      root      2777     1  0  2006 ?        00:46:35 /usr/sbin/snmpd -Lsd -Lf /dev/null -p /var/run/snmpd.pid

      If you see carefully, that Ubuntu 6.10 snmp daemon is by default restricted to 127.0.0.1. This means that it is only listening on localhost. To change that and make it listen on all interfaces we need to change the /etc/default/snmpd file:

      Change the following line

      $ sudo vi /etc/default/snmpd
      .....
      SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid 127.0.0.1'
      .....

      to

      SNMPDOPTS='-Lsd -Lf /dev/null -u snmp -I -smux -p /var/run/snmpd.pid'

      and then restart SNMPD

      $ sudo invoke-rc.d snmpd restart