In previous articles we reviewed the overall topic of management interfaces to Taps and NPBs and in subsequent chapters took a deep dive into the topics of Fault Management and Configuration Management. In this chapter we will focus on Software Management and its related topics. Subsequent chapters will cover other management topics including accounting, performance monitoring, security and remote access.
Software Management
For the purposes of this document we have decided to include several topics together under the heading of software management:
Booting – How the device boots up from power on or power cycle
Factory Reset – How to reset the software to a factory configuration
Software Updates – the software components and how each are updated
Backup and restore – how to backup the software and its related data so it can be reloaded to the box in the event of a corruption or rollback
General Software Guidelines
- Use commercial, supported software components wherever possible. Opensource is fine and may be preferred as long as there is forecast support in place for the lifetime of the product
- Use standards based behaviors (many of these are discussed below) to ensure compatibility and supportability. Standards based interfaces and behaviors are also more secure since they have been tested and scrutinized by the larger community.
Security Considerations
- Proprietary software and proprietary interfaces alone are not a sustainable form of security or protection for a product – while the hacker/attacker may not have an off the shelf cracking tool for a proprietary interface, it usually does not take much effort to scan and reverse engineer a proprietary system for attack.
- Software images and backups and configuration data backups should be version tagged, encrypted and include integrity verification data. All files imported to a box including software updates and backups need to be integrity checked before being processed. This is needed to prevent corruption or hacking of the software or data when stored outside of the box.
- These will be discussed more in a later chapter dedicated to security considerations
Booting
There are several methods to use to initially bootstrap and configure a device on the initial power cycle. The most common ones are:
- Bootp – Bootstrap Protocol – assigns a device an initial IP address on bootup but also shares the address of the bootp server and a software image server that can be used to download the initial device software and/or configuration
- DHCP – Dynamic Host Configuration Protocol – is a newer and more sophisticated way to assign IP addresses to a box and can also communicate information about software image locations and configuration data to the device. DHCP can provide many network configuration parameters to device including location of its configuration and software images but also network service information such as time servers, name servers, routers, etc.
- TFTP – Trivial File Transfer Protocol – used alone or with BootP/DHCP is typically used for loading of boot images since it is supported at the firmware level of many devices (it is simple to implement and easy to configure in low level systems). Software images and boot configuration data downloaded over TFTP should still be checked for integrity and compatibility with the device.
- DHCP Option 66 – While there are many different ways to use DHCP, TFTP and other protocols to configure a device, a specific method known as Option 66 was defined to allow a device to discover a provisioning server and to download its configuration at boot time from the network. While this is rarely used in low level network devices, it can be used for deployment of large numbers of centrally configured edge devices such as cable/DSL modems, IP Phones, access gateways, etc.
- Thumb drive or SD card booting – This mechanism is common for development machines and PCs/laptops/Tablets, but is not recommended for network devices. If the Thumb drive or SD card is replaced with a modified image then the device and be use for hacking and spying on the network. As with all forms of software and configuration data loading, any image loaded from a Thumb drive or SD card should be verified for security and integrity. It is not recommended that “boot from SD” or “boot from USB” be enabled on any network device.
- Internal flash/ROM storage with redundant images – Another mechanism is to store an initial factor software/configuration image in ROM or other non-volatile memory on the device. This allows for initial booting and for a failsafe recovery back to this image in the event of corruption. The device would have an ‘active’ image that is updated and patched over time and a ‘factory’ image that is protected from updates and can be used in an emergency. Some devices will have 3 ‘inactive’ image storage areas, with the 3rd area being used during upgrades/patching to store the ‘next’ image. This 3rd software image also gives the user the ability to abort/roll back to the previous image easily by switching the ‘active’ and ‘inactive’ designation.
Factory Reset
All products need to support a reset capability to return the unit to a baseline software image and configuration. This allows the user to clear a box of network/proprietary data before decommissioning, repurposing it or returning it for repair. It also allows a fallback option if the configuration or the software becomes corrupted. This is different from a restore from a backup which will be discussed later in this document.
Two levels of factory reset should be supported:
- Reset software to a reliable base version – this is usually the version of software that was shipped with the product initially. This will also almost always necessitate a configuration data reset as well.
- Reset configuration to a new, out of the box configuration state – this ‘clean slate’ configuration may be different for different versions of software. The configuration reset option generally does not revert the software to an earlier version but instead only resets the configuration to the default state for the current software image.
Software Updates
When considering how software is updated we first need to understand the various types of software on a typical tap or packet broker box:
- BootLoader – when a device is powered on, this is the software that is initially executed by the processor to begin the startup and software loading process. The bootloader is usually stored in ROM or some form of programmable persistent storage (EPROM, EEPROM, etc.) or flash since it has to be accessible before anything is loaded into the volatile memory (RAM). It’s first job it to initialize the memory (RAM) and other devices needed to load the operating system and to initiate loading of the operating system.
- Firmware – While the bootloader is a type of firmware (the firmware for the computing infrastructure of the device), there may be other firmware components for specific sub-processors and devices attached to the computing system. In taps and packet brokers this includes the packet processors and network interface controllers (NICs). Firmware is also stored in ROM, EPROM, EEPROM or Flash on the device but may be updated by the operating system. Firmware for the subsequent devices is initialized at power on or upon signal from the bootloader.
- Operating System – The operating system is the main software system the will operate the system – in this case the Tap or Packet Broker. It manages all of the hardware and software resources on the unit and provides the environment for the application(s). The operating system handles storage, memory usage, processor allocation to tasks, access to devices and other common functions that are managed across applications.
- Application – The application software is responsible for performing the end user function(s). In the case of Taps and Packet Brokers this includes the management interface for the user to configure and monitor ports, packet filters, packet flows, statistics, state, etc. The application software also include the actual packet manipulation (duplication, filtering, deduplication, routing, etc.) performed by a Tap or Packet broker but in most cases depends heavily on lower level packet processors to do the actual packet manipulation (the exception are software only Taps and Packet Brokers which perform all of the packet handling within the application itself).
Loading/Updating these individual software components occurs differently in the initial configuration of the box vs. subsequent field updates:
- Pre-Deployment – In manufacturing the ROM/EPROM/EEPROM/FLASH images are loaded into the chips using customized hardware. This is usually performed after the unit is assembled but is driven by external hardware and/or special connectors. This hardware is not included in the product shipped to the customer. In some cases the chips are pre-loaded before being placed on the board. This establishes the baseline factory initialized capability. The non-volatile storage is also preloaded with the factory versions of firmware for other devices, operating systems and any default applications. In some environments the units are again upgrade before shipment using normal field updating methods.
- Post Deployment – In normal field operation there are several methods that can be used to update the software:
- Management system application – The typical method is to have the management system (either built into the operating system or a separate device specific management interface such as CLI or GUI) retrieve a new software image (bootloader, firmware, OS and/or application) from an outside source. The source is usually a downloaded image from the manufacturer but can also come from a network file or a file on removable storage such as a USB drive or SD card. The management system will then write it to the appropriate storage location and then reboot or reinitialize the box to utilize the new software. Best practice is to have at least 2 storage locations for each software component that is selectable at boot time – in this way the current operating version is not overwritten when the new version is being installed. If there is an issue with an update (incompatibility, corruption, defect, etc.) then the previous version can be re-selected. Additionally the user may manually select the previous version if they want to roll-back due to some functionality issue they discover.
- Load directly from external media – In some devices the device can boot load an image or update a software component during boot directly from an SD card, USB Drive or other removable storage. This method is similar to using a management system application but skips the user interaction portion
In all cases, as stated earlier, it is best practice to have the incoming image scanned for integrity and authenticity and to reject loading it if any issue is detected. Additional, error handling code should be included so that a failure of the new software to properly initialize will cause the system to abort and revert to the previous version or to a factory default version. Absence of sufficient safeguards can either enable a hacker to take control of the box, insert spyware or ransomware or could leave the box in a configuration that is corrupted and cannot be recovered by the end user.
Types of software updates:
Upgrade – this is the typical software update that replaces the current version with a completely new version that may include bug fixes, new features, security enhancements, etc. Often upgrades are bundled together across multiple software components (software types) either to simplify the installation procedure or to coordinate a specific dependency (see dependency management below). Note that more complex computer systems may have many modules within a software component, each with their own version and with the ability to update individually.
Patch – A patch is a small update that modifies only a portion of a larger software component. This was common in the era of mainframe and large computer programs, particularly in the era when data storage space was limited/expensive and transfer speeds were slow. In modern systems where storage is plentiful and transfer of even large files occurs in seconds, patching is seldom used and instead are rolled into upgrades.
Rollback – A rollback is used to revert the system to an earlier version. This can be applied like an update (but with an older vs. newer version of the component), or, as discussed earlier, can be performed by abandoning the current software version and selecting an older version already installed but stored in an inactive location. A special version of a rollback is reset to factory default which would select the oldest version of software or may revert to a version stored in ROM.
Dependency Management:
When considering each of the software layers/components: boot loader, firmware(s) (for multiple components in the system), the operating system and all of its modules and the applications, it is important to understand how each layer depends on the layer below it. Dependencies between these components must be properly managed and validated to prevent errors and incompatibilities. Software architects will typically minimize the number of dependencies between software components to simplify this process. This is achieved by defining a limited set of externally accessible interface points (defined as Application Programming Interfaces (APIs)) between the components and by implementing those interfaces in such a way that they support backward compatibility. In more complicated software systems (operating systems or higher order applications), this upgradeability and dependency management capability can also be implemented as a library (the details of which are out of scope for this discussion).
By providing a limited set of interfaces with backward compatibility, a lower level component can simultaneously support higher layer components that are both older (utilizing the backward compatible interface(s)) and current (using the more recent capabilities of the module). The number of releases/amount of time that a component supports backward compatibility may vary but is typically a year or more allowing time for all dependent components to be versioned. As a result upgrades are almost always performed from the lowest level of the software stack to the highest. This also creates a behavior where the lowest level components are upgraded rarely and have the largest backward compatibility window while the upper level components (typically the OS and applications) have progressively shorter backward compatibility and receive more frequent updates. This can be seen in desktop environment where a given computer boot loader can last a decade or more and can support many operating systems and many versions of operating systems which may only perform major upgrades every few years, and higher in the stack the applications may upgrade every few months or even weeks in order to provide the newest and most secure capabilities to the users.
One important role of the software upgrade function, regardless of what level in the hierarchy is being upgraded, is to verify that version compatibility is not violated. This is why some upgrades will abort and require another lower level or dependent component to be upgraded first. This whole mechanism is generally simplified to a sequential version numbering system which is checked for each dependent component, rather than having the software upgrade function understand how to check each individual API. A given software upgrade file will usually contain meta-data which defines the list of components it is dependent on and the range of versions it is compatible with – which is used by the software upgrade function to verify the compatibility. This compatibility matrix is also typically included in release notes and documentation.
The best practice for this version management is to use a multi-part version number system with at least 3 and sometimes 4 tiers usually represented as a decimal separated number. For example 2.4.5.13 (Major release 2, Minor release 4, Patch level 5, Build 13):
- Major Release – the first number in the release number system A major release indicates a major change in functionality, and/or a break in the backward compatibility sequence. In most cases the Major release is used to introduce changes to the behavior of the software that require higher level components or the user in the case of an operating system or application to change how they use the component. Major releases are usually rare, perhaps years between one Major release and the next.
- Minor Release – the second number in the release numbering system typically indicates a minor release. A minor release usually provides backward compatibility to any dependency built upon the same major release. A minor release may be used to introduce limited incremental functionality, improvements to existing functionality and new functionality that might be needed by higher level components in the hierarchy. Minor releases are typically provided a few times per year depending on the speed of development by the software originator.
- Patch level – the third number usually relepresents that patch level of fix level of the software. Most developers elect not to introduce new functionality in the patch level updates, but instead focus on fixes to bug, resolution of software vulnerabilities and correction to any compatibility issues. Patch level updates may occur many times per year, perhaps with only days or weeks between updates if they are resolving compatibility or security issues. These first 3 levels are almost always sequential and continuous (no gaps in the numbering sequence).
- Build level – there is sometimes a 4th level version ID which indicates the software build that is used to create the update. Within the software development community there will be periodic software builds created (daily or weekly) that are used for internal validation (within the development team/community). Builds that don’t pass the testing are not released. Once a software build reaches a level of stability sufficient for release and in some cases only after it has a sufficient number of updates, then it is released. In most cases this just increments the patch level ID, but in some situations the software supplier may elect to provide multiple build increments to customers with the same patch level.
Note that the structure above is a common one but not the only designation used by a given software supplier. Definitions of what each digit in the software version means, its frequency of updates and even the version naming/numbering structure may vary (for example some suppliers use a letter suffix for the patch level or a date for the build level).
Data Upgrade
When a software component is upgraded, any data associated with the previous version of the component must also be updated at the same time so it is compatible with the new version of the software. This may include reformatting the data storage based on changes in how the software accesses the data, adding new data attributes and their defaults that are needed by the new system, converting data values to be compatible with the new software module and deleting attributes that are no longer needed. This requires that the upgrade process itself execute a data conversion procedure after it is installed to update the existing data. The conversion algorithm needs to be able to update data from any of the “from” software versions that are supported by the upgrade.
Backup and Restore
The last area of software management we will discuss is backup and restore. This topic overlaps somewhat with configuration data backup and restore, but there is an important dependency between the two which we will discuss.
Some, but not all systems, have the ability to create a backup of the running software load. A software backup may help to simplify disaster recovery or reloading of a system or can be used to transfer the software from one hardware instance to another. What differentiates software backup from simply re-installing a vendor supplied software update are three characteristics:
a) The backup usually bundles all of the components into a single image for easy restoration – for example an operating system and its applications may consist of dozens of (or a hundred or more in the case of desktop/workstation or smartphone systems) different software components, perhaps from multiple different suppliers. Restoring this complex set of software components could take many hours of effort. By making a copy of and storing an image of that full set of software, it can be reinstalled quickly and without needing to re-acquire and install all of the individual updates.
b) The backup is created by the device itself and it is kept locally (on the device or on locally accessible storage media) for easy access to reload the device (i.e. it is not provided by the vendor). Most devices can store at least one or perhaps multiple backups made at different times to provide options as to which backup (point in time) to restore.
c) The software backup includes or is dependent on a specific configuration data backup. To accommodate this a data backup is almost always included within the software backup. This allows the software and its configuration data and any operational data to be stored in a snapshot and restored as a set.
The next chapter in this series will focus on the management and access to Performance and Operational Statistics from network taps and network packet brokers.