The active cyber defense cycle: A strategy to ensure oil and gas infrastructure cyber security
Oil and gas infrastructure is a prime target for extremists and nation states to inflict economic damage as well as to project their influence. Adversaries’ ability to leverage cyber capabilities to achieve this end adds complexity to an already diverse discussion on security. Regardless of the solution identified, protecting against cyber threats requires a strategy. Organizations must understand the purpose of their security strategy before it is developed and implemented. An overly broad goal of “security” or “defense” is not well suited to identify the varying approaches needed and the unique skill sets required. The three categories that can help articulate the needs related to cyber security are architecture, passive defense, and active defense. This five-part series will focus on active defense and how to implement a specific active defense strategy in operations and technology environments.
Cyber security is more than a software patch
The latest trends and buzz terms in the security industry often over-promise quick solutions and plug-n-play type security approaches. This emphasizes only the new and exciting and fails to recognize that security is a process that must be customized to each organization’s maturity and needs. Additionally, good security practices build on each other and fill gaps instead of attempting to entirely replace solutions. In this way, an active defense builds on an organization’s good architecture and passive defenses.
In this context, “architecture” is defined as, “Those processes and actions that contribute to and result in a system developed and maintained with security in mind.” This approach includes:
- Using the most secure implementation of protocols and systems where feasible
- Identifying and implementing network data flows to allow for proper monitoring of connections in and out of the network
- Maintaining patching to the best of the organization’s ability for all systems.
Proper security-minded architecture is a difficult challenge. However, investments in this area dramatically increase the effectiveness of passive and active defenses.
Passive defense
Passive defenses are software or hardware added to the architecture that increase security without consistent and direct interaction from personnel, even if updates and tuning are required over time. Systems, such as firewalls, anti-malware software, intrusion detection and prevention systems, and application whitelisting, are passive defenses. The operations technology environment introduces many challenges toward effectively implementing passive defenses, but even simple actions, such as limiting inbound and outbound connections, requiring authentication from remote locations, and maintaining firewalls with ingress and egress filtering, will prove to be invaluable.
Active defense
When an organization has properly invested in developing and maintaining architecture and passive defenses, it is effective to leverage an active defense. An “active defense” is “the process of security personnel taking an active and involved role in identifying and countering threats to the system.” The term is sometimes incorrectly associated with the idea of hacking back or counterstriking an adversary. This inappropriate use of the term has largely been due to poor translations of active defense theory in military strategies into the field of cyber security. Active defense emphasizes empowering security personnel to monitor an organization’s infrastructure, identify threats, and neutralize them internal to the network before they impact operations. It is never about accessing or impacting adversary networks.
The active cyber defense cycle (ACDC) consists of four phases that work together to maintain security, contributing to the safety and reliability of operations. The four phases are:
- Asset identification and network security monitoring
- Incident response
- Threat and environment manipulation
- Threat intelligence consumption.
The ACDC concept is not complicated:
- Understand the network topologies so they can be monitored for abnormalities and indications of compromise.
- Upon identifying a true threat, initiate an incident response to identify the scope of the infection, contain it, and eradicate it to maintain operations.
- In a safe environment, interact with the threat through skill sets, such as malware analysis to gather information and make recommendations for logical or physical infrastructure changes that would aid security.
- Collect the information about the threat throughout the cycle and combine it with external information about threats or threat intelligence.
This information is fed back through the process, which helps security personnel develop over time and look at defense not as a series of single encounters with an adversary, but as a prolonged process where growth and innovation can take place. This cycle ensures that security personnel of various talents are contributing to the same strategy and are effectively working together. Ultimately, this ties into the organization’s business goals.
ACDC is one strategy for an active defense that has been implemented in industrial control system (ICS) environments in and out of the government with great success. There are many distinctive aspects about ICS that put security personnel in a unique position to effectively and efficiently perform this strategy.
Active Cyber Defense Cycle
Asset identification and network security monitoring
Understanding and monitoring networked infrastructure is the key to identifying cyber attacks. This is the reason that the second phase of the Active Cyber Defense Cycle (ACDC) is asset identification and network security monitoring. The first portion of this phase, asset identification, serves multiple purposes inside an industrial control system (ICS). Knowing network infrastructure, where assets are, and what the network flows and topologies look like, ensures that operations personnel know not only what must be secured but also what normal operations look like–useful for troubleshooting and network configuration. Identifying peak usage, detecting failing devices, and validating the authenticity or integrity of device reporting from the field aids in the availability and performance of oil and gas operations. Oil and gas networks are relatively static, especially compared to enterprise IT environments. There should not be thousands of users surfing the Internet and changes to infrastructure are more tedious and done over much longer product life cycles. This gives defenders an opportunity to truly understand the network and its assets. There are four basic ways to identify assets and their communications.
Rule one in security: Know thyself
The four basic approaches to performing asset identification are:
- Physical inspection
- Configuration file analysis
- Passive scanning
- Active scanning
Physical inspection of assets can be tedious, especially in a distributed SCADA environment, but is useful and the least impactful to operations. While the other methods of detecting assets can be helpful, there is always a risk that some will be missed. As an example, some legacy systems simply do not communicate via the network or communicate often. Laying eyes on these devices and noting where they are can be the only way to positively identify them. However, physical inspection should be used to identify networked devices that have configuration files, such as network switches that can be reviewed. Additionally, physical inspection should be performed periodically when possible to validate findings.
Analyzing the configuration files on devices such as network switches can reveal registered devices. Every time a device connects to an Ethernet based network it registers itself to the network, usually through an address resolution protocol request. This registration is stored in configuration files to map the Ethernet address to the Internet Protocol (IP) address. Reviewing configuration files can show what is, or has been, on the network. When devices such as network switches are managed infrastructure, there is often an option to capture network traffic required for passive scanning. Mirror ports on network switches, taps at key points, or hubs on smaller networks are important ways to capture network traffic without impacting operations. Reviewing these data is possible in free and open source network tools such as TCPdump or Wireshark. Many tools in programs such as Wireshark, including the “endpoints” and “conversations” features, can precisely identify assets and their normal communication patterns, which can be recorded and reviewed over time. Passive scanning, or traffic analysis, often has the best return on investment for quickly and efficiently performing asset identification.
Active scanning on ICS networks should almost always be avoided. It is difficult to say that active scanning, or sending communications to devices and waiting for responses, should never be employed but the acceptable situations are few and far between. Interacting with sensitive devices in unexpected ways can often impact operations or crash the assets. Additionally, network devices such as proxies and firewalls often understandably block forced communications, thus returning incomplete network architecture maps by active scanning. Lastly, sending communications across the network often distorts the communication topologies on the network making them difficult to accurately identify and baseline. This baseline of communications is vital for network security monitoring.
Network security monitoring
Defenders can monitor the network to identify indications of malicious activity such as anomalies or deviations from the normal operation of the network. Network security monitoring builds on a good understanding of the network and its assets to identify changes that occur over time. If there are spikes in bandwidth usage, new devices appearing on the network, assets communicating to unusual IP addresses, or an increase in security alerts from firewalls or intrusion detection systems, this can all be cause for concern that must be investigated. Network security monitoring emphasizes three steps to perform the type of monitoring required to detect threats. The three steps are:
- Collect
- Detect
- Analyze
Defenders should use their knowledge of the network to collect important data. These data include different types such as full content data, statistical data, alert data, and more. As an example, full content data such as network packet captures reveal the activity on a network and its true usage. Anything an adversary does over a network is captured and can be investigated there. In an ICS environment learning how to get the data the first time can be the most difficult challenge, but after initial collection, it is a sustainable process. For example, logs in field devices are often available in the form of syslog on logic controllers but is often disabled by default. Enabling it can be a long process but once the data is there and sent to a central location for collection it is a manageable process. In enterprise IT environments data can be cumbersome by how much data is present and therefore difficult to store. In oil and gas networks, however, a relatively small amount of storage can be used to maintain vital data over long periods of time. The small and static networks, compared to traditional IT networks, are one of the defender’s best advantages.
Defenders can detect threats with the right collected data. Changes to the network, or breaks from the baseline, are the best method to detect threats. Traditional systems often provide valuable alert data that when collected and correlated, can reveal an adversary’s presence. For example, detecting failed logins on a human machine interface and then finding that there were intrusion detection system alerts a few hours previous to that event on another segment of the network might reveal and adversary moving throughout the environment. However, it is important to defeat false positives. False positives are when threats are detected but turn out to not be a true threat. The activity may have looked malicious, such as new communications reaching off of the network, but may have been something mundane such as a previously authorized diagnostics action. Analyzing the detected threats to guarantee they are real threats is important to ensure that defenders do not exhaust themselves or their management. Analysis by personnel ensures that false positives are disregarded while true positives, or accurately identifying a threat, are shown proper attention. When true positives are found by the network security monitoring personnel it is often reason for incident response. This leads to the next phase of ACDC which will be discussed in part three of this series.
Incident Response
In the process of performing asset identification and network security monitoring, analysts eventually find threats in industrial control system networks.
In the process of performing asset identification and network security monitoring, analysts eventually find threats in industrial control system networks. At this point, the team must be ready to execute incident response procedures. While there are many models, such as the National Institute for Standards and Technology incident response methodology, it is vital to prevent incident responders from operating in a silo. Security analysts should continue monitoring networks for threats to help incident responders determine how widespread threats are and if they are spreading on the network. Incident responders should move quickly and methodically to gather forensic evidence and identify instances of malware. This can be incredibly difficult across distributed oil and gas networks. Sensitive systems and large distributed networks make incident response a challenge. For this reason, the majority of the work required in incident response should be done prior to the incident.
Every organization should have an incident response plan. The plan should incorporate points of contact within operations technology (OT) and information technology (IT). The combination of IT and OT personnel with a common understanding of the environment and its needs-as well as a crossover of the skillsets required to work in each field-ensures success during incident response. The plan should contain points of contact, checklists for the procedures that responders must follow, and should fit into the overall incident response plan of the organization to include events that are not cyber related. The plan should be rehearsed and available in multiple locations including inside a jump kit. A jump kit is a bag that contains the tools, checklists, and procedures that responders will need. For example, the jump kit should include extra hard drives and CDs for data storage, converters and connectors for various interfaces, emergency contact phone numbers, and anything that the responders will find valuable during an incident. In operations environments it’s a good idea to ensure that a checklist reminds responders to have access to personal protective equipment, such as hard hats and steel-toed boots. There are plenty of guides for creating an incident response plan.
Sampling the pre-incident environment
Security personnel should gather forensic evidence prior to an incident. Even if an organization does not have the personnel, time, or skills to analyze collected evidence for indications of malicious activity, the organization should have pre-incident samples. During an incident, responders can use the pre-incident samples to identify abnormalities between that evidence and post-incident evidence, allowing them to quickly rule out benign processes and files. Data collected as a baseline must never be assumed to be completely clean, but it is a starting point for incident responders that will save time. Time and money are directly related during incidents; saving time equates to saving money and resources. A tool that has been used extensively on the Microsoft Windows OT systems in oil and gas networks is the cyber security company Mandiant’s Redline. There are other tools in the community that work similarly but Redline will be used for the purpose of having an example to point to in this article.
Redline is a free tool that can be installed on a nonproduction Windows system. After installation, personnel will be prompted with two options: Analyze data or collect data. The analyze data option allows security personnel to review evidence collected in a graphical user interface to identify potentially malicious processes and files on the system. However, this requires the allocation of resources such as analyst time and training. For such training, consider taking research and education organization SANS ICS 515: Active Defense and Incident Response class. The easy win is the collect data option. Using this option, personnel can create a collector that can be saved to a universal serial bus (USB). This USB can then be taken to a Windows-based system such as a human machine interface and used to collect data.
Using the collector is as simple as putting the USB into the system and executing the RunAudit script. It will then collect the appropriate data including system memory and system timeline information. Personnel can take samples from the environment using this method and then securely store them. Even without analyzing the data, it will be invaluable to incident responders during an incident.
This baseline information helps responders quickly rule out files that they should not analyze. Furthermore, it helps to identify abnormalities, and determine a window of time that the adversary has been present in the environment. It is important to note, however, that the collector should always be used on test systems first before using in the production environment. Additionally, use in the production environment should take place at safe times such as scheduled downtimes. USBs also must be maintained and cleaned to ensure that any existing malware is not spread from system to system.
Consider this sample process:
- Install Redline on a nonproduction Windows system
- Create a collector on a forensically clean USB.
- Test the collector on a test environment to ensure it does not impact systems.
- Insert the USB into a Windows OT system during a safe period such as scheduled downtime.
- Execute the RunAudit script to collect data such as system memory and disk information.
- Store the USB in a secure location with documentation that identifies for which system it was used.
It is preferable for analysts to review this data. In the active cyber defense cycle (ACDC), incident responders should sample the environment and look for threats-even before the network security monitor analysts alert them to an incident. This process of sampling and analyzing can uncover threats not easily observable in the network. However, given real world constraints, personnel should at least sample the environment and store the data for incident responders to use in the future. It is important to fully test any security measure, such as sampling the environment, before implementing it on production systems.
Preparing for the worst
After sampling the environment, it is useful to create a collector for remote sites. For example, in oil and gas networks, there are often remote sites that are difficult to access during incident response. By creating a collector and pre-positioning it at those remote locations, even untrained personnel can insert the USB into a potentially-infected system during an incident and run the script when directed. The USBs can then be taken or shipped to a central point where incident responders are conducting analyses. In this case, the serial number of the USB should be tracked and the USBs should be securely stored and clearly marked that they are for incident response scenarios only. Operators and engineers should not use the USBs outside of that situation, otherwise, they may accidentally spread malware in the network via the USB by accessing an infected system. The incident response plan should also call for the approval to use USBs that are identified, tracked, and cleaned during an incident. Management approval may be required ahead of time.
Redline is not the only tool that is free to use. Another option is FTK Imager. It is ok to use other tools, but the most important thing is being able to have tools ready for quick and easy use in the event of an incident. For the purpose of the ACDC, incident responders should take care to identify samples of the threat, such as malware, that can be passed to the threat and environment manipulation team. They will be able to analyze the threat and make recommendations for needed security changes.
Robert M. Lee, co-founder of Dragos Security LLC, shares his insight into the challenges of cyber security in the oil and gas industry with a five-part series on implementing the active cyber defense cycle. Dragos developed a passive asset discovery and visualization software tool. Lee is a PhD candidate at Kings College London researching control system cyber security. He is the course author of SANS ICS 515: Active Defense and Incident Response, the author of the book SCADA and Me, and a U.S. Air Force Cyber Warfare Operations Officer. Edited by Eric R. Eissler, editor-in-chief, Oil & Gas Engineering
Cyber Security Services Provided by iFluids Engineering
< To know more send email to john@ifluids.com>
- Marine Cyber security Program and Training
- Railway Cyber security Program and Training
- Pipeline cyber security
- Industrial Automation and Control Systems (IACS) Security
- ICS/SCADA Security Assessment & Penetration Testing
- Risk and Vulnerability Assessment (RVA)
- Embedded Devices Security Assessment
- IoT and IIoT Security Assurance Services
•IT Security Awareness
•ISO 27001 Introduction & ISMS Primer
•Essentials of ICS Engineering
•Essentials of SIS and Safety Life Cycle
•Identity and Access Management
Disclaimer: All information and content contained in this website are provided solely for general information and reference purposes. TM information, Images & any copyrighted material inadvertently published or depicted belong to rightful owner and iFluids doesn't claim to be its own.