As cloud-native development continues to automate the consumption of upstream content providers, the ability for automation to make real-time, informed decisions becomes critical to keep the automation wheels spinning. The speed of development and consumption has reached the point where cracks in the system are always present, even if not immediately seen. The ability to leverage new information against facts on artifacts already in our environments are key aspects of analysis. The time-shifted analysis continually improves the hygiene of the artifacts we consume, build and distribute.
Signing, Systems Bill of Materials (SBoM), Security Scanning, and Registries play major roles in how you build, secure, distribute and consume artifacts. But, what are the roles and responsibilities of each? What should be included in an SBoM, and what are the expectations of signing? If you have a signed SBoM, do you even need to run security scanning software? In this article, I’ll cover an opinion for the roles and responsibilities of each as security is an always evolving standard, with multiple lines of defense.
Side note: the SPDX and MITRE groups use Systems Bill of Materials, as opposed to Software Bill of Materials to reflect IoT and other hardware-based device scenarios.
Short and Long Version
As I shared the article for review, there was lots of great feedback and discussions. This led to more clarity, which made the article longer. “I didn’t have time to write a short post” and I didn’t want to exclude interesting discussions. I’ll follow up with more, in a series, as this is intended to be a reference point for continued discussion.
Systems Bill of Materials (SBoM) & Tacos
You’re planning a party for the end of COVID-19. Some of your friends are gluten-free and some are vegetarians. The menu is Tacos, ’cause who doesn’t love tacos?
You order gluten-free corn and flour tortillas, a variety of vegetables, and of course, avocados to make fresh guacamole. You even plan separate serving areas and utensils to not cross-contaminate.
How do you know what products are gluten-free, and which are made from non-vegetarian ingredients or packaging? Food products have a list of ingredients. The list doesn’t elaborate the full recipe, nor state an opinion of the quality of those ingredients. It factually states the ingredients for consumers to make educated decisions.
In addition to the factual perspective, where is the list of ingredients generated? The ingredients are a factual list of what went into assembling the product.
Can you post-analyze a completed food product and know what went into it? You might perform DNA analysis to understand amylase was in the final product. But was amylase an ingredient, or the result of mixing other ingredients, and it’s what formed when the product was complete? These are important distinctions, and both are valuable.
The list of ingredients is the equivalent of an SBoM. It states the factual aspects consumers need to know.
As of late December 2021, 13 people across six states were diagnosed with the outbreak strain of E. coli O157:H7. It seems the concern over corn and flour tortillas was less of an issue, compared to the bag of lettuce that was used.
The Simple Truth Organic Power Greens ingredients (SBoM) lists:
Organic Spinach, Organic Mizuna, Organic Chard, Organic Baby Kale. Mix May Vary by Season.
Not surprisingly, there’s no mention of E. coli O157:H7, as there was no intent to include E. coli, nor was it known at the time that some of the greens were contaminated.
At the time of packaging, and the time of purchase, everything was fine, or rather it was believed to be fine. Only later, when consumers were diagnosed with E. Coli was an analysis completed. The affected people, across 6 states, all consumed the same product, labeled “Best if used between Nov. 30, 2021, and Jan. 8, 2022”. The packaging wasn’t damaged, it remained sealed, nor was it contaminated along the distribution supply chain. It was a failure in the preparation of the product (the build system).
In the Future, We Learn About Things of the Past
With the knowledge around E. coli contaminations, we can block the consumption of the specific product: Simple Truth Organic Power Greens with a Best if used date between Nov. 30, 2021, and Jan. 8, 2022 label. Is that enough? What can we learn from this event and other outbreaks that help us prevent this vulnerability from making its way into future supply chains? If the risk and recurrence are high enough, manufacturers might choose to proactively test for E. coli before their food product is shipped. The industry might even require testing (scanning) for known vulnerabilities prior to distribution. A scanner may identify known ingredients that are subject to a type of vulnerability and perform further testing. If specific types of lettuce, from specific regions of the world, have proven to be more prone to E. coli contaminations, it might be worth more expensive testing. In the E. coli case, the food processor would need to test their greens as part of the assembly line. Manufacturers could attest they follow the certified process, just as manufacturers endorse products as gluten-free, non–GMO, Kosher, or USDA Organic.
Shifting the focus from contamination response, the FDA recently announced the Food Safety Modernization Act (FSMA), which shifts to proactive testing measures. The FDA is taking the historic knowledge to apply to future potential situations, as the risk/reward is justified.
Scanning for Best Practices
In the above case, we explored the lack of information in an SBoM, and the need for scanning to test for potential risk. The SBoMs didn’t include E. coli, metal fragments, or other vulnerabilities. Products that are known to contain these vulnerabilities would be recalled, and removed from the supply chain. I’ll explore revocations a bit further down.
SBoMs can be used to assure best practices are applied. The inclusion of an item in an SBoM is factual information that enables a scanning policy to make an informed decision, based on what’s known at a point in time. Scanners may output endorsements (badges) when specific tests were validated as they implemented a best practice. However, a scan of factual information may yield a different decision as new information is applied to previous facts. (Time is a factor)
Asbestos Fibers for Fire Suppression and Material Bonding
Between the 1950s and 1980s, Asbestos curtains were the standard for theaters. The curtains boldly advertised they were made with Asbestos, to give comfort to attendees in the case of an on-stage fire.
In the 1970s, Asbestos was preferred in some industries as risk mitigation. Contracts required Asbestos in the bill of materials. Flash forward to the 1980s. The same SBoM, for the exact same product, is now considered a high risk that requires abatement for the product that was previously considered a safety requirement.
Did the product change? It was our knowledge about the product that changed. Future information was applied to past factual data. If the SBoM of a theater stated an opinion the theater was safe from stage fires, without stating the facts as to how that requirement was met, the SBoM wouldn’t have the same long-term value. It was the static facts contained in the SBoM, at the time of creation, that enabled future opinions on the factual information.
Risk and Remediation
Asbestos is an example of a known risk, that is avoided in future products but doesn’t require immediate abatement for existing products. Existing products that contain Asbestos are only a risk under certain conditions.
Asbestos was so prevalent, most residential and commercial buildings built between 1950-1980 likely contain Asbestos floor tiles or Asbestos ceiling tiles. Are these buildings unsafe? Should all buildings that contain Asbestos be immediately shut down until the Asbestos has been removed, and replaced with what’s currently believed to be safe products? What is the risk (cost) and reward for such an undertaking?
When to Remediate
Identifying a risk may not necessarily require immediate abatement and remediation. It’s a reference point, to decide if the risk applies to the usage. As long as the Asbestos floor and ceiling tiles are left intact, the risk is managed. The costs of removing the tiles would entail shutting down the building, relocating the occupants, with an impact on the business. The tiles require special handling to remove, increasing the costs, and then the reconstruction must begin. When new construction (deployments) is begun, the new products are validated against known vulnerabilities. Construction of a theater in 1970 would require Asbestos curtains, or Asbestos tiles as they were cheap and sturdy. However new construction (deployments) assure Asbestos is not in the SBoM.
In the software supply chain, we might identify a known vulnerability, which exists in the registry and is currently deployed, but has no exploit vector. For example, Asbestos is not a risk, until it’s disturbed.
The fictitious Wabbit Networks produces network monitoring software, which is consumed by Acme Rockets. When the Wabbit Networks
net-monitor product was initially built, a throttling profile package was used to provide usage insights, while also preventing denial of service attacks. Wabbit Networks advertised their software integrated with API management systems because it included the throttling profile package. Scanning software at Acme Rockets required the package as a best practice, integrating with their API management software.
Several months later, the profiler package was found to send PII data to the Vulgarian government. The security team at Acme Rockets had to evaluate whether they needed to shut down their operations, delay the launch of their road runner rocket, or proceed with operations and remediate the exploit at a later date.
While the exploit is considered a critical level risk, the usage within Acme Rockets is mitigated as production environments are locked down, and all nodes, with the vulnerable package, have no egress to send the PII data. As the launch commenced, their Kubernetes environment needed to scale up more instances to satisfy all the telemetry processing.
The security team evaluated the risk and determined they could proceed. They flagged the vulnerability in production deployed nodes requiring mitigation within 5 days. The scanners were configured to block new deployments that contained the throttling package, but existing deployments were allowed to scale. To mitigate the risk of scaling to newly added nodes, a security rule was added requiring nodes must be in a private network with no egress. The data exfiltration exploits prompted a proactive best practice, locking down nodes by default.
The development team was given time to replace the package within 5 days, enabling the team to watch the afternoon launch. Of course, the roadrunner dodged the rocket, and Acme Rockets was back to the drawing board.
Signing, Identity, Quality and Revocation
What does a signature imply? When an artifact is consumed, do you know who created it? The location by which you acquired the artifact should not reflect the sole source of trust as the Best Practice for Consuming Public Content is to take possession of the artifacts you depend upon. In addition to the promotion of public content to private registries, the container community recognizes the challenges with consuming the public content (see Docker Hub Rate Limiting). Alternate sources include the Amazon ECR Public Gallery, GitHub Package Registry, and the Microsoft Container Registry. Just as you can find the Mission gluten-free tortillas from several grocery stores, the same debian image may be available from multiple registries. The location you pull the artifact from, should not reflect the identity. See Separating Identity from Location for more info.
The signed artifact could be the
net-monitor container image, the SBoM of the image, or even a persisted scan result of the image. If we’re going to evaluate SBoMs as a source of our security decisions, don’t we need to know who created them? And, based on who created them, do we trust them?
Does a Signature Imply Quality
A signature, by itself, is not intended to guarantee quality. There may be hundreds of releases of the
net-monitor image that contain various levels of quality. Some builds may even have known or unknown exploits. Being able to identify the author is a piece to the secure supply chain puzzle, but not the entirety of the puzzle.
Using the favorite Solar Winds Exploit, it was the signature that provided insight the exploit wasn’t a distribution hack. The signing key wasn’t stolen. The artifacts were signed by the SolarWinds build system, enabling auditors to focus on how the artifacts were built.
The next question is how are consumers notified of the exploit? Should the signing key be revoked, causing the signature validation to fail, because a particular build was found to be of lower quality, or contained a vulnerability? If we’re going to revoke signing keys because of vulnerabilities, what level of a vulnerability should constitute revocation? Is the revocation more damaging? Should the scaling of the Acme Rockets telemetry service fail, just as the launch commenced? Is revocation the right tool for the job, or a blunt hammer that can cause more damage and confusion than the initial exploit?
Key revocation is a debated and controversial topic. SolarWinds did revoke their signing key, which caused a wider range of problems.
…the code-signing certificate used by SolarWinds to sign the affected software versions was revoked March 8, 2021. This is industry-standard best practice for software that has been compromised.
Is it the “best practice”. Should it be the best practice? Asbestos was the best practice in the mid-1900s. It stopped fires, but it caused bigger problems. Is Asbestos still the best practice?
The SolarWindos Q&A goes on to note:
Regretfully, the same digital code-signing certificate used to sign our Orion Platform software affected by the SUNBURST vulnerability was also used to sign additional SolarWinds products not known to be affected by SUNBURST. While this does not mean all products are compromised, it does mean the day-to-day operation of any software signed by the compromised digital code-signing certificate may be impacted
Which prompted questions from users:
Can I just replace the revoked digital code-signing certificate with the new one and keep my software running?
Of course, replacing the code-signing certificate doesn’t mitigate the problem. It doesn’t automatically download and upgrade to remediated versions. Users must install a new version of the software, and, they must enable new versions of the signing certificates.
So, what is the best practice? What are the roles and responsibilities of signatures, SBoMs, and security scanners? When should you revoke a signing key? Is it the job of signing keys to enforce vulnerabilities or quality, when the vulnerability was distributed by the owner of the signing key?
Identity Theft = Revocation
Let’s compare to the identity of a person, Morgan. If Morgan does a bad thing, as a mistake or an intentional act, does that mean Morgan is no longer Morgan? What did Morgan do, and how do we know Morgan did it? Did Morgan do “it”, or was their identity stolen and someone is doing something bad on their behalf? If someone stole Morgan’s identity and was able to act on their behalf, we want to invalidate the instrument used to represent Morgan, but protect Morgan as an entity that does good things. If Morgan made a mistake, isn’t it that specific artifact that needs to be recalled? Do we want to revoke everything Morgan signed? Should Morgan be the only one that can revoke things they signed?
A signature is a representation of identity. The ability to sign something, on behalf of someone is the thing that can be compromised. the identity should be maintained. When the identity is stolen and used to identify bad actors, the identity of the signing instrument should be revoked as it no longer accurately identifies Morgan. If your driver’s license, passport, or credit cards are stolen, you contact those authorities to have them canceled, and new documents and/or credit cards are issued. A power of attorney may be given to sign things on Morgan’s behalf. That power of attorney may be given for a period of time, but the things Morgan’s attorney signed aren’t time-bound. If the power of attorney wasn’t valid, this constitutes a stolen identity. Anyone attempting to use the stolen identities is blocked, as each of these systems has its validation schemes. However, there’s no central list to track all driver’s license, passport, or credit cards. If you purchased a bad product or canceled a subscription, do you cancel the credit card, or do you return the product, and get credit?
Security Scanning Identifies Vulnerabilities
Using the SolarWinds exploit, I’d suggest revoking the signature wasn’t the best appropriate approach, at least not by today’s standards. It was the equivalent of using Asbestos. Well-intended, but didn’t “scale” so well. Revoking the signing key created a wider blast radius, causing collateral damage. And, it didn’t actually remediate the problem. It was a blunt hammer.
Security scanners were, and are the appropriate tool for the job. Users didn’t know where the SolarWinds software was running. However, companies should deploy security scanning products, which monitor all environments. The security scanners look for known vulnerabilities, enabling users visibility and tools for remediation. As users were notified of the problem, by the security scanner, they were notified to remediate by deploying a new version. If the SolarWinds key was maintained, the new version would install, as the SolarWinds signing key still accurately reflected SolarWinds built the update., The security scanning software would approve the update, as it no longer contained the exploit. Unaffected SolarWinds products would continue to operate, as they were never impacted. If the SolarWinds signing key was compromised, and someone was building SolarWinds software, a revocation of that signing key would make sense, as who knows what was produced by a bad actor. That wasn’t the case here. It was quickly discovered what, and a surgical knife would have been more accurate than a blunt hammer.
Security scanning can and should be done at each stage of the supply chain (build, distribute, consume). Imported artifacts for the build, staging, and production environments must be scanned. The source code and the patterns within the source must be scanned. The output of the build system (the collection of artifacts) must be scanned as the aggregate can surface new vulnerabilities.
Scanning takes inputs and generates outputs. As the saying goes, the output is only as good as the collection of inputs. Since we’re continually chaining the consumption of upstream artifacts, scanners are always at the mercy of what they can consume.
Scanning provides insight, based on what is known at that point in time, and what has been learned, up to that point in time. It evaluates the information it has and states an opinion, based on the knowledge at that point in time and the context by which it’s being used. It doesn’t predict the future, nor may it see the cracks just below the surface. Scanners provide a subjective opinion that will change, over time as it learns of new exploits and vulnerabilities.
Security Scanners and SBoMs=PB&J
Security scanners use the information to make decisions. These include the list of Common Vulnerabilities and Exposures (CVE) and proprietary information security companies maintain. They actively scan content for what they can see. Using the Tacos example, scanners could perform DNA analysis on the completed food product. Wouldn’t it be nice to know what went into the food product? If you knew Red Dye No. 3, which causes Thyroid Cancer, was used in the creation of the food, you would know to avoid it. You would have insight into what went into the build, and wouldn’t be limited to after the fact, DNA analysis.
If scanners knew, for a fact, a particular package was used to compile a native binary, they could be far more effective and efficient. When a new vulnerability is found in a particular package or a build configuration, scanners could query the list of SBoMs of software that’s actively deployed providing more insight. Scanners wouldn’t have to pull terabytes of container images across the network to scan for patterns in the compiled code.
If you had an SBoM for a building you purchased in 1970, you would require Asbestos curtains. Purchasing the same building in 2022 wouldn’t stop the purchase and usage. But, you would use that factual information to negotiate the price of abatement. Without an SBoM, you’re faced with lengthy and expensive inspections to check for Asbestos. The inspection may require non-destructive testing, concealing potential risks. The use of an SBoM, from an entity you trust, verified by a signature, makes the entire process efficient. When the signature is from a construction company, that was found to do shady work, the SBoM may be considered invalid, and exploratory testing may be required before purchasing the building. or the identity may represent a company so shady, that you either walk away from the deal (block the deployment) or know the building must be torn down and replaced before you can continue. The identity of the signature is a piece of data, that validates or invalidates the quality of the information provided by that identity.
Summing It Up
SBoMs Should Contain Factual, Non-opinionated Insight That Will Last Over Time
SBoMs represent information, frozen at the point they are created. They must provide information that went into the creation, not after-the-fact, DNA style analysis. They should provide information about the creation, the environment, and any other insight that can be used for future analysis by scanning systems. SBoMs should be easy to create if they’re created at the point of the build. There’s not much logic, as it simply captures factual information. You might argue a particular SBoM may not capture enough information, but the facts for what it captures should not be opinionated. If the scanner runs during the creation, validating the quality of the packages used, it might generate an SBoM for factual information for went into the artifact. The inference for DNA analysis of the consumed packages are opinions that may evolve. Listing Asbestos is a fact. Saying Asbestos is either Approved or High Risk is an opinion that may change over time.
Scanners Are Time Based Opinions
Scanners inform, based on what’s known at a point in time, providing oversight, where time is a factor of evolving information.
Scanners can use the factual information for what went into a build to be efficient and deeply informed. You may generate an SBoM and a scan result at the time of creation, as two separate pieces of information. One fact, one opinion based on what was known at the time.
If the build system knows its building an artifact that contains a known vulnerability, it should list the package in the SBoM and the opinion in the scan result. The opinion may change in the future, and a new scan can represent the newly informed opinion, based on the static facts of the SBoM. You should continue running scans on the same content, as the information improves over time, giving you more insight into older content. You may likely choose to keep the history of SBoMs, as you may be asked: “why was foo:v123abc deployed on Jan 12, 2021, when we know it contains vul 456def” To which you can demonstrate on Jan 12, 2021, it had scanned clear of that vulnerability. It wasn’t until September 10, 2021 that
vul 456def was even identified.
Signatures Are Identities, That Are Maintained as Long as the Signing Instrument Is Valid
Signing instruments (private keys) may be revoked, but I’d suggest they should only be revoked when the signing instrument has been compromised. Signatures that contain verifiable timestamps for when they were issued, may be invalidated from a point in time. If you believe an identity was stolen on April 15th, you should consider the confidence in the date of theft, increasing the margin of error. You may block signatures for days or weeks prior to the compromise date. Signatures should not be revoked as a blunt instrument to impose an opinion of quality. I’d suggest it is the overuse, or abuse of key revocation, and the central management of revocation that has led to the concerns of revocation.
A Suite of Tools, Each With Their Own Purpose
Scanners are the tools for quality and opinions on security, which use identity and SBoM insight to inform that configurable opinion.
SBoMs are an excellent new tool, that will make future software validations more reliable and efficient. The ability to use these new tools is only as good as the quality that goes into them.
Details on what constitutes a signing instrument, how it scales, how to highlight concerns around particular artifacts is a great discussion, for another article.
If you’ve read this far, you obviously care about evolving the norm, improving the best practices, and finding a balance of scale and the separation of concerns. Systems should do one thing, do it well, and allow a suite of tools to solve a suite of problems.