Things should be made as simple as possible – but no simpler.
- sometimes attributed to Einstein
I believe the rule of thumb above stands on its own merit when it comes to software systems so the credibility of the attribution is not important (it’s also possible that we should not take software design advice from a physicist).
This post is about the PKI signing API provided by Securesystemslib and used by applications built with python-tuf. It’s an example of how keeping a thing too simple can actually make it more complex.
The original securesystemslib.keys module is based on the assumption that there are three distinct steps in the lifetime of a private-public keypair in a system like a TUF repository:
This all seems logical on paper but in practice implementing signing for different underlying technologies (like online key vaults and Yubikeys) forces the API surface to grow linearly, and still requires the applications to also be aware of all the different signing technologies and their configuration. It was clear that something was wrong.
In reality there are four distinct events during the lifetime of a signing key. All of these steps can happen on different systems, with different operators and different access to the underlying signing system:
Securesystemslib 0.26 introduces an improved signer API that recognizes this process complexity – and in turn makes managing and signing with keys simpler in practical application development. There are three main changes, all in the securesystemslib.signer module that defines Signer and Key classes:
gcpkms:projects/python-tuf-kms/locations/global/keyRings/git-repo-demo/cryptoKeys/online/cryptoKeyVersions/1 (A Google Cloud KMS key)file:/home/jku/keys/mykey?encrypted=true (A key in an encrypted file)hsm: (A hardware security module like Yubikey)These examples are slightly simplified copies from my latest repository implementation and should represent any new application code using the python-tuf Metadata API in the future1. Some things to note in these examples:
Here’s an example where the private key URI is stored in a custom field in the metadata (this makes sense for online keys). First, the setup code that imports a key from Google Cloud KMS – this code runs in a repository maintainer tool:
def import_google_cloud_key() -> Key
gcp_key_id = input("Please enter the Google Cloud KMS key id")
uri, key = GCPSigner.import_(gcp_key_id)
# embed the uri in the public key metadata
key.unrecognized_fields["x-online-uri"] = uri
return key
Then signing with the same key – this code runs in the online repository component and only needs the public key as an argument since we embedded the private key URI in the public key metadata. It does require the cloudkms.signer role permissions on Google Cloud though:
def sign_online(self, md: Metadata, key: Key) -> None:
uri = key.unrecognized_fields["x-online-uri"]
signer = Signer.from_priv_key_uri(uri, key)
md.sign(signer)
This time we’re importing the maintainers Yubikey:
def import_yubikey(config: ConfigParser) -> Key
input("Insert your HW key and press enter")
uri, key = HSMSigner.import_()
# store the uri in application configuration
config["keyring"][key.keyid] = uri
return key
Later we sign with the Yubikey:
def sign_local(md: Metadata, key: Key, config: ConfigParser) -> None:
uri = config["keyring"][key.keyid]
signer = Signer.from_priv_key_uri(uri, key)
md.sign(signer)
In February 2022 the python-tuf team released version 1.0. This release was the product of a significant refactoring effort with the code being rewritten from scratch to provide two new stable API’s:
Unifying both of these APIs is a focus on developer ergonomics and flexibility of the API.
While the new python-tuf codebase is much leaner, a mere 1,400 lines of code at release, compared to the legacy code’s 4,700 lines, and builds on the lessons learned from development (and developers) on the prior versions of python-tuf, we were very conscious of the fact that our first major release of a security project was made up of newly authored code.
To improve our confidence in this newly authored code we engaged with the Open Source Technology Improvement Fund (OSTIF) to have an independent security assessment of the new python-tuf code. OSTIF connected us with the team at X41 D-Sec who performed a thorough source code audit, the results of which we are releasing today.
The report prepared by X41 included one medium severity and three low severity issues, we describe below how we are addressing each of those reported items.
Private Key World-Readable (TUF-CR-22-01) – Medium
This vulnerability is not in any code called by python-tuf, but was included in demonstrative code the python-tuf team provided to the X41 team. The underlying issue is in securesystemslib, a utility library used by python-tuf which provides a consistent interface around various cryptography APIs and related functionality, where any files were created with the default permissions of the running process.
We resolved this issue by adding an optional restrict parameter
to the storage.put() interface and in the corresponding filesystem
implementation of the interface ensuring that when restrict=True files are
created with octal permissions 0o600 (read and write for the user only).
This enhancement has been included in the recent release of securesystemslib 0.25.0.
Shallow Build Artifact Verification (TUF-CR-22-02) – Low
The verify_release script, run by python-tuf developers as part of the
release process and available to users to verify that a release on GitHub or
PyPI matches a build of source code from the repository, was only performing
a shallow comparison of files. That is, only the type, size, and modification
times were compared. We have modified the script to perform a deep comparison of the contents and attributes of files being
verified.
Quadratic Complexity in JSON Number Parsing (TUF-CR-22-03) – Low
This issue was not in python-tuf itself, rather the problem was in Python’s built-in json module.
Fortunately, we did not need to take any action for this issue as it was independently reported upstream and has been fixed in Python. Find more details in CVE-2020-10735: Prevent DoS by large int<->str conversions on Python’s issue tracker.
Release Signatures Add No Protection (TUF-CR-22-04) – Low
python-tuf releases are built by GitHub Actions in response to a developer
pushing a tag. However, before those releases are published to the project’s
GitHub releases page and PyPI a developer must verify (using the
verify_release script discussed earlier) and approve the release. Part of the
approval includes creating a detached signature and including that in the
release artifacts. While these do not add any additional protection, we do
believe that the additional authenticity signal is worthwhile to users.
Furthermore, along with the above notice and the recommendations in the informational notes we will continue to iterate on our build and release process to provide additional security for users of python-tuf.
We are extremely grateful to X41 for their thorough audit of the python-tuf code, to Open Source Technology Improvement Fund (OSTIF) for connecting us with the X41 D-Sec, GMBH team, and to the Cloud Native Computing Foundation (CNCF) for funding the source code audit – thank you all.
Read the full report here: Source Code Audit on The Update Framework for Open Source Technology Improvement Fund (OSTIF).
]]>With the v1.0.0 release we can say that the current reference implementation is finally in a good place, although it wouldn’t be so trustworthy without all the awesome test functionality it provides. Therein lies some interesting surprises, for the conformance tests reflect use cases and tricky details that wouldn’t easily come to mind. TUF, in fact, is capable of managing some tricky business!
Before looking into them, let’s first introduce the test functionality itself.
The test suite is heavily based on RepositorySimulator, which allows you to play with repository metadata by modifying it, signing and storing new roles versions, while serving older ones in the client test code. You can also simulate downloading new metadata from a remote without the need of file access or network connections, and modify expiry dates and time.
Even though RepositorySimulator hosts repos purely in memory, you can supply the --dump flag to write its contents to a temporary directory on the local filesystem with “/metadata/…” and “/targets/…” URL paths that host metadata and targets respectively in order to audit the metadata. The test suite provides you with the ability to see the “live” test repository state for debugging purposes.
Let’s cite a specific example with testing expired metadata to demonstrate the cool thing the RepositorySimulator provides, i.e. the capability to simulate real repository chains of updates as suggested by the spec, and not just modify individual metadata.
More specifically, we would like to simulate a workflow in which a targets version is being increased and a timestamp expiry date is being changed. We are going to elaborate below on how this can be used to test the Updater above all programmatically. Now, let’s just focus on how to verify that the RepositorySimulator did what we expected.
Let’s assume we did the following:
targets to v2timestamp v2 expiry dateWe can verify that the metadata looks as expected, without the need to implement file access.
First, we need to find the corresponding temporary directory:
$ python3 test_updater_top_level_update.py TestRefresh.test_expired_metadata --dump
Repository Simulator dumps in /var/folders/pr/b0xyysh907s7mvs3wxv7vvb80000gp/T/tmpzvr5xah_
Once we know it, we can verify that the metadata has 2 cached versions:
$ tree /var/folders/pr/b0xyysh907s7mvs3wxv7vvb80000gp/T/tmpzvr5xah_/test_expired_metadata
/var/folders/pr/b0xyysh907s7mvs3wxv7vvb80000gp/T/tmpzvr5xah_/test_expired_metadata
├── 1
│ ├── 1.root.json
│ ├── snapshot.json
│ ├── targets.json
│ └── timestamp.json
└── 2
├── 2.root.json
├── snapshot.json
├── targets.json
└── timestamp.json
And now we can also see that after bumping the version and moving timestamp v2 expiry date two weeks forward from v1, the v2 corresponding timestamp metadata has recorded that expiry date correctly:
Timestamp v1:
$ cat /var/folders/pr/b0xyysh907s7mvs3wxv7vvb80000gp/T/tmpzvr5xah_/test_expired_metadata/1/timestamp.json
{
"signatures": [{...}],
"signed": {
"_type": "timestamp",
"expires": "2022-03-30T00:18:31Z",
"meta": { "snapshot.json": {"version": 1}},
"spec_version": "1.0.28",
"version": 1
}}
Timestamp v2:
$ cat /var/folders/pr/b0xyysh907s7mvs3wxv7vvb80000gp/T/tmpzvr5xah_/test_expired_metadata/2/timestamp.json
{
"signatures": [{...}],
"signed": {
"_type": "timestamp",
"expires": "2022-04-13T00:18:31Z",
"meta": { "snapshot.json": {"version": 2}},
"spec_version": "1.0.28",
"version": 2
}}
As you can see, the first date is 30 Mar and the second - 13 Apr, which is exactly 14 days later. This is a great way to observe what the tests really do and check if they do it successfully.
Now, let’s take a closer look at two edge cases, using in this test the cool things the RepositorySimulator provides:
Imagine that we have performed an update and stored metadata in a cache. And the locally stored timestamp/snapshot has expired. But we still need it to perform an update from remote by verifying the signatures and we need to use the expired timestamp.
We can play with versions and expiry to verify that this scenario not explicitly mentioned in the spec works correctly and safely. By using the simulator, we can do the following:
updater.refresh().refresh (with updater.refresh() call) with the expired locally cached timestampThis is a not so obvious use-case to keep in mind when thinking about updates. You can see how it looks in practice in the reference implementation.
Now let’s see if a rollback attack protection can be performed when the local timestamp has expired. In this case we need at least two timestamp and snapshot versions, an expired older version of timestamp, and a verification that a rollback check is performed with the old version.
For a timestamp rollback, the case is pretty similar to the use of expired metadata. We can do the following:
updater.refresh() on the very first dayupdater.refresh() somewhere between day 7 and day 21A similar approach can be used when testing both timestamp and snapshot rollback protection. We just need to guarantee that after the last snapshot update, the snapshot version is not the latest in order to verify a rollback check is performed both with expired timestamp and an older snapshot. Sounds complicated, but it’s pretty easy with the simulator and this example illustrates it pretty well.
One of the great things about a reference implementation is that one can learn a lot about the TUF specification by looking at the tests, which are full of examples that would hardly come to mind when you read the abstract straightforward workflow explained in the spec. And those tests most likely do not cover everything…
Do you have a comment about the TUF spec or the cited examples? An idea? Please share it with us!
]]>ngclient, in Python-TUF. This post explains why we ended up doing that when a client already existed.
The legacy code had a few problems that could be summarized as non-optimal abstractions: Significant effort had been put to code reuse, but not enough attention had been paid to ensure the expectations and promises of that shared code were the same in all cases of reuse. This combined with Pythons type ambiguity, use of dictionaries as “blob”-like data structures and extensive use of global state meant touching the shared functions was a gamble: there was no way to be sure something wouldn’t break.
During the redesign, we really concentrated on finding abstractions that fit the processes we wanted to implement. It may be worth mentioning that in some cases this meant abstractions that have no equivalent in the TUF specification: some of the issues in the legacy implementation look like the result of mapping the TUF specifications Detailed client workflow directly into code.
Here are the core abstractions we ended up with (number of lines of code in parenthesis to provide a bit of context, alongside links to sources and docs):
Metadata (900 SLOC, docs) handles everything related to individual pieces of TUF metadata: deserialization, signing, and verifyingTrustedMetadataSet (170 SLOC) is a collection of local, trusted metadata. It defines rules for how new metadata can be added into the set and ensures that metadata in it is always consistent and valid: As an example, if TrustedMetadataSet contains a targets metadata, the set guarantees that the targets metadata is signed by trusted keys and is part of a currently valid TUF snapshotUpdater (250 SLOC, docs) makes decisions on what metadata should be loaded into TrustedMetadataSet, both from the local cache and from a remote repository. While TrustedMetadataSet always raises an exception if a metadata is not valid, Updater considers the context and handles some failures as a part of the process and some as actual errors. Updater also handles persisting validated metadata and targets onto local storage and provides the user-facing APIFetcherInterface (100 SLOC, docs) is the abstract file downloader. By default, a Requests-based implementation is used but clients can use custom fetchers to tweak how downloads are doneNo design is perfect but so far we’re quite happy with the above split. It has dramatically simplified the implementation: The code is subjectively easier to understand but also has significantly lower code branching counts for the same operations.
A year ago we added TUF support into pip as a prototype: this revealed some design issues that made the integration more difficult than it needed to be. As the potential pip integration is a goal for Python-TUF we wanted to smooth those rough edges.
The main addition here was the FetcherInterface: it allows pip to keep doing all of the HTTP tweaks they have collected over the years.
There were a bunch of smaller API tweaks as well: as an example, legacy Python-TUF had not anticipated downloading target files from a different host than it downloads metadata from. This is the design that PyPI uses with pypi.org and files.pythonhosted.org.
Since we knew we had to break API with the legacy implementation anyway, we also fixed multiple paper cuts in the API:
In addition to the big-ticket items, the rewrite allowed loads of improvements in project engineering practices. Some highlights:
These are not ngclient features as such but we expect they will show in the quality of products built with it.
![]()
Python-TUF is the reference implementation of The Update Framework specification, an open source framework for securing content delivery and updates. It protects against various types of supply chain attacks and provides resilience to compromise.
For the past 7 releases the project has introduced new designs and implementations, which have gradually formed two new stable APIs:
ngclient:
A client API that offers a robust internal design providing implementation
safety and flexibility to application developers.Metadata API:
A low-level interface for both consuming and creating TUF metadata. Metadata
API is a flexible and easy-to-use building block for any higher level tool or
library.Python-TUF 1.0.0 is the result of a comprehensive rewrite of the project, removing several hard to maintain modules and replacing them with safer and easier to use APIs:
With this foundation laid, Python-TUF developers are currently planning next steps. At the very least, you can expect improved repository side tooling, but we’re also open to new ideas. Pop in to #tuf on CNCF Slack or Github issues and let’s talk.
]]>