Introduction

This article embarks on an in-depth exploration of Google Cloud Workload Identity, shedding light on its internal workings and essential attributes. We will demystify key concepts like workload identity pools and providers, while delving into critical components like attribute mappings and conditions, which plays a fundamental role in precise identity management. Additionally, we will explore service account impersonation, explaining how it enables identity access to Workload Identity while strategically utilizing attribute conditions for access control.

Furthermore, we will guide you through GitLab CI’s OpenID Connect (OIDC) integration, showcasing its role in securing interactions between GitLab CI/CD pipelines and Google Cloud resources. With a step-by-step walkthrough and practical demo code, you will gain hands-on experience implementing this integration. By the end, you will master the intricate synergy of Workload Identity and GitLab CI’s OIDC, empowering you to seamlessly apply these insights in real-world cloud-native deployment scenarios.

Risks of Storing Google Cloud Service Account Keys in GitLab CI Variables

When working with cloud services like Google Cloud, one of the common practices is to utilize service account keys for authenticating applications and services. One of the most pressing concerns is the leakage of these sensitive credentials to potential attackers. Setting up those keys in Gitlab CI variables does not offer sufficient protection and makes it challenging to restrict or monitor who can access and modify those credentials.

Workload Identity concepts

Workload Identity is a mechanism in Google Cloud that enhances security by allowing Google services to authenticate to other services. It helps eliminate the need for long-lived credentials (also known as Service Account keys), enhancing overall security posture. Workload Identity relies on the use of Google-managed service accounts and allows external identities the rights to impersonate those service accounts. (Google Cloud releases a short-lived (1 hour by default) token for each request)


Graphic illustration showing GitLab's integration with Google Cloud, indicating a preference for short-lived tokens with a green check mark, and a red cross against long-lived Service Account Keys. The image features the GitLab logo on the left, pointing towards a cloud graphic with icons for Cloud SQL and Google Cloud on the right


Workload Identity Pools and Providers

In Google Cloud, a Workload Identity Pool is like a container where service accounts and external identities are put together, like those from Google Workspace or other identity systems outside of Google Cloud. This helps us connect Google Cloud service accounts with external identities, and these pools are linked to specific Google Cloud projects.

On the other hand, a Workload Identity Provider is like the setup that tells Google Cloud how to verify the identities of external users or services and how to make the connection between them and Google Cloud service accounts. It specifies where to look for identities, like Google Workspace or an external identity provider, and sets up the rules for secure communication between these external sources and Google Cloud. Workload Identity Providers are essential for making sure the right external users or services can access Google Cloud resources using the correct Google Cloud service accounts. This makes it easier to manage who has access to what and keeps everything more secure by reducing the need for long-lasting service account keys.


Diagram illustrating the integration between GitLab and Google Cloud. GitLab is depicted with an Identity Provider (IdP) and Workloads inside a green-bordered rectangle. Google Cloud is shown in a purple-bordered rectangle with Workload Identity Pool, Security Token Services (STS), Service Account, and Google Cloud Resources. The flow between these components is implied but not explicitly drawn


In essence, Workload Identity Pools are the containers where you organize and group your identities, while Workload Identity Providers are the configurations that specify how Google Cloud should authenticate and map those external identities to Google Cloud service accounts. Together, they enable secure and seamless identity management and access control for Google Cloud resources, with the Workload Identity Provider defining the authentication process and the Workload Identity Pool serving as the target for that authentication.

Attribute Mappings

Attribute Mappings are rules that determine how attributes (such as user roles or group memberships) from external identity sources align with corresponding attributes within Google Cloud. It maps attributes from the identity provider to Google Cloud. You can map to the following Google Cloud attributes:

  • google.subject: (required). It must be mapped in order for Google to know who is the subject making the request.

  • google.group: (Optional) You could use this value to then give access to resources depending on this group. Same as google.subject, it won’t impact our architecture unless we use this group to filter who has access to our service account.

  • attribute.NAME: (Optional) With this option, you can define up to 50 custom attributes in order to map any kind of claims which is provided by an external application (GitLab) to an attribute.

Attribute conditions

“Attribute Condition” is a rule that determines whether an identity request should be granted access to a resource based on specific attributes associated with that request. Attribute conditions are used to control access to Google Cloud resources by evaluating the attributes associated with the request and applying predefined rules or policies. For example, you might allow access only to workloads with a specific attribute value, such as “namespace_id=sandbox” or “gitlab_project_id=112233”. You can map to the following Gitlab attributes:

  • Google.subject (on Google) = assertion.sub: where assertion.sub is an attribute provided by the identity provider during the authentication process. It represents the subject or the user/service being authenticated.

  • attribute.NAME = assertion.NAME

The diagram below displays authentication steps between GitLab and Google Cloud.


Flowchart of GitLab to Google Cloud authentication sequence, with GitLab's Identity Provider and Workloads on the left, connected to Google Cloud's Identity Pool, STS, Service Account, and Resources on the right, numbered to show the flow of operations


  1. Workload (application) authenticates to Identity provider..

  2. The Identity Provider issues credentials.

  3. Workload passes credentials to Google Cloud Security Token Service (STS).

  4. STS checks the validity of credentials using information registered in the Workload Identity Pool.

  5. Response to STS that whether the credentials are valid.

  6. Issue a temporary token (short-lived token) that can impersonate a service account.

  7. Impersonate a service account using a temporary token.

  8. Access Google Cloud resources.

Service Account

At step 8 above, the process involves a token exchange flow that returns a federated access token. This federated access token can be used to impersonate the service account and obtain a short-lived OAuth 2.0 access token. The short-lived token obtained through impersonation carries the privileges and permissions associated with the service account being impersonated. This means that the external identity can perform actions and make API calls within Google Cloud, leveraging the same access rights as the service account.

To enable service account impersonation, you need to grant the external identity the “Workload Identity User” role (roles/iam.workloadIdentityUser) on a specific service account. When adding the “Workload Identity User” role, we need to specify the member to whom we are giving the right to the Service Account. You can see the list below:

+----------------+---------------------+-------------------------------------------------------------+| Attribute | Role binding to set | Use Case |+----------------+---------------------+-------------------------------------------------------------+| | | || google.subject | principal:// | I want to grant it to a unique user. || google.groups | principalSet:// | I want to grant access to all members of the group. || attribute.NAME | principalSet:// | I want to grant access to all IDs with specific attributes. |+----------------+---------------------+-------------------------------------------------------------+

In our case, attribute.NAME will be used to restrict access (more details in the “implementation” section below).

Gitlab CI OIDC

Earlier, we learned about Google Cloud Workload Identity, which helps manage identities and access to Google Cloud services. Now, let’s explore how GitLab CI OIDC joins the mix.

GitLab OpenID Connect (OIDC) integration is like a bridge that helps GitLab communicate to external systems securely, such as Google Cloud. It’s a method for verifying the identity of users and services. When GitLab needs to access resources in your Google CloudWorkload Identity, it uses OIDC to ensure that only the right users and services can get in. In a nutshell, GitLab OIDC is the trusted pathway that makes sure everything and everyone accessing your Google Cloud Workload Identity is authorized and secure.

The diagram below shows how GitLab communicates with Google Cloud Workload Identity:


Flowchart outlining the CI/CD authentication flow between GitLab and Google Cloud. It shows the process from a CI/CD job issuing an OIDC Token in GitLab, to Google Cloud creating roles and validating tokens, and finally returning a temporary credential for GitLab operations


  1. GitLab CI Job Initialization: Within a GitLab CI job, the process begins by issuing an OIDC token. This token is generated with GitLab as the audience (https://gitlab.com/), indicating that GitLab is the intended recipient of the token.

  2. Google Cloud OIDC Identity Provider and Conditional Roles: In Google Cloud, you have set up an OIDC Identity Provider and configured conditional roles. These configurations include attribute mappings and attribute conditions, which define how attributes from the OIDC token should be used to determine access rights.

  3. GitLab calls to Google Cloud APIs: GitLab, within the CI authentication job, makes an API call to Google Cloud. The specific API being called is gcloud iam workload-identity-pools create-cred-config. In this call, GitLab sends the following information:
    a) The OIDC token generated in the earlier step.
    b) The service account email, specifying which Google Cloud service account should be used for the subsequent Google Cloud operations.
    c) The name of the Google Cloud Workload Identity Provider that has been set up.

  4. Google Cloud Validation: Google Cloud receives the API call from GitLab and proceeds to validate the information provided. This includes checking the OIDC token for its authenticity and ensuring that the service account email and Workload Identity Provider name match the configured conditions and mappings.

  5. Short-Lived Token Generation: Upon successful validation, Google Cloud generates a short-lived OAuth 2.0 access token. This token is securely generated based on the provided OIDC token and the specified service account.

  6. Token Sent Back to GitLab: Google Cloud sends the short-lived access token back to GitLab as a response to the API call made in step 3.

  7. GitLab Gains Google Cloud Access: GitLab receives the short-lived token and utilizes it to perform Google Cloud operations securely. This token grants GitLab the permissions and access rights associated with the specified Google Cloud service account, as determined by the attribute conditions and mappings.

Implementation

It is time to roll up our sleeves and dive into the nitty gritty of making GitLab CI and Google Cloud Workload Identity play nice together.

At Astrafy, we have implemented a robust security framework with two layers of defense. Think of it as our digital fortress. The first layer is all about controlling who gets in. We’ve set up Workload Identity attribute conditions right on our GitLab organization (GitLab group). This means that only GitLab identities originating from that specific organization are granted access to our Workload Identity Pool. It’s like having a VIP list for our cloud resources.

But wait, there’s more! We didn’t stop there. That’s where our second layer of defense kicks in. Once an identity makes it through the first checkpoint, they’re handed a token by our Workload Identity Pool. But we don’t just let them roam freely. We’ve set up restrictions that specify which GitLab repository, or any other claim we choose, can impersonate which Google Cloud Service Account. It’s like giving out keys, but only to the rooms they’re supposed to be in. Double the protection, double the peace of mind!


Diagram showing integration between GitLab's CI and OIDC Provider with Google Cloud's STS and Workload Identity Pool, detailing the authentication and authorization flow, including security layers and restrictions for accessing Google Cloud Resources.


Assume that you have configured Terraform with your Google Cloud environment. Here is a snippet of our Terraform configuration for setting up a Google Cloud Workload Identity Pool and Provider.

resource "google_iam_workload_identity_pool" "this" {workload_identity_pool_id = "gitlab-pool"display_name = "GitLab pool"description = "Workload identity pool for GitLab"}resource "google_iam_workload_identity_pool_provider" "this" {workload_identity_pool_id = google_iam_workload_identity_pool.this.workload_identity_pool_idworkload_identity_pool_provider_id = "main-provider"display_name = "GitLab main provider"attribute_mapping = {"google.subject" = "assertion.sub""attribute.namespace_id" = "assertion.namespace_id""attribute.project_id" = "assertion.project_id"}attribute_condition = "assertion.namespace_id ==\"your_gitlab_group_id\""oidc {issuer_uri = "https://gitlab.com"allowed_audiences = ["https://gitlab.com"]}}

Here, the ‘attribute_condition’ parameter is a critical component of our first layer of defense, as mentioned in the architecture above. The ‘namespace_id,’ which corresponds to the GitLab group ID, represents the GitLab repository group where we are using Google Cloud Workload Identity for authentication. This condition ensures that only entities associated with the specific GitLab repository group identified by ‘namespace_id’ are granted access to our resources, as outlined in our architectural approach.

We create a service account and grant it the necessary IAM role to use Workload Identity (this is mandatory). We declare an IAM role membership for the service account and point to a specific principal (identity), indicating which GitLab repository is associated with this identity for access control purposes.

resource "google_service_account" "sa_gitlab_runner" {project = var.gke_project_idaccount_id = "sa_gitlab_runner"}
resource "google_service_account_iam_member" "iam_sa_gitlab_runner" {service_account_id = google_service_account.sa_gitlab_runner.namerole = "roles/iam.workloadIdentityUser"member = "principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/attribute.project_id/your_gitlab_repo_id"}

Replace PROJECT_NUMBER and POOL_ID by Google Cloud project number and the ID of the workload identity pool where you set up the Workload Identity Federation. Replace your_gitlab_repo_id by Gitlab repository ID which uses Google Cloud workload identity.

In our second layer of defense, we have established precise rules. These rules determine which GitLab repository, along with any other chosen attribute like ‘attribute.project_id/your_gitlab_repo_id’ (which represents the GitLab repo ID), is authorized to act as specific Google Cloud Service Accounts. However, you can also choose other GitLab claims based on your architecture/design, check the list of claims provided by Gitlab here.

One of our use case with Gitlab CI is to deploy Cloud Functions and hereafter is a Gitlab CI pipeline to achieve this:

deploy_cloud_function:stage: deployimage: google/cloud-sdk:alpineid_tokens:WORKLOAD_IDENTITY_TOKEN:aud: https://gitlab.combefore_script:- echo ${WORKLOAD_IDENTITY_TOKEN} > ${CI_PROJECT_DIR}/.ci_job_jwt_file- gcloud iam workload-identity-pools create-cred-config ${GCP_WORKLOAD_IDENTITY_PROVIDER}- service-account="${GCP_SERVICE_ACCOUNT}"- output-file=/tmp/.gcp_temp_cred.json- credential-source-file=${CI_PROJECT_DIR}/.ci_job_jwt_file- gcloud auth login - cred-file=/tmp/.gcp_temp_cred.json- export GOOGLE_APPLICATION_CREDENTIALS=/tmp/.gcp_temp_cred.jsonscript:- *deploy_cf

*deploy_cf contains gcloud code to deploy Cloud Functions

Please note that GCP_WORKLOAD_IDENTITY_PROVIDER should be in format:

projects/$PROJECT_NUMBER/locations/$REGION/workloadIdentityPools/$WORKLOAD_POOL_ID/providers/$PROVIDER_ID

We trigger the GitLab CI pipeline and this is the results:


A terminal screenshot showing a series of commands used to authenticate with Google Cloud using workload identity federation from GitLab. It includes setting environment variables, creating credentials with 'gcloud iam', and listing authenticated accounts confirming a service account is active.


As you can see, the GitLab CI/CD job is properly authenticated and authorized with the service account created above steps using a short-lived token received from Google Cloud Platform.

Conclusion

Using Google Cloud Workload Identity to authenticate in GitLab CI pipelines has been a security game changer for our company. We have removed the worry of token leaks in all our GitLab repositories and have simplified the authorisation process of our Gitlab CI pipeline through full optimization. One should always strive for keyless identification as keys require quite some maintenance (through rotation, audits, etc.) and are one of the main sources of Cloud IT attacks.


If you enjoyed reading this article, stay tuned as we regularly publish technical articles on Google Cloud and how to secure it at best. Follow Astrafy on LinkedIn to be notified for the next article ;).

If you are looking for support on Data Stack or Google Cloud solutions, feel free to reach out to us at sales@astrafy.io.