Secure your data in BigQuery and Cloud Storage to prevent data leakages, data exfiltration, and risk from compromised accounts.

Introduction

Security around data is becoming more important than ever with the ever-increasing amount of data and the regulation around data. It cannot be put as a second-class citizen to be tackled at some point once the data engineering and analytics engineering parts are done. It has to be taken into consideration from day 0 on any data projects. The risks and consequences of having a data leakage are just too big to be put aside.

If you are fortunate enough to be working on Google Cloud for your data, you can then leverage different layers of security around your data. The classical one is IAM bindings around your users, groups, and service account identities. This security layer enforces access management based on identities but it doesn’t put any constraints on the networking layer. This is where “VPC Access Controls” comes into play by securing your data at the networking layer This article will deep dive into the specifics of VPC service controls and why you should put this security layer in place from the very beginning for Google Cloud data products such as BigQuery and Google Cloud Storage (GCS).

By the end of this article, you know how to implement VPC Service Controls in order to:

  • Prevent anyone from accessing BigQuery and GCS based on Access Controls criteria even if that user has the necessary IAM roles

  • Minimize the risk of people accessing your data from stolen credentials

  • Support regulatory compliance from standards such as GDPR and HIPAA

  • Secure communication between data products on different Google Cloud projects for a zero-trust architecture

  • Restrict access to development environments to prevent accidental deployment of sensitive code

  • Provides an extra layer of security by denying access from unauthorized networks, even if the data is exposed by misconfigured IAM policies.

The following illustration taken from Google Cloud gives you an overall idea of how VPC Service Controls work:

A network security diagram highlighting VPC Service Controls. It shows allowed connections between trusted devices and internal resources like BigQuery and Storage, within a corporate network. It also illustrates blocked access attempts from unauthorized networks and partner projects to these resources, using red dashed lines


Disclaimer
: VPC Service Controls is a global product that can be activated on a multitude of Google Cloud products. This article will focus on BigQuery and GCS as those two services are the ones you want to protect in priority when working with data on Google Cloud. The concepts explained apply equivalently to other Google Cloud products.

Key Components of VPC Service Controls

The goal is obviously not to rewrite Google Cloud documentation but we need to put into context the different concepts used by VPC Service Controls in order to grasp at best the use cases and mechanisms described afterwards in this article.

Access Context Manager

This analogy from Google Cloud documentation does the job perfectly to explain in simple words Access Context Manager:

Many companies rely on a perimeter security model — for example, firewalls — to secure internal resources. This model is similar to a medieval castle: a fortress with thick walls, surrounded by a moat, with a heavily guarded single point of entry and exit. Anything located outside the wall is considered dangerous. Anything inside is trusted.

Firewalls and the perimeter security model work well if there is a precise boundary around specific users and services. However, if a workforce is mobile, the variety of devices increases as users bring their own devices (BYOD) and utilize cloud-based services. This scenario results in additional attack vectors that are not considered by the perimeter model. The perimeter is no longer just the physical location of the enterprise, and what lies inside cannot be assumed as safe.

Access Context Manager lets you reduce the size of the privileged network and move to a model where endpoints do not carry ambient authority based on the network. Instead, you can grant access based on the context of the request, such as device type, user identity, and more, while still checking for corporate network access when necessary.

The Access Context Manager lets you grant access based on the context of the request, such as device type, user identity, and IP address. Using access levels, you can start to organize tiers of trust.

It’s worth noting that Access Context Manager is an internal Google service used by VPC service controls. Therefore, there is no need to activate it or create it to use VPC Service Controls (It is an integral part of the BeyondCorp product).

Access Policy

An access policy is an organization-wide container for access levels (which define the necessary attributes to use GCP services) and service perimeters (which define regions of services able to freely pass data within a perimeter). As an image is worth a thousand words, thai illustration should make things clear:

Illustrated diagram outlining an access policy structure. The top section is labeled 'Access Levels' and shows two cards: 'Access Level 1' and 'Access Level 2,' both listing criteria like IP Subnetworks and Geographic locations. Below, 'Security Perimeters' are displayed with two areas labeled 'Security Perimeter 1' and 'Security Perimeter 2,' each containing Google Cloud projects marked 'x' and 'y' with icons indicating protected services.


An access policy is globally visible within an organization, and the restrictions it specifies apply to all projects within an organization unless specific scopes are selected. Scoped policies are access policies that are scoped to specific folders or projects alongside an access policy that you can apply to the entire organization. You can use scoped policies to delegate administration of VPC Service Controls perimeters and access levels to folder-level and project-level administrators.

Access Levels

An access level is a classification of requests over the internet based on attributes such as:

  • Source IP range

  • User

  • User device

  • Geolocation (region level)

  • Operating system

  • Minimum allowed OS version

Any access level can have several filters at the same time. You can define if you want to use an “OR” strategy, which allows requests satisfying one or more filtering criteria, or an “AND” strategy, which needs all filtering criteria to be satisfied to allow the request to access the perimeter.

As depicted in the illustration above, an access level must be mapped to an access policy. Those access levels can then be used in one or several service perimeters in order to define who is authorized to access the perimeter.

Other features exist in the “Premium” tier of Access Context Manager (see more details here) — the premium tier is linked to BeyondCorp Enterprise.

Service Perimeter

Service perimeters are networking defenses around Google Cloud projects or VPC networks.

It allows free communication within the perimeter and, by default, blocks all communication across the perimeter. The following illustration depicts a simple service perimeter blocking all connections external to the service perimeter. With that configuration, only connections from within the perimeters are authorized.

Diagram of a service perimeter in a cloud environment. It features two authorized projects within a striped service perimeter, with one project allowing secure data flow to the other, indicated by a green check mark. Red crosses mark the boundary, showing denied access from unauthorized clients, an unauthorized VPC project, and an unauthorized project attempting various connections, including a VM to Google Cloud Service and inter-service access


A service perimeter consists in a nutshell of Google Cloud projects with a list of restricted products (in this article focus is around restricting BigQuery and Cloud Storage). A service perimeter in its simplest form blocks access to every connection coming from the outside world and every connection emerging from the perimeter and going to the outside world. But it’s very unlikely you will want such constraints applied to your perimeters and for this reason, you can allow connections from the outside world and from your perimeter to the outside world via:

  • Access levels: by attaching an access level to a service perimeter, the access rules from this access level will be applied to the service perimeter.

  • Ingress rules: You can define ingress rules within the definition of the service perimeter to configure at a granular level what connections to allow from outside the perimeter.

  • Egress rules: Similar to the ingress rules, you can define egress rules to define what connections to allow from within your security perimeter towards the outside world.

Rules defined in access levels, ingress, and egress add up and therefore you can have generic rules in access levels and then define additional granular rules at ingress and egress level of each of your service perimeters.

Service Perimeter Bridge

A perimeter bridge allows projects in different service perimeters to communicate with each other. Perimeter bridges are bidirectional, allowing projects from each service perimeter equal access within the scope of the bridge.

Regular Service Perimeters cannot overlap, a single Google Cloud project can only belong to a single regular Service Perimeter. Service Perimeter Bridges can contain only Google Cloud projects as members, a single Google Cloud project may belong to multiple Service Perimeter Bridges. The following diagram depicts a use case for a perimeter bridge where the landing zone for data is within one service perimeter and a consuming project of this source data is located in another service perimeter.

Diagram showing cloud network access layers. It depicts authorized IP/device access from the internet with a green checkmark, leading into a DMZ Perimeter labeled as lower trust. Inside, there's a 'Bridge' area connecting a 'Source Project' with an ID of 12345 to a 'Sink Project' with an ID of 67890, within a private perimeter of higher trust

It’s worth noting that Google recommends using ingress and egress rules instead of service perimeter bridges since the former provides the same results as the perimeter bridge but with more granular controls.

VPC Service Controls and IAM permissions

You need to see those as two layers of defense around your Google Cloud products.

By leveraging VPC Service Controls, your security is now managed at two levels:

  • Layer 1 — networking: Access is checked against rules defined by the security perimeters (i.e. access levels, ingress & egress rules).

  • Layer 2 — Classical IAM bindings: If access is authorized by layer 1, the end user now needs access with classical IAM permission.

Illustration of a layered network security concept. A 'Request' starts on the left, passing through 'Layer 1', represented by a brick wall labeled 'VPC Service Controls'. Next, it moves to 'Layer 2', another brick wall labeled 'IAM permissions', before proceeding to the right towards further security measures, indicated by an ellipsis and a security icon


VPC Service Controls act as a robust perimeter defense mechanism that secures your network at the infrastructure level. This network-level security is crucial because it provides the first layer of defense, ensuring that unauthorized network requests, regardless of the requester’s permissions, are blocked before they reach cloud resources.

On the other hand, Google Cloud IAM (Identity and Access Management) bindings function as the second layer of defense. After VPC Service Controls have verified that a request is coming from an allowed source and can pass through the network perimeter, IAM bindings then govern what actions an authenticated identity (be it a user, service account, or group) can perform on specific Google Cloud resources. IAM provides fine-grained access control, ensuring that even if a request originates from a permitted network location, the requester must still have the appropriate IAM permissions to interact with the targeted resources.

Together, VPC Service Controls and IAM bindings offer a comprehensive security model. VPC Service Controls secure the environment at a macro level by controlling network access, while IAM bindings provide micro-level security by managing individual permissions. This layered approach ensures a robust defense-in-depth security posture, where both network-level controls and access permissions work in tandem to protect cloud resources and data from unauthorized access and potential security threats.

Implementation with Terraform

Done with the theory. Let’s now dive a bit into how to implement those VPC Service Controls resources. First things first, always use an “Infrastructure as Code” tool (such as Terraform) in order to deploy those sensitive resources. Benefits are multiple and you will also go faster than via the UI. It’s important as well to not reinvent the wheel and there are official modules made by Google to deploy those resources via terraform.

While VPC Service Controls may seem worrying to deploy as you might lock yourself out of your project, we highly recommend testing it out on isolated sandbox projects before deploying those in your production projects. You can also enable the “dry run” mode for service perimeters and this will not block any requests that violate the service perimeter but only log these requests for you to review.

Resources to deploy:

→ Access policy: it all starts with the access policy, which is the parent container for access levels and security perimeters. This resource is quite simple and you normally define one at the organization level and if your organization is more complex, you can scope those policies at folder and project level for more granular management.

Terraform code snippet:

resource "google_access_context_manager_access_policy" "access-policy" { 
 parent = "organizations/123456789" 
 title = "Org Access Policy" 
}

Access Level: This resource is a bit more complex but simplified with the use of the Google Cloud terraform module. You basically have to define a set of rules to allow access; this access level can then be used in the configuration of the service perimeter.

Terraform code snippet:

module "access_level_members" {
  source         = "terraform-google-modules/vpc-service-controls/google//modules/access_level"
  policy      = google_access_context_manager_access_policy.access_policy_main.name
  name        = "terraform_members"
  members = ["serviceAccount:<service-account-email>", "user:<user-email>"]
  ip_subnetworks = ["192.168.1.1/28"]
  regions = ["BE", "FR"]
}

More granular constraints can be set (see documentation on custom access level) but the constraints around members, IP subnetworks and regions should normally fulfill your security use cases.

→ Service Perimeter: This resource is the one that will have the real impact for your end users and that will put this networking security layer in place. As the access level resource, you should use the google cloud terraform module to deploy service perimeters.

Terraform code snippet:

module "regular_service_perimeter_1" {
  source         = "terraform-google-modules/vpc-service-controls/google//modules/regular_service_perimeter"
  policy         = google_access_context_manager_access_policy.access_policy_main.name
  perimeter_name = "sandbox_perimeter"
  description    = "Perimeter around Sandbox projects"
  resources      = [LIST OF PROJECTS NUMBER]

  restricted_services = ["bigquery.googleapis.com", "storage.googleapis.com"]


  access_levels = [LIST OF ACCESS LEVELS]

  ingress_policies = [{
      "from" = {
        "sources" = {
          resources = [
            "projects/688789777678",
            "projects/557367936583"
          ],
        },
        "identity_type" = ""
        "identities"    = ["some_user_identity or service account"]
      }
      "to" = {
        "operations" = {
          "bigquery.googleapis.com" = {
            "methods" = [
              "BigQueryStorage.ReadRows",
              "TableService.ListTables"
            ],
            "permissions" = [
              "bigquery.jobs.get"
            ]
          }
          "storage.googleapis.com" = {
            "methods" = [
              "google.storage.objects.create"
            ]
          }
        }
      }
    },
  ]
  egress_policies = [{
       "from" = {
        "identity_type" = ""
        "identities"    = ["some_user_identity or service account"]
      },
       "to" = {
        "resources" = ["*"]
        "operations" = {
          "bigquery.googleapis.com" = {
            "methods" = [
              "BigQueryStorage.ReadRows",
              "TableService.ListTables"
            ],
            "permissions" = [
              "bigquery.jobs.get"
            ]
          }
          "storage.googleapis.com" = {
            "methods" = [
              "google.storage.objects.create"
            ]
          }
        }
      }
    },
  ]

  shared_resources = {
    all = [LIST OF PROJECTS NUMBER]
  }
}

This resource is extensive and might look overwhelming but it’s actually quite simple and here is an explanation of each important parameter:

  • resources: this list defines all the projects that are to be included in the service perimeter

  • restricted_services: this list defines the Google Cloud services to be protected by this service perimeter

  • access_levels: this list defines the different access levels that will be taken into account to define who can access the security perimeter from outside. Those access levels can then be supplemented by ingress policies.

  • ingress_policies: this list defines the different ingress for this service perimeter and two main attributes define an ingress policy:

→ From: what identities or projects can access the service perimeter

→ To: what services within the perimeter those identities can reach and with which methods and permissions.

  • egress_policies: this list works similarly to the ingress policies except that it defines the rules for reaching the outside world from within the perimeters.

  • Shared_resources: It defines the list of projects within the service perimeter that will be shared in case this service perimeter is included in a bridge perimeter. As aforementioned, it’s recommended to work with ingress policies instead of bridge perimeters to grant access between two or more service perimeters as it provides more granularity and control.

Important to note that if you start to have multiple access levels and multiple security perimeters, you will want to leverage Terraform “for_each” loops for those different VPC Service Controls terraform resources.

Use Cases for VPC Service Control

A lot has been said conceptually and technically in this article regarding those VPC Service Controls. By now you should have a clear understanding of how it works and its potential use cases. Hereafter we will go into details on those use cases and if you ever encounter those use cases in your company then VPC Service Controls will be your ally.

Security concerns split between networking and operational data teams

Instead of your data teams being dependent on your security team to request IAM access, operational teams can now manage the IAM bindings while the security team manages this networking security layer. It’s a win-win as your security team manages the first layer of defense and for the operational teams, they have the freedom and flexibility to give IAM roles when needed without having to create tickets and wait a long time.

Prevent employees from downloading sensitive data

Preventing data exfiltration is key to classified data. However, some people need access to it. Be it engineers, auditors, or any other person, if they can see the data, they will be able to exfiltrate it with simple IAM bindings. VPC service controls allow you to specify granular egress rules so that you can control what data can get out of your Google Cloud projects. For instance, you could set up rules to forbid service operations such as gsutil cp or bq mk. This highly reduces the risk of insider exfiltration.

Environment isolation

Minimizing errors in production can save headaches, time, and a lot of money. Creating different perimeters for development and production environments prevents people from accessing them by accident without satisfying some requirements. This adds a crucial safety net that minimizes the risk of errors or vulnerabilities making it to production, ultimately protecting the stability and security of critical systems.

Design tips and caveats

VPC Service Controls is an advanced security product on Google Cloud and it’s likely you don’t need it if your organization is small or still at an early-stage maturity on Google Cloud. However, once your organization starts getting bigger and more importantly needs to comply with data regulations, then it’s highly recommended you start using VPC Service Controls to level up your security game.

We will cover some important tips and best practices that are fundamental when starting your journey with VPC Service Controls:

  1. Simplicity in Design: It’s advisable to keep the VPC Service Controls design as straightforward as possible. Complex designs involving multiple bridges, perimeter network projects, or a DMZ perimeter, and intricate access levels should be avoided to reduce complexity and potential security risks​​.

  2. Unified Perimeter Approach: A single, large perimeter, known as a common unified perimeter, is recommended over multiple segmented perimeters. This approach not only simplifies management but also enhances protection against data exfiltration by allowing services and network resources within the perimeter to communicate freely, subject to necessary IAM and network control permissions​​.

  3. Consideration for Multiple Perimeters: In certain scenarios, like handling different types of data with varying sensitivity levels, multiple perimeters may be necessary. This approach can cater to specific compliance requirements or facilitate secure data sharing with external entities​​.

  4. Careful Enablement: Enabling VPC Service Controls without proper planning can disrupt existing applications. It’s crucial to have a detailed plan, involve stakeholders from both the VPC Service Controls operation team and the applications team, and allow ample time for testing and analysis​​.

  5. Documentation and Communication: Clearly document all valid access patterns and use cases at the start of the enablement process. Effective communication and coordination among the involved teams are key to identifying service perimeter violations and ensuring legitimate business workflows are not hindered​​.

  6. Dry Run Mode: Before fully enforcing new configurations, utilize the dry run mode. This mode allows for the identification of potential violations without impacting application functionality, thus enabling a smoother transition to enforced VPC Service Controls​​.

Last but very not least, service perimeters aren’t intended to replace or reduce the need for IAM controls. When your operational teams are defining access controls via IAM bindings, they must ensure that the principle of least privilege is followed and IAM best practices are applied. Security practices cannot be softened assuming that there is an additional layer of security with VPC Service Controls.

Conclusion

“In the world of data security, the smallest detail overlooked can lead to the largest breach.”

This quote underscores the vital role that tools like VPC Service Controls play in safeguarding Google Cloud environments against sophisticated threats and data exfiltration risks. As organizations grow and their data landscapes become increasingly complex, the need for robust, layered security mechanisms becomes paramount. VPC Service Controls, in conjunction with IAM bindings, offers a comprehensive defense strategy that addresses security at both the network and access levels, ensuring that sensitive data remains protected against unauthorized access and potential threats.

Your exploration of VPC Service Controls, from its key components to practical implementation tips, highlights the importance of starting with a security-first mindset. The emphasis on simplicity in design, the strategic use of unified perimeters, and the careful planning required for successful enablement are critical takeaways for any organization looking to enhance its security posture on Google Cloud. The detailed examination of use cases and the thoughtful discussion on the interplay between networking and operational data teams further illuminate the practical applications and benefits of VPC Service Controls.

The journey towards securing cloud data is an ongoing and complex one, but with the right tools and strategies in place, organizations can significantly reduce their risk profile and protect their most valuable assets in the cloud.

Thank you

If you enjoyed reading this article, stay tuned as we regularly publish technical articles on Google Cloud and how to leverage it at best without compromising on security. Follow Astrafy on LinkedIn to be notified of the next article.

If you are looking for support on Modern Data Stack or Google Cloud solutions, feel free to reach out to us at sales@astrafy.io.