Free Google Cloud Architect Analyzing and optimizing technical and business processes Practice Test 2026 — GCP PCA Questions

Last updated: June 2026 · Aligned with the current Google Professional Cloud Architect exam · 15% of the exam

This free Google Cloud Architect Analyzing and optimizing technical and business processes practice test covers analyzing and improving technical and business processes — software development lifecycle, CI/CD, testing, cost optimization, and stakeholder collaboration. Each question includes a detailed explanation with real Google Cloud context — perfect for GCP PCA exam prep.

Key Topics in Google Cloud Architect Analyzing and optimizing technical and business processes

10 Free Google Cloud Architect Analyzing and optimizing technical and business processes Practice Questions with Answers

Each question below includes 4 answer options, the correct answer, and a detailed explanation. These are real questions from the FlashGenius GCP PCA question bank for the Analyzing and optimizing technical and business processes domain (15% of the exam).

Sample Question 1 — Analyzing and optimizing technical and business processes

A global retail company runs a monolithic e-commerce application on-premises. They are migrating to Google Cloud to reduce operational overhead and improve release velocity. The application has the following characteristics and constraints: - The checkout flow is latency-sensitive and must respond in under 300 ms for 95% of requests from North America and Europe. - Traffic is highly variable, with 10x spikes during flash sales. - The security team requires that payment processing components be isolated from other workloads and that PCI-DSS controls be enforceable and auditable. - The finance team wants predictable monthly costs and visibility into cost drivers per business unit. - The operations team has limited SRE experience and wants to minimize custom automation and complex runbooks. You are asked to propose a target architecture and operational model that optimizes both technical and business processes. Which approach best meets these requirements?

  1. A. Refactor the monolith into multiple microservices and deploy them on a regional GKE cluster in each major geography. Use horizontal pod autoscaling for traffic spikes, separate Kubernetes namespaces for payment services, and custom Prometheus/Grafana stacks for observability and cost allocation. Implement PCI controls through network policies and custom admission controllers.
  2. B. Lift-and-shift the monolith into Managed Instance Groups behind a global external HTTP(S) Load Balancer with autoscaling. Place payment processing instances in a separate MIG and subnet with firewall rules and Cloud Armor policies. Use Cloud Monitoring and Cloud Logging for observability and export billing data to BigQuery for cost analysis by business unit.
  3. C. Refactor the application into a small set of domain-aligned services. Deploy stateless web and catalog services on Cloud Run with global HTTP(S) Load Balancing and Cloud CDN. Deploy payment processing as a separate service on Cloud Run in a dedicated PCI-focused project with VPC Service Controls and tighter IAM. Use Cloud Monitoring, Cloud Logging, and exported billing data to BigQuery with cost labels per service and business unit. (Correct answer)
  4. D. Deploy the monolith as a single containerized application on a multi-regional GKE cluster with cluster autoscaling. Use a service mesh for traffic management and mTLS, and configure network policies to isolate payment components. Use custom scripts to tag resources for cost allocation and build internal dashboards on top of exported logs.

Correct answer: C

Explanation: Option C best balances latency, security, cost visibility, and operational simplicity while aligning with Google Cloud’s Well-Architected Framework. Analysis of Option C (Correct): - **Latency and scalability:** Cloud Run with global HTTP(S) Load Balancing and Cloud CDN provides low-latency access for static and cacheable content to users in North America and Europe. Cloud Run autoscaling handles 10x traffic spikes without complex capacity planning. - **Security and PCI isolation:** Placing payment processing in a **separate service and dedicated project** allows strong isolation boundaries. Using **VPC Service Controls**, tighter IAM, and separate networking for the PCI scope aligns with security-by-design and compliance requirements. This makes PCI-DSS controls more auditable and reduces blast radius. - **Operational simplicity:** Cloud Run is fully managed, reducing the need for deep SRE or Kubernetes expertise. Autoscaling, health checks, and rolling updates are handled by the platform, minimizing custom automation and runbooks. - **Cost predictability and visibility:** Exporting billing data to BigQuery and using **labels per service and business unit** enables detailed cost attribution. Cloud Run’s per-use pricing is transparent, and the ability to break down costs by service and BU supports finance’s needs. - **Process optimization:** Refactoring into a small set of domain-aligned services (not an overly granular microservice mesh) improves release velocity and maintainability without incurring the full complexity of a large microservice architecture. Why other options are suboptimal: A) GKE microservices with custom observability and PCI controls - Technically viable but **operationally complex** for a team with limited SRE experience. Managing GKE, HPA, Prometheus/Grafana, network policies, and admission controllers increases operational burden. - PCI controls via custom Kubernetes constructs are more complex and less straightforward to audit than using project-level isolation and VPC Service Controls. - This option optimizes for flexibility and control, but not for **operational simplicity** and process efficiency. B) Lift-and-shift to Managed Instance Groups - Simpler than GKE, but still requires **instance-level management**, OS patching, and capacity planning. - While MIG autoscaling and global HTTP(S) Load Balancing can handle spikes, the monolith limits independent scaling of payment vs other components. - PCI isolation via separate MIG and subnet is weaker than **project-level isolation** and VPC Service Controls; the PCI scope is not as cleanly separated. - Does not significantly improve release velocity or long-term maintainability; it mostly replicates the existing operational model in the cloud. D) Monolith on multi-regional GKE with service mesh - Multi-regional GKE and service mesh (e.g., mTLS, traffic management) provide strong capabilities but at **high operational complexity**—contrary to the operations team’s constraints. - A single monolith in a large cluster does not provide clean PCI isolation; network policies help but are more complex to manage and audit than separate projects and services. - Custom scripts for tagging and internal dashboards add maintenance overhead compared to using labels and standard billing export to BigQuery. Thus, Option C most effectively optimizes technical and business processes by using managed services, clear security boundaries, and built-in cost and operations tooling, while keeping the architecture and operations model manageable for the existing team.

Sample Question 2 — Analyzing and optimizing technical and business processes

A financial services company runs a portfolio risk analytics platform on Google Cloud. The platform consists of: - A batch computation engine that runs nightly risk calculations on large datasets. - An API that provides intraday risk snapshots to internal trading desks. Current architecture: - Batch jobs run on a large Compute Engine managed instance group that is overprovisioned to meet peak month-end workloads. - The API runs on a separate managed instance group behind a regional HTTP(S) Load Balancer. - All data is stored in a single regional Cloud SQL instance. Business and technical constraints: - Regulatory requirements mandate that all data and processing remain in a single region. - The risk calculations must complete within a 4-hour nightly window, including month-end peaks. - The API must meet a 99.9% availability SLO with p95 latency under 200 ms for internal users in the same region. - The finance team reports that the environment is significantly underutilized outside the nightly batch window and wants to reduce costs without compromising SLOs. - The operations team wants to reduce manual capacity planning and simplify incident response. You are asked to redesign the architecture and operational model to optimize cost and operational efficiency while meeting the constraints. Which approach should you recommend?

  1. A. Move the batch computation engine to a regional GKE cluster with cluster autoscaling and use preemptible node pools for batch workloads. Keep the API on the existing managed instance group for stability. Retain Cloud SQL as the single data store. Use custom autoscaling policies to scale the cluster up before the nightly window and down afterward.
  2. B. Migrate both the batch engine and the API to a single regional GKE cluster. Use separate node pools: one with preemptible VMs for batch jobs and one with regular VMs for the API. Use horizontal pod autoscaling for the API and a Kubernetes job scheduler for batch. Keep Cloud SQL as the backend and enable high availability for the database.
  3. C. Move the batch computation engine to Cloud Dataflow using autoscaling and flexible resource scheduling. Keep the API on a smaller managed instance group with autoscaling based on request latency. Migrate the data from Cloud SQL to BigQuery for batch analytics while keeping a smaller Cloud SQL instance for transactional needs. Use Cloud Monitoring SLOs and alerting for both batch completion time and API latency. (Correct answer)
  4. D. Refactor the batch engine into Cloud Functions triggered by Cloud Scheduler and Pub/Sub, each processing a subset of the data. Move the API to Cloud Run with concurrency enabled for cost efficiency. Keep Cloud SQL as the single data store and enable read replicas to offload batch reads from the primary instance.

Correct answer: C

Explanation: Option C provides the best balance of cost optimization, performance, and operational simplicity while respecting regulatory and SLO constraints. Analysis of Option C (Correct): - **Cost optimization for batch:** Moving the batch computation to **Cloud Dataflow** leverages autoscaling and managed infrastructure. Dataflow can scale resources up during the nightly window (including month-end peaks) and scale down to zero afterward, eliminating overprovisioned Compute Engine instances. - **Performance and batch window:** Dataflow is designed for large-scale batch processing and can be tuned to meet the 4-hour completion window. Autoscaling and parallelism can be adjusted based on job characteristics. - **API SLOs:** Keeping the API on a **smaller managed instance group with autoscaling based on latency** maintains control over performance and availability (99.9% SLO, p95 < 200 ms) while reducing baseline capacity. - **Data architecture optimization:** Moving analytical workloads to **BigQuery** separates analytical and transactional concerns. BigQuery is optimized for large-scale analytics, improving batch performance and simplifying schema evolution. Keeping a smaller Cloud SQL instance for transactional needs reduces cost and risk for the API. - **Operational excellence:** Dataflow and BigQuery are fully managed, reducing operational overhead. Using **Cloud Monitoring SLOs and alerting** for both batch completion time and API latency aligns with the Well-Architected Framework’s reliability and operations pillars. - **Regulatory constraint:** All services (Dataflow, BigQuery, Cloud SQL, MIG) can be configured to run in a single region, satisfying data residency requirements. Why other options are suboptimal: A) Batch on GKE with preemptible nodes, API on existing MIG - GKE with preemptible nodes can reduce batch costs, but introduces **cluster management complexity** and requires careful handling of preemptions to meet the 4-hour window. - Custom autoscaling policies and pre-scaling the cluster before the batch window add operational overhead and manual tuning. - Retaining Cloud SQL as the single store for both transactional and heavy analytical workloads can become a bottleneck and limit scalability. B) Both batch and API on a single GKE cluster - Consolidating onto GKE can improve utilization, but significantly **increases operational complexity** (cluster operations, node pools, HPA, job scheduling) for the operations team. - Running both latency-sensitive API and heavy batch workloads on the same cluster increases the risk of resource contention, which can threaten the API’s 99.9% SLO and latency targets unless carefully isolated and tuned. - Still relies on Cloud SQL as the single backend, which is not ideal for large-scale analytics. D) Batch via Cloud Functions and API on Cloud Run - Cloud Functions are not ideal for **large, long-running batch computations** on large datasets due to execution time limits, concurrency model, and orchestration complexity. - Orchestrating a large portfolio risk calculation via many functions and Pub/Sub increases complexity and risk of partial failures, making it harder to guarantee the 4-hour completion window. - Keeping all analytics on Cloud SQL, even with read replicas, is not optimal for large-scale batch analytics and can still stress the database. Therefore, Option C most effectively optimizes technical and business processes: it uses the right managed services for batch (Dataflow, BigQuery), keeps the API simple and autoscaled, reduces overprovisioning, and improves observability and SLO-based operations.

Sample Question 3 — Analyzing and optimizing technical and business processes

A global retail company runs a monolithic Java application on Compute Engine managed instance groups behind a global external HTTP(S) load balancer. The application handles both customer-facing traffic and nightly batch processing. The operations team reports that during peak shopping hours, batch jobs slow down the user-facing API, causing latency SLO violations. The CIO wants to improve performance and reliability without significantly increasing operational overhead. The company has a moderate budget and wants to avoid a full rewrite in the short term. What should you do to optimize both technical and business processes while meeting these constraints?

  1. A. Split the monolith into two separate instance groups: one for user-facing traffic and one for batch processing. Use separate backend services on the existing HTTP(S) load balancer and configure autoscaling policies tailored to each group. Schedule batch instance groups to scale down during peak hours and scale up during off-peak hours. (Correct answer)
  2. B. Migrate the entire monolithic application to a single regional GKE cluster with node autoscaling. Use Kubernetes HPA to scale pods based on CPU utilization and configure separate namespaces for batch and user-facing workloads.
  3. C. Move the batch processing logic to Cloud Functions triggered by Cloud Scheduler. Keep the user-facing monolith on the existing instance group and reduce the size of the instances to save costs, relying on autoscaling to handle peak loads.
  4. D. Containerize the monolith and deploy it to Cloud Run with maximum concurrency set to 1 to isolate requests. Use Cloud Scheduler to trigger batch endpoints and rely on Cloud Run autoscaling to handle both batch and user-facing workloads.

Correct answer: A

Explanation: Option A is best because it optimizes both technical and business processes with minimal disruption and operational complexity. By separating batch and user-facing workloads into different managed instance groups, you can: - Apply different autoscaling policies: aggressive scaling for user-facing traffic based on request load and latency, and scheduled or CPU-based scaling for batch jobs. - Protect customer-facing SLOs by limiting or scheduling batch capacity during peak hours. - Reuse existing patterns (HTTP(S) load balancer, MIGs) without a full platform migration, reducing risk and time-to-value. - Improve cost efficiency by scaling batch capacity down when not needed. This aligns with the Well-Architected principles of reliability (isolation of workloads), performance (right-sizing and scaling policies), and operational excellence (incremental change, reuse of existing tooling). Option B is technically valid but suboptimal. Moving to GKE introduces significant operational overhead (cluster management, Kubernetes expertise, observability changes) for a team that currently runs MIGs. While namespaces and HPA help, they do not inherently isolate resource contention between batch and user-facing pods unless you also configure resource requests/limits, node pools, and potentially separate clusters. This is a larger transformation than needed for the stated short-term goal. Option C is risky and incomplete. Moving batch logic to Cloud Functions may require substantial refactoring of a monolith, which contradicts the desire to avoid a full rewrite. Cloud Functions are not ideal for long-running or complex batch jobs and may hit execution limits. Reducing instance size to save costs could worsen performance during peaks, and relying solely on autoscaling may not fully protect user-facing SLOs if batch and user traffic still compete for shared resources. Option D is also suboptimal. Cloud Run is well-suited for stateless services, but lifting a monolithic app directly into Cloud Run often requires significant changes (e.g., startup time, filesystem assumptions, long-running tasks). Setting concurrency to 1 greatly increases cost and may not scale efficiently for batch workloads. Running both batch and user-facing workloads on the same Cloud Run service still risks contention and complicates cost control. Therefore, Option A provides the best balance of improved reliability, performance, cost control, and low migration risk.

Sample Question 4 — Analyzing and optimizing technical and business processes

A financial services company processes loan applications through a web portal hosted on Google Cloud. The application stack consists of: - Frontend: Cloud Run services - Backend: A set of microservices on GKE - Database: Cloud SQL for PostgreSQL (high availability) The company must meet strict regulatory requirements: - All customer data must remain in a single EU region. - RPO of 5 minutes and RTO of 1 hour for the database. - Monthly cost increases must be justified with measurable business value. Recently, a regional outage caused several hours of downtime and data unavailability. The business wants to reduce the impact of regional failures while controlling costs. They are not ready to adopt a multi-region active/active architecture due to complexity. What should you recommend to optimize resilience and business continuity while respecting constraints?

  1. A. Enable cross-region read replicas for Cloud SQL in a second EU region and promote the replica manually during a regional outage. Update DNS records to point to services in the secondary region when needed.
  2. B. Migrate the database to Cloud Spanner in multi-region configuration and deploy GKE and Cloud Run services in two EU regions in active/active mode behind a global HTTP(S) load balancer.
  3. C. Keep the primary Cloud SQL instance in the current EU region and configure automated backups with point-in-time recovery. Create an on-demand Cloud SQL instance in another EU region only during a disaster and restore from the latest backup.
  4. D. Use Cloud SQL high availability in the current region and configure a cross-region read replica in another EU region. Document and test a failover runbook that includes promoting the replica and redeploying services in the secondary region during a regional outage. (Correct answer)

Correct answer: D

Explanation: Option D is best because it balances resilience, regulatory constraints, and cost. By using Cloud SQL HA in the primary region plus a cross-region read replica in another EU region, you: - Keep all data within the EU, satisfying compliance. - Achieve an RPO close to the replication lag (typically under a few seconds to minutes), meeting the 5-minute RPO target. - Maintain a relatively simple operational model: primary region is active, secondary region is warm-standby. - Control costs better than a full active/active multi-region architecture, since the secondary region can run minimal capacity until failover. - Improve operational excellence by defining and testing a clear failover runbook, aligning with the Well-Architected Framework. Option A is partially correct but incomplete. Cross-region read replicas provide the necessary data redundancy, but the option does not address how application services will be deployed in the secondary region or how failover will be orchestrated. Simply updating DNS without a tested process for service deployment and configuration may lead to longer RTO and operational risk. Also, without HA in the primary region, you are still vulnerable to zonal failures. Option B is technically strong but likely overkill and misaligned with constraints. Cloud Spanner multi-region plus active/active services in two regions would significantly increase cost and operational complexity. The company explicitly stated they are not ready for multi-region active/active. This option may exceed the budget and change tolerance without clear immediate business justification. Option C under-delivers on RPO and RTO. Restoring from backups to a newly created Cloud SQL instance in another region during a disaster will likely exceed the 1-hour RTO and may not meet the 5-minute RPO, depending on backup frequency and restore time. It also introduces high operational risk during a stressful event, since the environment is not pre-provisioned or regularly tested. Therefore, Option D provides a pragmatic, cost-conscious improvement in resilience that meets regulatory and recovery objectives while keeping operational complexity manageable.

Sample Question 5 — Analyzing and optimizing technical and business processes

A media analytics company ingests clickstream events from multiple mobile apps worldwide. The current architecture is: - Events are sent to Pub/Sub. - A Dataflow streaming job enriches and aggregates events. - Results are written to BigQuery for analytics and to Cloud Storage for long-term archival. The business team complains that analytics dashboards are often delayed by 10–15 minutes, impacting their ability to react to live campaigns. The Dataflow job is configured with large window sizes and uses complex transformations. The CFO is concerned about rising Dataflow and BigQuery costs and wants to avoid a major rewrite of the pipeline. You are asked to optimize the architecture and processes to reduce end-to-end latency while keeping costs under control and minimizing operational complexity. What should you do?

  1. A. Reduce the Dataflow window size and watermarking delay to process events more frequently, and enable BigQuery streaming inserts from Dataflow. Use BigQuery table partitioning and clustering to optimize query performance and cost. (Correct answer)
  2. B. Replace the Dataflow job with a custom streaming application on GKE that writes directly to BigQuery using the Storage Write API. Use aggressive autoscaling on the GKE cluster to handle peak loads.
  3. C. Introduce Cloud Functions to pre-aggregate events from Pub/Sub and send summarized data to the existing Dataflow job at lower frequency, reducing Dataflow workload and BigQuery storage usage.
  4. D. Configure Dataflow to write intermediate results to Cloud Storage every minute and schedule a BigQuery load job every 5 minutes to import the latest data into partitioned tables for analytics.

Correct answer: A

Explanation: Option A is best because it directly addresses latency while leveraging existing managed services and minimizing architectural change. By reducing Dataflow window sizes and watermarking delays, you: - Decrease the time Dataflow waits before emitting results, reducing end-to-end latency. - Maintain the same overall pipeline structure, avoiding a costly rewrite. Enabling BigQuery streaming inserts from Dataflow allows near-real-time data availability for dashboards. Using partitioned and clustered tables improves query performance and can reduce cost by: - Scanning less data for time-bounded queries. - Organizing data for common filter and aggregation patterns. This approach aligns with the Well-Architected principles of performance efficiency and cost optimization while preserving operational simplicity. Option B is technically feasible but suboptimal. Replacing Dataflow with a custom GKE-based streaming application increases operational overhead (cluster management, scaling logic, fault tolerance) and development effort. It contradicts the requirement to avoid a major rewrite and may not yield better cost or latency than a tuned Dataflow pipeline. Option C adds complexity and may not significantly reduce latency. Introducing Cloud Functions for pre-aggregation creates another processing layer, increasing operational complexity and potential failure points. It may reduce some Dataflow and BigQuery costs but at the expense of flexibility and maintainability. It also does not directly address the core issue of large windows and watermark delays. Option D still relies on batch-style loading into BigQuery, which inherently introduces latency due to the load job schedule and processing time. Writing intermediate results to Cloud Storage every minute and loading every 5 minutes will likely keep latency in the several-minute range and adds operational overhead for managing frequent load jobs. Therefore, Option A provides the most balanced improvement in latency, cost control, and operational simplicity with minimal changes to the existing architecture.

Sample Question 6 — Analyzing and optimizing technical and business processes

A healthcare SaaS provider hosts a multi-tenant patient management platform on Google Cloud. The architecture is: - Frontend: Cloud Run services - Backend: Microservices on GKE - Data: A single multi-tenant Cloud SQL for MySQL instance with row-level tenant isolation New enterprise customers require: - Stronger data isolation between tenants for compliance. - Transparent encryption of data at rest and in transit. - Minimal performance impact and predictable costs. The operations team is already stretched and wants to avoid managing a large number of separate databases. The business team wants to onboard new tenants quickly without complex provisioning workflows. What architectural change best balances compliance, operational simplicity, and cost while optimizing the company’s onboarding and management processes?

  1. A. Migrate from Cloud SQL to a single multi-tenant BigQuery dataset with row-level security and customer-managed encryption keys (CMEK). Use views to enforce tenant isolation and restrict access via IAM.
  2. B. Keep a single Cloud SQL instance but move to a schema-per-tenant model within the same instance. Use Cloud SQL IAM database authentication and CMEK for the instance. Implement a provisioning service that creates schemas and grants least-privilege access for each tenant.
  3. C. Provision a dedicated Cloud SQL instance per tenant with CMEK enabled. Automate instance creation and configuration through an internal provisioning portal and use VPC peering to connect all instances to the GKE cluster.
  4. D. Migrate to Cloud Spanner and create a separate database per tenant in a single Spanner instance. Use CMEK and IAM-based access control to isolate tenants and rely on Spanner’s scalability to manage many databases efficiently. (Correct answer)

Correct answer: D

Explanation: Cloud Spanner is the optimal architectural choice for this multi-tenant healthcare SaaS scenario. It provides a 'database-per-tenant' model within a single managed instance, which offers significantly stronger isolation than row-level or schema-level models, satisfying strict compliance requirements. Crucially, Cloud Spanner supports Customer-Managed Encryption Keys (CMEK) at the database level, allowing each enterprise tenant to have its own encryption key—a feature not available at the schema level in Cloud SQL (where CMEK is instance-wide). This approach balances operational simplicity (managing one Spanner instance instead of many Cloud SQL instances) with the need for strong isolation and scalability. Option C is disqualified due to the operational burden of managing many instances and the VPC peering limit (typically 25), which prevents scaling to many tenants. Option B provides only logical isolation and lacks per-tenant encryption keys. Option A is unsuitable as BigQuery is an analytical (OLAP) rather than a transactional (OLTP) database.

Sample Question 7 — Analyzing and optimizing technical and business processes

A global retail company runs a monolithic e-commerce application on-premises. They are migrating to Google Cloud to reduce operational overhead and improve release velocity. The application has the following characteristics: - Web tier and API tier are tightly coupled and deployed together. - A single relational database handles product catalog, orders, and user accounts. - Traffic is highly variable, with large spikes during promotions. - The business wants to start modernizing but must keep the current feature set stable for at least 12 months. Constraints: - The CTO wants to reduce operational burden quickly and avoid a large up-front refactor. - The finance team wants predictable monthly costs and to avoid overprovisioning. - The operations team has limited container orchestration experience. You are asked to propose an initial target architecture on Google Cloud that optimizes for business and technical constraints while enabling future modernization. What should you do?

  1. A. Lift and shift the monolith into a managed instance group of Compute Engine VMs behind an external HTTP(S) load balancer. Use autoscaling based on CPU utilization and gradually refactor the application into separate services by deploying new components on separate instance groups.
  2. B. Containerize the monolith and deploy it on a regional GKE cluster with cluster autoscaling. Use a single Kubernetes Deployment for the entire application and plan to split it into multiple Deployments over time as you refactor.
  3. C. Containerize the monolith and deploy it to Cloud Run (fully managed) behind an external HTTP(S) load balancer. Use Cloud SQL for the relational database and plan to gradually extract specific functionalities into separate Cloud Run services as you modernize. (Correct answer)
  4. D. Refactor the monolith into separate microservices for web, API, catalog, orders, and users before migration. Deploy each microservice to Cloud Run and use Cloud SQL for each bounded context, ensuring full separation of concerns from day one.

Correct answer: C

Explanation: Option C best balances the stated constraints and aligns with the Well-Architected Framework. Analysis: - Business goals: reduce operational burden quickly, predictable costs, enable future modernization without a big-bang rewrite. - Technical constraints: limited container orchestration experience, variable traffic, need to keep feature set stable for 12 months. Why C is best: - Cloud Run (fully managed) removes most infrastructure management (no cluster management, patching, or node scaling), directly addressing the CTO’s desire to reduce operational burden and the ops team’s limited orchestration experience. - Cloud Run scales to zero and up automatically, which handles highly variable traffic and reduces overprovisioning, supporting the finance team’s desire for predictable and efficient costs. - Containerizing the monolith is a relatively small change compared to a full refactor, supporting the requirement to keep the feature set stable for at least 12 months. - Using Cloud SQL as the managed relational database reduces operational overhead and provides a clear path to later split schemas or databases as you extract services. - Over time, specific functionalities can be extracted into separate Cloud Run services, enabling incremental modernization while keeping the core monolith running. Why not A: - Managed instance groups with autoscaling are valid, but they require more VM-level management (OS patching, capacity planning, image management) than Cloud Run. - This approach does not significantly reduce operational burden compared to a container-based managed platform. - It also doesn’t leverage serverless scaling as effectively for highly variable traffic, potentially leading to more overprovisioning and less cost efficiency. Why not B: - GKE is powerful but adds operational complexity: cluster management, node pools, upgrades, and Kubernetes primitives. - The operations team has limited container orchestration experience, so GKE increases the learning curve and operational risk. - While autoscaling helps with variable traffic, the cluster still needs baseline capacity and management, which is more overhead than Cloud Run. Why not D: - A full refactor into microservices before migration is high risk and contradicts the requirement to keep the current feature set stable for at least 12 months. - It delays the benefits of moving to Google Cloud and significantly increases project scope and time-to-value. - It also introduces more operational complexity (multiple services, multiple databases) before the team has gained experience with cloud-native operations. Therefore, option C provides the best combination of low operational overhead, cost efficiency, and a pragmatic path to gradual modernization.

Sample Question 8 — Analyzing and optimizing technical and business processes

A financial services company processes loan applications using a set of internal microservices running on-premises. They want to move the processing pipeline to Google Cloud to improve scalability and reduce processing time. The pipeline has these characteristics: - Each application goes through a sequence of steps: validation, credit scoring, risk assessment, and final decision. - Some steps are CPU-intensive; others are I/O-bound and call external APIs. - The business requires that 95% of applications complete processing within 3 minutes. - Daily volume is predictable, but there are occasional end-of-quarter spikes. Constraints: - Regulatory requirements mandate that all processing and data storage remain in a specific region. - The risk team wants full traceability of each application’s path through the system. - Operations wants to minimize custom orchestration code and avoid building a complex workflow engine. - The finance team wants to avoid paying for idle capacity during off-peak hours. What architecture should you recommend to optimize performance, cost, and operational simplicity while meeting compliance and traceability requirements?

  1. A. Implement the pipeline using Cloud Functions triggered by Pub/Sub topics for each step. Use Pub/Sub attributes to track application state and write custom logging in each function to provide end-to-end traceability.
  2. B. Use Cloud Run services for each step and orchestrate the workflow using Cloud Tasks, with each step enqueuing the next. Store application state and audit logs in Cloud SQL to provide traceability.
  3. C. Use Cloud Run services for each step and orchestrate the workflow using Workflows. Configure Workflows to call each service in sequence, handle retries, and write step-level audit information to Cloud Logging and BigQuery. (Correct answer)
  4. D. Deploy all microservices to a regional GKE cluster and implement orchestration logic in a dedicated orchestration microservice that calls each step in sequence. Use Cloud Trace for end-to-end tracing and Cloud Logging for audit logs.

Correct answer: C

Explanation: Option C best meets the performance, cost, compliance, and operational simplicity requirements. Why C is best: - Cloud Run provides serverless, autoscaling compute for each step, so you pay only for usage and not idle capacity, aligning with the finance team’s requirement. - Workflows is a managed orchestration service that reduces the need for custom orchestration code, directly addressing the operations team’s desire to avoid building a workflow engine. - Workflows can orchestrate HTTP calls to Cloud Run services, manage retries, timeouts, and error handling, which helps meet the 3-minute processing SLO by controlling step behavior and backoffs. - Regional configuration of Cloud Run, Workflows, and storage/logging ensures compliance with the requirement to keep processing and data in a specific region. - Workflows can be instrumented to write step-level audit information to Cloud Logging and BigQuery, providing full traceability of each application’s path and satisfying the risk team. Why not A: - Cloud Functions with Pub/Sub can implement event-driven pipelines, but orchestration becomes implicit and distributed across topics and functions. - Traceability is harder: relying on Pub/Sub attributes and custom logging makes it more complex to reconstruct a clear, ordered path for each application. - Managing sequencing, retries, and timeouts across multiple topics and functions increases operational complexity compared to a dedicated workflow service. Why not B: - Cloud Run for each step is good, but Cloud Tasks is primarily for asynchronous task execution and rate limiting, not for modeling multi-step workflows with complex branching, retries, and end-to-end visibility. - You would need to implement orchestration logic in each step (enqueueing the next task, handling failures), which increases custom code and operational complexity. - While Cloud SQL can store state and audit logs, it adds overhead for write patterns that are better suited to logging and analytics tools like Cloud Logging and BigQuery. Why not D: - GKE introduces cluster management overhead (node pools, upgrades, capacity planning) that is unnecessary given the requirement to minimize custom orchestration and operational complexity. - Implementing orchestration in a custom microservice recreates functionality that Workflows already provides as a managed service. - While Cloud Trace and Cloud Logging can provide good observability, this approach increases the amount of custom code and infrastructure to manage, and it does not optimize for cost as well as serverless options during off-peak periods. Therefore, option C provides a managed, regional, serverless workflow with strong traceability and minimal custom orchestration code.

Sample Question 9 — Analyzing and optimizing technical and business processes

A media analytics company ingests clickstream data from multiple mobile apps and websites. They currently process data in batches overnight on-premises, generating daily engagement reports for customers. They want to move to Google Cloud and provide near real-time dashboards (latency under 2 minutes) while controlling costs. Current and target state: - Ingestion volume averages 50 MB/s, with occasional spikes to 200 MB/s. - Data must be retained for 2 years for historical analysis. - Customers expect near real-time dashboards but can tolerate slightly delayed heavy aggregations (e.g., cohort analysis). Constraints: - The company has a small data engineering team and wants to minimize pipeline maintenance. - Finance wants to avoid overprovisioning long-running clusters. - The architecture must support both streaming and historical queries without duplicating data pipelines. Which architecture should you recommend to best balance latency, cost, and operational simplicity?

  1. A. Ingest data into Pub/Sub, stream it into Dataflow for real-time transformations, and write the results to BigQuery for both streaming dashboards and historical analysis. Use BigQuery’s built-in storage for 2-year retention and partition tables by event date. (Correct answer)
  2. B. Ingest data into Pub/Sub, write it to Cloud Storage using Dataflow, and run scheduled Dataproc jobs every 5 minutes to load data into BigQuery for dashboards and historical analysis. Use Cloud Storage lifecycle policies for 2-year retention.
  3. C. Ingest data directly into BigQuery using the Storage Write API for streaming inserts. Use scheduled BigQuery queries to export older data to Cloud Storage for long-term retention and run dashboards on Cloud Storage via external tables.
  4. D. Ingest data into Cloud Storage using signed URLs from clients, then run Dataflow batch jobs every 2 minutes to load data into BigQuery. Use BigQuery for dashboards and Cloud Storage for 2-year retention.

Correct answer: A

Explanation: Option A provides a unified, low-maintenance architecture that supports both streaming and historical analytics while optimizing for cost and latency. Why A is best: - Pub/Sub handles variable ingestion rates and spikes, decoupling producers from consumers. - Dataflow (streaming) provides managed, autoscaling stream processing with minimal operational overhead, aligning with the small data engineering team’s needs. - Writing directly to BigQuery allows near real-time dashboards with sub-2-minute latency, meeting the business requirement. - BigQuery can serve both streaming and historical queries from the same tables, avoiding duplicate pipelines and simplifying the architecture. - Partitioning by event date and using BigQuery’s managed storage for 2-year retention is operationally simple and cost-effective for analytics workloads. Why not B: - Dataproc requires cluster management (even with autoscaling), which conflicts with the desire to avoid long-running cluster overprovisioning and minimize maintenance. - Running Dataproc jobs every 5 minutes introduces more operational overhead and may struggle to consistently meet the under-2-minute latency requirement. - The combination of Dataflow + Dataproc + BigQuery is more complex than necessary; Dataflow + BigQuery alone is sufficient. Why not C: - Direct ingestion into BigQuery via the Storage Write API is valid for streaming, but exporting older data to Cloud Storage and then querying via external tables for dashboards is suboptimal. - External tables on Cloud Storage are slower and less feature-rich than native BigQuery storage, which can negatively impact dashboard performance. - This design complicates the architecture by splitting data between BigQuery and Cloud Storage for active analytics, without clear cost or operational benefits. Why not D: - Ingesting via Cloud Storage with client-signed URLs adds complexity on the client side and introduces higher latency compared to Pub/Sub + streaming. - Running batch Dataflow jobs every 2 minutes is operationally more complex than a single streaming pipeline and may not consistently meet the under-2-minute end-to-end latency, especially under spikes. - This approach treats a streaming problem as micro-batch, which is less efficient and harder to scale smoothly. Therefore, option A offers a streaming-first, managed, and unified analytics architecture that best balances latency, cost, and operational simplicity.

Sample Question 10 — Analyzing and optimizing technical and business processes

A healthcare SaaS provider hosts a multi-tenant patient management platform. They are migrating from a single-tenant, VM-based architecture on-premises to Google Cloud. Each customer (clinic or hospital) has its own database schema today. The company wants to reduce operational overhead and improve scalability while maintaining strong tenant isolation and meeting healthcare compliance requirements. Requirements: - PHI (Protected Health Information) must be encrypted at rest and in transit. - Each tenant must be logically isolated so that no tenant can access another tenant’s data. - The platform must support hundreds of small tenants and a few very large tenants with higher performance needs. - The operations team wants to minimize the number of database instances to manage while still being able to tune performance for large tenants. Constraints: - The company wants to avoid a full application rewrite and prefers to keep the existing per-tenant schema model for now. - They need a clear path to onboard new tenants quickly with minimal manual operations. Which database architecture on Google Cloud best balances isolation, manageability, and performance tuning needs?

  1. A. Use a single multi-tenant Cloud SQL instance with one database per tenant and shared tables. Enforce tenant isolation using a tenant_id column and application-level access controls.
  2. B. Use a single Cloud SQL instance with separate databases per tenant, each containing its own schema. Use database-level permissions to isolate tenants and configure read replicas for large tenants.
  3. C. Use multiple Cloud SQL instances: one shared multi-tenant instance for small tenants (with separate databases per tenant) and dedicated Cloud SQL instances for very large tenants. Automate tenant provisioning with scripts or tooling. (Correct answer)
  4. D. Use Spanner with a single multi-tenant database and interleaved tables keyed by tenant_id. Use fine-grained IAM and row-level security to isolate tenants and scale horizontally as needed.

Correct answer: C

Explanation: Option C provides a pragmatic balance between isolation, manageability, and performance tuning, while aligning with the existing per-tenant schema model. Why C is best: - A shared Cloud SQL instance for many small tenants (with separate databases per tenant) reduces the number of instances to manage, meeting the operations team’s goal. - Dedicated Cloud SQL instances for very large tenants allow independent performance tuning (e.g., machine type, storage, replicas) without impacting smaller tenants. - Per-tenant databases preserve the current per-tenant schema model, avoiding a full application rewrite. - Logical isolation is improved compared to a single shared schema: database-level separation plus application controls reduce the risk of cross-tenant data access. - Tenant onboarding can be automated by scripting database creation on the shared instance or provisioning a new dedicated instance for large tenants, supporting rapid onboarding. - Cloud SQL provides encryption at rest by default and supports TLS for in-transit encryption, satisfying PHI requirements when combined with proper configuration and IAM. Why not A: - A single instance with one database per tenant but shared tables and tenant_id is effectively a shared schema model; it increases the risk of cross-tenant data exposure due to application bugs. - It does not leverage database-level isolation, which is desirable for PHI and multi-tenant SaaS. - Performance tuning per tenant is difficult because all tenants share the same tables and resources. Why not B: - A single Cloud SQL instance with separate databases per tenant improves isolation compared to A, but it still has limitations: - All tenants share the same instance resources, making it hard to tune for very large tenants without overprovisioning for small ones. - A noisy large tenant can impact performance for all others. - While this is manageable for a moderate number of tenants, the requirement includes both hundreds of small tenants and a few very large ones, making a single instance a scaling and performance risk. Why not D: - Spanner is powerful and supports horizontal scaling, but it represents a significant architectural shift from per-tenant schemas to a shared, globally distributed database model. - It likely requires more application changes (e.g., schema redesign, query changes) than the company is willing to undertake now, conflicting with the desire to avoid a full rewrite. - Spanner is also typically more expensive and complex to operate than Cloud SQL for smaller workloads, which may not be justified for hundreds of small tenants. Therefore, option C offers a hybrid approach: shared infrastructure for small tenants to minimize overhead, and dedicated instances for large tenants to allow targeted performance tuning and stronger isolation.

How to Study Google Cloud Architect Analyzing and optimizing technical and business processes

Combine these Google Cloud Architect Analyzing and optimizing technical and business processes practice questions with Google Cloud's official learning path and hands-on practice in the Google Cloud free tier. The PCA exam rewards applied knowledge, so always tie concepts back to real solutions you've designed and deployed.

About the Google Professional Cloud Architect Exam

Other Google Cloud Architect Domains

Start the free Google Cloud Architect Analyzing and optimizing technical and business processes practice test now | 10-question quick start | All GCP PCA domains