Google Cloud Digital Leader Notes and questions

DOWNLOAD PDF LINK

GCP has 200+ services :

This exam expects knowledge of 40+ Services
Exam tests your decision making abilities:
Which service do you choose in which situation?
This course is designed to help you make these choices
Our Goal : Help you start your cloud journey AND get certified

Challenging certification - Expects you to understand and REMEMBER a number of services

As time passes, humans forget things. How do you improve your chances of remembering things?

Active learning - think and take notes

Review the notes every once in a while

Challenge:

Peak usage during holidays and weekends
Less load during rest of the time
Solution (before the Cloud):
PEAK LOAD provisioning : Procure (Buy) infrastructure for peak load
What would the infrastructure be doing during periods of low loads?
Startup suddenly becomes popular
How to handle the sudden increase in load?
Solution (before the Cloud):
Procure (Buy) infrastructure assuming they would be successful
What if they are not successful?

High cost of procuring infrastructure

Needs ahead of time planning (Can you guess the future?) Low infrastructure utilization (PEAK LOAD provisioning)

Dedicated infrastructure maintenance team (Can a startup afford it?)

How about provisioning (renting) resources when you want them and releasing them back when you do not need them?

On-demand resource provisioning . Also called Elasticity.

Trade "capital expense" for "variable expense"

Benefit from massive economies of scale

Stop guessing capacity

Stop spending money running and maintaining data centers

"Go global" in minutes

GCP is the one of the Top 3 cloud service providers Provides a number of services (200+)

Reliable, secure and highly-performant:

Infrastructure that powers 8 services with over 1 Billion Users: Gmail, Google Search, YouTube etc

One thing I love : "cleanest cloud"

Net carbon-neutral cloud (electricity used matched 100% with renewable energy)

The entire course is all about GCP. You will learn it as we go further.

Cloud applications make use of multiple GCP services

There is no single path to learn these services independently HOWEVER, we've worked out a simple path!

Create GCP Account

Regions and Zones

Imagine that your application is deployed in a data center in London What would be the challenges?

Challenge 1 : Slow access for users from other parts of the world (high latency)

Challenge 2 : What if the data center crashes?

Your application goes down (low availability)

Let's add in one more data center in London What would be the challenges?

Challenge 1 : Slow access for users from other parts of the world

Challenge 2 (SOLVED) : What if one data center crashes?

Your application is still available from the other data center

Challenge 3 : What if entire region of London is unavailable?

Your application goes down

Let's add a new region : Mumbai What would be the challenges?

Challenge 1 (PARTLY SOLVED) : Slow access for users from other parts of the world

You can solve this by adding deployments for your applications in other regions

Challenge 2 (SOLVED) : What if one data center crashes?

Your application is still live from the other data centers

Challenge 3 (SOLVED) : What if entire region of London is unavailable?

Your application is served from Mumbai

Imagine setting up data centers in different regions around the world

Would that be easy?

Solution

Google provides 20+ regions around the world
Expanding every year
Region :Specific geographical location to host your resources
Advantages: High Availability
Low Latency
Global Footprint
Adhere to government regulations

How to achieve high availability in the same region (or geographic location)?

Enter Zones

Each Region has three or more zones

(Advantage) Increased availability and fault tolerance within same region

(Remember) Each Zone has one or more discrete clusters

Cluster : distinct physical infrastructure that is housed in a data center

(Remember) Zones in a region are connected through low-latency links

Compute

In corporate data centers, applications are deployed to physical servers

Where do you deploy applications in the cloud?

Rent virtual servers

Virtual Machines - Virtual servers in GCP

Google Compute Engine (GCE) - Provision & Manage Virtual Machines

Create and manage lifecycle of Virtual Machine (VM) instances Load balancing and auto scaling for multiple VM instances Attach storage (& network storage) to your VM instances

Manage network connectivity and configuration for your VM instances

Our Goal:

Setup VM instances as HTTP (Web) Server
Distribute load with Load Balancers

Let's create a few VM instances and play with them Let's check out the lifecycle of VM instances

Let's use SSH to connect to VM instances

Commands:

sudo su - execute commands as a root user
apt update - Update package index - pull the latest changes from the APT repositories
apt -y install apache2 - Install apache 2 web server
sudo service apache2 start - Start apache 2 web server
echo "Hello World" > /var/www/html/index.html - Write to index.html
$(hostname) - Get host name
$(hostname -I) - Get host internal IP address

IP Address Description

Internal IP Address Permanent Internal IP Address that does not change during the lifetime of an instance

Ephemeral External IP Address that changes when an instance is stopped

Static IP Address Permanent External IP Address that can be attached to a VM

How do we reduce the number of steps in creating an VM instance and setting up a HTTP Server?

Let's explore a few options:

Startup script
Instance Template Custom Image

Bootstrapping: Install OS patches or software when an VM instance is launched.

In VM, you can configure Startup script to bootstrap

DEMO - Using Startup script

Why do you need to specify all the VM instance details (Image, instance type etc) every time you launch an instance?

How about creating a Instance template?

Define machine type, image, labels, startup script and other properties

Used to create VM instances and managed instance groups

Provides a convenient way to create similar instances

CANNOT be updated

To make a change, copy an existing template and modify it

(Optional) Image family can be specified (example - debian-9):

Latest non-deprecated version of the family is used

DEMO - Launch VM instances using Instance templates

Installing OS patches and software at launch of VM instances

increases boot up time

How about creating a custom image with OS patches and software pre-installed?

Can be created from an instance, a persistent disk, a snapshot, another

image, or a file in Cloud Storage

Can be shared across projects

(Recommendation) Deprecate old images (& specify replacement image)

(Recommendation) Hardening an Image - Customize images to your corporate security standards

Prefer using Custom Image to Startup script

DEMO : Create a Custom Image and using it in an Instance Template

Automatic discounts for running VM instances for significant portion of the billing month

Example: If you use N1, N2 machine types for more

than 25% of a month, you get a 20% to 50% discount on every incremental minute.

Discount increases with usage (graph) No action required on your part!

Applicable for instances created by Google Kubernetes Engine and Compute Engine RESTRICTION: Does NOT apply on certain

machine types (example: E2 and A2)

RESTRICTION: Does NOT apply to VMs created by App Engine flexible and Dataflow

For workloads with predictable resource needs

Commit for 1 year or 3 years

Up to 70% discount based on machine type and GPUs

Applicable for instances created by Google Kubernetes Engine and

Compute Engine

(Remember) You CANNOT cancel commitments

Reach out to Cloud Billing Support if you made a mistake while purchasing commitments

Short-lived cheaper (upto 80%) compute instances

Can be stopped by GCP any time (preempted) within 24 hours

Instances get 30 second warning (to save anything they want to save)

Use Preempt VM's if:

Your applications are fault tolerant
You are very cost sensitive
Your workload is NOT immediate
Example: Non immediate batch processing jobs

RESTRICTIONS:

NOT always available
NO SLA and CANNOT be migrated to regular VMs NO Automatic Restarts
Free Tier credits not applicable

Shared Tenancy (Default)

Single host machine can have instances from multiple customers

Sole-tenant Nodes: Virtualized instances on hardware dedicated to one customer

Use cases:

Security and compliance requirements: You want your VMs to be physically separated from those in other projects
High performance requirements: Group your VMs together
Licensing requirements: Using per-core or per-processor "Bring your own licenses"

What do you do when predefined VM options are NOT appropriate

for your workload?

Create a machine type customized to your needs (a Custom Machine Type)

Custom Machine Type: Adjust vCPUs, memory and GPUs

Choose between E2, N2, or N1 machine types

Supports a wide variety of Operating Systems: CentOS, CoreOS, Debian, Red Hat, Ubuntu, Windows etc

Billed per vCPUs, memory provisioned to each instance

Example Hourly Price: $0.033174 / vCPU + $0.004446 / GB

2 primary costs in running VMs using GCE:

Infrastructure cost to run your VMs
Licensing cost for your OS (ONLY for Premium Images)

Premium Image Examples: Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise Server (SLES), Ubuntu Pro, Windows Server, ..

Options For Licensing:

You can use Pay-as-you-go model (PAYG) OR
(WITHIN A LOT OF CONSTRAINTS) You can use your existing license/subscription (Bring your own subscription/license - BYOS/BYOL)

(RECOMMENDED) If you have existing license for a premium image, use it while your license is valid

After that you can shift to Pay-as-you-go model (PAYG)

Image

What operating system and what sohware do you want on the VM instance? Reduce boot time and improve security by creating custom hardened Images.
You can share an Image with other projects
Machine Types
Optimized combination of compute(CPU, GPU), memory, disk (storage) and networking for specific workloads.
You can create your own Custom Machine Types when existing ones don't fit your needs
Static IP Addresses: Get a constant IP addresses for VM instances
Instance Templates: Pre-configured templates simplifying the creation of VM instances
Sustained use discounts: Automatic discounts for running VM instances for significant portion of the billing month
Committed use discounts: 1 year or 3 year reservations for workloads with
predictable resource needs
Preemptible VM: Short-lived cheaper (upto 80%) compute instances for non- time-critical fault-tolerant workloads

How do you create a group of VM instances?
Instance Group - Group of VM instances managed as a single entity
Manage group of similar VMs having similar lifecycle as ONE UNIT
Two Types of Instance Groups:
Managed : Identical VMs created using a template:
Features: Auto scaling, auto healing and managed releases
Unmanaged : Different configuration for VMs in same group:
Does NOT offer auto scaling, auto healing & other services
NOT Recommended unless you need different kinds of VMs
Location can be Zonal or Regional
Regional gives you higher availability (RECOMMENDED)

Managed Instance Group - Identical VMs created using an instance template

Important Features:

Maintain certain number of instances
If an instance crashes, MIG launches another instance
Detect application failures using health checks (Self Healing) Increase and decrease instances based on load (Auto Scaling) Add Load Balancer to distribute load
Create instances in multiple zones (regional MIGs)
Regional MIGs provide higher availability compared to zonal MIGs
Release new application versions without downtime
Rolling updates: Release new version step by step (gradually). Update a percentage of instances to the new version at a time.
Canary Deployment: Test new version with a group of instances before releasing it across all instances.

Instance template is mandatory :

Configure auto-scaling to automatically adjust number of instances based on load:
Minimum number of instances
Maximum number of instances
Autoscaling metrics: CPU Utilization target or Load Balancer Utilization target or Any other metric from Stack Driver
Cool-down period: How long to wait before looking at auto scaling metrics again?
Scale In Controls: Prevent a sudden drop in no of VM instances
Example: Don't scale in by more than 10% or 3 instances in 5 minutes
Autohealing: Configure a Health check with Initial delay (How long should you wait for your app to initialize before running a health check?)

Distribute traffic across VM instances in one or more regions

Managed service:

Google Cloud ensures that it is highly available

Auto scales to handle huge loads

Load Balancers can be public or private

Types:

External HTTP(S)

Internal HTTP(S) SSL Proxy

TCP Proxy

External Network TCP/UDP Internal TCP/UDP

Managed Services

Do you want to continue running applications in the cloud, the same way you run them in your data center?

OR are there OTHER approaches?

You should understand some terminology used with cloud services:

IaaS (Infrastructure as a Service)

PaaS (Platform as a Service) FaaS (Function as a Service) CaaS (Container as a Service) Serverless

Let's get on a quick journey to understand these!

Use only infrastructure from cloud provider

Example: Using VM to deploy your applications or databases

You are responsible for:

Application Code and Runtime
Configuring load balancing Auto scaling
OS upgrades and patches Availability
etc.. ( and a lot of things!)

Use a platform provided by cloud

Cloud provider is responsible for:

OS (incl. upgrades and patches)
Application Runtime
Auto scaling, Availability & Load balancing etc..
You are responsible for:
Configuration (of Application and Services)
Application code (if needed)

Varieties:

CAAS (Container as a Service): Containers instead of Apps
FAAS (Function as a Service): Functions instead of Apps
Databases - Relational & NoSQL (Amazon RDS, Google Cloud SQL, Azure SQL Database etc), Queues, AI, ML, Operations etc!

Enterprises are heading towards microservices architectures

Build small focused microservices

Flexibility to innovate and build applications in different programming languages (Go, Java, Python, JavaScript, etc)

BUT deployments become complex!

How can we have one way of deploying Go, Java, Python or JavaScript .. microservices?

Enter containers!

Create Docker images for each microservice Docker image has all needs of a microservice:

Application Runtime (JDK or Python or NodeJS)

Application code and Dependencies

Runs the same way on any infrastructure:

Your local machine

Corporate data center Cloud

Advantages

Docker containers are light weight

Compared to Virtual Machines as they do not have a Guest OS

Docker provides isolation for containers Docker is cloud neutral

Requirement : I want 10 instances of Microservice A container, 15 instances of Microservice B container and ....

Typical Features:

Auto Scaling - Scale containers based on demand
Service Discovery - Help microservices find one another
Load Balancer - Distribute load among multiple instances of a microservice
Self Healing - Do health checks and replace failing instances
Zero Downtime Deployments - Release new versions without downtime

What do we think about when we develop an application?

Where to deploy? What kind of server? What OS?

How do we take care of scaling and availability of the application?

What if you don't need to worry about servers and focus on your code?

Enter Serverless

Remember: Serverless does NOT mean "No Servers"

Serverless for me:

You don't worry about infrastructure (ZERO visibility into infrastructure)

Flexible scaling and automated high availability

Most Important: Pay for use

Ideally ZERO REQUESTS => ZERO COST

You focus on code and the cloud managed service takes care of all that is needed to scale your code to serve millions of requests!

And you pay for requests and NOT servers!

Centrally hosted sohware (mostly on the cloud)

Offered on a subscription basis (pay-as-you-go)

Examples:

Email, calendaring & office tools (such as Outlook 365, Microsoft Office 365, Gmail, Google Docs)

Cloud provider is responsible for:

OS (incl. upgrades and patches)
Application Runtime
Auto scaling, Availability & Load balancing etc.. Application code and/or
Application Configuration (How much memory? How many instances? ..)
Customer is responsible for:
Configuring the software!
And the content (example: docs, sheets etc)

Security in cloud is a Shared Responsibility:

Between GCP and the Customer

GCP provides features to make security easy:

Encryption at rest by default
IAM
KMS etc

Customer responsibilities vary with the model:

SaaS: Content + Access Policies + Usage
PaaS: SaaS + Deployment + Web Application Security
IaaS: PaaS + Operations + Network Security + Guest OS

Google Cloud is always responsible for Hardware, Network, Audit Logging etc.

Platform using open and familiar languages and tools

Cloud Functions Build event driven applications using simple, single- purpose functions

Cloud Run Develop and deploy highly scalable containerized

applications.

Does NOT need a cluster!

Managed Compute Service in GCP

Simplest way to deploy and scale your applications in GCP
Provides end-to-end application management

Supports:

Go, Java, .NET, Node.js, PHP, Python, Ruby using pre-configured runtimes
Use custom run-time and write code in any language
Connect to variety of Google Cloud storage products (Cloud SQL etc)
No usage charges - Pay for resources provisioned

Features:

Automatic load balancing & Auto scaling
Managed platform updates & Application health monitoring Application versioning
Traffic splitting

Compute Engine is IAAS
MORE Flexibility MORE Responsibility
Choosing Image
Installing Software Choosing Hardware
Fine grained Access/Permissions (Certificates/Firewalls) Availability etc

App Engine is PaaS and Serverless
LESSER Responsibility LOWER Flexibility

Standard: Applications run in language specific sandboxes

V1: Java, Python, PHP, Go (OLD Versions)
V2: Java, Python, PHP, Node.js, Ruby, Go (NEWER Versions) Complete isolation from OS/Disk
Supports scale down to Zero instances

Flexible - Application instances run within Docker containers

Makes use of Compute Engine virtual machines
Support ANY runtime (with built-in support for Python, Java, Node.js, Go, Ruby, PHP, or .NET)
CANNOT scale down to Zero instances

Managed Kubernetes service

Minimize operations with auto-repair (repair failed nodes) and auto-upgrade (use latest version of K8S always) features
Provides Pod and Cluster Autoscaling
Enable Cloud Logging and Cloud Monitoring with simple configuration
Uses Container-Optimized OS, a hardened OS built by Google Provides support for Persistent disks and Local SSD

Let's Have Some Fun: Let's get on a journey with Kubernetes:

Let's create a cluster, deploy a microservice and play with it in 13 steps!

Create a Kubernetes cluster with the default node pool .Gcloud container clusters create or use cloud console.
Login to Cloud Shell
Connect to the Kubernetes Cluster .Gcloud container clusters get-credentials my-cluster --zone us-central1-a --project solid-course-258105
Deploy Microservice to Kubernetes

Create deployment & service using kubectl commands :

kubectl create deployment hello-world-rest-api --image=in28min/hello-world-rest-api:0.0.1.RELEASE
kubectl expose deployment hello-world-rest-api --type=LoadBalancer --port=8080

Increase number of instances of your microservice:

kubectl scale deployment hello-world-rest-api --replicas=2

Increase number of nodes in your Kubernetes cluster:

gcloud container clusters resize my-cluster --node-pool my-node-pool --num-nodes 5

You are NOT happy about manually increasing number of instances and nodes!

Setup auto scaling for your microservice:

kubectl autoscale deployment hello-world-rest-api --max=10 --cpu-percent=70

Also called horizontal pod autoscaling - HPA - kubectl get hpa

Setup auto scaling for your Kubernetes Cluster

gcloud container clusters update cluster-name --enable-autoscaling --min-nodes=1 -- max-nodes=10

Delete the Microservice

Delete service - kubectl delete service
Delete deployment - kubectl delete deployment

Delete the Cluster

gcloud container clusters delete

Cloud Functions

Imagine you want to execute some code when an event happens?

A file is uploaded in Cloud Storage
An error log is written to Cloud Logging A message arrives to Cloud Pub/Sub
Enter Cloud Functions
Run code in response to events
Write your business logic in Node.js, Python, Go, Java, .NET, and Ruby
Don't worry about servers or scaling or availability (only worry about your code)
Pay only for what you use
Number of invocations
Compute Time of the invocations Amount of memory and CPU provisioned
Time Bound - Default 1 min and MAX 60 minutes(3600 seconds)
Each execution runs in a separate instance
No direct sharing between invocations

Cloud Run - "Container to Production in Seconds"

Built on top of an open standard - Knative
Fully managed serverless platform for containerized applications
ZERO infrastructure management
Pay-per-use (For used CPU, Memory, Requests and Networking)
Fully integrated end-to-end developer experience:
No limitations in languages, binaries and dependencies
Easily portable because of container based architecture
Cloud Code, Cloud Build, Cloud Monitoring & Cloud Logging Integrations
Anthos - Run Kubernetes clusters anywhere
Cloud, Multi Cloud and On-Premise
Cloud Run for Anthos: Deploy your workloads to Anthos clusters running on-premises or on Google Cloud
Leverage your existing Kubernetes investment to quickly run serverless workloads

How can you centrally manage multi-cloud and on-premise Kubernetes clusters ?

Anthos

Storage

What is the type of storage of your hard disk?

Block Storage

You've created a file share to share a set of files with your colleagues in a enterprise. What type of storage are you using?

File Storage

Use case: Harddisks attached to your computers

Typically, ONE Block Storage device can be connected to ONE virtual server

(EXCEPTIONS) You can attach read only block devices

with multiple virtual servers and certain cloud providers are exploring multi-writer disks as well!

HOWEVER, you can connect multiple different block storage devices to one virtual server Used as:

Direct-attached storage (DAS) - Similar to a hard disk

Storage Area Network (SAN) - High-speed network connecting a pool of storage devices

Used by Databases - Oracle and Microsoft SQL Server

Media workflows need huge shared storage for supporting processes like video editing

Enterprise users need a quick way to share files in a secure and organized way

These file shares are shared by several virtual servers

Block Storage:

Persistent Disks: Network Block Storage

Zonal: Data replicated in one zone

Regional: Data replicated in multiple zone

Local SSDs: Local Block Storage

File Storage:

Filestore:

High performance file storage
Most popular, very flexible & inexpensive storage service
Serverless: Autoscaling and infinite scale

Store large objects using a key-value approach:

Treats entire object as a unit (Partial updates not allowed)
Recommended when you operate on entire object most of the time

Access Control at Object level

Also called Object Storage

Provides REST API to access and modify objects
Also provides CLI (gsutil) & Client Libraries (C++, C#, Java, Node.js, PHP, Python & Ruby)
Store all file types - text, binary, backup & archives:
Media files and archives, Application packages and logs
Backups of your databases or storage devices
Staging data during on-premise to cloud database migration

Objects are stored in buckets
Bucket names are globally unique
Bucket names are used as part of object URLs => Can contain ONLY lower case letters, numbers, hyphens, underscores and periods.
3-63 characters max. Can't start with goog prefix or should not contain
google (even misspelled)
Unlimited objects in a bucket
Each bucket is associated with a project
Each object is identified by a unique key
Key is unique in a bucket
Max object size is 5 TB
BUT you can store unlimited number of such objects
Different kinds of data can be stored in Cloud Storage
Media files and archives
Application packages and logs
Backups of your databases or storage devices Long term archives
Huge variations in access patterns
Can I pay a cheaper price for objects I access less frequently?
Storage classes help to optimize your costs based on your access needs
Designed for durability of 99.999999999%(11 9’s)

Storage duration

storage region, 99.9% in regions
High durability (99.999999999% annual durability) Low latency (first byte typically in tens of milliseconds) Unlimited storage
Autoscaling (No configuration needed)
NO minimum object size
Same APIs across storage classes
Committed SLA is 99.95% for multi region and 99.9% for single region for Standard, Nearline and Coldline storage classes
No committed SLA for Archive storage
Files are frequently accessed when they are created
Generally usage reduces with time
How do you save costs by moving files automatically between storage classes?
Solution: Object Lifecycle Management
Identify objects using conditions based on:
Age, CreatedBefore, IsLive, MatchesStorageClass, NumberOfNewerVersions etc
Set multiple conditions: all conditions must be satisfied for action to happen
Two kinds of actions:
SetStorageClass actions (change from one storage class to another)
Deletion actions (delete objects)
Allowed Transitions:
(Standard or Multi-Regional or Regional) to (Nearline or Coldline or Archive)
Nearline to (Coldline or Archive) Coldline to Archive

{

"lifecycle": {

"rule": [

{

"action": {"type": "Delete"}, "condition": {

"age": 30, "isLive": true

}

{

"action": {

"type": "SetStorageClass", "storageClass": "NEARLINE"

"condition": {

"age": 365,

"matchesStorageClass": ["STANDARD"]

}

]

}

Most popular data destination is Google Cloud Storage Options:

Online Transfer:

Use gsutil or API to transfer data to Google Cloud Storage
Good for one time transfers

Storage Transfer Service:

Recommended for large-scale (petabytes) online data transfers from your private data centers, AWS, Azure, and Google Cloud
You can set up a repeating schedule
Supports incremental transfer (only transfer changed objects)
Reliable and fault tolerant - continues from where it left off in case of errors

Storage Transfer Service vs gsutil:

gsutil is recommended only when you are transferring less than 1 TB from on-premises or another GCS bucket

Storage Transfer Service is recommended if either of the conditions is met:

Transferring more than 1 TB from anywhere
Transferring from another cloud

Transfer Appliance: Physical transfer using an appliance

Copy, ship and upload data to GCS

Recommended if your data size is

greater than 20TB
OR online transfer takes > 1 week

Process:

Request an appliance
Upload your data
Ship the appliance back Google uploads the data
Fast copy (upto 40Gbps)
AES 256 encryption - Customer- managed encryption keys
Order multiple devices(TA40, TA300) if need

Database Fundamentals

There are several categories of databases:

Relational (OLTP and OLAP), Document, Key Value, Graph, In Memory among others

Choosing type of database for your use case is not easy. A few factors:

Do you want a fixed schema?

Do you want flexibility in defining and changing your schema? (schemaless)

What level of transaction properties do you need? (atomicity and consistency) What kind of latency do you want? (seconds, milliseconds or microseconds)

How many transactions do you expect? (hundreds or thousands or millions of transactions per second)

How much data will be stored? (MBs or GBs or TBs or PBs) and a lot more...

This was the only option until a decade back!

Most popular (or unpopular) type of databases

Predefined schema with tables and relationships

Very strong transactional capabilities Used for

OLTP (Online Transaction Processing) use

cases and

OLAP (Online Analytics Processing) use cases

Applications where large number of users make large number of small transactions

small data reads, updates and deletes

Use cases:

Most traditional applications, ERP, CRM, e-commerce, banking applications

Popular databases:

MySQL, Oracle, SQL Server etc

Recommended Google Managed Services:

Cloud SQL : Supports PostgreSQL, MySQL, and SQL Server for regional relational databases (upto a few TBs)

Cloud Spanner: Unlimited scale (multiple PBs) and 99.999% availability for global applications with horizontal scaling

Applications allowing users to analyze petabytes of data

Examples : Reporting applications, Data ware houses, Business intelligence applications, Analytics systems

Sample application : Decide insurance premiums analyzing data from last hundred years

Data is consolidated from multiple (transactional) databases

Recommended GCP Managed Service

BigQuery: Petabyte-scale distributed data ware house

OLAP and OLTP use similar data structures

BUT very different approach in how data is stored

OLTP databases use row storage

Each table row is stored together

Efficient for processing small transactions

OLAP databases use columnar storage

Each table column is stored together

High compression - store petabytes of data efficiently

Distribute data - one table in multiple cluster nodes

Execute single query across multiple nodes - Complex queries can be executed efficiently

New approach (actually NOT so new!) to building your databases

NoSQL = not only SQL

Flexible schema

Structure data the way your application needs it

Let the schema evolve with time

Horizontally scale to petabytes of data with millions of TPS

NOT a 100% accurate generalization but a great starting point:

Typical NoSQL databases trade-off "Strong consistency and SQL features" to achieve "scalability and high-performance"

Google Managed Services:

Cloud Firestore (Datastore)
Cloud BigTable

Cloud Datastore - Managed serverless NoSQL document database

Provides ACID transactions, SQL-like queries, indexes

Designed for transactional mobile and web applications

Firestore (next version of Datastore) adds:

Strong consistency

Mobile and Web client libraries

Recommended for small to medium databases (0 to a few Terabytes)

Cloud BigTable - Managed, scalable NoSQL wide column database

NOT serverless (You need to create instances)

Recommend for data size > 10 Terabytes to several Petabytes Recommended for large analytical and operational workloads:

NOT recommended for transactional workloads (Does NOT support multi row transactions -

supports ONLY Single-row transactions)

Retrieving data from memory is much faster than retrieving data from disk

In-memory databases like Redis deliver microsecond latency by storing persistent data in memory

Recommended GCP Managed Service

Memory Store

Use cases : Caching, session management, gaming leader boards, geospatial applications

Databases/caches

A start up with quickly evolving schema (table structure) Cloud

Datastore/Firestore

Non relational db with less storage (10 GB) Cloud Datastore

Transactional global database with predefined schema needing to process million of transactions per second CloudSpanner

Transactional local database processing thousands of transactions per second Cloud SQL

Cache data (from database) for a web application : MemoryStore

Database for analytics processing of petabytes of data: BigQuery

Database for storing huge volumes stream data from IOT devices: BigTable

Database for storing huge streams of time series data : BigTable

IAM

You have resources in the cloud (examples - a virtual server, a database etc)

You have identities (human and non-human) that need to access those resources and perform actions

For example: launch (stop, start or terminate) a virtual server

How do you identify users in the cloud?

How do you configure resources they can access?

How can you configure what actions to allow?

In GCP: Identity and Access Management (Cloud IAM) provides this service

Authentication (is it the right user?) and Authorization (do they have the right access?) Identities can be

A GCP User (Google Account or Externally Authenticated User)

A Group of GCP Users

An Application running in GCP

An Application running in your data center Unauthenticated users

Provides very granular control

Limit a single user:

to perform single action

on a specific cloud resource from a specific IP address during a specific time window

I want to provide access to manage a specific cloud storage bucket to a colleague of mine:

Important Generic Concepts:

Member: My colleague

Resource: Specific cloud storage bucket

Action: Upload/Delete Objects

In Google Cloud IAM:

Roles: A set of permissions (to perform specific actions on specific resources)

Roles do NOT know about members. It is all about permissions!

How do you assign permissions to a member?

Policy: You assign (or bind) a role to a member

1: Choose a Role with right permissions (Ex: Storage Object Admin)

2: Create Policy binding member (your friend) with role (permissions) IAM in AWS is very different from GCP (Forget AWS IAM & Start FRESH!)

Example: Role in AWS is NOT the same as Role in GCP

Member : Who?

Roles : Permissions (What Actions? What Resources?)

Policy : Assign Permissions to Members

Map Roles (What?) , Members (Who?) and Conditions (Which Resources?, When?, From Where?)

Remember: Permissions are NOT directly assigned to Member

Permissions are represented by a Role

Member gets permissions through Role!

A Role can have multiple permissions

You can assign multiple roles to a Member

Roles are assigned to users through IAM Policy documents Represented by a policy object

Policy object has list of bindings

A binding, binds a role to list of members

Member type is identified by prefix:

Example: user, serviceaccount, group or domain

{

"bindings": [

{

"role": "roles/storage.objectAdmin", "members": [

"user:you@in28minutes.com", "serviceAccount:myAppName@appspot.gserviceaccount.com", "group:administrators@in28minutes.com", "domain:google.com"

]

{

"role": "roles/storage.objectViewer", "members": [

"user:you@in28minutes.com"

"condition": {

"title": "Limited time access", "description": "Only upto Feb 2022",

"expression": "request.time < timestamp('2022-02-01T00:00:00.000Z')",

}

]

}

Scenario: An Application on a VM needs access to cloud storage

You DONT want to use personal credentials to allow access

(RECOMMENDED) Use Service Accounts

Identified by an email address (Ex: id-compute@developer.gserviceaccount.com)

Does NOT have password

Has a private/public RSA key-pairs

Can't login via browsers or cookies

Service account types:

Default service account - Automatically created when some services are used

(NOT RECOMMENDED) Has Editor role by default

User Managed - User created

(RECOMMENDED) Provides fine grained access control

Google-managed service accounts - Created and managed by Google

Used by GCP to perform operations on user's behalf

In general, we DO NOT need to worry about them

Google Cloud Digital Leader Notes and questions

DOWNLOAD PDF LINK

GCP has 200+ services :

Challenge:

Solution

Compute

Commands:

Image

Managed Services

Storage

File Storage

Block Storage:

Storage duration

Database Fundamentals

Databases/caches

IAM

Post a Comment

Popular Posts

Lucent GK PDF Download Latest Edition

Indian Polity Pdf - M LAXMIKANT

Most asked GK Questions and Answers with PDF

SQL NOTES for Placements

Constitution of India pdf download

Labels

Categories

About Us

Follow Us

Footer Copyright

Contact form