Pakistan's First Oracle Blog

Subscribe to Pakistan's First Oracle Blog feed
Blog By Fahd Mirza Chughtai
Updated: 10 hours 9 min ago

SSIS in AWS RDS

Fri, 2020-05-22 03:38
Whenever migrating a SQL Server database from on-prem to AWS Cloud, my first preference is always to move it to AWS RDS, the managed database service. So whenever a client asks me to migrate an on-prem SQL Server database, my first question is:


Do you need to access filesystem as part of this database operations?

(Secretly wishing the answer would be NO), but more often than not, SSIS is the deal breaker in such database migration and the database ends up on an EC2 instance, which is still better than having it on-prem.

Managing a SQL Server on EC2 seems like a heavy chore when your other SQL Servers are humming smoothly on RDS and you know you don't have to nurse and babysit them. Well the prayers have been answered and the days of looking at those EC2 based SQL Servers having SSIS are numbered

AWS has announced SSIS support on RDS. For now, its only compatible with either SQL Server 2016 and 2017, which is a bit of a bummer, but still a welcome thing. SSIS is enabled through option groups in RDS and you have to do the S3 integration which is fairly straight forward. You can find step by step instructions here.

Looking forward to migrate my SSIS-struck EC2 based SQL Servers to RDS now.


Categories: DBA Blogs

Cloud Vanity: A Weekly Carnival of AWS, GCP and Azure - Edition 1

Thu, 2020-05-07 18:46
This is the first edition of this weekly collection about what is happening in the rapidly evolving cloud sphere. This will mainly focus on news, blogs, articles, tidbits, and views from AWS, Azure and GCP but will also include other Cloud providers from time to time. Enjoy Reading!!!




AWS:

Amazon Relational Database Service (RDS) for SQL Server now supports distributed transactions using Microsoft Distributed Transaction Coordinator (MSDTC). With MSDTC, you can run distributed transactions involving RDS for SQL Server DB instances.

Despite the Kubernetes and Serverless hypes, the vast majority of cloud workloads still happen on virtual machines. AWS offers the Amazon Elastic Compute Cloud (EC2) service, where you can launch virtual machines (AWS calls them instances).

James Beswick shows how you can import large amounts of data to DynamoDB using a serverless approach.

Amazon Lightsail provides an easy way to get started with AWS for many customers. The service balances ease of use, security, and flexibility. The Lightsail firewall now offers additional features to help customers secure their Lightsail instances.

AWS Security Hub offers a new security standard, AWS Foundational Security Best Practices This week AWS Security Hub launched a new security standard called AWS Foundational Security Best Practices.

GCP:

As organizations look to modernize their Windows Server applications to achieve improved scalability and smoother operations, migrating them into Windows containers has become a leading solution. And orchestrating these containers with Kubernetes has become the industry norm, just as it has with Linux.

“Keep calm and carry on.” While the words may resonate with the public, carrying on with business as usual these days is not an option for most enterprises—especially not application development and delivery teams.

During times of challenge and uncertainty, businesses across the world must think creatively and do more with less in order to maintain reliable and effective systems for customers in need.

COVID-19 is forcing us all to adapt to new realities. This is especially true for the healthcare industry. From large healthcare providers to pharmaceutical companies to small, privately run practices, nearly every customer in the healthcare industry is re-evaluating and shifting their strategies.

Protecting users and data is a big job for organizations, especially as attackers continue to attempt to access enterprise credentials and gain control of corporate machines. Google has been working hard to help protect corporate passwords with features like Password Checkup and a variety of other Chrome functionalities.

Azure:

Modern applications are increasingly built using containers, which are microservices packaged with their dependencies and configurations. For this reason, many companies are either containerizing their existing applications or creating new complex applications that are composed of multiple containers.

In the past few months, there has been a dramatic and rapid shift in the speed at which organizations of all sizes have enabled remote work amidst the global health crisis. Companies examining priorities and shifting resources with agility can help their employees stay connected from new locations and devices, allowing for business continuity essential to productivity.

Whether you're a new student, thriving startup, or the largest enterprise, you have financial constraints and you need to know what you're spending, where, and how to plan for the future. Nobody wants a surprise when it comes to the bill, and this is where Azure Cost Management + Billing comes in.

Azure Container Registry announces dedicated data endpoints, enabling tightly scoped client firewall rules to specific registries, minimizing data exfiltration concerns.

Azure Backup uses Recovery Services vault to hold customers' backup data which offers both local and geographic redundancy. To ensure high availability of backed up data, Azure Backup defaults storage settings to geo-redundancy.
Categories: DBA Blogs

Kubernetes Taints/Tolerations and Node Affinity for Dummies

Thu, 2020-04-09 21:26

In order to guarantee which pod goes to which node in a Kubernetes cluster, the concept of Taints/Tolerations and Node Affinity is used. With Taints/Tolerations, we taint a node with a specific label, and then add those labels as toleration in the pod manifest to ensure that if a pod doesn't have that toleration, it won't be scheduled on that tainted node. In order to ensure that, this tolerated pod only goes to tainted node, we also add an affinity within the pod manifest.


So in other words, Taints/Tolerations are used to repel undesired pods, whereas Node Affinity is used to guide Kubernetes scheduler to use a node for a specific pod.

So why we need both Taints/Tolerations and Node Affinity? It is to guarantee that a pod goes to an intended node. Because Taints/Tolerations ensures that undesired pod stay away from a node but it doesn't ensure that desired pod will actually be placed on that node. In order to guarantee that, we use node affinity.

Following is a complete example of 4 deployments: red, blue, green, other. We have 4 worker nodes node01, node02, node03, node04.

We have labelled nodes to their respective colors, and we also have added a taint with same key value pair. Then we added a toleration in deployments for the respective key value pair. For example, this ensures that For node01 , the label is red, taint is also red and any pod which doesn't have label red won't be scheduled on this node. Then we added a node affinity which ensures that red pods will only be placed on node with label red. Same logic is being used for other deployments.


For Node red:

Kubectl label node node01 color=red
Kubectl taint node node01 color=red:NoSchedule

For Deployment red:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: red
spec:
  replicas: 1
  selector:
    matchLabels:
      color: red
  template:
    metadata:
      labels:
        color: red
    spec:
      containers:
      - name: nginx
        image: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: color
                operator: In
                values:
                - red
      tolerations:
  - key: "color"
    operator: "Equal"
    value: "red"
    effect: "NoSchedule"


For Node blue:

Kubectl label node node02 color=blue
Kubectl taint node node02 color=blue:NoSchedule

For Deployment blue:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: blue
spec:
  replicas: 1
  selector:
    matchLabels:
      color: blue
  template:
    metadata:
      labels:
        color: blue
    spec:
      containers:
      - name: nginx
        image: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: color
                operator: In
                values:
                - blue
      tolerations:
  - key: "color"
    operator: "Equal"
    value: "blue"
    effect: "NoSchedule"

For Node green:

Kubectl label node node03 color=green
Kubectl taint node node03 color=green:NoSchedule

For Deployment green:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: green
spec:
  replicas: 1
  selector:
    matchLabels:
      color: green
  template:
    metadata:
      labels:
        color: green
    spec:
      containers:
      - name: nginx
        image: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: color
                operator: In
                values:
                - green
      tolerations:
  - key: "color"
    operator: "Equal"
    value: "green"
    effect: "NoSchedule"


For Node Other:

Kubectl label node node04 color=other

For Deployment other:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: other
spec:
  replicas: 2
  selector:
    matchLabels:
      color: other
  template:
    metadata:
      labels:
        color: other
    spec:
      containers:
      - name: nginx
        image: nginx
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: color
                operator: In
                values:
                - other
      tolerations:
  - key: "color"
    operator: "Equal"
    value: "other"
    effect: "NoSchedule"

Hope this helps!!!
Categories: DBA Blogs

BigQuery Materialized Views and Oracle Materialized Views

Wed, 2020-04-08 18:33
One of the common ways of one-to-many replication setups in Oracle databases involve, on high level, having one master transaction database which holds the transactions, then a mview log is created on that table.


Then all the other reporting databases subscribe their respective materialized views (MViews) to this log table. These MViews remain in sync with the master log table through incremental refresh or through complete refresh. As long as it runs fine, it runs fine but when things break, it becomes ugly, and I mean ugly. The MViews at reporting databases could lag behind the master log due to network issue or if the master database goes down. Doing a complete refresh is also a nightmare and you have to do lots of purging and tinkering. The more subscribing MViews, the more hassle it is when things break.

BigQuery is Google's managed data warehousing service which now offers materialized views. If you have managed Oracle MViews, it brings you to tears when you learn that BigQuery MViews offers following:

Zero maintenance: A materialized view is recomputed in background once the base table has changed. All incremental data changes from the base tables are automatically added to the materialized views. No user inputs are required.

Always fresh: A materialized view is always consistent with the base table, including BigQuery streaming tables. If a base table is modified via update, merge, partition truncation, or partition expiration, BigQuery will invalidate the impacted portions of the materialized view and fully re-read the corresponding portion of the base table. For an unpartitioned materialized view, BigQuery will invalidate the entire materialized view and re-read the entire base table. For a partitioned materialized view, BigQuery will invalidate the affected partitions of the materialized view and re-read the entire corresponding partitions from the base table. Partitions that are append-only are not invalidated and are read in delta mode. In other words, there will never be a situation when querying a materialized view results in stale data.

Smart tuning: If a query or part of a query against the source table can instead be resolved by querying the materialized view, BigQuery will rewrite (reroute) the query to use the materialized view for better performance and/or efficiency.

In my initial testing, the things work like a charm and refresh takes at most couple of minutes. I will be posting some tests here very soon. But suffice is to say that delegating manaagement of Mview refresh to Google is reason enough to move to BigQuery.


Categories: DBA Blogs

How To: Reduce Cloud Bill in Half for Any Cloud Provider

Tue, 2020-03-31 17:09
“But you said, we would be only paying what we use then how come my Cloud bill is way more than my on-prem cost?”
Managements are now asking above question from Cloud Operations team all across the globe as recession looms, revenues drop, businesses almost to a standstill, budgets shrunk and economic outlook becoming grimmer by the day.



Above statement about Pay-as-you-go is still true about the cloud but it doesn’t happen just by migrating to cloud. It has to be configured, monitored, managed and audited. If you start a virtual machine in the cloud, use it for 9 to 5, leave it running until next morning, then you will be charged for 24 hours instead of just from 9 to 5 where you were supposedly “using” it. Running in cloud means using, and you pay for it. Same is also true for managed database services in the cloud.
Following are top 3 things you can do right now to significantly reduce your cloud bill.
  1. Control Resource Creation:
Resource Creation privileges must be given to a handful team in your organization, and this team should be doing it through pipeline after proper approvals. So revoke resource creation rights from everyone else, and let one team be doing it through pipeline with a single click.
2. Tag Resources
Tag your resources, and no resource in cloud should live without following tags:
StartTime, StopTime, LifeSpan
These tags are self-explanatory. StartTime will tell, when this resource will start, when it will stop, and how long this will remain in cloud. You can either write your own serverless functions to enforce these timings, or use any third party or cloud-native solution to implement it. (Drop me a line, if you want further guidance on this). This point is a game-changer for your bill especially for VMs and managed databases.
3. Remove the Clutter
No matter how little it costs; if its not needed or you cannot find the owner, just delete it. Orphan persistent disk volumes, ownerless compute images, dangling object stores, and all stray services must be stopped and deleted.
If followed, these 3 points can reduce your cloud bill to half or even more in any cloud provider setup. Cloud’s honeymoon period is over, and with all this belt tightening around, its time to save.
Categories: DBA Blogs

Basic DynamoDB Data Modelling

Sat, 2020-03-28 01:58



The key differentiator between a NoSQL database like AWS Dynamodb and a relational database like Oracle is of the way data is modelled. In a relational model; data is normalized, de-duplicated and relationships are established between entities. In a NoSQL database like Dynamodb, data can be duplicated and entities can be fused together. This is done to facilitate scalability which is hallmark of today's hyper-scale cloud based applications.


One of the most common relationship within data modelling is One-to-Many. For example, we have 2 entities; Painting and Gallery. A painting can only reside in one gallery, whereas a gallery can have multiple paintings. This is one to many relationship. In relational modelling world, we would have 2 tables; Painting and Gallery, such as:


Museum
Painting

MuseumId
MuseumName
PaintingId
PaintingName
MuseumId
M1
Sofía
P1
Guernica
M1
M2
NSW Gallery
P2
The Great Masturbator
M1
M3
Louvre
P3
Sofala
M2
P4
Mona Lisa
M3


In above One-to-Many relationship, we have joined both Museum and Painting with primary key and foreign key relationship. So e.g. if want to check all paintings in Museum M1, then the query would be:

select p.Title, m.MuseumName from Painting p, Museum m where p.MuseumId=m.MuseumId and m.MuseumId=’M1’;

Now joins are expensive, and they become more expensive as the data grows. In huge cloud scale databases this hampers scalability and at times become impractical to have these joins, hence the NoSQL databases like Dynamodb.

So how do you model One-to-Many relationship in a Dynamodb table when we cannot have joins and aggregations?

We know what the entities are, and we have an access pattern, so now let’s design our table.


MuseumPaintingTable
Primary Key
Attributes
PK
SK
MuseumName
Title
Artist
Museum#M1
Painting#P1
Sofía
Guernica
Picasso
Painting#P2
Sofía
The Great Masturbator
Salvador Dali
Museum#M2
Painting#P3
NSW Gallery
Sofala
Russel Drysdale
Museum#M3
Painting#P4
Louvre
Mona Lisa
Leonardo


If you are coming from a relational world, the above table may seem like blasphemy even though I haven’t yet used different attributes for each item to keep it simple. This is where you need to appreciate the flexibility of NoSQL databases to facilitate scalability.

In above table, we have define a composite primary key, consisting of a partition key and a sort key. A primary key in Dynamodb table defines a unique item, where item corresponds to a row. Remember that all access patterns are driven through the primary key, which means that the entry point for your queries must be the primary key. Every museum and painting has its own unique id, so we have combined them together to generate a unique item.

Note that how we have defined a one-to-many relationship in above table using partition key and sort key within the primary key. For one Museum#M1, we have 2 paintings; Painting#P1 and Painting#P2, and then both of these paintings have their own attributes, which can even differ in structure. We have fused both Painting and Museum entities together.

Now how would you write the previous relational query in DynamoDB, where we wanted to check all paintings in the Museum M1? It would be like this:

aws dynamodb query \
    --table-name MuseumPaintingTable \
    --key-condition-expression "PK = :MID" \
    --expression-attribute-values  '{":MID":{"S":"M1"}}'
               --projection-expression 'Title,MuseumName'

The above command is a RESTful API call to Dynamodb table.
Categories: DBA Blogs

Why on Earth Would You Manage Your Own Kubernetes Control Plane?

Sat, 2020-03-14 21:40
The word Kubernetes comes from the Greek word 'Helmsman'. A helmsman is the person who navigates the ship ensuring the steady and desired course. Whether the seas are turbulent or not, or if some component of ship fails, or the load of containers on ship needs to be re-balanced; its the job of Helmsman or Kubernetes to make sure ship maintains its desired state until it reaches its destination.





For managing software containers, then, Kubernetes seems like an apt name. These containers comprise of the ready-to-run code, code's run-time, and code dependencies. These containers are then encapsulated within abstraction of Pods. Pods are container of one ore more containers which are tightly coupled. These Pods are then abstracted into Replicaset which are copies of the pods for high availability. These replicasets are abstracted into Deployment which is a stateless object of Kubernetes. Then there is Service object of Kubernetes which exposes the application to the networks connected to the Kubernetes cluster. All of these objects run in data plane section of Kubernetes.

In order to manage all the deployments, services, pods, and other objects Kubernetes has master nodes in something called as Control Plane. This control plane interfaces with data plane through api server. Then there are controllers to ensure desired state of nodes, pods and other objects in data plane, Scheduler to to schedule new pods on pertinent nodes, ETCD key-value persistent data store to store configuration and state of your whole Kubernetes cluster, cloud provider specific cloud manager etc.

Just step back a bit and think why we are doing this? All of this complexity just to provision a software in stead and reliable way? Isn't developing software complicated enough that we have to resort to such complexity to manage it? That is why it makes sense to delegate management of Kubernetes Control Plane to a Managed Kubernetes Service provider like Google Kubernetes Engine (GKE), AWS Elastic Kubernetes Service (EKS), Microsoft Azure Kubernetes Service (AKS), Linode Kubernetes Service (LKS), and there are few others.

Why not focus more and more on making software aligned to user expectations and experience rather than managing the software needed to provision and maintain this software? Other than Kubernetes, modern software also has to reside in some git oriented repository, follow the agile methodology, go through a pipeline, get packaged into a container image and then mounted on Kubernetes. Handing over management of Kubernetes control plane, and some part of data plane to a managed service only makes sense. All the managed service providers mentioned above and few others are Cloud Native Computing Foundation (CNCF) certified. They go through rigorous security and compliance checks all the time. They keep the K8s versions updated and stable and much more.

If you are running Kubernetes on-prem or in cloud and still managing its control plane by itself, I will be very interested to learn the reason behind it. Please share in the comments. Thanks. 
Categories: DBA Blogs

What is Nomos in Anthos?

Sat, 2020-03-07 19:41
In kubernetes, everything is a code. For code, the best place to reside is any git repository which should act as single source of truth. GCP Anthos enables us to extend the power of kubernetes to multi and hybrid clouds. With this power comes the added complexity of managing multiple kubernetes clusters scattered in GKE, or in GKE on-prem or in other cloud providers.



Anthos Config Management solves this issue elegantly by declaring the desired state of clusters in YAML or JSON files called as Configs. Then Anthos ensures from top-down that all clusters stay reconciled to the desired state. This is the same principle of configuration as code which is the soul of Kubernetes manifests.

In the context of Anthos and Google Kubernetes Engine (GKE), Nomos is a CLI tool which checks config files in git repository for any syntax errors. Nomos tool is installed locally and with Nomos you can check config files for syntax before pushing them to your single source of truth Git repository.

The usage of Nomos is quite intuitive. For help use:

nomos --help

To check installation status of Nomos:

nomos status

To initialize a new Anthos Config Management repo, create an empty directory, change to it, initialize a new Git repository, then run the nomos init command:

mkdir my-repo
cd my-repo
git init
nomos init

This creates the basic directory structure of your repo, including the system/, cluster/, and namespaces/ directories.

Before you commit a config to the repo, use the nomos vet command to check the syntax and validity of the configs in your repo:

nomos vet

If syntax errors are found, the nomos vet command exits with a non-zero status.

You can read more about nomos here.
Categories: DBA Blogs

Anthos is Icing on GKE Cake

Thu, 2020-03-05 21:31
If you have installed and managed all of Kubernetes by yourself, then you can appreciate how relieving it is to handover control plane and node component to services like GKE, AKS and EKS or any similar managed kubernetes service.
So far, for me Google Kubernetes Engine (GKE) is a winner hands down when it comes to managed kubernetes service due to its maturity, ease of use, and more importantly seamless integration with rest of Google Cloud Platform (GCP) services. AWS and Azure are also catching up fast but still some distance to go.

Google is one leap ahead with Kubernetes by offering the fragrant Anthos. Anthos enables you to run managed Kubernetes anywhere including on-prem, GKE, AWS, Azure or to any other supported cloud provider. Google is very actively improving Anthos as that clearly seems the next big thing in realm of Devops. It's heralding a new era in hybrid and multi cloud environment whereas cloud providers like AWS aren't much fan of term multi-cloud. 

Where Anthos is great in blending in K8s to your on-prem and cloud environments and providing a consistent application management framework, it could also introduce bad practices big time. One such feature which could be badly exploited is transferring all on-prem garbage to cloud. For instance, with Anthos Migrate, you can simple convert your VMs running on on-prem to containers and then move them into Kubernetes. 

This is against the very spirit of light-weight, mutable, micro-service containers. This lift and shift is powerful yet dangerous as no K8 can mask the heavy VM's monolithic burden. So, its should be imperative to prune the VMs before migrating them as containers to GKE through Anthos. 

Having said that Anthos is a thing of beauty and here to stay!!!!
Categories: DBA Blogs

Rename SQL Server Database in AWS RDS

Tue, 2020-03-03 15:06
In AWS RDS based SQL Server, you cannot rename a database from SQL Server management studio. You will get an error about permissions. You need to use rdsadmin command to accomplish that.



If you get error:

Msg 5030, Level 16, State 2, Line 1
The database could not be exclusively locked to perform the operation.

Then you need to set the database first to single user, rename it and then revert back to multi user. Use following steps to rename SQL Server database in AWS RDS:

use master
ALTER DATABASE OldDB SET SINGLE_USER WITH ROLLBACK IMMEDIATE    
exec rdsadmin.dbo.rds_modify_db_name 'OldDB','NewDB'
ALTER DATABASE NewDB SET MULTI_USER

Hope that helps.


Categories: DBA Blogs

RDS Resource in Terraform

Fri, 2020-01-17 21:19
Just marveling again how easy and clean it is to create an RDS resource in Terraform.

# Create the RDS instance
resource "aws_db_instance" "databaseterra" {
  allocated_storage    = "${var.database_size}"
  engine               = "${var.DbEngineType}"
  engine_version       = "${var.DbMajorEngineVersion}"
  instance_class       = "${var.DBInstanceClass}"
  identifier             = "testdbterra"
  username             = "${var.database_user}"
  password             = "${var.database_password}"
  db_subnet_group_name = "${aws_db_subnet_group.rdssubnetgroupterra.id}"
  parameter_group_name = "${aws_db_parameter_group.rdsparametergroupterra.id}"
  multi_az           = "${var.db_multiaz}"
  vpc_security_group_ids = ["${aws_security_group.RdsSecurityGroupterra.id}"]
  publicly_accessible    = "false"
  backup_retention_period = "2"
  license_model           = "license-included"
  apply_immediately = "true"
  tags = {
    Env = "Non-Prod"
  }

Just have your vars ready and thats it.
Categories: DBA Blogs

Use Conda to Generate Requirements.txt for Docker Containers

Fri, 2019-12-27 00:01
pip is a standard package manager. Requirements.txt can be generated in one environment and installed by pip in a new environment. Conda replicates own installation. Pip produces a list of packages that were installed on top of standard library to make the package you wrote work.

Following are the steps to generate requirements.txt file to be used insdie Dockerfile for docker containers:



Go to your project environment conda activate

conda list gives you list of packages used for the environment

conda list -e > requirements.txt save all the info about packages to your folder

conda env export > .yml

pip freeze

Hope that helps.
Categories: DBA Blogs

Urdu in AWS

Fri, 2019-10-04 00:24
Urdu is arguably one of the most beautiful and poetic language on the planet. AWS Translate now supports Urdu along with 31 other languages, which is awesome.



AWS Translate is growing leaps and bounds and has matured quite a lot over the last few months. There are now hundreds of translations and its now available in all the regions.


Amazon Translate is a text translation service that uses advanced machine learning technologies to provide high-quality translation on demand. You can use Amazon Translate to translate unstructured text documents or to build applications that work in multiple languages.


Amazon Translate provides translation between a source language (the input language) and a target language (the output language). A source language-target language combination is known as a language pair.


As with other AWS products, there are no contracts or minimum commitments for using Amazon Translate.



Categories: DBA Blogs

How to do a quick health check of AWS RDS database

Fri, 2019-10-04 00:16
Just because the database is on AWS RDS, it doesn't mean that it won't run slow or get stuck. So when your users complain about the slowness of your RDS database, do the following quick health check:
1- From AWS console, in RDS section, go to your database and then go to Logs and Events tab. From Logs, in case of Oracle check alert log, in case of SQL Server check Error log, for PostgreSQL check postgres log and error log for MySQL database. Check for any errors or warnings and proceed accordingly as per that database engine.


2- If you dont see any errors or warnings or if you want to check in addition, then first check which database instance type you are using. For example for one of my test Oracle databases, it is db.r4.4xlarge.


Go to https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.DBInstanceClass.html and check specifications of this instance type.

For instance, for db.r4.4xlarge, it is :


Instance Class vCPU, ECU, Memory (GiB), VPC Only, EBS Optimized, Max. Bandwidth (Mbps), Network Performance
db.r4.4xlarge 16  53  122   Yes   Yes    3,500     Up to 10 Gbps


So this db.r4.4xlarge has a max bandwidth (throughput) of 437.5 MB/s  (3500 Mbps/8 = 437.5 MB/s). The throughput limit is separate for read and write, which means you’ll get 437.5 MB/s for read and 437.5 MB/s for write.


3- Now go to Monitoring tab of this RDS in your console and check Read Throughput and Write Throughput to see if your instance is touching above threshold. For instance in this case 437.5. If yes, then you know that IO is the issue and you may need to either tune the SQLs responsible or increase instance size.


4- Similarly, from the same monitoring tab check for CPU usage, free memory available and Free storage space to make sure no other threshold is being reached.
5- Also check for Disk Queue Depth. The Disk Queue Depth is the number of IO requests waiting to be serviced. This time spent waiting in the queue is a component of latency and service time. Ideally disk queue depth of 15 or less should be good, but in case you notice latency greater than 10 or milliseconds accompanied by high disk queue depth than that could cause performance issues.


6- Last but not least, reviewing and tuning your SQLs is the biggest optimization gain you can achieve whether your database is in RDS or not.


Hope it helps.
Categories: DBA Blogs

Cloning of RDS Instance to Another Account

Wed, 2019-08-28 21:01
Frequently, we need to refresh our development RDS based Oracle database from the production which is in another account in AWS. So we take a snapshot from production, share it with another account and then restore it in target from the snapshot.

I will post full process in a later post, but for now just sharing an issue we encountered today. While trying to share a snapshot with another account, I got the following error:


Sharing snapshots encrypted with the default service key for RDS is currently not supported.


Now, this snapshot was using default RDS keys and that is not supported. So in order to share it, we need to have customer managed keys and then copy this snapshot with these news keys and only then we can share it. You don't have to do anything at the target, as these customer managed keys become part of that snapshot. You can create customer managed keys in KMS console and may be assign to IAM user you are using.


I hope it helps.
Categories: DBA Blogs

DevOps for Oracle DBA

Sun, 2019-08-25 00:21
DevOps is natural evolution for Oracle database administrators or sysadmins of any kind. The key to remain relevant in the industry is to embrace DevOps these days and in near future.

The good news is that if you are an Oracle DBA, you already have the solid foundation. You have worked with the enterprise, world class database system and are aware of high availability, disaster recovery, performance optimization, and troubleshooting. Having said that, there is still lots to learn and unlearn to become a DevOps Engineer.


You would need to look outside of Oracle, Linux Shell and the core competency mantra. You would need to learn a proper computer language such as Python. You would need to learn software engineering framework like Agile methodology, and you would need to learn stuff such as Git. Above all you would need to unlearn that you only manage Database. As DevOps Engineer in today's Cloud era, you would be responsible for end to end delivery.


Without Cloud skills, its impossible to transition from Oracle DBA to DevOps role. Regardless of the cloud provider, you must know the networking, compute, storage, and infrastructure as code. You already know the databases side of things, but now learn a decent amount about other databases as you would be expected to migrate and manage them in the cloud.


So any of public cloud like AWS, Azure, or GCP plus a programming language like Python or Go or NodeJS, plus agile concepts, IaC as Terraform or CloudFormation, and plethora of stuff like code repositories and pipelining would be required to be come an acceptable DevOps Engineer.


Becoming obsolete by merely staying Oracle DBA is not an option. So buckle up and start DevOps journey today.
Categories: DBA Blogs

Understanding Nested Lists Dictionaries of JSON in Python and AWS CLI

Sat, 2019-04-20 03:01

After lots of hair pulling, bouts of frustration, I was able to grasp this nested list and dictionary thingie in JSON output of AWS cli commands such as describe-db-instances and others. If you run the describe-db-instances for rds or describe-instances for ec2, you get a huge pile of JSON mumbo-jumpo with all those curly and square brackets studded with colons and commas. The output is heavily nested.


For example, if you do :

aws rds describe-db-instances

you get all the information but heavily nested within. Now if you only want to extract or iterate through, say VPCSecurityGroupId of all database instances, then you have to traverse all that nested information which comprises of dictionary consisting of keys which have values as arrays and those arrays have more dictionaries and so on.

After the above rant, let me try to ease the pain a bit by explaining this. For clarity, I have just taken out following chunk from describe-db-instance output. Suppose, the only thing you are interested in is the value of VpcSecurityGroupId from  following chunk:

mydb=rds.describe_db_instances(DBInstanceIdentifier=onedb)
mydb= {'DBInstances':
          [
            {'VpcSecurityGroups': [ {'VpcSecurityGroupId': 'sg-0ed48bab1d54e9554', 'Status': 'active'}]}
          ]
       }

The variable mydb is a dictionary with key DBInstances. This key DBInstances has an array as its value. Now the first item of that array is another dictionary and the first key of that dictionary is VpcSecurityGroups. Now the value this key VpcSecurityGroups another array. This another array's first item is again a dictionary. This last dictionary has a key VpcSecurityGroupId and we want value of this key.

If your head has stopped spinning, then read on and stop cursing me as I am going to demystify it now.

If you want to print that value just use following command:

mydb['DBInstances'][0]['VpcSecurityGroups'][0]['VpcSecurityGroupId']

So the secret is that if its a dictionary, then use key name and if its an array then use index and keep going. That's all there is to it. Full code to print this using Python, boto3 etc is as follows:

import boto3
import click

rds = boto3.client('rds',region_name='ap-southeast-2')
dbs = rds.describe_db_instances()

@click.group()
def cli():
    "Gets RDS data"
    pass

@cli.command('list-database')
@click.argument('onedb')
def list_database(onedb):
    "List info about one database"
    mydb=rds.describe_db_instances(DBInstanceIdentifier=onedb)
    #print(mydb)
    #Following line only prints value of VpcSecurityGroupId of RDS instance
    print(mydb['DBInstances'][0]['VpcSecurityGroups'][0]['VpcSecurityGroupId'])
    #Following line only prints value of OptionGroup of RDS instance
    print(mydb['DBInstances'][0]['OptionGroupMemberships'][0]['OptionGroupName'])
    #Following line only prints value of Parameter Group of RDS instance
    print(mydb['DBInstances'][0]['DBParameterGroups'][0]['DBParameterGroupName'])

if __name__ == '__main__':
    cli()


I hope that helps. If you know any easier way, please do favor and let us know in comments. Thanks.

Categories: DBA Blogs

AWS CloudFormation Linter Installation and Quick Introduction

Thu, 2019-04-04 23:27
AWS Cloudformation is an evolving managed service which facilitates infrastructure as a code in the cloud. What it means is that you can create AWS resources like EC2 instances, S3 buckets and many more just by writing code instead of using GUI console.



For instances, if you want to create 100 EC2 instances for production, and then later you have to create same for development and for testing and then may be later on, you might need to change the configuration of those EC2 instances then doing that by GUI would be a very tedious task. With CloudFormation, you just describe those EC2 instances once in a file called as template and then run it.

Now another cool thing here is that you don't have to write code for all those 100 EC2 instances in that template. You may just describe one EC2 instance and then use CloudFormation Macros to provision 100 or even more of them  (AWS resource limits apply).

Anyway, CloudFormation templates are either written in JSON or in YAML and they grow big fairly quickly and it becomes hard to keep track of syntactical and other errors. With aws cloudformation validate-template or with GUI you can identify the mal-formatted template file but what about if you have written any property of a resource wrongly? What if the resource specification is not as per AWS resource specification document?

Enters AWS CloudFormation Linter.

Linting is an ancient concept from good old C era which refers to a tool to check the source code for any syntax or bugs. So CloudFormation Linter or cfn-lint is a nifty tool to check for syntactical and bugs in the template. I tried to catch any logical errors with cfn-lint in the template but it was unable to identify them. For example, in my template of building a code pipeline, I omitted the cfn-init signal to let CFN know that launch configuration is complete so that deployment group of CodeDeploy could use it, but it was unable to catch that. So it would be a great future feature for cfn-lint.

Following is an example as how to install it and then I have passed through one of my templates to identify any errors:



Another cool blog post by Chuck Meyer is here about cfn-lint and git pre-commit validation.

Categories: DBA Blogs

Step by Step Troubleshooting and Validation of AWS CloudFormation Yaml template

Mon, 2019-04-01 02:00
CloudFormation (CFN) is infrastructure as a code service of AWS. You just tell CFN your desired state of resources and it creates them in order while resolving dependencies. You mention these resources in a file called as template which can be written in YAML or JSON. YAML being bit more user readable, is widely used now. YAML is great but you have to be aware of its indentation.


Following is a step by step troubleshooting and validation of a simple CFN template by using aws cli from windows 10 command prompt.

Step 1: Create your template file. I am using ec2.yaml which just creates an EC2 instance and a security group with some parameters and outputs.


#####################################################################################################
#  Template : Use to deploy simple T2.MICRO EC2 Instance with a security group
#
#  Author   : Fahd Mirza
#
#  Created on :  23rd March, 2019
#
#####################################################################################################
---
AWSTemplateFormatVersion: "2010-09-09"
Description: Deploy EC2 Instance with a security group
######################################################################################################
Parameters:
######################################################################################################
  InstanceType:
    Type: String
    Default: t2.micro
    AllowedValues:
      - t2.micro
      - t2.2xlarge
  MyRSAKey:
    Type: String
    Description: Supply your RSA Key Name
Mappings:
  RegionMap:
    us-east-1:
      AMI: ami-1853ac65
    us-west-1:
      AMI: ami-bf5540df
    eu-west-1:
      AMI: ami-3bfab942
    ap-southeast-1:
      AMI: ami-e2adf99e
    ap-southeast-2:
      AMI: ami-43874721   
######################################################################################################
Resources:
######################################################################################################
  MySecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Allow SSH to EC2 Bastion Host
      SecurityGroupIngress:
      - IpProtocol: tcp
        FromPort: '22'
        ToPort: '22'
        CidrIp: 0.0.0.0/0
      Tags:
      - Key: Name
        Value: MyBastionSG112
  MyEc2Instance:
    Type: AWS::EC2::Instance
    Metadata:
      AWS::CloudFormation::Init:
        config:
          files:
            /etc/cfn/cfn-hup.conf:
              content: !Sub |
                    [main]
                    stack=${AWS::StackId}
                    region=${AWS::Region}
                    interval=1
              mode: '000744'
              owner: root
              group: root
            /etc/cfn/hooks.d/cfn-auto-reloader.conf:
              content: !Sub |
                     [cfn-auto-reloader-hook]
                     triggers=post.update
                     path=Resources.MyEc2Instance.Metadata.AWS::CloudFormation::Init
                     action=/opt/aws/bin/cfn-init -v --stack ${AWS::StackId} --region ${AWS::Region} --resource MyEc2Instance
                     runas=root
            ~/hello1.txt:
               content: !Sub |
                     hello world, I am from files section of CFN Init metadata.         
          commands:
            RunA:
              command: cat "hello world, I am from command section of CFN Init metadata." > ~/hello.txt
              ignoreErrors: "true"
          services:
              sysvinit:
                  cfn-hup:
                    enabled: true
                    ensureRunning: true
                    files: [/etc/cfn/cfn-hup.conf, /etc/cfn/hooks.d/cfn-auto-reloader.conf]
    Properties:
      Tags:
      - Key: Name
        Value: !Ref "AWS::StackName"
      ImageId:
        Fn::FindInMap:
        - RegionMap
        - !Ref AWS::Region
        - AMI       
      InstanceType:
        Ref: InstanceType
      KeyName:
        Ref: MyRSAKey
      UserData:
        Fn::Base64: !Sub |
              #!/bin/bash -xe
              yum install -y aws-cfn-bootstrap
              chkconfig --add cfn-hup
              /opt/aws/bin/cfn-init -v --stack ${AWS::StackId} --resource MyEc2Instance --region ${AWS::Region}
              /opt/aws/bin/cfn-signal -e 0 --stack ${AWS::StackName} --resource MyEc2Instance --region ${AWS::Region}
    CreationPolicy:
      ResourceSignal:
        Timeout: "PT15M"
######################################################################################################
Outputs:
######################################################################################################
  MyEC2InstancePublicIP:
    Description: My EC2 Instance Public IP
    Value: !GetAtt MyEc2Instance.PublicIp
  MyEC2InstanceID:
    Description: My EC2 Instance ID
    Value: !Ref MyEc2Instance
  MyEC2SecGroup:
    Description: My EC2 Security Group
    Value: !Ref MySecurityGroup

Step 2: When I first ran the validation on this template, it gave me error about its indentation like following. I fixed the indentation  by using simple editor. You can also use any online editor or Visual code studio.

C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>dir
 Volume in drive C is Windows
 Volume Serial Number is 5285-4635

 Directory of C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates

23/03/2019  12:54 PM              .
23/03/2019  12:54 PM              ..
23/03/2019  12:49 PM             4,360 ec2.yaml
23/03/2019  12:54 PM             2,461 my.yaml
               2 File(s)          6,821 bytes
               2 Dir(s)  807,032,090,624 bytes free

C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>aws cloudformation validate-template --template-body file://my.yaml

An error occurred (ValidationError) when calling the ValidateTemplate operation: [/Parameters/Mappings] 'null' values are not allowed in templates

C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>
C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>
C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>aws cloudformation validate-template --template-body file://ec2.yaml

An error occurred (ValidationError) when calling the ValidateTemplate operation: Invalid template parameter property 'RegionMap'

C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>aws cloudformation validate-template --template-body file://ec2.yaml

Step 3: Then I had declared a resource in resources section but it wasn't present in the Parameters section. So I removed it from parameters.

An error occurred (ValidationError) when calling the ValidateTemplate operation: Template format error: Unresolved resource dependencies [MyVPCId] in the Resources block of the template

Step 4: Now it works like a charm.

C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>
C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>
C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>
C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>aws cloudformation validate-template --template-body file://ec2.yaml
Deploy EC2 Instance with a security group
PARAMETERS              Supply your RSA Key Name        False   MyRSAKey
PARAMETERS      t2.micro                False   InstanceType

C:\Users\mirza\Desktop\Cloud\AWS\Cloud\CloudFormation\templates>

Categories: DBA Blogs

When AWS SCT Unable to Connect to MySQL in RDS or EC2

Fri, 2019-03-22 05:27
AWS Schema Conversion Tool (SCT) is one of the must tool for a successful migration of databases to AWS RDS.

If you are trying to connect to MySQL hosted on EC2 instance or RDS and unable to make the connection despite of setting the security groups correctly and making sure that port, hostname, username and password are right, then first check the log of SCT.




2019-03-22 19:40:16.866 [   1]     GENERAL INFO    global_settings:
    app_folder=AWS Schema Conversion Tool
    bucket-name=sct-cloud9-test
    console_log_folder=C:\Users\mirza\AWS Schema Conversion Tool\Log
    convert_without_statistic_reminder=false
    data_migration_settings=
    db2luw_driver_file=
    debug_mode=false
    ext_pack_version_alert=true
    facade_log_folder=C:\Users\mirza\AWS Schema Conversion Tool\Log
    file-system-thread-count=1
    forget-state=false
    greenplum_driver_file=
    install_dir=C:\Users\mirza
    log.max_file_size_mb=100
    log.max_files_count=20
    log_folder=C:\Users\mirza\AWS Schema Conversion Tool\Log
    log_folder_extractors=C:\Users\mirza\AWS Schema Conversion Tool\Extractor Log
    log_folder_extractors_ask_before_load=false
    log_folder_extractors_cassandra=C:\Users\mirza\AWS Schema Conversion Tool\Extractor Log\Cassandra
    maria_driver_file=
    message_level.CLONE_DATACENTER=INFO
    message_level.COMPARE_SCHEMA=INFO
    message_level.CONTROLLER=INFO
    message_level.GENERAL=INFO
    message_level.LOADER=INFO
    message_level.PARSER=INFO
    message_level.PRINTER=INFO
    message_level.RESOLVER=INFO
    message_level.TELEMETRY=INFO
    message_level.TRANSFORMER=INFO
    message_level.TYPEMAPPING=INFO
Caused by: com.amazon.sct.dbloader.DbLoaderHandledException: Connection wasn't established. Check connection properties.
at com.amazon.sct.dbloader.DbLoader.checkConnection(DbLoader.java:512)
at com.amazon.sct.dbloader.DbLoader.connect(DbLoader.java:349)
at com.amazon.sct.dbloader.DbLoaderContainer.checkAccessibility(DbLoaderContainer.java:25)
at com.amazon.sct.task.TestConnectionTask.call(TestConnectionTask.java:26)
at com.amazon.sct.task.TestConnectionTask.call(TestConnectionTask.java:12)
at javafx.concurrent.Task$TaskCallable.call(Task.java:1423)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 1 more

In this case, there are no errors about time out which commonly refers to security group configuration or any about the credential issue. Here the issue is not with SCT or any of AWS component, rather its about the user which you are using to connect to MySQL.

Create user MySQL as follows and try and it should work:

CREATE USER 'myuser'@'localhost' IDENTIFIED BY 'mypass';
CREATE USER 'myuser'@'%' IDENTIFIED BY 'mypass';

GRANT ALL ON *.* TO 'myuser'@'localhost';
GRANT ALL ON *.* TO 'myuser'@'%';

Then first test connection using MySQL workbench and then try with SCT.

Hope it helps.

Categories: DBA Blogs

Pages