Posted in Docker, Misc, Tech

Docker – Introduction

docker.png

Let’s start by looking back on how applications were hosted and how it has evolved. Initially, we had one application running on a single server with its own dedicated memory, CPU and disk space. This model proved to be highly costly in order to host high load applications.

hypervisor.pngVirtual Machine

Next, we moved to Virtual Machines (VM), also called as Hypervisor Virtualization. We could now run multiple applications on each virtual machine, each one being a chunk of the actual physical server and having its own memory, CPU and disk space. However, the drawback was that each Virtual Machine needed its own Operating System and this incurred additional overhead of licensing costs, security patching, driver support, admin time etc.

Docker arrived with a solution to this problem. There is just one OS installed on the server. Containers are created on the OS. Each Container is capable of running separate applications or services. Here’s a comparative illustration of Containers vs Virtual Machines:

docker vs vm

  • Docker is said to be derived from the words: docker + worker -> Docker.
  • It is written using Golang
  • Is Open source with Apache license 2.0.

Let’s look at what the following mean:

Docker Container: It is a software that contains the code and all its dependencies to facilitate an application to run quickly and reliably irrespective of the computing environment

Docker Container Image is a lightweight, standalone, executable package that contains everything to run an application: the code, runtime, system tools, libraries and settings. Container Images become Containers at runtime.

Docker Hub is a public docker repository to store and retrieve docker Images. This is provided by Docker, however there are other third-party registries as well.

Open Container Initiative or OCI  is responsible for standardizing container format and container runtime.

Why is Docker so popular? 

  • A Docker image can run the same way irrespective of the server or the machine. It is hence very portable and consistent. It eliminates the overhead of setting up and debugging environment.
  • Another reason is Rapid Deployment, the deployment time is reduced to a few seconds.
  • They start and stop much faster than Virtual Machines
  • Applications are easy to scale as containers can be easily added from an environment.
  • Helps in converting a monolithic application into Microservices. [To read more, refer Microservices]
  • last but not the least, it is Open Source.
Posted in Misc, Serverless, Tech

Serverless

Serverless is a cloud-computing execution model that dynamically manages resources. It is categorized as FaaS or Function as a Service Solution.

serverlessThe idea is to create an abstract functionality which is available “on-demand”. Cloud Platforms bill only for the time of execution of the functionality. Serverless model reduces Operation cost, complexity of provisioning servers and maintaining them and handles scaling automatically.

  • Serverless abstracts the functionality.
  • The server side logic is run on stateless computing containers which are triggered on an event.
  • It is fully managed by third-party, relieving the responsibility of managing servers.
  • Provides Increased Reliability
  • Greener Computing: It reduces the necessity of building data centers across the Globe.

Some well known Serverless solutions are Amazon’s AWS Lambda, IBM’s OpenWhisk, Google cloud service, Microsoft Azure Serverless Platform.

Expedia, Coca-cola, Thompson Reuters, EA are some examples of companies using Serverless Architecture.

Ground Rules

Some ground rules that label an Architecture as Serverless are:

  • Granularity: One or more stateless functions having a single purpose, solving a specific problem.
  • Designed as an Event-driven pipeline.
  • Zero Administration: No provisioning or Maintaining or scaling. Using third party services lets you focus on building value-adding customer features.
  • Scaling is automatically managed.
  • Cost-Saving: You can execute code on-demand and pay only for the time of execution. This helps build self-sustaining start-ups regardless of the number of customers it currently has. Scaling is also not a road-block anymore.
  • Rapid Time to Market: Shortens the Time between an Idea evolving into a Product taking away the entire overhead of procuring servers.

 

What is not Serverless and Why?

  • PaaS or Platform as a Service like Heroku, Salesforce or Google App Engine cannot bring the entire application up/down in response to an event unlike how a FaaS works.
  • Container Platform like Docker and Container Hosting Systems like Mesos, Kubernetes need you to manage the size and shape of the cluster. FaaS can automatically handle resource provisioning, allocation and scaling.
  • Stored Procedures as a Service – They often require to be written in a specific Framework or language and is hard to unit test as they are database dependance.

 

AWS Lambda: 

AWS Lambda can run scripts/app on the Amazon’s cloud environment. Amazon charges only when the function is used. Hence you pay as you use regardless of whether your business has a single user or a million. AWS Lambda provides an integrated solution for computing as well as storage, using Amazon S3.

  • AWS Lambda provides a lot of pre-configured templates to choose from instead of writing the Lambda function from scratch.
  • In Python, boto3 is the library used in order to create an Identity and Access Management (IAM) Role to securely access AWS resources. An example given below.
  • A function can be created inline or by uploading a .zip file.
  • Once created, the lambda function can be triggered via an HTTP call, using the “Add API Endpoint” feature.

Here’s an example of a Lambda function:

import boto3, json

lambda_client = boto3.client('lambda')

def lambda_handler(event, context):
   message = 'Hello {} {}!'.format(event['first_name'], 
                                    event['last_name'])  
    return { 
        'message' : message
    } 

lambda_client.create_function(
  FunctionName='exampleLambdaFunction',
  Runtime='python2.7',
  Role=role['Role']['Arn'],
  Handler='main.handler',
  Code=dict(ZipFile=zipped_code),
  Timeout=300
)

The above function can be invoked using the below snippet:

import boto3, json

lambda_client = boto3.client('lambda')

lambda_client.invoke(
FunctionName='exampleLambdaFunction',
InvocationType='Event',
Payload=json.dumps(test_event),
)

 

Drawbacks

The one major drawback is the dependency on the third-party/ vendor whose services are being used, taking away control over system down time, cost changes etc. Also, it is very hard to port from one vendor to another and always involves shifting the entire infrastructure. Privacy and security is also a major concern when the application is built for handling sensitive information.

Posted in Elastic Stack, Misc, Tech

Elastic Stack

Server logs contain some of the most valuable and untapped information. Logs are always unstructured and usually makes little sense. Various opportunities of improvement might unveil by deriving insights from them. Elastic Stack or the ELK Stack is the most widely used solution for Log Analysis. It is Open Source and has a massive community pushing the boundaries by adapting it into various scalable systems. Companies including Microsoft, LinkedIn, Netflix, ebay, SoundCloud, StackOverflow use the Elastic Stack.

Elastic Stack

ELK is an acronym for three open source projects:

  • Elasticsearch:  A search and Analytics Engine. It is an open source, Distributed, RESTFul, JSON based search engine.
  • Logstash: A Server side data Processing Pipeline that can ingest data from multiple sources. It then transforms and send data to Elasticsearch.
  • Kibana: Visualizes data with charts and graphs.
  • Beats: A light-weight single purpose Data Shipper. Beats have a small installation footprint and use limited system resources. It can either directly send data to Elasticsearch or send it via Logstash. Beats is written in Go! 

Logstash along with Beats collects and parses log data from multiple sources. Elasticsearch indexes and stores this information.
Kibana visualizes this information to provide insights.

Elasticsearch

Elasticsearch is worth discussing in-detail. It is widely used for Full Text search. It is written in Java. This powerful search engine is designed to scale-up to millions of search events per second. Elasticsearch is used by Wikipedia, Airbus, ebay and shopify for powering their search for near-real time access. Its powerful features:

  • Scalability
  • Highly Available
  • Multi-tenancy
  • Developer friendly

Logstash

Logstash supports data of many formats coming from various systems. It can ingest data from logs, web applications, Data stores, Network devices, AWS services and REST endpoints. It then parses and transforms data, identifies named fields to build the structure and converts into a common format.logstash

  • Provides around 200 plugins to mix and match and build the data pipeline. It also provides the feature to build a plugin to ingest from a custom application.
  • Pipelines can be very complicated and cumbersome to monitor Load, Performance, Latency, Availability etc. Centralized monitoring is provided by the “monitoring and pipeline viewer” that makes the task easier and understandable.
  • Structures, transforms and enriches data with filter plugins
  • Can emit data to Elasticsearch or other destinations using output plugins like TCP or UDP
  • Logstash is horizontally scalableSecurity: Incoming data from Beats can be encrypted. Logstash also integrates with secured Elasticsearch clusters. 

Kibana

Kibana provides interactive visuals of Elasticsearch data to monitor the behavior, understand the impact of certain data changes and so on.

  • Kibana core comes with histograms, line charts, pie charts, sunburst and many other classics.
  • Plots Geospatial data on any given map.
  • Can perform advanced Time Series analysis.
  • Graph Exploration: Analyzing Relationships with Graphs
  • Build customized canvas, add logos, elements and create a story.
  • Can easily share dashboards across the organization 

Problems ELK can solve:

  • In a distributed system with several nodes, searching through several  log files for certain information, using unix commands is a tedious task. Elasticsearch comes to the rescue by providing faster access along with Logstash+Beats by collecting logs from all the nodes.
  • Ship Reports: Kibana provides faster ways to explore and visualize data. It can schedule and email reports. Can quickly export the results of ad-hoc analysis or saved searches into a CSV file. Alerting can be used to generate data dumps when certain conditions are met, or on a regular interval.
  • Alerting feature can set alerts on data changes, that can be identified using the Elasticsearch query language. Can proactively identify intrusion attempts, trend in social media, peak-hours in network traffic and can also learn from its own Alerting history. It comes with built-in integrations for email, Slack, HipChat etc.
  • Unsupervised Learning: The Machine learning features have the ability to detect different kinds of anomalies, unusual network activities and quick root cause identification.

It can also integrate with Graph APIs to analyze relationships in data. Canvas can be used to build presentations and organize reports. Elastic Stack has been extending its features and exploring many possibilities. 

Useful Resources:

  1. Elastic Stack
  2. Kibana Live Demo
  3. Logstash – Video

Posted in Building REST APIs, Golang, Misc, Tech

Build RESTful API in Golang

Welcome again! After learning some Golang, my next experiment was to build APIs. So, here’s a blog-post for someone interested in trying it out. The only prerequisite to building RESTful APIs is to know the basics of Golang along with some SQL. Link to my blog-post series on Introduction to Golang may come handy.

The code can be broadly divided into two files:

  • migrate.go: Connects to MySQL and creates the necessary tables.
  • base.go: Contains all the API handlers.

Step 1: Establishing Database Connection

We will now be working on the migrate.go file. Certain packages are to be imported before we can start over.

import (
"fmt"
"database/sql"
_ "github.com/go-sql-driver/mysql"
)

The package sql provide an interface for the SQL database. In other words, it equips Golang to work with database and create a connection, executing queries and perform other SQL operations. The sql package works along with a database driver. We have used Go’s mysql driver in this case: go-sql-driver/mysql.

db, err := sql.Open("mysql","root:password@(127.0.0.1:3306)/gotest")
if err != nil {
fmt.Println(err.Error())
} else {
fmt.Println("Connection established successfully")
}
defer db.Close()

sql.Open() opens a database connection to MySQL. This function accepts two parameters – the database used (mysql in this case) and the connection details in the format: username: password@hostname/database_name. Here ‘gotest‘ is the database created to run this project. Appropriate Error message is shown when the connection fails. Defer is used to wait until all the surrounding functions return. Only then is the connection  to the database closed.

err = db.Ping()
if err != nil {
fmt.Println(err.Error())
}

In case the machine is not reachable at all,  sql.Open() does not throw an error. Hence we need to explicitly ping the machine in order to verify the same.

STEP 2: Creating Tables

stmt, err := db.Prepare("CREATE TABLE emp (id int NOT NULL AUTO_INCREMENT, first_name varchar(40), last_name varchar(40), PRIMARY KEY(id));")
if err != nil {
fmt.Println(err.Error())
}
_,err = stmt.Exec()
if err != nil {
fmt.Println(err.Error())
} else {
fmt.Println("emp table migration successful")
}

Now let’s create the table ‘emp‘ using db.Prepare(). Just to recap, ‘db’ is the handler we obtained after connecting to the database. The SQL create statement is provided as parameter. It returns a handler which is then executed using Exec() function. It also returns an error if the function fails. Error handling is coded accordingly in the last few lines.

STEP 3: Building the APIs

We now have everything in-place to start building the APIs. Create the file base.go. We will be importing some new libraries apart from those we have already come across. “Gin” is a HTTP web framework written in golang. The package “bytes” implements functions for manipulating byte slices. “net/http” provides implementations for Client Server Communication.

import (
"fmt"
"database/sql"
"bytes"
"net/http"
"encoding/json"
"github.com/gin-gonic/gin"
_ "github.com/go-sql-driver/mysql"
)

Next, create a database connection using sql.Open() along with implementation of  error handling.

func main() {
db, err := sql.Open("mysql","root:pwd@(127.0.0.1:3306)/gotest")
if err != nil {
fmt.Println(err.Error())
} else {
fmt.Println("Connection established successfully")
}
defer db.Close()
//Checking for connection:
err = db.Ping()
if err != nil {
fmt.Println(err.Error())
}

Create a golang type called Emp which will hold each row from the database and help in processing them. The type is a carbon copy for a typical row in the database containing “id”, “first_name” and “last_name” as its variables.

type Emp struct {
id int
first_name string
last_name string
}

Step 4: The GET request

A simple GET request can be created with the help of Gin which creates a handler to place a http GET request. It takes as input the URL string which is to be matched and function to handle the data returned.

router.GET("/emp/:id", func(c *gin.Context) {
var (
emp Emp
result gin.H
)
id := c.Param("id")
row := db.QueryRow("select id,first_name,last_name from emp where id = ?;",id)
err := row.Scan(&emp.id,&emp.first_name,&emp.last_name)
if err != nil {
//if no results found send Nill
result = gin.H{
"result": nil,
"count": 0,
}} else {
result = gin.H{
"result": gin.H{
"id": emp.id,
"first_name": emp.first_name,
"last_name": emp.last_name,
},
"count": 1,
}}
c.JSON(http.StatusOK, result)
})

 

Here the handler will match /emp/1/ but will not match /emp. The function stores the id, first name and last name into the type Emp by querying the database with the id obtained from the user in the GET request. The result string is constructed and passed as a JSON which serves as the output.

Similarly,

  • A GET request can be constructed to obtain all the employee data in the database.
  • POST request can be used to insert a new record or
  • PUT can modify an existing record.
  • You can similarly delete one or more records using the DELETE request.

Code for each of the above APIs can be found on my github.

Finally, let us see the above APIs in Action.

Here’s an example of a PUT request. We are here updating the first & last name of employee #2. The port in use is 3000. Note the message shown on successful execution.

PUT

Let’s now verify if the name of employee #2 was successfully modified as claimed by the above output. Here’s the GET request:

GET

The entire code can be found on: Github/RESTful-API-with-Golang-and-MySql

Posted in Microservices, Misc, Tech

Microservices

microservices

There has been a great deal of talks about Microservice these days. So how exactly is it different from the traditional/monolith architectural design? How is it different from SOA or the Service Oriented Architecture? Here is how I ventured into finding answers to these questions.

Microservices in simple words is a way of decentralizing an application into smaller, well defined chunks, each performing an important task independently. Also, there needs to be a way these can talk to one another and solve the problem. A good example is the Microservice architecture built by Netflix.

We can assertively say that microservices are quite similar to SOA. We can imagine microservices to be a small portion of a much larger entity called the SOA. Microservices can be deployed independently whereas SOA is deployed as a single monolith. SOA is majorly focused on re-use of services unlike microservices which is all about decoupling.

SOA

There are quite a few perks that come with the Miroservices:

  • Exploring advantages of a variety of data stores.

A monolith architecture is restricted to choose a single database/datastore. This prevents the application from leveraging the advantages of other datastores simultaneously. However, each microservice can connect to its own datastore. It can also have its own platform.

  • Enables partial deployments and Agility.

Generally, the waiting time is a major overhead when deploying products in production. This is because the development time for each feature varies widely. Microservice preserves modularity and offers the benefit of deploying each feature separately. This hence suits Agile development environments.

  • Decentralized Infrastructure. High Availability.

Traditional architecture has a dependency on a single database. This causes a ‘single point of failure’. A single corruption/bug is sufficient to destabilize the entire application. Decentralization ensures infrastructure failure remains confined to a single entity without affecting the entire application. Downtime is also significantly reduced.

Challenges

  • Designing an architecture with completely independent services along with language agnostic APIs to communicate between the services can be complex.
  • Organization embracing the change is a challenge. Provisioning of Infrastructure, seamless communication between application and devops team, Rapid Application deployment can prove to be tough.
  • Testing a microservice application is much more cumbersome compared to a traditional application.
  • Multiple replication of data/redundancy can lead to low consistency.

References and further reading: