In the second installment of my three-part blog series, you will learn about the tools and services for DevOps. If you have not read my first blog I strongly recommend to read that first and come back as I have described some very important concepts and theory behind DevOps.
In modern-day development its all about picking right tools for the right job to avoid any accidental complexities. Similarly while setting up a DevOps pipeline within your organization it is important to research on all the tools you can. Open source software doesn’t cost a dime and is often a good replacement counterpart for the paid ones.
In my last post, I have briefly mentioned about DevOps pipeline and its various stages. Also, I gave a primer on the tools that can be used as complementary to each phase in the DevOps pipeline. I would uncover some of the buzzwords around modern day tools and services like Docker, Kubernetes, etc.
Even before it was a popularised term and a de facto industry standard there were very few software development teams that were using some practices of DevOps. Organizations didn’t have any other option than to either set up its own infrastructure or take it on lease from Data Center companies, at very high prices. As a result, the software and services which were coming out of these organizations were also sold costly to support their infrastructure costs. But it all changed with the oncoming of cloud computing. An organization can also lease its technology infrastructure to other smaller companies, charging at an hourly rate only for the resources consumed. This became a game changer for startups and a very affordable option for many software companies.
As of 2018, there are 3 major cloud providers, interestingly these are the same 3 behemoth companies who have been prevalent in the tech industry from its inception (the 90’s). One of their most renowned contribution to this new innovative world is opening of the doors, to such technology infrastructure for the general public on very affordable pricing models. Amazon gave us Amazon Web Services (AWS) which is a platform of choice for many startups. AWS infrastructure is programmable and by that I mean you can create and manage any resource that is provided by AWS using their SDKs, REST API or even through aws-cli. This opens up a way for new industry terminology, Infrastructure as a Code. IAAC makes it easier for developers to well manage their resources on the cloud. Startups can now keep their pricing to the bare minimum by running the infrastructure resources sparingly. Once the task is accomplished, the provided resource can be terminated and this has given huge cost saving benefits for small companies and startups who can not pay large costs upfront.
Google’s GCP (Google Cloud Platform) gives a very tight integration with big G’s services like Android development, BigQuery, Machine Learning, etc. GCP has all the resources a development team needs to run its services at affordable prices including, Virtual Machines, Load Balancers, Firewall etc. Recently both Google and AWS have given a big push for serverless architectures. They both provide a wide range of cloud services and resources for data storage, networking, monitoring, etc. Other than few specific use cases, it is up to your preference to use one over the other.
Then there is also Microsoft’s Azure if one likes to keep themselves within Microsoft’s ecosystem. It has everything a Windows system developer needs, virtual machines to run all kinds of Windows server operating systems. Powerful suite for ASP and .NET applications and its deployment. But as the majority of developers use Linux as their choice of server, Azure virtual machines now come with various Linux distributions too.
Now, you may ask what is the role of DevOps with the cloud providers. Every cloud provider has its own way to interact with its services, almost all of them have some kind of a way to integrate its infrastructure capabilities into your own application to manage resources. Like, starting a new server configuration and switching between them, creating new databases on the fly and destroying them, assigning DNS, etc. As we have discussed in an earlier post within a DevOps pipeline there can be many phases where these capabilities can come in handy, for example, running a build or automating a deployment process.
A DevOps process must have tight integration with cloud providers for complete automation of repetitive tasks like server provisioning, environment setup, networking, firewalls, etc. It frees up your developers from doing mundane tasks. They can spend their time on doing better things like solving more problems.
There are various new forms of software popping up which when integrated with cloud computing can decrease the developers time by many orders of magnitudes. We will discuss the most popular ones here.
It is an open source automation server written in Java. Jenkins helps to automate the non-human part of the software development process, with continuous integration and facilitating technical aspects of continuous delivery. Jenkins was originally developed as the Hudson project in the summer of 2004 at Sun Microsystems. Its functionality can be extended by plugins.
To start using Jenkins you can launch it using a Docker command like this:
“docker run --rm -u root -p 8080:8080 -v jenkins-data:/var/jenkins_home -v /var/run/docker.sock:/var/run/docker.sock -v "$HOME":/home jenkinsci/blueocean”
Above command will create a new Docker container using jenkinsci/blueocean image and will be accessible from your local machine. You can also run it on a server resolving at a domain URL.
After doing the initial setup, you will be able to access the Jenkins UI at localhost:8080.
Jenkins has many functionalities that can be accessed through its UI. You can create freestyle projects which is a central feature of Jenkins but as we discussed in DevOps we work with the pipeline. These are the essential features of Jenkins for setting it up as a Continuous Integration(CI) system.
Pipelines – This is crucial to complete operation of CI/CD. A pipeline is an automated system that performs many complex tasks in different stages such as, build, test and deploy, before your software is delivered to your users. Jenkins uses a special file sitting inside your project dir Jenkinsfile, it contains a declarative style of code for define various agents and stages within it. Agents provide the workspace to perform the tasks and stages define each task using steps.
Execution steps – The series of steps contained within a stage definition inside a Jenkinsfile.
The steps can be shell scripts sh, windows batch scripts bat, or any other script/executables available inside the agent. Steps can also be imported from a file directly for more complex tasks.
Environment variables – These can be set as default for all stages or can be set individually for each stage to run the steps. It is best practice to define credentials or any other security keys with dummy values here and override them for better security.
Tests – Jenkins can generate test reports and other artifacts using the post. By default, it uses JUnit-style XML reports but can be replaced with any widely used test format. The report will contain the essential details while performing test associated tasks.
Notifications – Based on the outcome of the execution steps the build can either be success or failure. There are ways to send out the notifications on Email, Slack, etc to notify your developers about the status of the build.
Deployment – A deployment can be done at various stages for different environments like staging, production. A human input step can also be added for validating the deployment and proceeding with the next stage.
Using the above features a workflow can be defined to automate no-human activities of software development or for DevOps.
Get your development team up and running with a manged DevOps setup on the cloud. Click Here
All the code authored by your developers is an asset for your startup. So you have to take measures to preserve it from any future losses. It can be code theft, accidental code deletion and even code loss due to storage malfunction. Cloud provides many alternatives to remedy code loss, code versioning system. In simple terms, it keeps a track on each version of the code by storing its content, author, and timestamp. There are many CVS options out there but the most popular among all is a git.
SVN is another code repository system but its outdated in comparison to git. GIT allows you to sync code with remote origins only when needed and you can continue writing the code even offline, the major issue with svn. Both have code branching mechanisms. You can create as many branches as you like. While in case of svn it is mandatory to have a remote linked and accessible while working, in case of git you should also have a remote origin setup to avoid code recovery and code sharing. Git has two major cloud platforms, we will discuss them now.
A cloud option available at github.com. it is a code-sharing website and has some social networking flavors to it. Developers create a profile and they can follow other developers or any other code repositories. With the ability to create unlimited free repositories Github is the most lucrative option for open source software teams. It can be witnessed by the sheer amount of open source repositories present on it currently. In the recent push towards open source by Microsoft, it has bought a majority stake in it. Although having private repository options with paid plans GitHub is still a third party website and have full control over its policies that has changed many times from its inception.
Born as a Github rival, Gitlab is a complete DevOps solution package. It has many features covering each of the DevOps stages from Plan to release and monitoring. But the feature that makes it more likable over Github is that it is an open source software with its core written in Ruby. And like any other application, you can self-host it in your private infrastructures, both cloud, and on-premises. Along with using it as a git remote repository, you can also use it as your CI/CD system by configuring jobs in the pipeline. It also can host private Docker repositories, but more on that later.
Gitlab is bundled with the features that you can get in multiple application. It does the complete lifecycle management up till security.
Manage – You can do authentication and authorization via LDAP servers, OmniAuth for social and SAML 2.0 Service providers.
Plan – Track your issues using issue boards and do portfolio management
Create – Ability to manage your source code, perform code reviews for your developers, setup wiki, also you can use their web IDE.
Verify – Launch Continuous Integration(CI) pipelines with the ability to do Unit testing, Integration Testing, Code Quality checks, and performance testing.
Package – Registries for your Docker containers and maven repositories.
Release – Continuous Delivery(CD) pipelines and review apps for creating staged environments.
Configure – Auto DevOps for an opinionated DevOps pipeline which runs automatically and Kubernetes configuration.
Monitor – To keep an eye on your system metrics, logging, and cluster monitoring.
Secure – Static Application Security Testing (SAST) and Dynamic Application Security Testing (DAST) along with Dependency scanning and Container scanning.
As you can see, Gitlab has all the right checkboxes ticked to be a one of a kind complete DevOps solution package. It also has a well-funded organization behind it that sells Gitlab Enterprise, which indicates the platform is being actively developed and maintained and is safe to use it for your own startup.
Even before many developers were aware of container technologies, Linux community has envisioned the container platforms and develop Linux Containers (LXC). It is like the father to Docker containers. It was the first, most complete implementation of Linux container manager. It was implemented in 2008 using groups and Linux namespaces, and it works on a single Linux kernel without requiring any patches. Docker also used LXC in its initial stages and later replaced that container manager with its own library, libcontainer.
When Docker emerged in 2013, containers exploded in popularity. It’s no coincidence the growth of docker and container use goes hand-in-hand. But there’s no doubt that Docker separated itself from the pack by offering an entire ecosystem for container management.
With the tagline of Build, Manage and Secure Your Apps Anywhere. Your Way. It is the most widely used container platform available in the world.
For developers, Docker provides a revolutionary approach towards building and deploying the applications with wrapped up dependencies that include: code, runtime, system tools, and libraries. This also enables a collaborative development model by sharing containers. This also enables your development team to practice polyglot development and build highly decoupled microservices for better scalability, high performance, and tight monitoring. Eliminate environment inconsistencies and the “works on my machine” problem by packaging the application, configs, and dependencies into an isolated container.
For IT operations, an enterprise-ready container platform is needed to provide an integrated software lifecycle and operations management workflow and security at scale with the assurances of enterprise support and a certified technology ecosystem. The Docker container platform delivers freedom of choice, agile operations and integrated security so you can confidently deploy, manage and secure your applications in production. Additionally, the Docker container platform can reduce IT costs by 50% while accelerating your time to market by 3X.
We offer A managed service for your teams’ complete devops solution on the cloud. Click here
Docker is a great and very powerful containerization platform but if your business is fully deep into it you must consider automating the management of a large number of containers. During microservice development, you may produce a large number containers each dedicated to a particular service or a group of it, but you IT operations team will find it difficult to manage these huge list containers and that may give some adverse effects to your business by the slowdown of operations and development. This is where orchestration utilities come in handy as they provide the capability to manage a cluster or a group of containers very easily. The most notable ones are Docker Swarm and Kubernetes.
Developed by the Docker team and is naturally extendable if you are good at working with Docker containers. Docker includes swarm mode for natively managing a cluster of Docker Engines called a swarm. Use the Docker CLI to create a swarm, deploy application services to a swarm, and manage swarm behavior. The main highlighted features
- Cluster management integrated with Docker Engine
- Decentralized design
- Declarative service model
- Desired state reconciliation
- Multi-host networking
- Service discovery
- Load balancing
- Secure by default
- Rolling updates
Mesos is another containerization platform about which a lot of developers are not aware of. It can also manage the Docker container along with its own default container (Mesos). Docker containerizer allows tasks to be run inside docker container. It has a very unique capability of composing two different container platform, currently only Mesos and Docker together. As advocated by the user case; For testing tasks with different types of resource isolations. Since ‘mesos’ containerizers have more isolation abilities, a framework can use composing containerizer to test a task using ‘mesos’ containerizer’s controlled environment and at the same time test it to work with ‘docker’ containers by just changing the container parameters for the task.
The Docker Containerizer is translating Task/Executor Launch and Destroy calls to Docker CLI commands.
Currently, the Docker Containerizer when launching as the task will do the following:
- Fetch all the files specified in the CommandInfo into the sandbox.
- Pull the docker image from the remote repository.
- Run the docker image with the Docker executor, and map the sandbox directory into the Docker container and set the directory mapping to the MESOS_SANDBOX environment variable. The executor will also stream the container logs into stdout/stderr files in the sandbox.
- On container exit or containerizer destroy, stop and remove the docker container.
The Docker Containerizer launches all containers with the mesos- prefix plus the agent id (ie: mesos-agent1-abcdefghji), and also assumes all containers with the mesos- prefix is managed by the agent and is free to stop or kill the containers.
When launching the docker image as an Executor, the only difference is that it skips launching a command executor but just reaps on the docker container executor pid.
Note that it currently defaults to host networking when running a docker image, to easier support running a docker image as an Executor.
The containerizer also supports optional force pulling of the image. It is set disabled as default, so the docker image will only be updated again if it’s not available on the host. To enable force pulling an image, force_pull_image has to be set as true.
Kubernetes is another one of the most popular open source container orchestration system made for automating the deployment process. Although its creators were initially Google, it has been undertaken by CNCF(Cloud Native Computing Foundation). One of the most important features Kubernetes has to offer is its ability to provide a platform where you can have automation, deployment, scaling, and operation of different application containers across various cluster of hosts. Also, its compatible with Docker, after its first release.
So like a cluster is commonly structured, Kubernetes cluster has one cluster master and others are worker machines, which they have named ‘node’. So entire cluster orchestration system is ran by cluster master and nodes. This cluster is the very foundation of GKE: These are the kubernetes objects that represent a containerized application, that generally run hierarchically on top of each other.
Minikube is one of the special tools of Kubernetes that makes it so optimal, the main function of this tool is to run a single-node cluster inside a virtual machine, in a laptop for instance, for users who are trying out kubernetes or develop with it on daily basis. There are many features in Minikube, like DNS, NodePorts, ConfigMaps and Secrets, Container runtime, and Dashboard. Minikube is really easy and to install and intuitive to use.
We’ve lots of interesting features about kubernetes to discuss, for which we need to learn about Kubectl. It is a command line interface, for running the command against kubernetes Cluster. We can have a look at a couple of examples, for knowing how kubectl works. Installation of Kubectl is really easy, and a comprehensive guide is here.
This is the syntax structure of kubectl commands.
where command, TYPE, NAME, and flags are:
command: Specifies the operation that you want to perform on one or more resources, for example create, get, describe, delete.
TYPE: Specifies the resource type. Resource types are case-insensitive and you can specify the singular, plural, or abbreviated forms. For example, the following commands produce the same output:
NAME: Specifies the name of the resource. Names are case-sensitive. If the name is omitted, details for all resources are displayed, for example $ kubectl get pods.
We’ve something called deployment controller, in Kubernetes. Purpose of Deployment controller is to change the present state of the deployment object to the desired state. Deployments can be defined for creating new ReplicaSets or for deleting present deployments or for adopting all their current resources with the new one.
So, for deploying you need to create .yaml file, mentioning all the specification, like the number of replicas, app name, ContainerPort. Basically we need to have .metadata for all the fields.
For creating a deployment, an example command can be
Now we want to know about names of deployments in cluster, desired number of replicas, current replicas, for that we have to type the command:
Also, if we type that before changing the configuration in yml file, we can see the difference with above command.
A pod is basically, a group of one or more containers, a typical example can be Docker containers, of-course with accoutrements like shared storage, and shared network, also for specification for as for how containers are to run. Also, contents of pods are always co-located and likewise co-scheduled, and their context is always shared. A pod can model a “logical-host” which is application specific, which means that it contains one or more application containers, which are much more tightly coupled, than they’re in a pre-container world, which leads to their execution on the same physical or virtual machine, and of course, on the same logical host.
Node commonly referred as minion, maybe a VM or physical machine, depending on the cluster. Each node contains all the services, which are required for runnings pods, (formerly discussed) and is managed by the master components. All the services on any minion include the runtime container, kubelet and kube proxy. A node basically contains following information, that is Addresses of various fields depending on the cloud provider. Condition for defining the status of all the running nodes. Capacity describes all the resources available on the node, and the max number of pods that can be scheduled onto the node. And finally Info, which is the general information about the node.
Services and Labels
Labels are key/value pairs that are attached to objects, such as pods. Main purpose of labels are to specify attributes of objects that are identifying,meaningful and relevant to users, but they don’t directly hit the semantics to the core system. Another purpose of labels are to select and organize any size of subsets of any objects. Labels can be attached to objects at creation time and as depending on the need of time, can be added and updated and changed at any time. Also, each and every object can have a small set of key and value labels, explicitly defined. Like each key must have one single and unique value for a given object.
As it’ll probably make sense, Kubernetes pods don’t last forever, and are short-lived. Thats where services enter, a kubernetes services is an logical module that defines a paradigm, logical set of Pods and a suitable policy, as for how to access them. Service target a set of pods, and use Label_Selector for determining them.
It is very intuitive and easy, as for how to scale master and slave or worker nodes on a cluster.
For example, considering we’ve a working set of watch_me cluster, which also has been deployed.
So for seeing status of watch_me
Like I said, its very intuitive and easy to scale cluster, for increasing master this simple command will do the job.
Likewise, for adding workers, this command will create more worker-units
As users always want their application to be accessible, at any point of time, also developers are expected to deploy updated versions of the application several times a day. It’s possible in kubernetes.One of the most pleasing features of Kubernetes is that it allows Rolling updates, with zero downtime, which basically works by incrementally updating pods instances, with new ones as they get updated. The new pods that have been updated are scheduled on the nodes, with available resources.
Monitoring and Logging
Another one of the most useful open-source we’ll discuss is Kibana which is a open source analytics and visualization platform which has been specifically designed to work with Elasticsearch. We can use Kibana to search, view, and interact with data stored in Elasticsearch indices. You can easily perform specialized and professional data analysis and plot and visualize your data in a variety of graphs, records and coordinated maps.
One of the most important ability of Kibana is that it makes the task of understanding large volumes of organized data very easy. It’s a very lucid, and very intuitive browser-based interface enables us to very quickly create and allows us to share with others our dynamic dashboards, which helps us that display changes to Elasticsearch queries in real time.
Setting up Kibana is a very intuitive to set up, . You can install Kibana and start exploring your Elasticsearch indices in minutes — no code, no additional infrastructure required.
So, just for dipping our fingers, We can start exploring Kibana with following two ways:
Elasticsearch is another one of the open-source full-text search and real-time distributed analytics engine. Elasticsearch finds most use in Single Page Application (SPA) projects. Also, all the java programmers would want to know, that Elasticsearch is an open source engineer that has been developed in Java and is being used by many big organizations around the world. It is licensed under the Apache license version 2.0. In this brief article, we’ll be touching surface of Elasticsearch and its features.
So, Elasticsearch and some of its many useful features are:
- It has been equipped to use denormalization for improving the intricate search performance.
- It’s one of the most popular industrial search engines, which, like mentioned above, is being used by many MNC, like StackOverflow, GitHub, Wikipedia.
- It is also being used as replacement of document stores like MongoDB and RavenDB
- Also, It’s capable of scaling up to petabytes of structured and unstructured data.
With this, our discussion of various key softwares and technologies come to an end, which we’ll be continuing in next blog about DevOps Security, Thank you.