Elastic Stack Guide Part – 1

As nowadays lots of our servers are deployed on Cloud and many applications are running on these servers , it is impossible to monitor and analyze logs by going to each servers . Central Logging and Monitoring solution is a must in present time  . 

In this Bog Series , we will learn about usage of Elastic Stack aka ELK  . 

Overview  :   

Elastic Stack is a group of open source products from Elastic designed to help users take data from any type of source and in any format and search, analyze, and visualize that data in real time. The product group was formerly known as ELK Stack, in which the letters in the name stood for the products in the group: ElasticSearch, Logstash and Kibana. A fourth product, Beats, was subsequently added to the stack, rendering the potential acronym unpronounceable. Elastic Stack can be deployed on premises or made available as Software as a Service

Architechture : 

For a small-sized development environment, the classic architecture will look  as follows :

Here there are many different types of beats you can read them from https://www.elastic.co/beats/ . Each beat have different set of usecases  . 

In this blog we will learn about two beats MetricBeat and FileBeat . 

Note – LogStash is an options part in the architecture and should not be needed in most of the cases  . Read more about Logstash at https://www.elastic.co/logstash/

Usage Elastic Stack : 

I am running experiments on CentOS7 machine and using rpm to setup the elastic stack . 

Elastic Search  Installation : 

Commands to install Elastic Search : 

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.14.0-x86_64.rpm
sudo rpm -i elasticsearch-7.14.0-x86_64.rpm
sudo service elasticsearch start

How to check if Elastic Search is running : 

[root@localhost elk]# curl http://127.0.0.1:9200
{
  "name" : "localhost.localdomain",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "MxKYDoJAQRG9D6krdFThsQ",
  "version" : {
    "number" : "7.14.0",
    "build_flavor" : "default",
    "build_type" : "rpm",
    "build_hash" : "dd5a0a2acaa2045ff9624f3729fc8a6f40835aa1",
    "build_date" : "2021-07-29T20:49:32.864135063Z",
    "build_snapshot" : false,
    "lucene_version" : "8.9.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

If you are getting output like above , it means elastic search is installed successfully  . 

Note : To change listen address and port you can change in the following file : /etc/elasticsearch/elasticsearch.yml

Kibana :   

Kibana is the Front end tool which communicates to Elastic search where anyone can monitor and analyze logs . 

Commands to install kibana : 

curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-7.14.0-linux-x86_64.tar.gz
tar xzvf kibana-7.14.0-linux-x86_64.tar.gz
cd kibana-7.14.0-linux-x86_64/
./bin/kibana

Access kibana from the url : 

http://127.0.0.1:5601/app/home#/

Note :  configure vim config/kibana.yml   for port and ip addressed for listening  settings  . 

Beats  

These will be installed on all servers from where we want to collect information  . they are like agents which will send data to Elastic Search  . 

Enabling Metric Beat : 

Every Beats supports different modules , it is up to the use that which module one wnts to enable in each beats  . if we talk about MetricBeat  it has many modules like System,Postgres,Nginx and so on . In this Blog we will see usage of System Module of MetricBeat . 

Commands to install MetricBeat  : 
curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.14.0-x86_64.rpm
sudo rpm -vi metricbeat-7.14.0-x86_64.rpm
Enabling System Module  of Metric Beat :
sudo metricbeat modules enable system
sudo metricbeat setup -e
sudo service metricbeat start

Here we are only enabling system module of metri beats  ,  there are many odule for basic monitoring of aplications like postgresql , nginx , tomcat etc . 

Fo list of modules available in metric beats :  command is 

metricbeat modules list  

Yeipeee  Now we can Monitor System Data in kibana  as follows . 

Open [Metricbeat System] Host overview ECS   in Dashboards in kibana UI . There you can apply filter of host of which one wants to see data  .   

System Module MetricBeat Uses :   What analysis can be Done by System module of MetricBeat : 

Traditionally after accessing linux servers , we gather system information by using many different commands and tools which also takes time , specially when there is some running issue on production . 

Following is the list of information : 

  1. Size information of all partitions 
  2. Read/Write Performance of Hardisk 
  3. InboundOutBound Traffic analysis per Ethernet Port 
  4. Load Avergae analysis of system 
  5. Top Proesses consuming High CPU  and RAM 

All these type of information now can be seen in seconds for some particular host using kibana UI .

Following are some screenshots  : 

Enabling FileBeat

Whether you’re collecting from security devices, cloud, containers, hosts, or OT, Filebeat helps you keep the simple things simple by offering a lightweight way to forward and centralize logs and files.

Commands to install Filebeat : 
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.14.0-x86_64.rpm
rpm -ivh filebeat-7.14.0-x86_64.rpm

Note : For configuring filebeat that where to send data to elastic search or filebeat  configue in /etc/filebeat/filebeat.yml  , cureent as i have only one machine so no need to do an conf as defaut conf will work for me You can check the following lion : https://www.elastic.co/guide/en/beats/filebeat/7.14/configuring-howto-filebeat.html

enabling system logs module in filebeat : 
filebeat modules enable system
(for system logs if we want to set custom paths :   edit the file /etc/filebeat/modules.d/system.yml)  -- Generally no need to change these config in all cases 

filebeat setup -e
sudo service filebeat start

Like  Metric Beat , FileBeats also have list of modules like postgres,nginx , and it also supports logging of popular framework like spring and can collect logs of these applications and provides ways to analyze them easily . 

To check modules list available for filebeat use following command  : 

[root@localhost elk]# filebeat modules list | less

System Module Filebeat Uses :   

Now you can use Kibana UI to analyze system logs like messages etc  . 

Open [Filebeat System] Syslog dashboard ECS  in Dashboard Tab  in Kibana  . 

Following are some screen shots which one can see : 

 

Configure filebeat for custom log files  : 

Now we may have situation where none of Modules and integration with framework logging work in filebeat for our custom application log then in that case you can configure your input manually to configure path of logs to read and analayse them in logs and stream section in kibana UI

Follow the following link to configure your log path : https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html

you can watch logs by : http://127.0.0.1:5601/app/logs/stream 

Here you can search in logs by hostname , filepath  and can also search in whole message which is fetched . 

By default only message column is shown . One can configure then in settings tabs of logs tabs in kibana  . 

Following are some screenshot : 

By Default logs lines are only one column , if for advance debugging we want to break log tine into columns then we need to use Logstash with Grok Filter . 

In next blog we will see the usage of LogStash to break custom logs into columns for better understanding . 

ThankYou all 

How to start your journey into Microservices Part -3 – Authentication Strategies

In the last blog we learned that we could use DDD to breakup the system into diff. microservices.

In this blog we will understand where and how we can implement authentication in Microservices.

How to implement:

Let’s understand what all mechanisms are there to implement authentication and authorizations:

  • Session Token <A long unidentifiable string> Based
    • HTTP Session – Cookie
      • not meant for microservices enviornment as generally there will be multiple nodes and tying a request to one particular node is not good
    • Redis or Any Other Caching tool
      • store the session token in a cache after the login is done
      • any request which needs to be checked verify the token from the cache
  • JWT Tokens as Session Token<JSON Web Tokens are an open, industry standard RFC 7519 method for representing claims securely between two parties.>
    • No external caching required
    • Can be done at Service level only without any external call
    • Challenges on keeping the signing secret constantly changing

In this both the approach has its pros and cons but i generally prefer Session Token and Cache based authentication for user based API and can use JWT for Service to Service communications , now a days JWT tokens are also used extensively.

  • JWT challenges on being used in browser side
    • No easy way to un-autheticate the JWT token as it is self verified <one way is time bound in that case token lives for ~5 min even after user logout> vs you can simply remove the token from
      • You can sort of simulate invalidation of a JWT, for a particular verifying party, by storing the JWT ID (jti claim) or equivalent, into a “revoked” list. For example, in a cache stored in API Gateway. Use a TTL that is longer than the JWT Expiry. (This will not work well if the JWT does not have an expiry!)
    • Other challenge is to manage the private keys very securely and constantly change it.

When to implement:

Let’s understand the categories when we need authentication and authorization

  • External API calls – api calls done from outside the system deployment either from UI or from third party.
  • Internal API calls – api calls done internally from one service to another either in respect of a external API calls or internal timer jobs.
    • Sometimes people do tend to overlook this area and make the internal API calls free from any authentication. Not the right approach
    • To implement this we need ways to transparently paas<Try to do it at the Framework Level rather than every dev take care of this> authentication tokens from one service to another.
    • With this is in place now every service becomes the owner for their data.

Where to Implement:

  • API Gateway Level
    • In this we use a API gateway which does all the authentication either via Session Token or JWT or other mechanisms
    • Any request coming at the gateway <l7 gateway eg: nginx,traefik> will checked for authentication and then only the request goes to the service
    • Do not have to implement form scratch <even though almost every framework provides it out of the box>
    • Service dosen’t seem to worry about authentication. <still when we talk about service to service communication for internal API call Service has to pass through the token>
    • For any communication between services a gateway between services will also be required.
  • Service Level
    • At service level there are various frameworks <Spring Security , Passport JS for Node > which provides authentication and authorization part so that one dosen’t have to code them from scratch .
    • Service <or Framework on which service is written> need to understand the token so that it can pass it through for Internal API calls.

It is very highly depends on the way you are integrating things that at which level you implement.

Horizontal Aspects

  • Auditing
    • very important – must be considered from very starting.
    • Many frameworks provide support for this eg: Spring Boot
  • External IDP – Identity Providers
    • If your are starting from scratch and want integrations with many third party like google , facebook and many others for single sign on external IDP is a very good choice.
    • Auth0 , AWS Cognito , Okta are some of the external IDP
    • Many things like password expiration policies , two factor authentication all available out of the box.

By now you must have got some gist about authentication in microservices world. For any doubts do put that into comments section.

How to start your journey into Microservices Part -2 (DDD)

In the last blog we learned that there are various areas of designing microservices.

In this blog we will work on decomposition of a larger problem into smaller microservices and look at the subtle but very important aspects on why to breakup and how simple it could be.

For this we will be using DDD. We will not go in detail of what exactly is ddd , we will work on the strategic design part of DDD, but lets still iterate over some of the principles of DDD.

One principle behind DDD is to bridge the gap between domain experts and developers by using the same language to create the same understanding. You must have seen cases where it becomes difficult for the product managers / experts to iteratively add features as the language between the pm and dev is different.

Another principle is to reduce complexity by applying object oriented design and design patters to avoid reinventing the wheel.

But what is a Domain? A Domain is a “sphere of knowledge”, for instance the business the company runs. A Domain is also called a “problem space”, so the problem for which we have to design a solution.

Lets choose a business problem like building a E-Commerce site<Amazon> on which we will try to apply DDD. We will not be able to go in complete depth of the business problem. I am writing very basic features required for it to work:

  1. User searches for a product
  2. User place a order for that product
  3. User pays online
  4. Delivery Management processes that order and start the delivery
  5. Delivery update the Order Status

Now we will try to design this in traditional way, lets create the schema:

  • Tables
    • Product
      • id
      • name
      • image url
      • price
      • count available
      • ….
    • Product Packaging Info
      • id
      • product id
      • size
      • isFragile
      • …..
    • Order
      • product id
      • user id
      • delivery address details
      • paid
    • Order Delivery Status
      • id
      • order id
      • delivery company name
      • delivery company id
      • delivery company code
    • Delivery Company
      • id
      • name
      • ….
    • User
      • id
      • name
      • address details
    • User Preferences
      • id
      • name
      • preferences

We can create a table structure like this in our system<lot more tables will be there>. I think by looking at those tables you could understand not all tables to be understood by every dev, eg: someone working on delivery management might not be interested in UserPreferences <used for searching> and someone working on searching might not be interested in OrderDeliveryStatus.

By this you can understand that we need to break the structure in smaller areas.

To design this in a way which helps to put more business context and smaller structure to manage . Lets put DDD.

As Domain in DDD means we are talking about a specific set of area <knowledge> . In larger sense E-commerce domain can be classified internally by various subdomain like:

  1. Identity Management
    1. User
  2. Inventory Management
    1. Product
  3. Product Search
    1. Product
    2. UserPreferences
  4. Order
    1. Order
    2. Order Delivery Status
  5. Delivery Management
    1. Order
    2. Product Packaging Info
    3. Delivery Company
    4. Order Deliver Status
  6. Billing

The separated Domain can easily be visualized. In DDD terms this is called a Context Map, and it is the starting point for any further modeling.Essentially what we have done is breakup the larger problem into smaller interconnected ones.

Now we need to align the Subdomain aka problem space to our solution design, we need to form a solution space. A solution space in DDD lingo is also called a Bounded Context, and it is the best to align one problem space/Subdomain with one solution space/Bounded Context.

In this we can think of each sub domain as a diff. microservice. Microservices are not complete without their dependencies . lets see them in context of bounded context:

  • Product Search – dependent on – Inventory Management
  • Delivery Management – dependent on – Order
  • Product Search – dependent on – User
  • Billing – dependent on – order
  • … so on

you can see that their is dependency between order and billing service and they talk via a common shared objects model which both of them can represent separately rather that using a single complete model which is cumbersome eg: order object in order service<care about status> is different from order in billing service<care about amount> . Now this is benefit of breaking them into smaller and separated domains.

To define such contracts one can also use ContextMapper.

There are certain outcomes one should take out of this:

  • Breaking into smaller pieces is very important so that it becomes easy for people to work on diff. part of the business
  • It is very simple when we are clear about the business sub domains .

After this i recommend you guys to go in more depth of DDD and look one more example here.

In next we will look about authentication mechanisms.

How to start your journey into Microservices Part -1

Architecting an application using Microservices for the first timers can be very confusing. This article is very relevant if

  • You are you beginning to develop an application that can scale like Amazon, Facebook, Netflix, and Google.
  • You are doing this for the first time.
  • You have already done research and decided that microservices architecture is going to be your secret sauce.

Microservices architecture is believed to be the simplest way of scaling without limits. However, when you get started, a lot of considerations are going to confuse you. Questions arose as I spent time learning about it online or discussing it with a team:

  1. What exactly is a microservice?
    1. Some said it should not exceed 1,000 lines of code.
    2. Some say it should fit one bounded context (if you don’t know what a bounded context is, don’t bother with it right now; keep reading).
  2. Even before deciding on what the “micro”service will be, what exactly is a service?
  3. Microservices do not allow updating multiple entities at once; how will I maintain consistency between entities
  4. Should I have a single database cluster for all my microservices?
  5. What is this eventual consistency thing everyone is talking about?
  6. How will I collate data which is composed of multiple entities residing in different services?
  7. What would happen if one service goes down? How would the dependent services behave?
  8. Should I make a sync invocation between microservices to always get consistent data?
  9. How will I manage version upgrades to a few or all microservices? Is it always possible to do it without downtime?
  10. And the last unavoidable question – how do I test the entire application as an integrated application?

Hmm… All of the above questions must be answered to able to be understand and deploy applications based on microservices.

Lets first list down all things we should cover to understand Microservices :

  • Decomposition – Breaking of system in Microservices and contracts between them
  • Authentication – how authentication info is passed from service to service , how diff. services validate the session
  • Service Discovery – hard coded , external system based like consul , eureka , kubernetes
  • Data Management – Where to store the data , whether to share the DB or not
  • Auditing – very important in business applications to have audit information about who updated or created something
  • Transactional Messaging – when you require very high consistency between DB operation and its event passed onto diff. services
  • Testing – What all to test , Single Service , Cross Service Concerns
  • Deployment Pattern – serverless , docker based
  • Developer Environment Setup – All Services running on developer machine , or single setup
  • Release Upgrades – How to do zero downtime release upgrades , blue green deployments
  • Debugging – Pass tracing id between services , track time taken between services , log aggregation
  • Monitoring – API time taken , System Health Check
  • UI Development – Single Page Applications or Micro Front Ends and client side composition
  • Security – for internal users

The First Thing we should learn is how to Decompose or build the Services:

Domain Driven Design:

While researching the methodology to break up the application into these components and also define the contract between them, we found the Domain Driven Design philosophy to be the most convincing. At a high level, it guided us on

  • How to break a bigger business domain into smaller bounded contexts(services).
  • How to define contracts between them(services).

These smaller components are the microservices. At a finer level, domain driven design (aka DDD) provides tactical methods to help us with

  • How to write code in a single bounded context(services).
  • How to break up the entities(business objects within the service).
  • How to become eventually consistent(as our data is divided into multiple services we cannot have all of them consistent every moment).

After getting the answer to “how to break the application into microservices,” we needed a framework for writing code on these guidelines. We could come up with a framework of our own, but we chose to not to reinvent the wheel. Lagom , Axon, Eventuate are all java based , frameworks which provides all the functionality that we require to do microservices, like

  1. Writing services using DDD.
  2. Testing services.
  3. Integration with Kafka, Rabbit Mq …, the messaging framework for message passing between services.

This is Part 1 in a series of articles. I have explained how we got our direction on getting started with microservices. In the next article, we will discuss about a sample Application and breakup of that using DDD .

Recommended References

Thanks to the blogs by Vaugh VernonUdi DahanChris Richardson, and Microsoft. A few specific references:

  1. Youtube for Event Sourcing, CQRS, Domain Driven Design, Lagom.
  2. https://msdn.microsoft.com/en-us/library/jj554200.aspx
  3. http://cqrs.nu/
  4. Domain Driven Design Distilled and implementing Domain-Driven Design, by Vaugh Vernon.
  5. http://microservices.io/ by Chris Richardson.
  6. Event Storming http://eventstorming.com/, by Alberto.