In last blog we learned about diff. thread states in java, in this blog we will understand how to use that knowledge to debug application issues using fasthreadanalyzer tool.
Lets take one type of issue we see with Java Applications:
Application Unresponsive Issue
When we say application in unresponsive<it can mean different things to diff. people>, here we mean that application is not responding to external api calls.
Lets take an example of a Spring Boot application not responding to the Http API Calls. There can be several reasons to it:
Total Http Thread in tomcat (or whatever container spring is using) are consumed
Causes :
It could be because of some high cpu consuming work is done in those threads and all those threads are stuck doing that work <now to connect with thread states – those threads consuming cpu would be in RUNNABLE state , so we should be looking for lots of RUNNABLE state thread in the jstack>
It could be because of those threads are waiting on some external IO <now to connect with thread states – those threads are logically stuck on some IO to complete, means those threads would be in WAITING/BLOCKED state, we should be looking for threads with such states>
How to debug via jstack:
Take multiple jstacks
Now in every jstack to understand the what diff. thread are doing in which states they are stuck we will use the tool https://fastthreadanalyzer.com/. Just upload your jstack in this tool.
They will show you a table like this <thread group (generally http threads are part of group names http-bio-443-exec) vs count (total count of threads in this group) vs thread states(count of each thread state for these threads)>:
Now to make sense of the information above we will first see whether our http threads are available or not
if thread count on tool and thread count configured in tomcat.conf (or any other container configuration) ir equal to max thread count , that means all http threads are doing something and new requests cannot be processed
Yes
Now if stuck we will see what they are doing by seeing their thread states if most of them on running
means something in you application is taking long time to complete or system is bombarded with many http calls
Now if you see these threads are stuck on waiting/timed waiting/blocked that means most probably these threads are doing some IO and waiting on it
No
Some other issue may be related to JVM
Now to dig further exactly where the threads are waiting or stuck , you can click on the corresponding thread group and system will show what those threads are doing group by similar stack and their thread state, eg:
Now you can see the threads grouped by State and Stack , using this information you can figure out which service in the application is actually consuming the threads.
There could be many other reasons like JVM stuck , machine unresponsive we are not going in detail for them.
With the fast thread tool you can debug many such issues, we will cover more diff. type of issues in future posts.
The purpose of this blog series is to learn how to debug your java application issues, for this firstly we will understand what different thread states are there.
Lets understand what all are the different states of java stack :
In the above diagram you can 6 states of Java Thread:
New : When a new thread is created, it is in the new state. The thread has not yet started to run when thread is in this state. When a thread lies in the new state, it’s code is yet to be run and hasn’t started to execute.
Runnable/Running : A thread that is ready to run is moved to runnable state. In this state, a thread might actually be running or it might be ready run at any instant of time. It is the responsibility of the thread scheduler to give the thread, time to run. A multi-threaded program allocates a fixed amount of time to each individual thread. Each and every thread runs for a short while and then pauses and relinquishes the CPU to another thread, so that other threads can get a chance to run. When this happens, all such threads that are ready to run, waiting for the CPU and the currently running thread lies in runnable state.
Timed Waiting : A thread lies in timed waiting state when it calls a method with a time out parameter. A thread lies in this state until the timeout is completed or until a notification is received. For example, when a thread calls sleep or a conditional wait, it is moved to a timed waiting state.
Waiting : A thread is in the waiting state when it waits for another thread on a condition. When this condition is fulfilled, the scheduler is notified and the waiting thread is moved to runnable state.
Blocked : A thread is in the blocked state when it tries to access a protected section of code that is currently locked by some other thread. When the protected section is unlocked, the schedule picks one of the thread which is blocked for that section and moves it to the runnable state. A thread in this state cannot continue its execution any further until it is moved to runnable state. Any thread in these states does not consume any CPU cycle.
Terminated : A thread terminates because of either of the following reasons:
Because it exists normally. This happens when the code of thread has entirely executed by the program.
Because there occurred some unusual erroneous event, like segmentation fault or an unhandled exception.
Sample Code for creating threads with diff. thread states:
public class ThreadStatesDemo {
public static class WaitingThread extends Thread {
@Override
public void run() {
Object o = new Object();
try {
synchronized (o) {
o.wait();
}
} catch (InterruptedException e) {
}
}
}
public static class SleepingThread extends Thread {
@Override
public void run() {
try {
Thread.sleep(100000);
} catch (InterruptedException e) {
}
}
}
public static class RunningThread extends Thread {
@Override
public void run() {
for (int i = 1; i > 0;) {
}
}
}
public static class TimedWaitingThread extends Thread {
@Override
public void run() {
Object o = new Object();
try {
synchronized (o) {
o.wait(100000);
}
} catch (InterruptedException e) {
}
}
}
public static Integer mutex = 0;
public static class BlockedThread extends Thread {
@Override
public void run() {
try {
Thread.sleep(1000);
synchronized (mutex) {
}
} catch (InterruptedException e) {
}
}
}
public static class BlockingThread extends Thread {
@Override
public void run() {
synchronized (mutex) {
for (int i = 1; i > 0;) {
}
}
}
}
public static void main(String[] args) {
Thread wTh = new WaitingThread();
wTh.setName("waiting");
wTh.start();
Thread sTh = new SleepingThread();
sTh.setName("sleeping");
sTh.start();
Thread rTh = new RunningThread();
rTh.setName("running");
rTh.start();
Thread twTh = new TimedWaitingThread();
twTh.setName("timed waiting");
twTh.start();
Thread bldTh = new BlockedThread();
bldTh.setName("blocked");
bldTh.start();
Thread blcTh = new BlockingThread();
blcTh.setName("blocking");
blcTh.start();
try {
Thread.sleep(1000000);
} catch (InterruptedException e) {
}
}
}
When you will run the application , and take a jstack via jstack command , you will get some output like: this:
#command to take jstack
jstack -l <pid>
2021-10-18 17:20:34
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.221-b11 mixed mode):
"blocking" #15 prio=5 os_prio=0 tid=0x00007f1ee411e800 nid=0xc99 runnable [0x00007f1eae09d000]
java.lang.Thread.State: RUNNABLE
at ThreadStatesDemo$BlockingThread.run(ThreadStatesDemo.java:69)
- locked <0x000000076e5c0bb0> (a java.lang.Integer)
Locked ownable synchronizers:
- None
"blocked" #14 prio=5 os_prio=0 tid=0x00007f1ee411c800 nid=0xc98 waiting for monitor entry [0x00007f1eae19e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at ThreadStatesDemo$BlockedThread.run(ThreadStatesDemo.java:56)
- waiting to lock <0x000000076e5c0bb0> (a java.lang.Integer)
Locked ownable synchronizers:
- None
"timed waiting" #13 prio=5 os_prio=0 tid=0x00007f1ee411b000 nid=0xc97 in Object.wait() [0x00007f1eae29f000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076e938550> (a java.lang.Object)
at ThreadStatesDemo$TimedWaitingThread.run(ThreadStatesDemo.java:41)
- locked <0x000000076e938550> (a java.lang.Object)
Locked ownable synchronizers:
- None
"running" #12 prio=5 os_prio=0 tid=0x00007f1ee4119000 nid=0xc96 runnable [0x00007f1eae3a0000]
java.lang.Thread.State: RUNNABLE
at ThreadStatesDemo$RunningThread.run(ThreadStatesDemo.java:29)
Locked ownable synchronizers:
- None
"sleeping" #11 prio=5 os_prio=0 tid=0x00007f1ee4117000 nid=0xc95 waiting on condition [0x00007f1eae4a1000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at ThreadStatesDemo$SleepingThread.run(ThreadStatesDemo.java:20)
Locked ownable synchronizers:
- None
"waiting" #10 prio=5 os_prio=0 tid=0x00007f1ee4115800 nid=0xc94 in Object.wait() [0x00007f1eae5a2000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000076e7fae38> (a java.lang.Object)
at java.lang.Object.wait(Object.java:502)
at ThreadStatesDemo$WaitingThread.run(ThreadStatesDemo.java:9)
- locked <0x000000076e7fae38> (a java.lang.Object)
Locked ownable synchronizers:
- None
Now in this stack trace you can see the Thread State via the line:
java.lang.Thread.State: <THREAD STATE>
eg:
java.lang.Thread.State: WAITING (on object monitor)
In the code to create various Thread States a thread a created for each state or multiple thread for same state<representing how a particular thread state is reached>.
Let’s see:
Thread State RUNNABLE – a thread with code wise just working in a single loop.
Thread State WAITING – a thread with name waiting and code wise called wait() on a Object.
Thread State BLOCKED – a thread with name blocked and code wise trying to get into synchronized block on a Object whose lock is already taken by thread with name blocking <blocking thread almost always take the lock first because of sleep in Blocked Thread>
Thread State TIMED_WAITING – a thread with name timed_waiting and code wise called wait() with time<100000> on a Object and a thread with name sleeping and code wise called sleep on Thread Object.
Now i think you got an understanding of what all java thread states are there and how thread can end up in such state, there are more ways also apart from them.
In Our previous blog we have seen mainly the filebeat and metric beat and explored the system module of that . In this blog we will see the usage of heart beat and how to monitor the services using heart beat .
Heartbeat should not be installed on each server you monitor , it should be installed on some separate servers from which you can monitor all url’s/services . For example we have one server deployed at x.x.x.x:8000 at some server in aws in north region , then we can install heart beat in our four server in each region(north,south,east,west) of aws and can monitor this server from all the servers to check weather services is UP from all india .
From these four server we can monitors all the services url’s .
For setting up the Heartbeat following is the link :
After that , On Dashboard tab you can see the MetricBeat monitoring ,
Also you can see the uptime app in kibana to check status and tls expiry time and history of all downtimes :
Followings are some screenshots:
Configuration in heartbeat.yml for setting the name of machine from where url is pinging in heartbeat.yml
processors:
- add_observer_metadata:
# Optional, but recommended geo settings for the location Heartbeat is running in
geo:
# Token describing this location
name: sahil-machine
# Lat, Lon "
#location: "37.926868, -78.024902"
Configuration in hearebeat.yml for setting to monitors urls :
heartbeat.config.monitors:
# Directory + glob pattern to search for configuration files
path: ${path.config}/monitors.d/*.yml
# If enabled, heartbeat will periodically check the config.monitors path for changes
reload.enabled: false
# How often to check for changes
reload.period: 5s
# Configure monitors inline
heartbeat.monitors:
- type: http
# Set enabled to true (or delete the following line) to enable this example monitor
enabled: false
# ID used to uniquely identify this monitor in elasticsearch even if the config changes
id: my-monitor
# Human readable display name for this service in Uptime UI and elsewhere
name: My Monitor
# List or urls to query
urls: ["http://localhost:9200"]
# Configure task schedule
schedule: '@every 10s'
# Total test connection and data exchange timeout
#timeout: 16s
# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service.name: my-apm-service-name
- type: http
# Set enabled to true (or delete the following line) to enable this example monitor
enabled: true
# ID used to uniquely identify this monitor in elasticsearch even if the config changes
id: emerge-gurgaon
# Human readable display name for this service in Uptime UI and elsewhere
name: emerge-gurgaon
# List or urls to query
urls: ["https://app.ameyoemerge.in:8887/"]
# Configure task schedule
schedule: '@every 10s'
# Total test connection and data exchange timeout
#timeout: 16s
# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service.name: my-apm-service-name
- type: http
# Set enabled to true (or delete the following line) to enable this example monitor
enabled: true
# ID used to uniquely identify this monitor in elasticsearch even if the config changes
id: emerge-banglore-app24
# Human readable display name for this service in Uptime UI and elsewhere
name: emerge-banglore-app24
# List or urls to query
urls: ["https://app24.ameyoemerge.in:8887/"]
# Configure task schedule
schedule: '@every 10s'
# Total test connection and data exchange timeout
#timeout: 16s
# Name of corresponding APM service, if Elastic APM is in use for the monitored service.
#service.name: my-apm-service-name
In the next blog we will explore Logstash with filebeat . Happy debugging . . .
As nowadays lots of our servers are deployed on Cloud and many applications are running on these servers , it is impossible to monitor and analyze logs by going to each servers . Central Logging and Monitoring solution is a must in present time .
In this Bog Series , we will learn about usage of Elastic Stack aka ELK .
Overview :
Elastic Stack is a group of open source products from Elastic designed to help users take data from any type of source and in any format and search, analyze, and visualize that data in real time. The product group was formerly known as ELK Stack, in which the letters in the name stood for the products in the group: ElasticSearch, Logstash and Kibana. A fourth product, Beats, was subsequently added to the stack, rendering the potential acronym unpronounceable. Elastic Stack can be deployed on premises or made available as Software as a Service
Architechture :
For a small-sized development environment, the classic architecture will look as follows :
Here there are many different types of beats you can read them from https://www.elastic.co/beats/ . Each beat have different set of usecases .
In this blog we will learn about two beats MetricBeat and FileBeat .
Note – LogStash is an options part in the architecture and should not be needed in most of the cases . Read more about Logstash at https://www.elastic.co/logstash/
Usage Elastic Stack :
I am running experiments on CentOS7 machine and using rpm to setup the elastic stack .
If you are getting output like above , it means elastic search is installed successfully .
Note : To change listen address and port you can change in the following file : /etc/elasticsearch/elasticsearch.yml
Kibana :
Kibana is the Front end tool which communicates to Elastic search where anyone can monitor and analyze logs .
Commands to install kibana :
curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-7.14.0-linux-x86_64.tar.gz
tar xzvf kibana-7.14.0-linux-x86_64.tar.gz
cd kibana-7.14.0-linux-x86_64/
./bin/kibana
Note : configure vim config/kibana.yml for port and ip addressed for listening settings .
Beats
These will be installed on all servers from where we want to collect information . they are like agents which will send data to Elastic Search .
Enabling Metric Beat :
Every Beats supports different modules , it is up to the use that which module one wnts to enable in each beats . if we talk about MetricBeat it has many modules like System,Postgres,Nginx and so on . In this Blog we will see usage of System Module of MetricBeat .
sudo metricbeat modules enable system
sudo metricbeat setup -e
sudo service metricbeat start
Here we are only enabling system module of metri beats , there are many odule for basic monitoring of aplications like postgresql , nginx , tomcat etc .
Fo list of modules available in metric beats : command is
metricbeat modules list
Yeipeee Now we can Monitor System Data in kibana as follows .
Open [Metricbeat System] Host overview ECS in Dashboards in kibana UI . There you can apply filter of host of which one wants to see data .
System Module MetricBeat Uses : What analysis can be Done by System module of MetricBeat :
Traditionally after accessing linux servers , we gather system information by using many different commands and tools which also takes time , specially when there is some running issue on production .
Following is the list of information :
Size information of all partitions
Read/Write Performance of Hardisk
InboundOutBound Traffic analysis per Ethernet Port
Load Avergae analysis of system
Top Proesses consuming High CPU and RAM
All these type of information now can be seen in seconds for some particular host using kibana UI .
Following are some screenshots :
Enabling FileBeat
Whether you’re collecting from security devices, cloud, containers, hosts, or OT, Filebeat helps you keep the simple things simple by offering a lightweight way to forward and centralize logs and files.
Note : For configuring filebeat that where to send data to elastic search or filebeat configue in /etc/filebeat/filebeat.yml , cureent as i have only one machine so no need to do an conf as defaut conf will work for me You can check the following lion : https://www.elastic.co/guide/en/beats/filebeat/7.14/configuring-howto-filebeat.html
enabling system logs module in filebeat :
filebeat modules enable system
(for system logs if we want to set custom paths : edit the file /etc/filebeat/modules.d/system.yml) -- Generally no need to change these config in all cases
filebeat setup -e
sudo service filebeat start
Like Metric Beat , FileBeats also have list of modules like postgres,nginx , and it also supports logging of popular framework like spring and can collect logs of these applications and provides ways to analyze them easily .
To check modules list available for filebeat use following command :
[root@localhost elk]# filebeat modules list | less
System Module Filebeat Uses :
Now you can use Kibana UI to analyze system logs like messages etc .
Open [Filebeat System] Syslog dashboard ECS in Dashboard Tab in Kibana .
Following are some screen shots which one can see :
Configure filebeat for custom log files :
Now we may have situation where none of Modules and integration with framework logging work in filebeat for our custom application log then in that case you can configure your input manually to configure path of logs to read and analayse them in logs and stream section in kibana UI
Here you can search in logs by hostname , filepath and can also search in whole message which is fetched .
By default only message column is shown . One can configure then in settings tabs of logs tabs in kibana .
Following are some screenshot :
By Default logs lines are only one column , if for advance debugging we want to break log tine into columns then we need to use Logstash with Grok Filter .
In next blog we will see the usage of LogStash to break custom logs into columns for better understanding .
In our Daily debugging we need to analyze logs files of various products . Reading those log files are not an easy task , it requires special debugging skills which can only be gained through experience or by god’s grace . Now while debugging we might need to extract some of data or we need to play with a log file which can not be done by just reading , there is need for commands .
There are many commands in linux which are used by debuggers like grep,awk,sed,wc,taskset,ps,sort,uniq,cut,xargs etc . . .
In this blog we will see usage of Practical grep commands examples useful in real world debugging in Linux . The examples which we will see in this blog are super basic but very useful in real life which a beginner should read to enhance the debugging skills .
Let’s Go to the Practical Part
Grep the lines which contains some particular word
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep 'sahil' file1.log
i am sahil
sahil is a software engineer
Grep number of lines matched for a particualar word in a file
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep -c 'sahil' file1.log
2
Another way :
[root@localhost playground]# grep 'sahil' file1.log | wc -l
2
Grep all the lines in which contains some word in a file with case insensitive
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep -i 'sahil' file1.log
i am sahil
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]#
Grep the lines in which either of two words are present in a file
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep 'sahil\|software' file1.log
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]#
Grep lines in which two words are present
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep 'sahil' file1.log | grep 'software'
sahil is a software engineer
[root@localhost playground]# ^C
[root@localhost playground]#
Eliminate lines which contains some word
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep -v 'sahil' file1.log
hello
i am software engineer
Sahil is a software engineer
Eliminate case insensitively
[root@localhost playground]# grep -iv 'sahil' file1.log
hello
i am software engineer
[root@localhost playground]#
Matching the lines that start with a string
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep '^sahil' file1.log
sahil is a software engineer
Matching the lines that end with a string
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep 'engineer$' file1.log
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]#
Getting n number of lines after each match
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]#
[root@localhost playground]# grep 'hello' file1.log
hello
[root@localhost playground]# grep -A 1 'hello' file1.log
hello
i am sahil
[root@localhost playground]# grep -A 2 'hello' file1.log
hello
i am sahil
i am software engineer
Geeting n number of lines before each match
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep 'i am sahil' file1.log
i am sahil
[root@localhost playground]# grep -B 1 'i am sahil' file1.log
hello
i am sahil
[root@localhost playground]# grep -B 2 'i am sahil' file1.log
hello
i am sahil
[root@localhost playground]#
Grep n lines after and m lines before every match
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# grep -A 2 -B 1 'i am sahil' file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
[root@localhost playground]#
Grep some word in more than one file in current directory
[root@localhost playground]# cat file1.log
hello
i am sahil
i am software engineer
Sahil is a software engineer
sahil is a software engineer
[root@localhost playground]# cat file2.log
hello
i am sahil
i am tech blogger
Sahil is a tech blogger
sahil is a tech blogger
[root@localhost playground]# grep 'sahil' file1.log file2.log
file1.log:i am sahil
file1.log:sahil is a software engineer
file2.log:i am sahil
file2.log:sahil is a tech blogger
Grep some word in all files in current directory
[root@localhost playground]# grep 'sahil' *
file1.log:i am sahil
file1.log:sahil is a software engineer
file2.log:i am sahil
file2.log:sahil is a tech blogger
[root@localhost playground]#
[root@localhost playground]# cat file3.log
time taken by api is 1211 ms
time taken by api is 2000 ms
time taken by api is 3000 ms
time taken by api is 4000 ms
time taken by api is 50000 ms
time taken by api is 123 ms
time taken by api is 213 ms
time taken by api is 456 ms
time taken by api is 1000 ms
Now suppose we want to grep all the lines in which time taken by any api is more than 1 second or more than 1000 ms , it means it should have minimum 4 digit number .
Now grep command for this will be as follows :
[root@localhost playground]# grep -P '[0-9]{4} ms' file3.log
time taken by api is 1211 ms
time taken by api is 2000 ms
time taken by api is 3000 ms
time taken by api is 4000 ms
time taken by api is 50000 ms
time taken by api is 1000 ms
If want to get 5 digit number
[root@localhost playground]# grep -P '[0-9]{5} ms' file3.log
time taken by api is 50000 ms
Recursively grep in a directory and sub directoies
[root@localhost playground]# grep -R 'sahil' .
./dir1/file.log:i am sahil
./dir1/file.log:sahil is a software engineer
./file1.log:i am sahil
./file1.log:sahil is a software engineer
./file2.log:i am sahil
./file2.log:sahil is a tech blogger
[root@localhost playground]#
All above are basic use cases of grep . One can mix all the command options of grep to achieve the complex use cases and also one can also mix different grep commands using pipe operator to achieve complex use cases .
In future blogs i will explain some complex use cases and example how to achieve that using linux commands which can ease logs debugging .