Bulk Updation/Insertion of Database Tables in Java using Hibernate – Optimized Way

Hibernate is the most popular orm framework used to interact with databases in java . In this article  we will see what are the various ways using which bulk selection and updation in any table can be done and what is the most effective way when using the hibernate framework in java . 

I  experimented with three ways which are as follows : 

  • Using Hibernate’s Query.list() method.
  • Using ScrollableResults with FORWARD_ONLY scroll mode.
  • Using ScrollableResults with FORWARD_ONLY scroll mode in a StatelessSession.

To decide which one gives best performance for our use case, following tests i performed using the above three ways listed.

  • Select and update 1000 rows.

Let’s see the Code and results by applying above three ways to the operation stated above one by one. 

Using Hibernate’s Query.list() method.

Code Executed : 

   List rows;
        Session session = getSession();
        Transaction transaction = session.beginTransaction();
        try {
            Query query = session.createQuery("FROM PersonEntity WHERE id > :maxId ORDER BY id").setParameter("maxId",
                    MAX_ID_VALUE);
            query.setMaxResults(1000);
            rows = query.list();
            int count = 0;
            for (Object row : rows) {
                PersonEntity personEntity = (PersonEntity) row;
                personEntity.setName(randomAlphaNumeric(30));
                session.saveOrUpdate(personEntity);
                //Always flush and clear the session after updating 50(jdbc_batch_size specified in hibernate.properties) rows
                if (++count % 50 == 0) {
                    session.flush();
                    session.clear();
                }
            }
        } finally {
            if (session != null && session.isOpen()) {
                transaction.commit();
                session.close();
            }
        }

Tests Results : 

  • Time taken:- 360s to 400s
  • Heap Pattern:- gradually increased from 13m to 51m(from jconsole). 

Using ScrollableResults with FORWARD_ONLY scroll mode.

With this we are expecting that it should consume less memory that the 1st approach . Let’s see the results 

Code Executed : 

Session session = getSession();
        Transaction transaction = session.beginTransaction();
        ScrollableResults scrollableResults = session
                .createQuery("FROM PersonEntity WHERE id > " + MAX_ID_VALUE + " ORDER BY id")
                .setMaxResults(1000).scroll(ScrollMode.FORWARD_ONLY);
        int count = 0;
        try {
            while (scrollableResults.next()) {
                PersonEntity personEntity = (PersonEntity) scrollableResults.get(0);
                personEntity.setName(randomAlphaNumeric(30));
                session.saveOrUpdate(personEntity);
                if (++count % 50 == 0) {
                    session.flush();
                    session.clear();
                }
            }
        } finally {
            if (session != null && session.isOpen()) {
                transaction.commit();
                session.close();
            }
        }

Tests Results : 

  • Time taken:- 185s to 200s
  • Heap Pattern:- gradually increased from 13mb to 41mb (measured same using jconsole)

Using ScrollableResults with FORWARD_ONLY scroll mode in a StatelessSession.

A stateless session does not implement a first-level cache nor interact with any second-level cache, nor does it implement transactional write-behind or automatic dirty checking, nor do operations cascade to associated instances. Collections are ignored by a stateless session. Operations performed via a stateless session bypass Hibernate’s event model and interceptors.   

These type of session is always recommended in case of bulk updation as we really do not need these overheads of hibernate features in these type of usecases . 

Code Executed : 

 StatelessSession session = getStatelessSession();
        Transaction transaction = session.beginTransaction();
        ScrollableResults scrollableResults = session
                .createQuery("FROM PersonEntity WHERE id > " + MAX_ID_VALUE + " ORDER BY id")
                .setMaxResults(TRANSACTION_BATCH_SIZE).scroll(ScrollMode.FORWARD_ONLY);
        try {
            while (scrollableResults.next()) {
                PersonEntity personEntity = (PersonEntity) scrollableResults.get(0);
                personEntity.setName(randomAlphaNumeric(20));
                session.update(personEntity);
            }
        } finally {
            if (session != null && session.isOpen()) {
                transaction.commit();
                session.close();
            }
        }

Tests Results : 

  • Time taken:- 185s to 200s
  • Heap Pattern:- gradually increased from 13mb to 39mb

I also performed the same tests with 2000 rows and the results obtained were as follows:-

Results:-

  • Using list():- time taken:- approx 750s, heap pattern:- gradually increased from 13mb to 74 mb
  • Using ScrollableResultSet:- time taken:- approx 380s, heap pattern:- gradually increased from 13mb to 46mb
  • Using Stateless:- time taken:- approx 380s, heap pattern:- gradually increased from 13mb to 43mb

Blocker Problem with all above approaches Tried

ScrollableResults and Stateless ScrollableResults give almost the same performance which is much better than Query.list(). But there is still one problem with all the above approaches. Locking, all the above approaches select and update the data in same transaction, this means for as long as the transaction is running, the rows on which updates have been performed will be locked and any other operations will have to wait for the transaction to finish.

Solution : 

There are two things which we should do here to solve above problem : 

  •  we need to select and update data in different transactions.
  • And updation of these types should be done in Batches

So again I performed the same tests as above but this time update was performed in a different transaction which was commited in batches of 50.

Note:- In case of Scrollable and Stateless we need a different session also, as we need the original session and transaction to scroll through the results.

Results using Batch Processing

  • Using list():- time taken:- approx 400s, heap pattern:- gradually increased from 13mb to 61 mb
  • Using ScrollableResultSet:- time taken:- approx 380s, heap pattern:- gradually increased from 13mb to 51mb
  • Using Stateless:- time taken:- approx 190s, heap pattern:- gradually increased from 13mb to 44mb

Observation:- This temporal performance of ScrollableResults dropped down to become almost equal to Query.list(), but performance of Stateless remained almost same.

Summary and Conclusion

As from all the above experimentation  , in cases where we need to do bulk selection and updation, the best approach in terms of memory consumption and time is as follows : 

  • Use ScrollableResults in a Stateless Session.
  • Perform selection and updation in different transactions in batches of 20 to 50 (Batch Processing) (Note -*-  Batch size  can depend on the case to case basis)

  Sample Code with the best approach

  StatelessSession session = getStatelessSession();
        Transaction transaction = session.beginTransaction();
        ScrollableResults scrollableResults = session
                .createQuery("FROM PersonEntity WHERE id > " + MAX_ID_VALUE + " ORDER BY id")
                .setMaxResults(TRANSACTION_BATCH_SIZE).scroll(ScrollMode.FORWARD_ONLY);
        int count = 0;
        try {
            StatelessSession updateSession = getStatelessSession();
            Transaction updateTransaction = updateSession.beginTransaction();
            while (scrollableResults.next()) {
                PersonEntity personEntity = (PersonEntity) scrollableResults.get(0);
                personEntity.setName(randomAlphaNumeric(5));
                updateSession.update(personEntity);
                if (++count % 50 == 0) {
                    updateTransaction.commit();
                    updateTransaction = updateSession.beginTransaction();
                }
            }
            updateSession.close();
        } finally {
            if (session != null && session.isOpen()) {
                transaction.commit();
                session.close();
            }
        }

With the   java frameworks like spring and others this code may be even more smaller , like one not needing to  take care of session closing etc . Above code is written in plain java using hibernate. 

Please  try with large data and comment us the results , Also if you have some other better approach to do this please comment . 

Thank You for reading the article

Tune Linux Kernel Parameters For PostgreSQL Optimization and better System Performance

Introduction

In my previous Article i explained  Tuning PostgreSQL Database Memory Configuration Parameters to Optimize Performance and as i said  Database performance does not only depend on Postgresql configurations but also on system parameters .Poorly configured OS kernel parameters can cause degradation in database server performance. Therefore, it is imperative that these parameters are configured according to the database server and its workload. In this article  i will be talking about centos/redhat  linux system specially . 

Story

I will start the article with small story where on one of our client huge amount of writes were there and customer have provided us 200 GB of RAM for that dedicated database server , So there were no problem of resources. 

Now what was happening that after sometime system loads get increased so much and on debugging we found no special query  around the time when load increases . Somewhere over internet we found if we clear the system cache  regularly then issue will be resolved . 

We then schedule a cron to clear system cache after some regular interval and issue got resolved . 

Now  the question is why issue was not coming after this ? ? ?

And the Answer is  that due to large cache size as we have so much of ram available   lots of data is collected in RAM (in GB’s) and when this whole data flushes out on to the disk   ,  system load becomes high at that time 

So from that we came to know that we also need to tune some system parameters also to optimize system and database(postgresql) performance . 

In above case we tuned vm.dirty_background_ratio and vm.dirty_ratio , these two system(os) parameters to resolve the issue . 

Kernel parameters Tuning

Now what values we set for these above two parameters described in story and what are all other  important Linux kernel parameters that can affect database server performance which we can tune are described as follows : 

vm.dirty_background_ratio / vm.dirty_background_bytes

The vm.dirty_background_ratio is the percentage of memory filled with dirty pages that need to be flushed to disk. Flushing is done in the background. The value of this parameter ranges from 0 to 100; however, a value lower than 5 may not be effective and some kernels do not internally support it. The default value is 10 on most Linux systems. You can gain performance for write-intensive operations with a lower ratio, which means that Linux flushes dirty pages in the background.

You need to set a value of vm.dirty_background_bytes depending on your disk speed.

There are no “good” values for these two parameters since both depend on the hardware. However, setting vm.dirty_background_ratio to 5 and vm.dirty_background_bytes to 25% of your disk speed improves performance by up to ~25% in most cases.

vm.dirty_ratio / dirty_bytes

This is the same as vm.dirty_background_ratio / dirty_background_bytes except that the flushing is done in the foreground, blocking the application. So vm.dirty_ratio should be higher than vm.dirty_background_ratio. This will ensure that background processes kick in before the foreground processes to avoid blocking the application, as much as possible. You can tune the difference between the two ratios depending on your disk IO

 vm.swappiness

vm.swappiness is another kernel parameter that can affect the performance of the database. This parameter is used to control the swappiness (swapping pages to and from swap memory into RAM) behavior on a Linux system. The value ranges from 0 to 100. It controls how much memory will be swapped or paged out. Zero means disable swap and 100 means aggressive swapping.

You may get good performance by setting lower values.

Setting a value of 0 in newer kernels may cause the OOM Killer (out of memory killer process in Linux) to kill the process. Therefore, you can be on the safe side and set the value to 1 if you want to minimize swapping. The default value on a Linux system is 60. A higher value causes the MMU (memory management unit) to utilize more swap space than RAM, whereas a lower value preserves more data/code in memory.

A smaller value is a good bet to improve performance in PostgreSQL.

vm.overcommit_memory / vm.overcommit_ratio

Applications acquire memory and free that memory when it is no longer needed. But in some cases, an application acquires too much memory and does not release it.  This can invoke the OOM killer. Here are the possible values for vm.overcommit_memory parameter with a description for each:

  1. Heuristic overcommit, Do it intelligently (default); based kernel heuristics
  2. Allow overcommit anyway
  3. Don’t over commit beyond the overcommit ratio.

Reference: https://www.kernel.org/doc/Documentation/vm/overcommit-accounting

vm.overcommit_ratio is the percentage of RAM that is available for overcommitment. A value of 50% on a system with 2 GB of RAM may commit up to 3 GB of RAM.

A value of 2 for vm.overcommit_memory yields better performance for PostgreSQL. This value maximizes RAM utilization by the server process without any significant risk of getting killed by the OOM killer process. An application will be able to overcommit, but only within the overcommit ratio, thus reducing the risk of having OOM killer kill the process. Hence a value to 2 gives better performance than the default 0 value. However, reliability can be improved by ensuring that memory beyond an allowable range is not overcommitted. It avoids the risk of the process being killed by OOM-killer.

On systems without swap, one may experience a problem when vm.overcommit_memory is 2.

https://www.postgresql.org/docs/current/static/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT

Generally speaking almost all applications which uses more memory depends on this , For example  , In Redis setting this value 1 is best . 

Turn On Huge Pages

Linux, by default, uses 4K memory pages, BSD has Super Pages, whereas Windows has Large Pages. A page is a chunk of RAM that is allocated to a process. A process may own more than one page depending on its memory requirements. The more memory a process needs the more pages that are allocated to it. The OS maintains a table of page allocation to processes. The smaller the page size, the bigger the table, the more time required to look up a page in that page table. Therefore, huge pages make it possible to use a large amount of memory with reduced overheads; fewer page lookups, fewer page faults, faster read/write operations through larger buffers. This results in improved performance.

PostgreSQL has support for bigger pages on Linux only. By default, Linux uses 4K of memory pages, so in cases where there are too many memory operations, there is a need to set bigger pages. Performance gains have been observed by using huge pages with sizes 2 MB and up to 1 GB. The size of Huge Page can be set boot time. You can easily check the huge page settings and utilization on your Linux box using cat /proc/meminfo | grep -i huge command.

Get HugePage Info – On Linux (only)

Note: This is only for Linux, for other OS this operation is ignored
$ cat /proc/meminfo | grep -i huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

In this example, although huge page size is set at 2,048 (2 MB), the total number of huge pages has a value of 0. which signifies that huge pages are disabled.

Script to quantify Huge Pages

This is a simple script which returns the number of Huge Pages required. Execute the script on your Linux box while your PostgreSQL is running. Ensure that $PGDATA environment variable is set to PostgreSQL’s data directory.

Get Number of Required HugePages

!/bin/bash
pid=head -1 $PGDATA/postmaster.pid
echo “Pid:            $pid”
peak=grep ^VmPeak /proc/$pid/status | awk '{ print $2 }'
echo “VmPeak:            $peak kB”
hps=grep ^Hugepagesize /proc/meminfo | awk '{ print $2 }'
echo “Hugepagesize:   $hps kB”
hp=$((peak/hps))
echo Set Huge Pages:     $hp

The output of the script looks like this:

Script Output

Pid:            12737
VmPeak:         180932 kB
Hugepagesize:   2048 kB
Set Huge Pages: 88

The recommended huge pages are 88, therefore you should set the value to 88.

Set HugePages Command :

sysctl -w vm.nr_hugepages= 88

Check the huge pages now, you will see no huge page is in use (HugePages_Free = HugePages_Total).

Again Get HugePage Info – On Linux (only)

$ cat /proc/meminfo | grep -i huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
HugePages_Total:      88
HugePages_Free:       88
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

Now set the parameter huge_pages “on” in $PGDATA/postgresql.conf and restart the server.

And Again Get HugePage Info – On Linux (only)

$ cat /proc/meminfo | grep -i huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
HugePages_Total:      88
HugePages_Free:       81
HugePages_Rsvd:       64
HugePages_Surp:        0
Hugepagesize:       2048 kB

Now you can see that a very few of the huge pages are used. Let’s now try to add some data into the database.

Some DB Operations to Utilise HugePages

postgres=# CREATE TABLE foo(a INTEGER);
CREATE TABLE
postgres=# INSERT INTO foo VALUES(generate_Series(1,10000000));
INSERT 0 10000000

Let’s see if we are now using more huge pages than before.

Once More Get HugePage Info – On Linux (only)

$ cat /proc/meminfo | grep -i huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
HugePages_Total:      88
HugePages_Free:       18
HugePages_Rsvd:        1
HugePages_Surp:        0
Hugepagesize:       2048 kB

Now you can see that most of the huge pages are in use.

Note: The sample value for HugePages used here is very low, which is not a normal value for a big production machine. Please assess the required number of pages for your system and set those accordingly depending on your system’s workload and resources.

Now, Tuning Postgresql parameters and kernel parameters is not enough for good Postgresql performance there are many other things like

  • How you are making Query
  • Proper Indexing — For this you can follow indexing series on our blog
  • Proper partitioning and sharding accroding to business usecase
  • and many more .

Stay tuned to get more blogs on optimizing postgresql performance

Refrences : https://www.percona.com/blog/2018/08/29/tune-linux-kernel-parameters-for-postgresql-optimization/

Tuning PostgreSQL Database Memory Configuration Parameters to Optimize Performance

Introduction 

Out of the box, the default PostgreSQL configuration is not tuned for any particular workload. Default values are set to ensure that PostgreSQL runs everywhere, with the least resources it can consume and so that it doesn’t cause any vulnerabilities. It has default settings for all of the database parameters. It is primarily the responsibility of the database administrator or developer to tune PostgreSQL according to their system’s workload. In this blog, we will establish basic guidelines for setting PostgreSQL database parameters to improve database performance according to workload.

Bear in mind that while optimizing PostgreSQL server configuration improves performance, a database developer must also be diligent when writing queries for the application. If queries perform full table scans where an index could be used or perform heavy joins or expensive aggregate operations, then the system can still perform poorly even if the database parameters are tuned. It is important to pay attention to performance when writing database queries.

Nevertheless, database parameters are very important too, so let’s take a look at the eight that have the greatest potential to improve performance , But before that it is important to understand the memory architecture  basic of postgres  : 

Memory in PostgreSQL can be classified into two categories:

  1. Local Memory area: It is allocated by each backend process for its own use.
  2. Shared memory area: It is used by all processes of a PostgreSQL server.

So Now i will explain tunable parameters and what should be the value of these tunable parameters   : . 

shared_buffer

PostgreSQL uses its own buffer and also uses kernel buffered IO. That means data is stored in memory twice, first in PostgreSQL buffer and then kernel buffer. Unlike other databases, PostgreSQL does not provide direct IO. This is called double buffering. The PostgreSQL buffer is called shared_buffer which is the most effective tunable parameter for most operating systems. This parameter sets how much dedicated memory will be used by PostgreSQL for cache.

The default value of shared_buffer is set very low and you will not get much benefit from that. It’s low because certain machines and operating systems do not support higher values. But in most modern machines, you need to increase this value for optimal performance.

The recommended value is 25% of your total machine RAM. You should try some lower and higher values because in some cases we achieve good performance with a setting over 25%. The configuration really depends on your machine and the working data set. If your working set of data can easily fit into your RAM, then you might want to increase the shared_buffer value to contain your entire database, so that the whole working set of data can reside in cache. That said, you obviously do not want to reserve all RAM for PostgreSQL.

In production environments, it is observed that a large value for shared_buffer gives really good performance, though you should always benchmark to find the right balance.

Alternatively, while a larger shared_buffers value can increase performance in ‘read heavy’ use cases, having a large shared_buffer value can be detrimental for ‘write heavy’ use cases, as the entire contents of shared_buffers must be processed during writes.

Please note that the database server needs to be restarted after this change.

testdb=# SHOW shared_buffers;
shared_buffers
----------------
128MB
(1 row)

wal_buffers

Write-Ahead Logging (WAL) is a standard method for ensuring integrity of data. Much like in the shared_buffers setting, PostgreSQL writes WAL records into buffers and then these buffers are flushed to disk.

The default size of the buffer is set by the  wal_buffers setting- initially at 16MB. If the system being tuned has a large number of concurrent connections, then a higher value for  wal_buffers can provide better performance.

effective_cache_size

effective_cache_size has the reputation of being a confusing PostgreSQL settings, and as such, many times the setting is left to the default value.

The effective_cache_size value provides a ‘rough estimate’ of the number of how much memory is available for disk caching by the operating system and within the database itself, after taking into account what’s used by the OS itself and other applications.

This value is used only by the PostgreSQL query planner to figure out whether plans it’s considering would be expected to fit in RAM or not. As such, it’s a bit of a fuzzy number to define for general use cases.

A conservative value for  effective_cache_size  would be ½(50%) of the total memory available on the system. Most commonly, the value is set to 75% of the total system memory on a dedicated DB server, but can vary depending on the specific discrete needs on a particular server workload.

If the value for effective_cache_size  is too low, then the query planner may decide not to use some indexes, even if they would help greatly increase query speed.

So Conclusively , General recommendation for effective_cache_size is as follows.

  • Set the value to the amount of file system cache available. On UNIX/Linux like systems, add the free+cached numbers from free or top commands to get an estimate
  • If you don’t know, set the value to the 50%  or 75%  of total system memory specially if dedicated DB server

work_mem

This configuration is used for complex sorting. If you have to do complex sorting then increase the value of work_mem for good results. In-memory sorts are much faster than sorts spilling to disk. Setting a very high value can cause a memory bottleneck for your deployment environment because this parameter is per user sort operation. Therefore, if you have many users trying to execute sort operations, then the system will allocate work_mem * total sort operations  for all users. Setting this parameter globally can cause very high memory usage. So it is highly recommended to modify this at the session level.

12345678910testdb=# SET work_mem TO “2MB”;testdb=# EXPLAIN SELECT * FROM bar ORDER BY bar.b;                                    QUERY PLAN                                     ———————————————————————————–Gather Merge  (cost=509181.84..1706542.14 rows=10000116 width=24)   Workers Planned: 4   ->  Sort  (cost=508181.79..514431.86 rows=2500029 width=24)         Sort Key: b         ->  Parallel Seq Scan on bar  (cost=0.00..88695.29 rows=2500029 width=24)(5 rows)

The initial query’s sort node has an estimated cost of 514431.86. Cost is an arbitrary unit of computation. For the above query, we have a work_mem of only 2MB. For testing purposes, let’s increase this to 256MB and see if there is any impact on cost.

123456789testdb=# SET work_mem TO “256MB”;testdb=# EXPLAIN SELECT * FROM bar ORDER BY bar.b;                                    QUERY PLAN                                     ———————————————————————————–Gather Merge  (cost=355367.34..1552727.64 rows=10000116 width=24)   Workers Planned: 4   ->  Sort  (cost=354367.29..360617.36 rows=2500029 width=24)         Sort Key: b         ->  Parallel Seq Scan on bar  (cost=0.00..88695.29 rows=2500029 width=24)

The query cost is reduced to 360617.36 from 514431.86 — a 30% reduction.

So Conclusively , Setting the value Higher alway results beer sorting and hashing , But setting it in local queries is always recommended . Set value high in queries where you expect high sorting otherwise low global value is good . 

maintenance_work_mem

maintenance_work_mem is a memory setting used for maintenance tasks. The default value is 64MB. Setting a large value helps in tasks like VACUUM, RESTORE, CREATE INDEX, ADD FOREIGN KEY and ALTER TABLE.

It is necessary to remember that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high.

The default value of maintenance_work_mem = 64MB.

General recommendation to set maintenance_work_mem is as follows.

  • Set the value 10% of system memory, up to 1GB
  • Maybe you can set it even higher if you are having VACUUM problems

We can also temporarily increase this memory while creating indexes or at the time of dump restores or while performing full vacuums . 

synchronous_commit

This is used to enforce that commit will wait for WAL to be written on disk before returning a success status to the client. This is a trade-off between performance and reliability. If your application is designed such that performance is more important than the reliability, then turn off synchronous_commit. This means that there will be a time gap between the success status and a guaranteed write to disk. In the case of a server crash, data might be lost even though the client received a success message on commit. In this case, a transaction commits very quickly because it will not wait for a WAL file to be flushed, but reliability is compromised.

Temp_buffers

This parameter sets the maximum number of temporary buffers used by each database session. The session local buffers are used only for access to temporary tables. The setting of this parameter can be changed within individual sessions but only before the first use of temporary tables within the session.

PostgreSQL database utilizes this memory area for holding the temporary tables of each session, these will be cleared when the connection is closed.

The default value of temp_buffer = 8MB.

Conclusion

There are more parameters that can be tuned to gain better performance but those have less impact than the ones highlighted here. In the end, we must always keep in mind that not all parameters are relevant for all applications types. Some applications perform better by tuning a parameter and some don’t. Tuning PostgreSQL Database Parameters must be done for the specific needs of an application and the OS it runs on.  

Also ,  Performance tuning does not only depends on postgres configuration parameters ,  there are many system parameters also on which postgresql performance depends So, in my next blog i will explain some system parameters which can affect postgresql performance .

Understand the intuition for databases

This article is for beginners who are starting their journey in software development and database looks like a jargon to them .

In this we will cover the intuition for databases , the lingo and types all will be covered in future posts.

Let’s begin by understanding through an example. Lets define our problem:

  • We are a online site which allows users to register on our website and registered users can see other registered users and send messages to them.
  • Now lets assume we are building this in an era where there are no database software available. So what will we do.

First understand our business requirement:

  • When user come on our site and register with user id and password , we should store this somewhere on a persistent storage (hard disk , available after restart).
  • We should be able to retrieve this user info from our persistent storage to validate when somebody comes for login with user id and password to check whether its correct or not.

Naive Solution 1 :

We will write user and their password in a file called "users" in the system in a sequential manner , lets look at the file:

Simple right, now if somebody want to read the userid and password , we just read the entire file check for each row , break it by semicolon ';' and check whether user id given matches the first part of the line (after breaking through semicolon) and second part matches the password provided and one.

But hey there are certain problems in the above structure (i know you have already guessed):

  • It will be very slow if user count is large (as one need to check all rows till the point user matches)
  • if somebody put semicolon in their user name( ahhh easy we wont allow it , but still there is an extra check)

So lets discuss again :

What we will do we will create multiple files for each alphabet ie user_a for user name starting with a , user_b for user name starting b and so on. this way we will reduce the size by a significant amount.

Hey but still if for a there are thousands of users still it will be a problem. Hmmm still stuck.

Lets think again:

Hey we will create a Binary Search tree type structure in File where each node points to the row on the user file (a pointer to exact location of that user in that file so that we can read that file directly), i am just showing a very very basic version for understanding down below:

Now when we do this , this solves problem to some level.

But there are so many other use cases which you will come through:

  1. delete some user – now the index is also needs to be managed
  2. backup of these files if hard disk corrupt
  3. if now i wants to store the user last name as well in this
  4. ……

There are very very few which i have written , the main point here is that all these things will be require by all companies , so database software try to solve the data persistence and query problems , so companies can focus on their business.

I think by this you got the basic intuition about the databases. Done – now go read about various type of databases which are available and what all purpose they do.

A Database Session Leak Can Slow Down Your Database

In this article, I will explain some consequences of session leaks, which I faced when an issue came to me.

Session Leak is very common in developers’ lives. In this article, I will explain some consequences of session leaks, which I faced when an issue came to me and how I came to the root cause of the issue.

Issue and Analysis

IssueLoad is increasing on the server and Postgres queries are consuming CPU and taking time.

I had to resolve this issue and find the root cause of it.

Analysis: After all my debugging, I came to the following analysis:

There is a table that has 200 rows, but the number of live tuples showing there is more than that (around 60K). We are using Postgresql 9.3.

The following are the queries that I ran.

select count(*) from subscriber_offset_manager; 
count 
------- 
200 (1 row) 

SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup ; 
schemaname | relname | n_live_tup | n_dead_tup 
------------+---------------------------+------------+------------ 
public | subscriber_offset_manager | 61453 | 5 (1 row)

But as seen from pg_stat_activity and pg_locks, we are not able to track any open connection.

SELECT query, state,locktype,mode FROM pg_locks JOIN pg_stat_activity USING (pid) WHERE relation::regclass = 'subscriber_offset_manager'::regclass ; 
query | state | locktype | mode 
-------+-------+----------+------
(0 rows)

I also tried full vacuum on this table. Below were the results:

  • All the times no rows were removed
  • Many times all the live tuples become dead tuples.

Here is the output of the running vacuum command:

vacuum FULL VERBOSE ANALYZE subscriber_offset_manager; 
INFO: vacuuming "public.subscriber_offset_manager" 
INFO: "subscriber_offset_manager": found 0 removable, 67920 nonremovable row versions in714 pages 
DETAIL: 67720 dead row versions cannot be removed yet. CPU 0.01s/0.06u sec elapsed 0.13 sec. 

INFO: analyzing "public.subscriber_offset_manager" 
INFO: "subscriber_offset_manager": scanned 710 of 710 pages, containing 200 live rows and67720 dead rows; 200 rows in sample, 200 estimated total rows VACUUM 
after that i checked for live and dead tuples for that table as follows : 


SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup 

schemaname | relname | n_live_tup | n_dead_tup 
------------+---------------------------+------------+------------ 
public | subscriber_offset_manager | 200 | 67749

After 10 seconds:

SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup ;

schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------ 
public | subscriber_offset_manager | 68325 | 132

All the dead tuples moved to live tuples instead of cleaning up.

One more interesting observation: When I stop my Java app and then do a full vacuum, it works fine (number of rows and live tuples become equal). So there is something wrong if we select and update continuously from the Java app.

After all the research and analysis and help for stack overflow and after following many links, I found the following root cause.

Root Cause:

When there is one long-running transaction or a database session leak, then dead tuples are created after the start time of that transaction and will not be cleaned up by the vacuum for all the tables for that database. This is due to the PostgreSQL vacuum process that checks for a transaction ID less than the transaction ID of the oldest transaction for cleaning dead rows. The transaction ID is generated globally.

When I checked, I found a transaction that was opened for too long and when I killed it, the vacuum worked fine.

Please read the below-given links for detailed info and effects of not ending database transaction and some Postgres internals. These links helped me a lot in solving the issue.

1.Question asked by me on StackOverFlow: https://stackoverflow.com/questions/51844616/high-number-of-live-dead-tuples-in-postgresql-vacuum-not-working/51849491?noredirect=1#comment90728543_51849491

2. When Postgresql Vacuum does not Work: https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/

3. Consequences of not ending a database transaction: https://stackoverflow.com/questions/33815267/what-are-the-consequences-of-not-ending-a-database-transaction/33815610#33815610

4.Some Postgres internals and how the vacuum works internally in Postgreshttps://fritshoogland.wordpress.com/category/postgresql/