In this series we will understand postgresql indexes and also when to use them with sample cases.
Also we will be comparing certain indexes to understand which one to use when.
Let’s Begin:
B-Tree
The most common , highly used index on Postgresql or in any Sql DB.
What btree does is create a self balancing tree (with multiple child nodes not 2) in which the leaf nodes point to the exact row location.
Also the Btree all leaf nodes pointers point to adjacent node using doubly linked list. (this primarily help in getting the sorted data in very fast as we just reach to first element and then traverse further will see this in action).
Lets see a diagram to understand: here the leaf nodes are marked in white and they points to each other and contain row address and you can see they are in sorted order.

How to use?
First create a table with dummy data to understand, here i am creating 1 million rows:
create table student as select s as rollno, md5(random()::text) as name from generate_Series(1,1000000) s;
Use Case 1 : Query for equal
Now lets query this to understand Btree benefits:
explain analyze select * from student where rollno=9090;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Gather (cost=1000.00..15375.79 rows=5292 width=36) (actual time=10.074..98.045 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on student (cost=0.00..13846.59 rows=2205 width=36) (actual time=54.934..83.723 rows=0 loops=3)
Filter: (rollno = 9090)
Rows Removed by Filter: 333333
Planning time: 0.197 ms
Execution time: 98.084 ms
(8 rows)
Now create index:
create index roll_idx on student(rollno);
Lets try again:
explain analyze select * from student where rollno=9090;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Index Scan using roll_idx on student (cost=0.42..8.44 rows=1 width=37) (actual time=0.106..0.108 rows=1 loops=1)
Index Cond: (rollno = 9090)
Planning time: 0.426 ms
Execution time: 0.154 ms
Now you can see that total time taken after index is ~500 times faster than normal as in normal case it was doing a parallel scan(in older version of postgres it was also sequential scan).
Use Case 2: Query for range
explain analyze select * from student where rollno > 9090 and rollno < 10008;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Index Scan using roll_idx on student (cost=0.42..41.52 rows=905 width=37) (actual time=0.017..0.605 rows=917 loops=1)
Index Cond: ((rollno > 9090) AND (rollno < 10008))
Planning time: 0.215 ms
Execution time: 0.715 ms
(4 rows)
Now you could see this works well in case of range.
No no it is not checking for all the range value one by one , it is using the leaf property of pointing to each other in sorted order.
Use Case in case of string:
in case of string first create index and check :
create index name_idx on student(name);
explain analyze select * from student where name = 'b170e15823193100056e09725aec94c4';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Index Scan using name_idx on student (cost=0.42..8.44 rows=1 width=37) (actual time=0.115..0.116 rows=0 loops=1)
Index Cond: (name = 'b170e15823193100056e09725aec94c4'::text)
Planning time: 0.390 ms
Execution time: 0.151 ms
See this works perfectly well.
When to use?
- When you are querying a table frequently on this column by filtering
- If you want data sorted also in that order
- Date , Integer , String are good candidates for this
- DO NOT USE if you are searching ilike or like type of queries. There are diff. indexes for that(in next blogs)
- DO NOT JUST CREATE A INDEX ON EACH COLUMN . This has a huge overhead for postgres (will discuss in series)
Next – Multi Column B-Tree
please subscribe for more such indepth blogs.