PostgreSQL's Index Maintenance (B-tree reindex and drop unused indexes)
This article is not entirely complete yet, but it's still useful:
Reindex Maintenance Operation:
Important
Points:
1. All the numbers
and intervals in this document are experimental.
2. All the scripts
in this document must be changed mode to executable by using the following
command:
chmod +x
scriptname.sh
3. These scripts
are intended to run as root or a user which can switch user to ‘postgres’ user
passwordlessly.
4. The scripts must
run by /bin/sh like the following:
/bin/sh
scriptname.sh
• PostgreSQL provides the index methods
B-tree, hash, GiST, spgist, brin and GIN. Default is btree. Users can also
define their own index methods, but that is fairly complicated. When the WHERE
clause is present, a partial index is created. The default method is btree.
• Create index syntax:
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ]
[ [ IF NOT EXISTS ] name ] ON [ ONLY ] table_name [ USING method ]
( { column_name | ( expression ) } [ COLLATE collation ] [ opclass [ (
opclass_parameter = value [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST
} ] [, ...] )
[ INCLUDE ( column_name [, ...] ) ]
[ WITH ( storage_parameter [= value] [, ... ] ) ]
[ TABLESPACE tablespace_name ]
[ WHERE predicate ]
• Alter index syntax:
ALTER INDEX [ IF EXISTS ] name
RENAME TO new_name
ALTER INDEX [ IF EXISTS ] name
SET TABLESPACE tablespace_name
ALTER INDEX name ATTACH
PARTITION index_name
ALTER INDEX name [ NO ]
DEPENDS ON EXTENSION extension_name
ALTER INDEX [ IF EXISTS ] name
SET ( storage_parameter [= value] [, ... ] )
ALTER INDEX [ IF EXISTS ] name
RESET ( storage_parameter [, ... ] )
ALTER INDEX [ IF EXISTS ] name
ALTER [ COLUMN ] column_number
SET STATISTICS integer
ALTER INDEX ALL IN TABLESPACE name
[ OWNED BY role_name [, ... ] ]
SET TABLESPACE new_tablespace [ NOWAIT ]
-------------------------------------------------------------------------------------------------------------
• Reindex syntax:
REINDEX [ ( option [, ...] ) ] { INDEX |
TABLE | SCHEMA | DATABASE | SYSTEM } [ CONCURRENTLY ] name
where option can be one of:
CONCURRENTLY [ boolean ]
TABLESPACE new_tablespace
VERBOSE [ boolean ]
alternatively
you can use reindexdb binary file for maintenance purposes:
reindexdb:
reindexdb is
a wrapper around the SQL command REINDEX(7). There is no effective difference
between reindexing databases via this utility and via
other methods for accessing the server.
File path:
PostgreSQL’s executables’ path:
/usr/pgsql-<Major
Version Number>/bin/
Command-line
arguments:
Example:
To reindex
the table t and the index ind in a database named drpezeshki:
reindexdb --table=t --index=ind
drpezeshki
• Normally you don't have to worry about
index fragmentation at all. However, if there has been a mass delete or update,
or the sustained change rate was so high that autovacuum couldn't keep up, you
may end up with a badly bloated index.
• Find boated/fragmented indexes:
o Query current
extensions with:
select * from pg_extension;
select * from pg_available_extensions;
• Extensions for index management
1) pg_repack_
Reorganize
tables in PostgreSQL databases with minimal locks
is a PostgreSQL extension which lets you
remove bloat from
tables and
indexes, and optionally restore the physical order of clustered
indexes.
Unlike CLUSTER_ and `VACUUM FULL`_ it works online, without
holding an
exclusive lock on the processed tables during processing.
pg_repack is
efficient to boot, with performance comparable to using
CLUSTER
directly.
2) pgstattuple:
/usr/pgsql-13/share/extension/pg_stattuple.control
dnf -y install postgresql-contrib
#connect to
the desired database and issue the subsequent command:
CREATE EXTENSION pgstattuple;
o issue the
following command:
SELECT * FROM pgstatindex('index name');
Sample
output:
query to get
the leaf fragmentation of all B-tree indexes:
----------------------------------------------------------------------------------------------------------------
SELECT i.indexrelid::regclass,
s.leaf_fragmentation
FROM pg_index AS i
JOIN pg_class AS t ON i.indexrelid = t.oid
JOIN pg_opclass AS opc ON i.indclass[0] = opc.oid
JOIN pg_am ON opc.opcmethod = pg_am.oid
CROSS JOIN LATERAL pgstatindex(i.indexrelid) AS s
WHERE t.relkind = 'i'
AND pg_am.amname = 'btree';
-------------------------------------------------------------------------------------------------------------
Using these
extensions automation tasks can be planned and programmed to defragment indexes
periodically like SQL Server
Related
performance parameters:
maintenance_work_mem
max_parallel_maintenance_workers
effective_cache_size
Example:
• We have a table of more than 200,000
rows with the following definition:
Dataset
exists on the following link:
https://www.kaggle.com/speedoheck/inpatient-hospital-charges
---------------------------------------------------
create table inpatientcharges
(
drg_definition
varchar(200),
provider_id
integer,
provider_name
varchar(200),
provider_street_address
varchar(200),
provider_city
varchar(50),
provider_state
varchar(10),
provider_zip_code
integer,
hospital_referral_region_description
varchar(50),
total_discharges
integer,
average_covered_charges
varchar(20),
average_total_payments
varchar(20),
average_medicare_payments
varchar(20)
);
alter table inpatientcharges owner to
postgres;
---- Load the data into table
---- Adding an identity column
ALTER TABLE inpatientCharges
ADD COLUMN id int
-- int or bigint or smallint
GENERATED BY DEFAULT AS IDENTITY;
--------------------------------------------------------------------------------------------------
• create index on the identity column:
-- create unique index concurrently ic on
inpatientcharges (id asc nulls last) with (fillfactor=80); -- default is 90
-------------------------------------------------
If we want
to turn it to the primary key, and to recreate a primary key constraint,
without blocking updates while the index is rebuilt:
CREATE UNIQUE INDEX CONCURRENTLY ic ON
inpatientCharges (id asc nulls last) include (drg_definition, provider_id, provider_name,provider_street_address,provider_city, provider_state,provider_zip_code,hospital_referral_region_description,total_discharges,average_covered_charges,average_total_payments,average_medicare_payments)
with (fillfactor = 80);
ALTER TABLE inpatientCharges
-- DROP CONSTRAINT ic,
ADD CONSTRAINT ic PRIMARY KEY USING INDEX ic;
----------------------------------------------------------------------------------------------------------
You can
issue this command to show indexes that belong to table:
\d table_name
----------------------------------------------------------------------------------------------------------
use this
command to import dataset:
COPY
inpatientCharges(drg_definition,provider_id,provider_name,provider_street_address,provider_city,provider_state,provider_zip_code,hospital_referral_region_description,total_discharges,average_covered_charges,average_total_payments,average_medicare_payments)
FROM
'/var/lib/pgsql/data/inpatientCharges.csv'
DELIMITER ','
CSV HEADER;
-------------------------------------------------------------------------------------------------
SELECT
tablename,
indexname,
indexdef
FROM
pg_indexes
WHERE
schemaname = 'public'
ORDER BY
tablename,
indexname;
-------------------------------------------------------------------------------------------------
create a new
non-primary key index:
---------------------------------------------------------------------------------------------------
CREATE INDEX CONCURRENTLY icpid ON inpatientCharges
(provider_id asc nulls last) include (id, provider_name,provider_city, provider_state,provider_zip_code,total_discharges,average_covered_charges,average_total_payments,average_medicare_payments)
with (fillfactor = 80);
--------------------------------------------------------------------------------------------------
Issue the
following command given before to extract fragmentations by name and
fragmentation is bigger than 40 for example:
--------------------------------------------------------------------------------------------------
SELECT i.indexrelid::regclass,
s.leaf_fragmentation
FROM pg_index AS i
JOIN pg_class AS t ON i.indexrelid = t.oid
JOIN pg_opclass AS opc ON i.indclass[0] = opc.oid
JOIN pg_am ON opc.opcmethod = pg_am.oid
CROSS JOIN LATERAL pgstatindex(i.indexrelid) AS s
WHERE t.relkind = 'i'
AND pg_am.amname = 'btree' and leaf_fragmentation > 40 and
leaf_fragmentation <> 'NaN' order by leaf_fragmentation desc;
------- in
one line for dynamic query:
-- user data
SELECT i.indexrelid::regclass FROM
pg_index AS i JOIN pg_class AS t ON i.indexrelid = t.oid JOIN pg_opclass AS opc
ON i.indclass[0] = opc.oid JOIN pg_am
ON opc.opcmethod = pg_am.oid CROSS JOIN
LATERAL pgstatindex(i.indexrelid) AS s WHERE t.relkind = 'i' AND pg_am.amname = 'btree' and
leaf_fragmentation > 0 and leaf_fragmentation <> 'NaN' and t.relname
not like 'pg_%' order by leaf_fragmentation desc;
-- system
catalogs
SELECT i.indexrelid::regclass FROM
pg_index AS i JOIN pg_class AS t ON i.indexrelid = t.oid JOIN pg_opclass AS opc
ON i.indclass[0] = opc.oid JOIN pg_am
ON opc.opcmethod = pg_am.oid CROSS JOIN
LATERAL pgstatindex(i.indexrelid) AS s WHERE t.relkind = 'i' AND pg_am.amname = 'btree' and leaf_fragmentation
> 0 and leaf_fragmentation <> 'NaN' and t.relname like 'pg_%' order by
leaf_fragmentation desc;
Fragmentation
of indexes in the above example:
SELECT i.indexrelid::regclass,
s.leaf_fragmentation FROM pg_index AS i JOIN pg_class AS t ON i.indexrelid =
t.oid JOIN pg_opclass AS opc ON i.indclass[0] = opc.oid JOIN pg_am ON opc.opcmethod = pg_am.oid CROSS JOIN LATERAL pgstatindex(i.indexrelid)
AS s WHERE t.relkind = 'i' AND pg_am.amname
= 'btree' and relname = 'ic' or relname='icpid';
select 'path.to.object'::regclass::oid
select 'ic'::regclass::oid
------------------------------------------------------------------------------------------------
Alternative
query to fetch fragmented users only indexes’ list:
I do not vow
that this query returns all fragmented indexes. It’s slower too! Someone
suggested this query to me so I include it here:
SELECT i.indexrelid::regclass,
s.leaf_fragmentation
FROM pg_index AS i
JOIN pg_class AS t ON i.indexrelid = t.oid
JOIN pg_opclass AS opc ON i.indclass[0] = opc.oid
JOIN pg_am ON opc.opcmethod = pg_am.oid
CROSS JOIN LATERAL pgstatindex(i.indexrelid) AS s
WHERE t.relkind = 'i'
AND pg_am.amname = 'btree' and leaf_fragmentation > 30 and
leaf_fragmentation <> 'NaN' order by leaf_fragmentation desc;
Now you
can automate reindexing:
System
catalog indexes cannot be rebuilt concurrently, thus we have separated user
indexes and system catalogs reindexing
1) Script to
reindex system catalogs (Not concurrent, thus midnights and weekends are
suggested)
usage:
/bin/sh reindex_systemcatalogs.sh <PG
Version> <Fragmentation Threshold> <DBName>
###################################################################
set -x
######## variables: ##############
PGHOME=/usr/pgsql-$1/bin #Home for PG13: /usr/pgsql-13/bin
fragmentation_threshold=$2
dbname=$3
##################################
export PATH=$PATH:$PGHOME
psqlc="sudo -u postgres psql
-d$dbname"
$psqlc -c "SELECT
i.indexrelid::regclass FROM pg_index AS i JOIN pg_class AS t ON i.indexrelid =
t.oid JOIN pg_opclass AS opc ON i.indclass[0] = opc.oid JOIN pg_am ON opc.opcmethod = pg_am.oid CROSS JOIN LATERAL pgstatindex(i.indexrelid)
AS s WHERE t.relkind = 'i' AND
pg_am.amname = 'btree' and leaf_fragmentation >= $fragmentation_threshold
and leaf_fragmentation <> 'NaN' and t.relname like 'pg_%' order by
leaf_fragmentation desc;" | tail -n +3 | head -n -2 | xargs -I {} sudo -u
postgres psql -d$dbname -c "REINDEX (verbose) INDEX {}"
###################################################################
§ If you want to
include index, database, schema, and table’s name too, you should use the
following 2 scripts, both in the same directory:
·
The main script: The script you call:
usage:
/bin/sh reindex_sysindexes.sh <PG version>
<Target Leaf Fragmentation Threshold (0-100)>
Note:
You must
execute this script with /bin/sh
###################################################################
set -x
######## variables: ##############
LOG_DIRECTORY=/var/log/pgsql/maintenance/
LOG_FILE_NAME=reindex_job.log
PGHOME=/usr/pgsql-$1/bin #Home for PG13: /usr/pgsql-13/bin
fragmentation_threshold=$2
##################################
#exec > >(logger -i -p local1.info)
2>&1
LOG_PATH=$LOG_DIRECTORY$LOG_FILE_NAME
alias FULL_DATE='date "+%Y-%m-%d
%T.%N %z"'
mkdir -p /var/log/pgsql/maintenance/
touch $LOG_PATH
export PATH=$PATH:$PGHOME
START_TIMESTAMP=$(echo $(date
"+%s.%N"))
printf
"\------------------------------------------------------------------------------------------------\n$(FULL_DATE):\tStarting
reindex maintenance operation \n"
>> $LOG_PATH 2>&1
printf "$(FULL_DATE):\tAdding
pgstattuple extension to PostgreSQL's databases if not already added: \
(Attention! \"postgresql<pg
version>-contrib\" package must be installed in advance\n\n"
>> $LOG_PATH 2>&1
echo $(sudo -u postgres psql -t -c
"select datname from pg_database where datname not in
('postgres','template0','template1');"\
|
head -n -1 | xargs -I {} sudo -u postgres psql -d{} postgres -c "create
extension if not exists pgstattuple;") >> $LOG_PATH 2>&1
printf "\n$(FULL_DATE):\tReindexing
target system catalogs indexes:\n" >> $LOG_PATH 2>&1
printf "The index names will be in
format: \"<DBName>.[schema name if not public.]<Table
Name>.<Index Name>\"\n" >> $LOG_PATH 2>&1
echo $(sudo -u postgres psql -t -c
"select datname from pg_database where datname \
not in
('postgres','template0','template1');" | head -n -1 | xargs -I {} bash -c \
"sudo -u postgres psql -t -d{} postgres
-c \"SELECT concat('.',indrelid::regclass,'.',relname) \
FROM pg_index AS i JOIN pg_class AS t ON
i.indexrelid = t.oid JOIN pg_opclass AS opc \
ON
i.indclass[0] = opc.oid JOIN pg_am ON
opc.opcmethod = pg_am.oid \
CROSS JOIN LATERAL pgstatindex(i.indexrelid)
AS s \
WHERE t.relkind = 'i' AND pg_am.amname = 'btree' and
leaf_fragmentation >= $fragmentation_threshold \
and leaf_fragmentation <> 'NaN' and
t.relname like 'pg_%'\" | head -n -1 | tr -d ' ' | sed -e 's/^/{}/' | \
xargs -IA ./reindex.sh {} A $LOG_PATH")
END_TIMESTAMP=$(echo $(date
"+%s.%N"))
#EXECUTION_TIME=$(expr $(date
"+%s") - $START_TIMESTAMP)
EXECUTION_TIME=$(bc <<<
"$END_TIMESTAMP-$START_TIMESTAMP")
printf "\n\n$(FULL_DATE):\t Process
finished executing. The entire process took $EXECUTION_TIME seconds to
complete.\n\n\n\n" >> $LOG_PATH 2>&1
###################################################################
·
The subsidiary script in the same directory as the
primary script:
###################################################################
exec >> $3 2>&1
printf "\n"
echo $2
TEMP=$2
TEMP=$(echo "${TEMP##*.}")
sudo -u postgres psql -d$1
-c"REINDEX INDEX $TEMP"
##################################################################
The above
script also logs some information about output to the following file:
/var/log/pgsql/maintenance/reindex_job.log
--------------------
2) user data
reindexing: (once every midnight)
I.
For all databases (more convenient):
usage:
/bin/sh reindex_userindexes.sh <PG
version> <Target Leaf Fragmentation Threshold (0-100)>
Note:
You must
execute this script with /bin/sh
###################################################################
set -x
######## variables: ##############
LOG_PATH=/var/log/pgsql/maintenance/reindex_job.log
PGHOME=/usr/pgsql-$1/bin #Home for PG13: /usr/pgsql-13/bin
fragmentation_threshold=$2
##################################
#exec > >(logger -i -p local1.info)
2>&1
alias FULL_DATE='date "+%Y-%m-%d
%T.%N %z"'
mkdir -p /var/log/pgsql/maintenance/
touch
/var/log/pgsql/maintenance/reindex_job.log
export PATH=$PATH:$PGHOME
START_TIMESTAMP=$(echo $(date
"+%s.%N"))
printf
"\------------------------------------------------------------------------------------------------\n$(FULL_DATE):\tStarting
reindex maintenance operation \n"
>> $LOG_PATH 2>&1
printf "$(FULL_DATE):\tAdding
pgstattuple extension to PostgreSQL's databases if not already added: \
(Attention! \"postgresql<pg
version>-contrib\" package must be installed in advance\n\n"
>> $LOG_PATH 2>&1
echo $(sudo -u postgres psql -t -c
"select datname from pg_database where datname not in
('postgres','template0','template1');"\
|
head -n -1 | xargs -I {} sudo -u postgres psql -d{} postgres -c "create
extension if not exists pgstattuple;") >> $LOG_PATH 2>&1
printf "\n$(FULL_DATE):\tReindexing
target user indexes:\n" >> $LOG_PATH 2>&1
echo $(sudo -u postgres psql -t -c
"select datname from pg_database where datname \
not in
('postgres','template0','template1');" | head -n -1 | xargs -I {} bash -c "sudo -u postgres psql -t -d{}
postgres -c \"SELECT relname \
FROM pg_index AS i JOIN pg_class AS t ON
i.indexrelid = t.oid JOIN pg_opclass AS opc \
ON
i.indclass[0] = opc.oid JOIN pg_am ON
opc.opcmethod = pg_am.oid \
CROSS JOIN LATERAL pgstatindex(i.indexrelid)
AS s \
WHERE t.relkind = 'i' AND pg_am.amname = 'btree' and
leaf_fragmentation >= $fragmentation_threshold \
and leaf_fragmentation <> 'NaN' and
t.relname not like 'pg_%';\" | head -n -1 | \
xargs -I@ sudo -u postgres psql -d{}
-c\"REINDEX INDEX CONCURRENTLY @\"") >> $LOG_PATH
2>&1
END_TIMESTAMP=$(echo $(date
"+%s.%N"))
#EXECUTION_TIME=$(expr $(date
"+%s") - $START_TIMESTAMP)
EXECUTION_TIME=$(bc <<<
"$END_TIMESTAMP-$START_TIMESTAMP")
printf "\n$(FULL_DATE):\t Process
finished executing. The entire process took $EXECUTION_TIME seconds to
complete.\n\n\n" >> $LOG_PATH 2>&1
###################################################################
The above
script also logs some information about output to the following file:
/var/log/pgsql/maintenance/reindex_job.log
Sample log
output:
-------------------------------------------------------------------------------------------------------------
§ If you want to
include index, database,
schema, and table’s name too, you should use the following 2 scripts,
both in the same directory:
·
The main script: The script you call:
usage:
/bin/sh reindex_userindexes.sh
<PG version> <Target Leaf Fragmentation Threshold (0-100)>
Note:
You must
execute this script with /bin/sh
###################################################################
set -x
######## variables:
##############
LOG_DIRECTORY=/var/log/pgsql/maintenance/
LOG_FILE_NAME=reindex_job.log
PGHOME=/usr/pgsql-$1/bin #Home for PG13: /usr/pgsql-13/bin
fragmentation_threshold=$2
###############################
#exec > >(logger
-i -p local1.info) 2>&1
LOG_PATH=$LOG_DIRECTORY$LOG_FILE_NAME
alias FULL_DATE='date
"+%Y-%m-%d %T.%N %z"'
mkdir -p
/var/log/pgsql/maintenance/
touch $LOG_PATH
export PATH=$PATH:$PGHOME
START_TIMESTAMP=$(echo
$(date "+%s.%N"))
printf
"\------------------------------------------------------------------------------------------------\n$(FULL_DATE):\tStarting
reindex maintenance operation \n"
>> $LOG_PATH 2>&1
printf
"$(FULL_DATE):\tAdding pgstattuple extension to PostgreSQL's databases if
not already added: \
(Attention!
\"postgresql<pg version>-contrib\" package must be installed in
advance\n\n" >> $LOG_PATH 2>&1
echo $(sudo -u postgres
psql -t -c "select datname from pg_database where datname not in
('postgres','template0','template1');"\
| head -n -1 | xargs -I {} sudo -u postgres
psql -d{} postgres -c "create extension if not exists pgstattuple;")
>> $LOG_PATH 2>&1
printf
"\n$(FULL_DATE):\tReindexing target user indexes:\n" >>
$LOG_PATH 2>&1
printf "The index
names will be in format: \"<DBName>.[schema name if not
public.]<Table Name>.<Index Name>\"\n" >> $LOG_PATH
2>&1
echo $(sudo -u postgres
psql -t -c "select datname from pg_database where datname \
not in
('postgres','template0','template1');" | head -n -1 | xargs -I {} bash -c \
"sudo -u postgres psql -t -d{} postgres
-c \"SELECT concat('.',indrelid::regclass,'.',relname) \
FROM pg_index AS i JOIN pg_class AS t ON
i.indexrelid = t.oid JOIN pg_opclass AS opc \
ON i.indclass[0] = opc.oid JOIN pg_am ON opc.opcmethod = pg_am.oid \
CROSS JOIN LATERAL pgstatindex(i.indexrelid)
AS s \
WHERE t.relkind = 'i' AND pg_am.amname = 'btree' and
leaf_fragmentation >= $fragmentation_threshold \
and leaf_fragmentation <> 'NaN' and
t.relname not like 'pg_%'\" | head -n -1 | tr -d ' ' | sed -e 's/^/{}/' |
\
xargs -IA ./reindex.sh {} A $LOG_PATH")
END_TIMESTAMP=$(echo
$(date "+%s.%N"))
#EXECUTION_TIME=$(expr
$(date "+%s") - $START_TIMESTAMP)
EXECUTION_TIME=$(bc
<<< "$END_TIMESTAMP-$START_TIMESTAMP")
printf
"\n\n$(FULL_DATE):\t Process finished executing. The entire process took
$EXECUTION_TIME seconds to complete.\n\n\n\n" >> $LOG_PATH
2>&1
###################################################################
·
The subsidiary script in the same directory as the
primary script, name it reindex.sh and change its mode to executable by issuing
‘chmod +x reindex.sh’:
###################################################################
exec >> $3
2>&1
printf "\n"
echo $2
TEMP=$2
TEMP=$(echo
"${TEMP##*.}")
sudo -u postgres psql
-d$1 -c"REINDEX INDEX CONCURRENTLY $TEMP"
##################################################################
Sample log
output:
-------------------------------------------------------------------------------------------------------------
II.
For one specific database:
usage:
/bin/sh
reindex_userdata.sh <PG Version> <Fragmentation Threshold>
<DBName>
###################################################################
set -x
######## variables: ##############
PGHOME=/usr/pgsql-$1/bin #Home for PG13: /usr/pgsql-13/bin
fragmentation_threshold=$2
dbname=$3
##################################
export PATH=$PATH:$PGHOME
psqlc="sudo -u postgres psql
-d$dbname"
$psqlc -c "SELECT i.indexrelid::regclass
FROM pg_index AS i JOIN pg_class AS t ON i.indexrelid = t.oid JOIN pg_opclass
AS opc ON i.indclass[0] = opc.oid JOIN
pg_am ON opc.opcmethod = pg_am.oid
CROSS JOIN LATERAL pgstatindex(i.indexrelid) AS s WHERE t.relkind =
'i' AND pg_am.amname = 'btree' and
leaf_fragmentation > $fragmentation_threshold and leaf_fragmentation
<> 'NaN' and t.relname not like 'pg_%' order by leaf_fragmentation
desc;" | tail -n +3 | head -n -2 | xargs -I {} sudo -u postgres psql
-d$dbname -c "REINDEX (verbose) INDEX CONCURRENTLY {}"
###################################################################
Logging is
not provided for the above script.
------------------------------------------------------------------------------------------------
The alternative
query which someone suggested to me has not been turned into a script as it is
slower.
Find unused
indexes:
------------------------------------------------------------------------------------------------
SELECT s.schemaname,
s.relname AS tablename,
s.indexrelname AS indexname,
pg_relation_size(s.indexrelid) / 1048576.0 AS index_size
FROM pg_catalog.pg_stat_user_indexes s
JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid
WHERE s.idx_scan = 0 -- has never been scanned
AND 0 <>ALL (i.indkey) --
no index column is an expression
AND NOT i.indisunique -- is not
a UNIQUE index
AND NOT EXISTS -- does
not enforce a constraint
(SELECT 1 FROM pg_catalog.pg_constraint c
WHERE c.conindid = s.indexrelid)
ORDER BY pg_relation_size(s.indexrelid)
DESC;
------- in one line for dynamic query:
SELECT concat(s.schemaname,'.',s.indexrelname)
FROM pg_catalog.pg_stat_user_indexes s JOIN pg_catalog.pg_index i ON
s.indexrelid = i.indexrelid WHERE s.idx_scan = 0 AND 0 <>ALL (i.indkey)
AND NOT i.indisunique AND NOT EXISTS (SELECT 1 FROM pg_catalog.pg_constraint c
WHERE c.conindid = s.indexrelid) ORDER BY pg_relation_size(s.indexrelid) DESC;
-------------------------------------------------------------------------------------------------------------
or:
-------------------------------------------------------------------------------------------------------------
SELECT
relid::regclass AS table,
indexrelid::regclass AS index,
pg_size_pretty(pg_relation_size(indexrelid::regclass)) AS index_size,
idx_tup_read,
idx_tup_fetch,
idx_scan
FROM pg_stat_user_indexes
JOIN pg_index USING (indexrelid)
WHERE idx_scan = 0
AND indisunique IS FALSE
order by index_size desc
;
---------------
3) drop
unused indexes (once every 2 weeks):
You have to
periodically reset your index usage statistics by issuing ‘pg_stat_reset()’
command. So two scripts come in the way. The script (I) shall execute ‘pg_stat_reset()’
and script (II) would
drop the indexes, however, I combined these two operations into one script
below. According to the load of your database, you can schedule the execution
of the script, say every month:
usage:
/bin/sh drop_unused_idx.sh <DBName>
#####Script (I)&(II)
combined##########################################
#set -x
dbname=$1
psqlc="sudo -u postgres psql
-d$dbname"
$psqlc -c "SELECT
concat(s.schemaname,'.',s.indexrelname) FROM pg_catalog.pg_stat_user_indexes s
JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid WHERE s.idx_scan = 0
AND 0 <>ALL (i.indkey) AND NOT i.indisunique AND NOT EXISTS (SELECT 1
FROM pg_catalog.pg_constraint c WHERE c.conindid = s.indexrelid) ORDER BY
pg_relation_size(s.indexrelid) DESC;" | tail -n +3 | head -n -2 | xargs -I
{} $psqlc -c'drop index {}'
$psqlc -c “select pg_stat_reset()”
###################################################################
Some fragmentation workload:
create temp table tt as select * from
inpatientCharges where id % 3 = 0; delete from inpatientCharges where id % 3 =
0; insert into inpatientCharges select * from tt;
References:
1) https://www.postgresql.org/docs/14/sql-createindex.html
create index
2) https://www.postgresql.org/docs/14/sql-reindex.html
rebuild
index
3) https://www.postgresql.org/docs/14/spgist-builtin-opclasses.html
Built-in
operator classes
4) https://www.postgresql.org/docs/14/sql-createopclass.html
create
operator class
5) https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DEDUPLICATION
b-tree
deduplication
6) https://www.postgresql.org/docs/14/runtime-config-client.html#GUC-GIN-PENDING-LIST-LIMIT
gin_pending_list_limit
7) https://www.postgresql.org/docs/14/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM
maintenance_work_mem
8) https://www.postgresql.org/docs/14/runtime-config-resource.html#GUC-MAX-PARALLEL-MAINTENANCE-WORKERS
max_parallel_maintenance_workers
9) https://www.postgresql.org/docs/14/runtime-config-query.html#GUC-EFFECTIVE-CACHE-SIZE
effective_cache_size
10) https://www.postgresql.org/docs/14/runtime-config-client.html#GUC-GIN-PENDING-LIST-LIMIT
gin_pending_list_limit
11) https://www.postgresql.org/docs/current/indexes-opclass.html
indexes-opclass
12) https://www.postgresql.org/docs/13/pgstattuple.html
pgstattuple
13) https://reorg.github.io/pg_repack/1.3/
pg_repack
fork of pg_reorg_
14) https://github.com/pgxn/pgxnclient
PGXN Client:
A command line tool to interact with the PostgreSQL Extension Network
15) https://www.postgresql.org/docs/13/ecpg-dynamic.html
PostgreSQL
dynamic query
16) https://www.cybertec-postgresql.com/en/get-rid-of-your-unused-indexes/
Comments
Post a Comment