PostgreSQL の contrib モジュール “auto_explain” を使ってみた

こんにちは。サイオステクノロジー OSS サポート担当 Y です。

PostgreSQL では複数の contrib モジュールが提供されており、必要に応じて利用することができます。

今回は contrib モジュールの 1つである auto_explain を使ってみました。(※以下の内容は PostgreSQL 10.1 をソースコードからビルドした環境にて検証/調査しています。)

(https://www.postgresql.org/docs/10/static/contrib.html)

(https://www.postgresql.org/docs/10/static/auto-explain.html)

auto_explain を利用すると “実行時間が設定した時間を超えたクエリ” の実行計画をログに出力することができ、スロークエリの原因調査やパフォーマンスチューニング等に役立てることができます。

1 ■auto_explain のインストール
2 ■検証
3 ■最後に

■auto_explain のインストール

まずは auto_explain をビルドしてインストールします。

各 contrib モジュールのソースコードは、PostgreSQL の tar ボールを展開したディレクトリ配下の contrib/<モジュール名> というディレクトリに格納されているため、当該ディレクトリに移動し make 及び make install を実行します。(※make install はインストール先ディレクトリの権限等、必要に応じて root ユーザ権限で実行します。)

[postgres@centos74 ~]$ cd /usr/local/src/postgresql-10.1/contrib/auto_explain/
[postgres@centos74 auto_explain]$ 
[postgres@centos74 auto_explain]$ make
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I. -I../../src/include  -D_GNU_SOURCE   -c -o auto_explain.o auto_explain.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -o auto_explain.so auto_explain.o  -L../../src/port -L../../src/common -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags  
[postgres@centos74 auto_explain]$ 
[postgres@centos74 auto_explain]$ su - root 
Password: 
Last login: Tue Jan 30 10:07:38 JST 2018 from 10.1.3.113 on pts/0
[root@centos74 ~]# 
[root@centos74 ~]# cd /usr/local/src/postgresql-10.1/contrib/auto_explain/
[root@centos74 auto_explain]# 
[root@centos74 auto_explain]# make install
/bin/mkdir -p '/usr/local/pgsql/lib'
/bin/install -c -m 755  auto_explain.so '/usr/local/pgsql/lib/auto_explain.so'
[root@centos74 auto_explain]# 
[root@centos74 ~]# exit
logout

インストールが完了すると、<PostgreSQL のインストールディレクトリ>/lib/ 配下に auto_explain.so というファイルが格納されているはずです。

[postgres@centos74 ~]$ ls -l /usr/local/pgsql/lib/auto_explain.so 
-rwxr-xr-x 1 root root 18912 Jan 30 10:18 /usr/local/pgsql/lib/auto_explain.so

■検証

auto_explain を使うためにはモジュールをロードする必要があるので、postgresql.conf の shared_preload_libraries に ‘auto_explain’ を設定します。

併せて、auto_explain の動作に関するいくつかの設定を実施しておきます。(auto_explain.* の設定はデフォルトの postgresql.conf には記載されていないため、必要に応じて追記します)

[postgresql.conf 抜粋]
shared_preload_libraries = 'auto_explain'
auto_explain.log_min_duration = 100
auto_explain.log_analyze = on
auto_explain.log_verbose = on

上記設定の場合、実行に 100ms 以上の時間を要したクエリの統計情報がログに出力されます。また、EXPLAIN ANALYZE VERBOSE を実行した際に出力される情報がログに出力されます。

※今回は DB 全体 (全セッション) で auto_explain を有効にするために shared_preload_libraries を利用していますが、LOAD/ALTER ROLE SET コマンドや session_preload_libraries 等を利用して特定のセッションでのみ auto_explain を有効にすることも可能です。

postgresql.conf の設定が完了したら PostgreSQL を起動します。(既に起動している場合は再起動します)

[postgres@centos74 data]$ pg_ctl restart -w -t 120
waiting for server to shut down.... done
server stopped
waiting for server to start....2018-01-30 10:35:52.151 JST [2550] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2018-01-30 10:35:52.152 JST [2550] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2018-01-30 10:35:52.164 JST [2550] LOG:  redirecting log output to logging collector process
2018-01-30 10:35:52.164 JST [2550] HINT:  Future log output will appear in directory "log".
 done
server started

PostgreSQL 起動後、psql にて接続し設定内容を確認してみます。

postgres=# select name, setting from pg_settings where name like 'auto_explain%';
                name                | setting 
------------------------------------+---------
 auto_explain.log_analyze           | on
 auto_explain.log_buffers           | off
 auto_explain.log_format            | text
 auto_explain.log_min_duration      | 100
 auto_explain.log_nested_statements | off
 auto_explain.log_timing            | on
 auto_explain.log_triggers          | off
 auto_explain.log_verbose           | on
 auto_explain.sample_rate           | 1
(9 rows)

設定内容が反映されていることが確認できたので、さっそくテスト用のテーブルを作って動作を確認してみます。

postgres=# CREATE TABLE hoge (a int);
CREATE TABLE
postgres=# 
postgres=# CREATE TABLE fuga (a int);
CREATE TABLE
postgres=# 
postgres=# INSERT INTO hoge VALUES (1);                    
INSERT 0 1
postgres=# 
postgres=# INSERT INTO fuga VALUES (generate_series(1,1000000));
INSERT 0 1000000
postgres=#

ログを確認する前に、EXPLAIN ANALYZE コマンドにてそれぞれのテーブルに対して SELECT * を実行した際の実行時間を確認してみます。

postgres=# EXPLAIN ANALYZE VERBOSE SELECT * FROM hoge;
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Seq Scan on public.hoge  (cost=0.00..35.50 rows=2550 width=4) (actual time=0.010..0.011 rows=1 loops=1)
   Output: a
 Planning time: 0.074 ms
 Execution time: 0.038 ms
(4 rows)

postgres=# 
postgres=# EXPLAIN ANALYZE VERBOSE SELECT * FROM fuga;
                                                      QUERY PLAN                                                      
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on public.fuga  (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.022..95.994 rows=1000000 loops=1)
   Output: a
 Planning time: 0.045 ms
 Execution time: 144.505 ms
(4 rows)

“SELECT * FROM hoge” の実行時間は “0.038 ms”、”SELECT * FROM fuga” の実行時間は “144.505 ms” となっています。

今回は “auto_explain.log_min_duration = 100” という設定にしているので、実行に 100ms 以上の時間を要している “SELECT * FROM fuga” の実行計画のみがログに記録されるはずです。

では実際に上記 2つのクエリを実行し、ログを確認してみます。

postgres=# SELECT * FROM hoge;
 a 
---
 1
(1 row)

postgres=# 
postgres=# SELECT * FROM fuga;
    a    
---------
       1
       2
       3

(omission)

  999998
  999999
 1000000
 (1000000 rows)

postgres=#

[クエリ実行時のログ]
2018-01-30 10:48:50.426 JST [2755] LOG:  duration: 502.519 ms  plan:
        Query Text: SELECT * FROM fuga;
        Seq Scan on public.fuga  (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.031..120.548 rows=1000000 loops=1)
          Output: a

上記のように実行に 100ms 以上要している “SELECT * FROM fuga” の実行計画のみが出力されていました。

このように実行に要した時間を条件としてクエリの実行計画をログに出力することが可能ですが、auto_explain.log_min_duration の値を “0” に設定することで、全てのクエリの実行計画をログに出力することも可能です。(ログの出力量が膨大になるため、あまり現実的な設定ではありませんが…)

postgres=# show auto_explain.log_min_duration;
 auto_explain.log_min_duration 
-------------------------------
 100ms
(1 row)

postgres=# 
postgres=# set auto_explain.log_min_duration to 0;
SET
postgres=# 
postgres=# show auto_explain.log_min_duration;
 auto_explain.log_min_duration 
-------------------------------
 0
(1 row)

postgres=# 
postgres=# SELECT * FROM hoge;
 a 
---
 1
(1 row)

postgres=# 
postgres=# SELECT * FROM fuga;
    a    
---------
       1
       2
       3

(omission)

  999998
  999999
 1000000
 (1000000 rows)

postgres=#

[クエリ実行時のログ (auto_explain.log_min_duration = 0)]
2018-01-30 10:52:01.724 JST [2755] LOG:  duration: 0.026 ms  plan:
        Query Text: SELECT * FROM hoge;
        Seq Scan on public.hoge  (cost=0.00..35.50 rows=2550 width=4) (actual time=0.014..0.015 rows=1 loops=1)
          Output: a
2018-01-30 10:52:05.203 JST [2755] LOG:  duration: 502.909 ms  plan:
        Query Text: SELECT * FROM fuga;
        Seq Scan on public.fuga  (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.017..115.529 rows=1000000 loops=1)
          Output: a

上記の通り、auto_explain.log_min_duration に “0” を設定すると、”SELECT * FROM hoge” (実行時間 : 0.026 ms のクエリ) の実行計画もログに出力されていることが確認できました。