Git log: commit 59d59ee50190a52a4a1e90e29d60a216880720ce
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Tue Feb 7 17:02:56 2023 +0100
When streaming into a JSON file, write to a partial file first. (#187)
Writing directly to the target JSON file creates a windows where the file
exists and can be read with partial content, either by pgcopydb itself or
from external tooling. Using a temp filename then switching to the actual
expected file protects again partial reads and parse failures.
commit 43ea82ea913cde86b0ddf7464299cb9c5e1c382c
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Tue Feb 7 16:33:42 2023 +0100
When transforming into a SQL file, write to a partial file first. (#186)
Writing directly to the output file opens a window where the file exists and
can be read with partial content, either from pgcopydb itself, or from an
external tooling. Using a temp filename then switching to the actual
expected file protects again partial reads and parse failures.
commit 59831f97b03c28e6ed7c51faa189c284a97a3804
Author: Shubham Dhama <shubhamdhamaofficial@gmail.com>
Date: Tue Feb 7 20:30:34 2023 +0530
Fix same-table-copy query to use cached pgcopydb.table_size. (#184)
* Allow bash scripts as test cases.
* Fix typo.
commit 00cd3c27a374dbed5130c34a011202d9ff33d30a
Author: pranavidandu <dandupranavi@gmail.com>
Date: Mon Jan 30 22:26:54 2023 +0530
added unlogged tables for data copy and fix for [exclude-index] sql query (#183)
commit 6ec3a9f4e1c2be2b5d9c476973c070326b63b929
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Thu Jan 26 12:08:54 2023 +0100
Fix stream_read_context retry loop. (#181)
commit ceaed3e5a58557f3a787f9e20229b6cd76b2764a
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Thu Jan 26 11:59:40 2023 +0100
Desultory code formating improvements. (#180)
commit ecf0120964a238ff5fc80e02f405043c4743970d
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Thu Jan 26 11:49:59 2023 +0100
Fix compile-time issue when using clang. (#179)
commit c3d90d088f3247f47a7a6fe47dbed89452f52812
Author: Shubham Dhama <shubhamdhamaofficial@gmail.com>
Date: Thu Jan 26 14:40:23 2023 +0530
Fix `dir` of stream cleanup and do cleanup for other commands. (#178)
Address the failure when `stream cleanup` is used after `clone follow`
caused by `dir` is passed as `NULL` to `copydb_init_workdir`, which leads
`copydb_prepare_filepaths` to set `topdir` to an undesired location.
Includes cleanup of commands that do not require working directory, by
removing `copydb_init_workdir`. And, sets `dir` to appropriate value that
do requires working directory.
commit 20b83792a7cec26faf74bb6463d3726bee59b22d
Author: Shubham Dhama <shubhamdhamaofficial@gmail.com>
Date: Wed Jan 25 15:22:48 2023 +0530
Fix huge memory allocations copydb_prepare_table_specs. (#175)
commit 45441f8c294358785d9e6b75f4e508e515c709a1
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Wed Jan 18 15:12:00 2023 +0100
Fix the SQL query that retrieves the column name for partitioning. (#176)
commit 25984447b70d74fee1caef3165b6d6edbe6924bf
Author: Shubham Dhama <shubhamdhamaofficial@gmail.com>
Date: Mon Jan 16 22:24:49 2023 +0530
Fix `lsn` for KEEPALIVE action. (#164)
* Fix lsn for KEEPALIVE actions.
* Fix failing unit tests by adding fake K messages.
commit c2760a02322fc3e73c44a622d6cdb6a3338c1e10
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Wed Jan 11 17:37:56 2023 +0100
Blind fix attempt for a reported segfault. (#173)
Should fix #172.
commit 759a003430b6a58d6eb495f37bcd6c61419c4f37
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Wed Jan 11 15:12:40 2023 +0100
Error out early when work dir does not exists and is expected to. (#171)
* Error out early when work dir does not exists and is expected to.
Auxilliary commands such as pgcopydb list progress expect the work directory
to have been created already. Check that early in the code.
* Fix copydb_init_workdir to specify if work dir should be created.
Then fix the callers and make most of them re-use a pre-existing directory
rather than create a default one and then use it.
commit 2ca73fe8f04fd6c916e31d7a3870182bd12c32e6
Author: Dan Hendry <dhendry@users.noreply.github.com>
Date: Tue Jan 10 08:49:21 2023 -0500
Trivial: Write index OIDs as unsigned integers correctly to the summary file & fix partitioning for tables with bigint primary keys with values greater than 2^32 (#168)
* Write OIDs as unsigned integers correctly to the summary file
* Update partitioning logic to support bigint primary keys
commit c38261702420d4523b32bc0c8d0db7db340bd1fa
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Tue Dec 13 13:59:29 2022 +0100
Fix transform process to handle endpos in between transactions. (#166)
* tests: Add test for scenario when endpos is set between BEGIN and COMMIT.
This test reproduces the issue mentioned in #154.
* Add new test to run-tests.yml
* Fix transform process to handle endpos in between transactions.
We might have a transaction that spans over multiple JSON files, and in that
case we always have a SWITCH WAL message at the end of the current file,
within the current transaction.
When we reach the end of file and we're in the middle of a transaction but
we don't have a SWITCH WAL message, it means that we have reached endpos. In
that case just ignore the currently opened transaction and replace its
contents by a synthetic KEEPALIVE message to advance the target system.
* Code cleanup.
* Review and fix to account for target case only...
Co-authored-by: Shubham Dhama <shubhamdhamaofficial@gmail.com>
commit 5cb32c6a7cb164e2493007648cc8ed84caf2698b
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Wed Dec 7 11:52:30 2022 +0100
One of our syscalls (to mkdir) failed to include the OS error message. (#162)
commit d1b56d6706fcffd70827c9c287aa07386dfab85e
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Mon Dec 5 18:55:16 2022 +0100
Implement --skip-collations. (#160)
In case when the target database does not have the same collation provider
as the source database, or just different collation names, then a mapping
needs to be done before using pgcopydb. Then this option is useful for
skipping the pg_restore steps that re-install the source collations.
commit f257da51d617d892839a7ea5fd1136b2f3934615
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Thu Dec 1 15:17:38 2022 +0100
Implement support for logical decoding plugin test_decoding. (#156)
By default Postgres ships with the test_decoding contrib module. While this contrib is meant only for testing the logical decoding APIs, it has been found useful to just rely on it in many situations. One key advantage compared to wal2json is that the data representation used in test_decoding is exactly the output from Postgres data type functions, without edits, so we know it will be accepted as-is by the target Postgres system.
This commit introduces the --plugin option, with support for either "wal2json" or "test_decoding". The default for the --plugin option is now "test_decoding".
commit 52d0eb72a855b394129b10d37271f0951e29aea8
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Wed Nov 30 12:00:58 2022 +0100
Fix previous commit / PR. (#158)
There was a couple bugs:
1. The XID in the metadata parts of the message is now a JSON string
rather than a JSON number, even though the XID in the wal2json message
still is a JSON number. We had to adjust to both cases, because we use
the same metadata parsing code in both situations.
2. Now that transactions may contain SWITCH WAL and KEEPALIVE messages, we
need to issue BEGIN lazily, rather than eagerly. That said, a
LogicalTransaction struct does not contain BEGIN and COMMIT messages
separately, it's part of the transaction metadata.
Adjust the code to actually work with that design.
commit 667f86926e18b33e6cb0c64888a00b9918b65920
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Wed Nov 23 14:56:05 2022 +0100
Arrange to use Logical Replication protocol metadata. (#155)
Instead of relying on wal2json to include the metadata we need in our
implementation of Change Data Capture, use the Logical Decoding protocol
metadata directly.
This prepares for compatibility with other output_plugins in the future.
The XID is still parsed from the "include-xids" option of wal2json at the
moment, but it's used only as a hint and a check. The current plan is to
have compatilibity with only output_plugins that can send the XIDs, such as
for example the test_decoding output_plugin.
commit 382d0f6c53c86d23fa30cd307d0d00c23cfb1a1a
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Tue Nov 22 14:50:52 2022 +0100
Refrain from using wal2json computed column "nextlsn". (#151)
* Refrain from using wal2json computed column "nextlsn".
This column isn't part of Postgres Logical Decoding API and relying on it
would prevent pgcopydb from being compatible with other output plugins in
the future.
Also, our internal tracking of the LSN position was pretty confused at
times, using a single LSN value makes it simpler to follow our logic.
To still be able to follow WAL filename changes we introduce the SWITCH WAL
statement in our SQL files too, not just the JSON files anymore. The replay
process knows how to parse those SWITCH WAL statements.
To still be able to match our endpos with the actual LSN position in the
WAL, we introduce a new KEEPALIVE statement in our SQL and JSON files too.
The replay process knows how to parse those KEEPALIVE statements and mark
the progress on the replication origin tracking on the target database.
* Fix how to output mixed internal transactions representation.
The addition of the KEEPALIVE and SWITCH WAL messages has been done in a way
that our logical transaction memory representation could hold such messages
before the BEGIN statement. Not all the code got the memo...
* PR clean-up/review.
Adjust some log levels and comments.
commit 8cc1882383db58397899ecbf922355bbd4de1e28
Author: Shubham Dhama <shubhamdhamaofficial@gmail.com>
Date: Mon Nov 21 19:58:04 2022 +0530
Fix migration failure of an empty database with --drop-if-exists. (#152)
It fixes a corner case when --drop-if-exists is used to migrate a database that
contains no tables. Before this change, `copydb_target_drop_tables` generates an
invalid SQL query`DROP TABLE IF EXISTSCASCADE`.
commit 301c8a3e3ac04adf21040175d75b5aeb7066fe0e
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Mon Nov 21 15:23:06 2022 +0100
Implement pgcopydb list tables --drop-cache. (#150)
This allows to manually drop the cache created with --cache option.
commit ec7ae12b04076fb369c19c30e9fb32c71f0b2fd3
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Mon Nov 21 15:16:27 2022 +0100
Fix the Dockerfile for wal2json support with older Postgres versions. (#153)
The apt.postgresql.org debian repository has archived its support for debian
stretch release, in a way that we need to now use
apt-archive.postgresql.org.
We could switch to using a bare debian system as the base for our docker
image here, but we would then have to provide and maintain the entry points
scripting that the official docker image is providing.
commit ae74b73919370a1cc72904e146d1245d08951114
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Mon Nov 7 14:53:36 2022 +0100
Implement an option to cache pg_table_size() results. (#146)
* Implement an option to cache pg_table_size() results.
In some environments computing the pg_table_size() can be quite slow, so we
might benefit from managing a cache of the table sizes that we can re-use.
This cache is implemented as the pgcopydb.table_size table on the source
database, which is created with the command pgcopydb list tables --cache.
* Filter out pgcopydb schema when listing sequences, indexes.
* Avoid using pgcopydb list tables in tests.
Because we now create a schema (and leave it behind) when running the
pgcopydb list tables command, avoid using it in the tests, specifically when
targetting the target database.
Instead, use pgcopydb list extensions, which doesn't leave objects behind.
* Update tests/cdc JSON file with the new extra transaction.
* Fix tests/follow with new pgcopydb list table skipping pgcopydb schema.
We can't use `pgcopydb list table | grep sentinel` anymore, because we
excluded the pgcopydb schema from the list commands now that we have both
the sentinel table and the table_size tables in there.
commit 3bdf701d15ddfc2d974d2d60c6267ab72e10115a
Author: nakatlam <74578153+nakatlam@users.noreply.github.com>
Date: Mon Nov 7 05:29:10 2022 -0800
Adding dir option pgcopydb list progress command (#148)
commit 8bcba6ec60d3960b86943548eb5fcfb12443d9f4
Author: Dimitri Fontaine <dim@tapoueh.org>
Date: Thu Nov 3 15:32:03 2022 +0100
Fix debian B-D in our debian build Dockerfiles. (#145)