PostgreSQL high availability clustering solutions
Introduction
In my previous post A Brief History of Logical Replication in Postgres — and Looking Ahead at its Likely Future Evolution, I provided a retrospective journey of the logical replication feature in PostgreSQL, starting from Postgres 9.6, where some of the building blocks were laid down. The blog also provides an insight into how a big feature like logical replication evolves and matures in the PostgreSQL community.
This is the second blog of a two blog series. In this post, I will be talking about PostgreSQL cluster solutions that are based on logical replication and the pgEdge approach to creating a high availability cluster.
We have recently seen unprecedented growth in the user base for most enterprises; this in turn has led to exponential data growth. Scalability in a distributed PostgreSQL environment has become the most pressing need of a replication solution. In addition to scalability for better performance and low latency, enterprises need high availability. High availability means that there is near zero percent downtime for users in the event of hardware/software/network issues or maintenance windows.
This is where distributed PostgreSQL comes into play. Before we go into specific Postgres cluster solutions, it is important to understand the concept of database clustering and its benefits. A Postgres cluster involves setting up a group of servers (nodes) to work together to provide a higher level of availability, reliability, and scalability than can be achieved with a single database server. In simpler terms, database clustering refers to the practice of linking several servers or instances together to work as a single system. This configuration enhances the performance, availability, and scalability of database systems. This is crucial for applications requiring high availability and performance, as it allows for data to be replicated across multiple nodes and for queries to be distributed among them, enhancing both fault tolerance and load distribution.
Now let's switch our attention to the main topic of this post. In this post, we are going to discuss PostgreSQL clustering solutions that are based on logical replication. Our solution provides active-active multi-master capabilities - this means that all nodes in the cluster will have the same copy of the data, providing data redundancy. The nodes are configured with asynchronous multi-master replication, and application user traffic is distributed across the nodes to provide better performance and high availability.
pgEdge - Fully Distributed PostgreSQL
Applications these days have to be highly responsive and always available - even during maintenance windows. The user base for an application may be spread across a country or around the globe. Your application needs to be able to respond in real time, even during peak hours. The exponential growth in data seen in most businesses makes serving this data up to users in a short turnaround time is a challenging task. To achieve low latency and high availability, you need to deploy instances in data centers that are both close to your user and close to your business.
pgEdge has combined cutting edge technology, unique solutions, and deep PostgreSQL expertise to provide a solution. pgEdge is a fully distributed PostgreSQL database, optimized for the network edge, and deployable across multiple cloud regions or data centers. The solution is a true multi-master (active-active) distributed database system that allows read and write operations at any node on the network. It seems almost magical, providing:
reduced data latency
ultra high availability
targeted data residency
and most importantly, an improved customer experience.
The best part is you can get all of this, typically without any code changes. pgEdge allows both read and write operations to take place on any database node in a geographically distributed cluster. Each node runs standard PostgreSQL (version 14, 15 or 16), and a cluster can span multiple cloud regions or data centers. pgEdge nodes are loosely coupled, and are kept updated via asynchronous logical replication with conflict resolution.
pgEdge Solutions
Keeping the industry demand at the forefront, pgEdge offers fully-distributed multi-master PostgreSQL clustering solutions for both cloud (with pgEdge Cloud) and on-prem deployments (with pgEdge Platform).
pgEdge Cloud high availability clusters
pgEdge Cloud is fully-distributed PostgreSQL, deployable across multiple cloud regions or data centers. The pgEdge Cloud console harnesses the low latency, high availability, and data residency benefits of pEdge distributed PostgreSQL in a fully managed cloud service running in multiple regions across AWS, Azure, or Google Cloud. pgEdge Cloud offers a free trial version that lets you experience a global, serverless PostgreSQL database in less than 90 seconds with powerful benefits and capabilities. You can deploy a highly-available three-node active-active multi-master cluster that handles read/write operations with built in conflict resolution and:
Low latency - achieve high performance with low latency by deploying read/write nodes in regions close to the user.
Edge integration - Providing integration with cloud flare workers and other edge platforms.
Rapid deployment - One click provisioning for global clusters on a secure private network.
pgEdge Platform high availability clusters
pgEdge Platform is self-managed distributed PostgreSQL for developer evaluations or production use; use pgEdge Platform to self-host and self-manage pgEdge distributed PostgreSQL in your own data center or cloud account.
Database nodes running pgEdge Platform can participate in clusters that span data centers and any of the major cloud providers( AWS, Azure and Google Cloud). pgEdge Platform runs on a variety of common hardware and OS combinations, and enterprise class support plans are available.
Installing pgEdge Platform
In any directory owned by your non-root user, install pgEdge on all nodes of the cluster:
python3 -c "$(curl -
fsSL https://pgedge-download.s3.amazonaws.com/REPO/install.py)"
On each node of the cluster, move into the pgedge
directory and install pgedge, specifying a name for the database superuser, a password, and a database name. Note that the name cannot be the name of an OS user, pgedge, or any of the PostgreSQL reserved words. You can also use the --port option to install PostgreSQL on a port other than the default port (5432).
./pgedge setup -U admin -P mypassword1 -d demo
The command will download the required pgEdge components and verify the system prerequisites before installing the latest version of PostgreSQL 16 supported by pgEdge and configuring the server to support the pgEdge replication requirements. The server hosts a database (named demo
) with a database superuser (`admin`) that can log in to the database with the credentials specified (`mypassword1`).
The command will also install the spock and snowflake extensions. The spock extension provides multi-master replication with conflict resolution. The snowflake extension provides support for sequences for multi-node multi-master clusters; regular PostgreSQL sequences are single host only.
When executed, the command also creates a replication user with the same name as the OS user that invokes the command. This is the user that you will use in connection strings when you create nodes and subscriptions.
If you encounter a permissions error on EL9 running this command, you may need to update your SELINUX mode to permissive
or disabled
, reboot, and retry the operation.
Create Nodes
Next you will register each of the databases as a spock node. Using node names with a naming sequence like n1, n2, n3 (.etc) will automatically set the correct value for snowflake.node, enabling the use of snowflake sequences. The user named in the connection string is a replication user, and has to match the OS user that invoked the setup command; in this example that user is named rocky.
Node n1
(IP address 10.1.2.5):
./pgedge spock node-create n1 'host=10.1.2.5 user=rocky dbname=demo' demo
Node n2
(IP address 10.2.2.5):
./pgedge spock node-create n2 'host=10.2.2.5 user=rocky dbname=demo' demo
Create Subscriptions
Next we need to create the subscriptions between the nodes in your cluster to support bi-directional replication. The connection string for sub_n1n2 should specify the connection details for n2 in the create node command; the string specified for sub_n2n1 should specify the connection details for n1 in the create node command. Again, you'll include the identity of the replication user (rocky) in the connection string.
Node n1
(IP address 10.1.2.5):
./pgedge spock sub-create sub_n1n2 'host=10.2.2.5 user=rocky dbname=demo' demo
Node n2
(IP address 10.2.2.5):
./pgedge spock sub-create sub_n2n1 'host=10.1.2.5 user=rocky dbname=demo' demo
Our example is a simple two-node cluster; if you have a three-node cluster, the subscriptions should allow traffic between any node in each direction. This means that for a three-node cluster you would create:
sub_n1n2 between node 1 and node 2
sub_n1n3 between node 1 and node 3
sub_n2n1 between node 2 and node 1
sub_n2n3 between node 2 and node 3
sub_n3n1 between node 3 and node 1
sub_n3n2 between node 3 and node 2
As your cluster grows, the subscriptions required also grow.
Adding tables to the default Replication Set
The next step is to use spock commands to add tables to the default replication set and start replication. The default replication set is created when you install pgEdge; you have the option to create a custom replication set and add it to the subscription, but using the default replication set provided simplifies configuration for our example. You also have the option of using spock to add all the tables in a schema to the replication set. The power of logical replication that underpins the pgEdge multi-master platform allows you to configure extremely granular replication.
For this example, we'll use pgbench to add some tables. When you open pgbench or psql, specify your database name after the utility name.
On each node, source the PostgreSQL environment variables to add pgbench and psql to your OS PATH; this will make it easier to move between the nodes:
source pgedge/pg16/pg16.env
Then, use pgbench to set up a very simple four-table database. At the OS command line, (on each node of your replication set), create the pgbench tables in your database (demo) with the pgbench command. You must create the tables on each node in your replication cluster:
pgbench -i demo
Then, connect to each node with the psql client:
psql demo
Once connected, alter the numeric columns, setting LOG_OLD_VALUE
equal to true
. This will make these numeric fields conflict-free delta-apply columns, ensuring that the value replicated is the delta of the committed changes (the old value plus or minus any new value) to a given record:
ALTER TABLE pgbench_accounts ALTER COLUMN abalance SET (LOG_OLD_VALUE=true);
ALTER TABLE pgbench_branches ALTER COLUMN bbalance SET (LOG_OLD_VALUE=true);
ALTER TABLE pgbench_tellers ALTER COLUMN tbalance SET (LOG_OLD_VALUE=true);
Then, exit psql:
demo=# exit
On the OS command line for each node, use the pgedge spock repset-add-table
command to add the tables to the system-created replication set (named default
); the command is followed by your database name (db_name)
:
./pgedge spock repset-add-table default 'pgbench_*' demo
The fourth table, pgbench_history
, is excluded from the replication set because it does not have a primary key. The primary key is needed because the replication set is configured to replicate UPDATEs and/or DELETEs.
Adding a Custom Replication Set to a Subscription
Since we're using the default replication set (created by the pgEdge installer) we don't need to add the replication set to the subscription. If you are using a custom replication set, it needs to be added to the subscription. The following spock command adds a replication set to the subscription.
./pgedge spock sub-add-repset repset_name repset-name db_name
Please see the pgEdge documentation https://docs.pgedge.com/platform/installing_pgedge for detailed information on creating custom replication sets and adding or removing replication sets from a subscription.
Useful Replication Status Views
You can use spock functions and tables to check the replication status of your tables. The pgEdge documentation also provides a list of functions and tables available for checking replication status and debugging issues.
To check available subscriptions:
select * from spock.subscription;
sub_id | sub_name | sub_origin | sub_target | sub_origin_if | sub_target_if | sub_enabled | sub_slot_name | sub_replication_sets | sub_forward_origins | sub_apply_delay | sub_force_text_transfer
------------+----------+------------+------------+---------------+---------------+-------------+------------------------+-------------------------------------
--+---------------------+-----------------+-------------------------
3651191944 | sub_n2n1 | 673694252 | 560818415 | 1437190346 | 4043145508 | t | spk_testdb_n1_sub_n2n1 | {default,default_insert_only,ddl_sql } | | 00:00:00 | f
(1 row)
To check tables and their assigned replication set:
select * from spock.tables;
relid | nspname | relname | set_name
-------+---------+------------------+----------
32771 | public | pgbench_tellers | default
32768 | public | pgbench_history |
32777 | public | pgbench_branches | default
32774 | public | pgbench_accounts | default
(4 rows)
To check subscription status:
./pgedge spock sub-show-status sub_n2n1 testdb
[
{
"sub_show_status": [
"sub_n2n1",
"replicating",
"n1",
"host=localhost port=5432 user=pgedge dbname=testdb",
"spk_testdb_n1_sub_n2n1",
"{default,default_insert_only,ddl_sql}",
null
]
}
]
Conclusion: Postgres High Availability Clusters
It is pretty clear that nearly every enterprise needs scalability to support its business needs and growing data requirements. PostgreSQL has done well in scaling upwards but in most cases it is proven that one machine is not enough to entertain application performance and high availability needs.
PostgreSQL has several clustering offerings, both open source and proprietary, based on physical streaming replication and on logical replication. pgEdge has a unique and robust product, and has proved itself as a leader in PostgreSQL distributed multi-master replication. pgEdge Cloud offers a state-of-the-art and user-friendly cloud console that simplifies cluster management. The pgEdge Platform provides a true and robust multi-master distributed PostgreSQL solution. Conflict management and conflict avoidance capabilities are truly unique to pgEdge, and are instrumental in a multi-master logical replication environment.
The product plans for pgEdge platform for 2024 are even more exciting. We are working on game changing logical replication capabilities that are increasingly in demand by enterprise applications. The upcoming features in pgEdge platform will continue to simplify ease of use and minimize adjustments needed to adopt multi-master replication for real world database applications. These features will include support for replication of DDL commands as-well as working with large objects. Above all of this the pgEdge team is working on increasing replication throughput across nodes.
I will keep everyone posted on the above developments and will share information about our new features as they become available.
Stay tuned….
Conclusion: Postgres High Availability Clusters
It is pretty clear that nearly every enterprise needs scalability to support its business needs and growing data requirements. PostgreSQL has done well in scaling upwards but in most cases it is proven that one machine is not enough to entertain application performance and high availability needs.
PostgreSQL has several high availability clustering offerings, both open source and proprietary, based on physical streaming replication and on logical replication. pgEdge has a unique and robust product, and has proved itself as a leader in PostgreSQL distributed multi-master replication. pgEdge Cloud offers a state of the art and user-friendly cloud console that simplifies cluster management. The pgEdge Platform provides a true and robust multi-master distributed PostgreSQL solution. Conflict management and conflict avoidance capabilities are truly unique to pgEdge, and are instrumental in a multi-master logical replication environment.
The product plans for pgEdge platform for 2024 are even more exciting. We are working on game changing logical replication capabilities that are increasingly in demand by enterprise applications. The upcoming features in pgEdge platform will continue to simplify ease of use and minimize adjustments needed to adopt multi-master replication for real world database applications. These features will include support for replication of DDL commands as-well as working with large objects. Above all of this the pgEdge team is working on increasing replication throughput across nodes.
I will keep everyone posted on the above developments and will share information about our new features as they become available.
Stay tuned….