Share this article:

Best Practices for Upgrading the Operating System on Nodes in a Vertica Cluster

At some point, you may need to upgrade the operating system or Linux kernel on the nodes in your Vertica cluster. This document explains the steps you should take to accomplish the upgrade.

Note This document does not discuss upgrading your Vertica version. It only addresses an operating system or kernel upgrade. Your database version will not change.

Before You Start

You should consider the following common constraints when preparing for a system upgrade:

  • Verify that the operating system you plan to upgrade to is supported by the version of Vertica you are running.

  • You may need to keep the cluster up and running for the duration of the upgrade.
  • The operating system upgrade may require that you reboot one or more nodes.

Before you upgrade your operating system or kernel, execute the following tasks:

  1. Back up your Vertica database and your operating system.
  2. Always upgrade the operating system to a Vertica supported version.
  3. Test the upgrade steps on a staging system before you apply them in production .
  4. Before you upgrade your operating system, follow these steps in Prepare Your Vertica Database for Maintenance. While these steps focus on preparing for a database upgrade, they apply to upgrading your operations system as well.

Options for Upgrading Your Operating System

You have four options for upgrading your operating system:

Use Copy Cluster with Two Clusters

Use copy cluster when you need to replicate an existing Vertica cluster. In this situation, the target cluster onto which you replicate the source cluster runs the upgraded operating system.

Using copy cluster is the fastest option, and you can keep your existing cluster up while the target is being copied to. However, switching user applications from the source cluster to the target cluster may require connection resets with minor down time.

The copy cluster approach requires that you set up an entirely new cluster with the new operating system. You must have hardware available to run two clusters for a short duration. If you are running on virtual infrastructure, copy cluster might be the best choice.

To perform a copy cluster operation, follow the steps in Copying Data Between Similar Vertica Clusters.

Upgrade All Nodes During Down Time

Upgrading all nodes during system down time is the traditional way to upgrade to a new operating system. However, it may not be convenient to shut down your entire cluster to perform the upgrade. The steps are:

  1. Back up the database and operating system.
  2. Shut down the Vertica cluster.
  3. Upgrade the operating system on all nodes.
  4. Reboot the nodes.
  5. Restart the Vertica cluster.

Upgrade the Cluster and Restore a Backup

A minor variation on the previous two options, you can upgrade the cluster, and restore a database backup to the same cluster. Use this option when Vertica is in the same file system as the operating system. Follow these steps:

  1. Back up the database and operating system.
  2. Shut down the Vertica cluster.
  3. Upgrade the operating system on all nodes.
  4. Reboot the nodes.
  5. Restore the Vertica cluster from the backup, as described in Copy and Restore Data from a Vertica Cluster to a Backup.

This option is not as efficient as the previous two options, but it may be necessary if the operating system upgrade results in errors, unexpected corruption, or loss of nodes. In those situations, you must rely on a backup to restore the database.

Upgrade Nodes in Three Phases

Suppose you have a 7-node cluster (node01 through node07). In this situation, you can perform the operating system upgrade in three phases, where each phase has less than n/2–1 of node's data. (n represents the number of nodes in the cluster.)

Important In each phase, verify that no two nodes in the same phases are buddies of each other. (The SQL statement later in this section identifies buddy nodes in your cluster.) If you have buddy nodes in a phase, your database is no longer K-safe and shuts down.

You can only use this phased option if the K-safety of your database is greater than or equal to 1.

Here's how the phases would look for a 7-node cluster:

  • Phase 1: node01, node03, node05 (odd)
  • Phase 2: node02, node04, node06 (even)
  • Phase 3: node07

Important Check the CRITICAL_NODES system table before you upgrade a node so that you don't cause the database to go down. If a critical node goes down, the database stops.

Suppose you have an 8-node cluster (node01 through node08). In this situation, you can perform the operating system upgrade in three phases, where each phase has less than n/2 -1 node.

  • Phase 1: node01, node03, node05 , node07 (odd)
  • Phase 2: node02, node04, node06 (all remaining even except last one)
  • Phase 3: node08

For each phase of the upgrade, follow these steps:

  1. Back up the database and operating system.
  2. Shut down the Vertica nodes that are participating in that phase.
  3. Upgrade the operating system on those nodes.

  4. Reboot the nodes in that phase.
  5. Verify that the newly started nodes successfully recovered before moving to the next phase.

This phased approach to upgrade the operating system is the most complicated, but it does not impose any down time. Concurrent recovery may cause some performance slowness for the duration of the recovery or when nodes are down.

Use the following SQL statement to find which two nodes are buddies with each other. The results show that node02 and node03 are buddies, node03 and node04 are buddies, and so on.

=> select dependency_id, min (node_name) node_x, max(node_name) node_y, 
   count(*) dep_count from vs_node_dependencies join nodes on (node_oid = node_id) 
   group by 1 order by 1; 
 dependency_id | node_x  | node_y  | dep_count 
---------------+---------+---------+----------- 
             0 | node02  | node03  |         2 
             1 | node03  | node04  |         2 
             2 | node04  | node05  |         2 
             3 | node05  | node06  |         2 
             4 | node01  | node07  |         2 
             5 | node06  | node07  |         2 
             6 | node01  | node08  |         2 
             7 | node02  | node08  |         2 
             8 | node01  | node08  |         8 
(9 rows) 

After You Upgrade

After you upgrade the operating system, check if there are operating system changes that need to be implemented to follow up the Vertica installation. To perform this check, execute this command:

$ /opt/vertica/oss/python/bin/python -m vertica.local_verify

For More Information

Share this article: