Prepare Your Vertica Database for Maintenance

Database Shut Down for Maintenance

Before you prepare your database for maintenance, to shut down your Vertica database safely, follow this checklist:

  1. Check node dependency status
  2. Back up the database
  3. Shut down the database
  4. Verify the database shutdown

Sometimes, you may not be able to shut down your database. In such cases, you can take specific steps to safely shut down the database and avoid a forced shutdown that results in total loss of your data and causes restart problems.

Check Node Dependency Status

The first step before shutting down the database is checking node dependencies. Make sure all nodes in the cluster have buddy projections correctly set up on other nodes in the cluster.

Correct Node Dependency

Node dependencies are correct, if all the projections have their segments and buddy segments on the same pair of nodes.

 Correct Node Dependency

In this example, in a total 26 projections there are 13 projections, 13 buddies, and 27 replicated projections.

=> SELECT is_segmented, COUNT(*) FROM projections GROUP BY 1;
is_segmented | count 
------------------------------------------------------------
f            |  27
t            |  26

A clean node dependency lists (number of nodes + 1) lines. Each line shows the number of segmented projections. The last line specifies the number of replicated projections.

In the binary form, each digit represents a node. For example, in the binary form 00011, where node 1 = 1, node 2 = 1, node 3 = 0, node 4 = 0, and node 5 = 0. 1 indicates that segments exist on the node and 0 indicates that segments do not exist on the node.

=> SELECT GET_NODE_DEPENDENCIES();
GET_NODE_DEPENDENCIES
--------------------------------------------------------------
Deps:
00011 - cnt: 13
00110 - cnt: 13
01100 - cnt: 13
11000 - cnt: 13 
10001 - cnt: 13
11111 - cnt: 27

Incorrect Node Dependency

Node dependencies can be incorrect for several reasons, including:

  • Incomplete rebalance
  • Cluster expansion
  • Fault group configuration

When incorrect dependencies occur, the projections on a node have split dependencies. Thus, some projections depend on one node while others depend on another node.

For example, in a 5-node cluster that experienced an incomplete rebalance operation, a set of projections on node 1 have buddy segments on node 2, and a different set of projections on node 1 have the buddy segments on node 5. In this scenario, when node 1 goes down, node 2, and node 5 become critical. Both nodes have buddy data to replace the data on node 1 that is down.

Sometimes, a node is shut down with incorrect node dependencies and the disk failure occurs before bringing the cluster UP. In such cases, determining the node dependencies of a K-safe cluster is difficult and time consuming, extending database down time.

Incorrect Node Dependency

In the same example as above, in a total 26 projections there are 13 projections, 13 buddies, and 27 replicated projections.

When the node dependencies are incorrect, 3 projections have buddy segments on the same node, while 10 projections have buddy segments on the other node. In this case, if node 1 goes down it has segments on node 2, 3, 4, and 5. As a result, all nodes become critical.

=> SELECT GET_NODE_DEPENDENCIES();
GET_NODE_DEPENDENCIES
----------------------------------------------------------------
Deps:
00011 - cnt: 3
00110 - cnt: 3
01100 - cnt: 3
11000 - cnt: 3 
10001 - cnt: 3
00101 - cnt: 10
01010 - cnt: 10
10100 - cnt: 10
01001 - cnt: 10
10010 - cnt: 10
11111 - cnt: 27

How to Check Node Dependency

To check node dependency and resolve node dependency:

  1. Ensure nodes in the cluster have correctly set up buddy projections by checking node dependencies, using the following command:

    => SELECT RECOMPUTE_NODE_DEPENDENCIES();
    => SELECT GET_NODE_DEPENDENCIES();
  2. If the node dependencies are incorrect, before shutting down the database, resolve node dependency. Rebalance data in the cluster synchronously and organize node dependencies, using the following command:

    => SELECT REBALANCE_CLUSTER()

If, after rebalancing the cluster, the node dependencies are still incorrect, please contact Vertica Support.

Back Up the Database

After you verify that the node dependencies are correct, you can back up your database. Try the standard method, using the vbr.py script. If this approach fails, you can do a cold (offline) backup.

Back Up Using the vbr.py Script

To perform a standard backup, run the vbr.py script as described in Backing Up and Restoring the Database in the product documentation.

Perform a Cold or Offline Backup

If the backup does not complete, you can perform a cold backup or an offline backup by copying the catalog and data directories to another location when the database is down.

Important If the database has to be shut down for a cold backup, to get the paths to the catalog and the data directories that you want to copy, query the NODES system table. To view data storage locations, query the STORAGE_LOCATIONS system table.

The correct path names are catalog path without the Catalog subdirectory and data directory without the SAL subdirectory. You back up the database by copying the catalog and the data directories.

=> SELECT name, catalogpath, bdbpath FROM vs_NODES;
-[ RECORD 1 ]-----------------------------------------------------
name        | v_test2_node0001
catalogpath | /home/dbadmin/test2/v_test2_node0001_catalog/Catalog
bdbpath     | /home/dbadmin/test2/v_test2_node0001_data/SAL
  • Catalog path: /home/dbadmin/test2/v_test2_node0001_catalog/ - without the Catalog subdirectory
  • Data directory: /home/dbadmin/test2/v_test2_node0001_data/ - without the SAL subdirectory

Shut Down the Database

After backing up the database, it is ready for shutdown. Shutdown can take some time, depending on cluster activity and data that Vertica needs to move out before shutting down. After the Tuple Mover moveout completes, the database shutdown begins. To monitor the shutdown process, check the vertica.log file regularly. The vertica.log file is stored under the catalog directory.

Before starting the shutdown process, Vertica:

  • Checks for active sessions. If there are active sessions, shutdown does not execute and shows an alert about active sessions.
  • Executes a moveout operation.

Shut Down the Database Using the SELECT SHUTDOWN () Statement

To shut down the database using SELECT SHUTDOWN() statement:

  1. View the maximum number of sessions:
    => SHOW CURRENT "MaxClientSessions"
  2. Verify the sessions are closed:
    => SELECT * FROM SESSIONS;
  3. Query the SESSIONS system table to see what sessions are still running:
     => SELECT * FROM SESSIONS;
  4. Close all the sessions:
     => SELECT CLOSE_ALL_SESSIONS();
    or close a specific session:
     => SELECT CLOSE_SESSION('session_id');
  5. To prevent additional users from connecting to the database, use the MaxClientSessions configuration parameter to set the maximum number of sessions to 0.
    => ALTER DATABASE <database_name> SET MaxClientSessions = 0;

    When MaxClientSessions is 0, five dbadmin sessions are allowed. As a database administrator, you can use these sessions to perform administrative tasks as needed.

  6. Move the ancient history mark (AHM) to avoid replay delete.
    => SELECT MAKE_AHM_NOW();
  7. Run the Tuple Mover to move all projections from the WOS to the ROS:
    => SELECT DO_TM_TASK('moveout')

    If you omit the table name, the Tuple Mover moves everything out from WOS.

  8. To verify that Tuple Mover moves everything, query against RESOURCE_USAGE to check for bytes used in WOS.
    => SELECT node_name, SUM(memory_inuse_kb) FROM resource_pool_status WHERE pool_name = 'wosdata' GROUP BY 1 ORDER BY 1 ;
          node_name     | sum
    ---------------------+-----
    v_vmart_db_node0001 |   0
    v_vmart_db_node0002 |   0
    v_vmart_db_node0003 |   0
    v_vmart_db_node0004 |   0
    v_vmart_db_node0005 |   0
  9. Shut down the database:
    => SELECT SHUTDOWN();

You can shut down your Vertica database using Management Console and Administration Tools. For information, see Stopping a Database in the Vertica documentation.

If you are unable to shut down, do not force the database to shut down. Instead contact Vertica Support.

Verify the Database Shutdown

To verify that your database has shut down without any error messages before you perform maintenance:

  1. Check if the database has shut down properly. There are three ways to check:

    • At the Linux prompt, check the status of each node and verify that all the nodes are down. The database cannot shutdown until all the nodes are down.:
      $ admintools -t view_cluster
    • Verify that the log file on each node contains a Shutdown complete message.
      For example, to check the message using your own path to the catalog folder, enter the following command:
      for host in `grep -P "^v_" /opt/vertica/config/admintools.conf|awk '{print $3}'|awk -F, '{print $1}'`;do echo ----- $host -----; ssh $host "find /home/dbadmin/test2/ -name vertica.log | xargs grep '<INFO> Shutdown complete.'"; done
    • Verify that the Vertica process is not running in any of the nodes.
      for host in `grep -P "^v_" /opt/vertica/config/admintools.conf|awk '{print $3}'|awk -F, '{print $1}'`;do echo ---- $host ---- ; ssh $host "ps -ef | grep /opt/vertica/bin/vertica"; done
  2. If the database has not shut down, monitor the log file as follows:
    $ tail –f vertica.log
  3. If the shutdown is not progressing or is taking longer than expected, collect vstack information by executing the vstack command on the node that is not shutting down. This command saves the vstack information in the vstack_nodeXX.log file.
    $ /opt/vertica/bin/vstack > /tmp/vstack_nodexx.log

    If required, send this information to Vertica Support.

  4. Force shutdown by killing nodes that have the same dependency. When two nodes with the same dependencies are down, Vertica automatically completes shutdown of other nodes due to K-safe.

    Use the Administration Tools to select nodes you want to shut down:

    1. Click Advanced Menu, and select Stop Vertica on Host.
    2. If Stop Vertica on Node fails, on the same menu, select Killing a Vertica Process on Host.

After your database has shut down safely, you can perform maintenance.