Vertica Blog

Sizing Your Vertica Cluster for an Eon Mode Database

This blog post was authored by Shrirang Kamat.

Vertica in Eon Mode is a new architecture that separates compute and storage, allowing users to take advantage of cloud economics that enable rapid scaling and shrinking of clusters in response to a variable workload. Eon Mode decouples the cluster size from the data volume and lets you configure by your compute needs instead. While a Vertica cluster can host an Eon Mode database or an Enterprise Mode database, this document focuses on Eon Mode. Currently, Eon Mode works on AWS. For more information, see CloudFormation Template (CFT) Overview in the Vertica documentation.

As a Vertica administrator setting up your production cluster running in Eon Mode, you have to make important decisions about picking the correct EC2 instances and cluster size to meet your needs. This document provides guidelines and best practices for selecting instance types and cluster sizes for a Vertica database running in Eon Mode.

This document assumes that you have a basic understanding of the Eon Mode architecture and references new Eon mode concepts like communal storage, depot, and shards. Make sure you are familiar with these concepts. You can find details about Eon Mode architecture in Eon Mode Architecture.

Sizing your Vertica cluster

When sizing a Vertica cluster running in Enterprise Mode for the cloud, you pick the number of nodes based on the data volume and the number of nodes that are required to process queries in parallel, while meeting the expected response time.

In contrast, a Vertica cluster running in Eon Mode stores all your data in virtually unlimited communal storage, so the overall data volume is not a consideration. However, your working data volume is. Each node has a depot that stores a subset of the data stored on the communal storage. As such, you need to size your depot according to the expected workload and data set. For example, suppose your primary objective is to query and analyze the last 30 days of data. In this case, you would size your depot to the last 30 days. Like Enterprise Mode, you will need additional storage to support temporary files.

This depot provides improved query performance with faster access to data that is local to the node. Segmented projections in Eon Mode are split into shards. The number of shards in a database is not tied to the number of nodes in the cluster, but to what you specify when you create your database. The number of shards determines the maximum number of compute nodes that will execute your query in parallel.

When sizing a Vertica cluster for Eon Mode, you pick the number of nodes based on the required total depot size and the number of shards needed to meet the expected query response times for your workload and data set. The initial cluster configuration will have the number of shards matching the number of nodes. To take advantage of elastic throughput scaling, add more nodes in the cluster than the number of shards. If the number of nodes in your cluster is equal to or greater than twice the number of shards, you can define fault groups to create subclusters. Subclusters isolate workloads because the cached files in their depot can vary from the primary cluster. Having multiple subclusters can help segregate workloads or scale out throughput because a session that connects to a node in the subcluster only uses nodes in the subcluster to execute the query.

The following picture illustrates a 3 shard, 9 node cluster with three subclusters. Each subcluster has nodes that subscribe to 3 shards:

Because the number of shards cannot change in a Vertica database running in Eon Mode, Vertica recommends that you run some initial experiments in UAT clusters to correctly estimate the number of shards and the total depot size.

Cluster sizing guidelines

In Enterprise Mode, sizing depends a lot on the total compressed data size (we assume 2:1 compression with high availability). For the number of nodes, you divide the total compressed data size by the storage capacity of each node. Depending on the expected concurrency and the amount of data on which the average query will operate, you will pick instance types that have sufficient CPU and memory. Vertica recommends a minimum of 8 cores and 64GB RAM, a minimum of 3 nodes for high availability, and that you put no more than 10TB of compressed data on any node.

In Eon Mode communal storage is like a data lake that can store unlimited data. Sizing for Eon Mode depends on the following factors:

Working Data Size: The amount of data on which most of your queries will operate.

Depot Size: To get the fastest response time for frequently executed short queries, you want the most frequently read data from your working data set to be in your depot at all times. Performance of queries directly against data in communal storage depends on the amount of data read by queries from communal storage. Vertica has optimizations like predicate pushdown to read only required data blocks for queries against the depot and communal storage. Our internal testing found that TPC-H queries were 2 times faster for data in depot as opposed to communal storage. Each Vertica node needs local storage for the depot, catalog, and temp space that is required for query execution. Vertica recommends a minimum local storage capacity of 600GB per node out of which 60% can be reserved for the depot and the other 40% can be shared between catalog and temp space. The size of the depot on each node must be large enough to hold data loaded in Vertica that is not committed plus the size of the data that is concurrently being loaded into Vertica, divided by the number of nodes in the cluster. The temp space size must be large enough to hold temporary files written during query processing plus the two times size of the data that is concurrently loaded in Vertica, divided by the number of nodes in the cluster.

Concurrency: The initial Vertica cluster has the number of nodes equal to the number of shards. You can pick the instance type based on the expected concurrency from the initial cluster. The following chart lists recommendations for choosing instance types and the number of nodes. In Eon Mode, you can achieve elastic throughput scaling by adding more nodes to add more subclusters. This feature is the highlight of the Eon Mode architecture.

You may choose instance types that support instance storage or EBS volumes for your depot. We recommend either r4 or i3 instances for production clusters. To pick an instance type and the number of nodes for the initial cluster, you must know what your working data set is and the required size of the depot. You can create an initial cluster with more nodes to have a larger depot and/or to get better parallelism with processing complex queries. The initial cluster configuration will have the number of shards matching the number of nodes. We recommend that you do not create depots with more than 1TB per node.

The following are recommended instance types based on the working data size:

You may choose instance types that support ephemeral instance storage or EBS volumes for your depot depending on the cost factor and availability. It is not mandatory to have an EBS backed depot because in Eon Mode a copy of the data is safely stored in communal storage.

The following table has information you can use to make a decision on how to pick instances with ephemeral instance storage or EBS only storage. Check with AWS for the latest prices.

Let’s take a look at some use cases to figure out how to size an Eon Mode cluster.

Use Case 1: Save compute by provisioning close to need, rather than peak times

This example highlights the elastic throughput scaling feature of Eon Mode to scale a cluster from 5 to 25 nodes with 5 subclusters of 5 nodes each. In this use case, we want to support a high concurrent, short query workload on a medium-sized working data set. We will create an initial cluster of type medium with 5 nodes and 5 shards. We can scale out throughput on demand by adding one or more subclusters during certain days of the week or for specific date ranges when we are expecting a peak load. The cluster can then be shrunk back to its initial size by dropping nodes for normal workloads. With Vertica in Eon Mode, you save compute by provisioning close to the need, rather than provisioning for the peak times.

Use Case 2: Complex analytic workload requires more compute nodes

This example showcases the idea that complex analytic workloads on large working data sets benefit from an initial cluster of type large with a high shard count. We will create an initial cluster of type large with 20 nodes and 20 shards. As needed, you can add and remove nodes to improve throughput scaling.

Use Case 3: Workload isolation

This example showcases the idea of having separate subclusters to isolate ETL and report workloads. We will create an initial cluster of type large with 10 nodes and 10 shards used to service queries, and add another 10 node subcluster of type medium for supporting ETL workloads. You may need to configure the network load balancer from AWS to separate the ETL workload from SELECT queries. Workload isolation can also be useful for isolating different users with varying Vertica skills.

Use Case 4: Shrink your cluster to save costs

To shrink the cluster size, drop nodes from the cluster and Vertica will automatically re-balance the shards among the remaining nodes. When you shrink the cluster to a size smaller than the initial cluster size, the nodes may subscribe to more than two shards, having the following impact:

• The catalog size will be larger because nodes are subscribing to more shards.

• The depot will be shared by more shard subscriptions, which may lead to the evictions of files.

• Each node will process more data, which may have performance impact on queries.

For more information, see Using Eon Mode in the Vertica documentation.