Vertica in Eon Mode: Tables, Parameters, Functions, and Useful Queries

Posted May 3, 2018 by Soniya Shah, Information Developer

High angle view of Beijing Guomao.
This blog post was authored by Rakesh Bankula. The purpose of this document is to list new system tables, configuration parameters and meta-functions that were added for Vertica Eon mode. This document also has sample queries you can run on the new system tables to get up to speed with Eon mode. This document assumes that you have a basic understanding of Eon mode architecture. You can find mode details about Eon mode architecture in Eon Mode Architecture in the Vertica documentation.

Shard Related System Tables

Shards are segments of the data in your communal storage location. The shards in your communal storage location are similar to a collection of segmented projections in an Enterprise mode database. Each node in your database subscribes to a subset of the shards in the communal storage location. The following shard related system tables were added in Eon mode:
Table Name Table Description
shards Lists the shards in the database.
node_subscriptions Lists the nodes and its corresponding shards subscriptions in the database.
session_subscriptions Lists the session subscriptions in the database. Shows the session’s view of the node subscriptions. Session subscriptions are used when running queries within the session.
catalog_subscription_changes Lists node subscription state changes.
node_subscription_change_phases Lists details about node subscription changes.

Depot Related System Tables

The depot provides the node with a local copy of data stored on the communal storage. Each node in an Eon database has its own depot. This intermediate layer of data storage provides a faster copy of the data that is local to the node. Data that is frequently used in your queries takes priority in your depot. If the data for a query is not in the depot, then Vertica might need to read data from communal storage. The following depot related system tables were added in Eon mode:
Table Name Table Description
depot_fetches Lists the files fetched to the depot.
depot_uploads Lists the files uploaded to communal storage from the depot.
depot_evictions Lists the files evicted from the depot.
vs_depot_lru Lists the details of all the depot files.
vs_depot_size Lists the current and maximum size in bytes of the depots on all nodes.
dc_file_reads Lists the files read from the depot and/or S3 for each query.
dc_communal_cleanup_records Lists the data files detected on communal storage that are not referenced in the catalog.

New Configuration Parameters

Configuration Parameter Description
UseDepotForReads Checks if the files are in the depot and reads from there. Disabling this parameter will force queries to use data from communal storage and have some performance impact. This flag is useful to test performance characteristics for data not in the depot.
UsePeerToPeerDataTransfer Sends data to all the subscribers for a shard at data load time.
EnableDepotWarmingFromPeers Warms up the depot when a node comes up. Disabling this parameter will have a performance impact for queries serviced by nodes with the cold depot.
DelayForDeletes The time interval to wait before actually deleting a file from shared storage. Default value is 2 hours.
ShardRecoveryMaxIterations Maximum number of iterations before the shard catalog installer is forced over concurrent loads. Default value is 10.
DropFailedToActivateSubscriptions Drops a subscription that failed to transition from PASSIVE TO ACTIVE.

New Metafunctions

Metafunction Description
clear_data_depot() Deletes all the data from the depot for the given table. Deletes all the data in the depot, if no table name is passed.
alter_location_size(‘path’, ‘node’,’size’) or alter_location_size(‘path’, ‘ ‘,’size’) Changes the size of the depot. Vertica will delete the least recently accessed files, if the new size is smaller. If the node is left empty, all nodes in the cluster will be changed.
finish_fetching_files() Blocks the session until all the files in the fetcher queue to be downloaded from communal storage are downloaded.
rebalance_shards() Used to re-balance shard subscriptions between nodes when new nodes are added or removed from the cluster.

Useful Queries

Query Description SQL Statement
Query to list files read from depot and communal storage. => select node_name,transaction_id,statement_id,storage_location,storageid from dc_file_reads where transaction_id=<transaction id> and statement_id=<statement id>;
Number of bytes read from depot and communal storage (s3). This is an approximation. => select s.node_name,storage_path,sum(used_bytes) from dc_file_reads f join storage_containers s on s.sal_storage_id=f.storageid and s.node_name=f.node_name where transaction_id=<transaction id> and statement_id=<statement id> group by 1,2;
List of files queued for deletion from communal storage. => select node_name,transaction_id,statement_id,storage_location,file_name,size, queued_for_delete from dc_communal_cleanup_records;
List of files in the depot by the descending order of accessed time. => select node_name,sal_storage_id,num_accesses,size,last_access_time from vs_depot_lru order by last_access_time desc;
Query to confirm that there is no data skew between shards. => select shard_name,sum(total_row_count) total_rows ,sum(deleted_row_count) del_row_count,sum(used_bytes) used_bytes from storage_containers group by 1;
Files fetched from communal storage along with the fetch time. => select datediff('ms',start_time,end_time) as duration_millisec, source_node, node_name, file_size_bytes, destination_file, source_file from depot_fetches where source_node = '';
Peer to peer files fetched along with the fetch time. => select datediff('ms',start_time,end_time) as duration_millisec, source_node, node_name, file_size_bytes, destination_file, source_file from depot_fetches where source_node != '';
Files uploaded to communal storage along with the upload time. => select datediff('ms',start_time,end_time) as duration_millisec,* from depot_uploads order by 1 desc;
Files evicted from the depot, which were accessed in the last 5 minutes before eviction. => select datediff('mi',last_access_time,time),* from depot_evictions where datediff('mi',last_access_time,time) > 5 order by 1;
Query that checks the percentage of space presently used in the storage location assigned to the depot. => select node_name,storage_path,storage_usage,disk_space_used_mb,disk_space_free_mb,disk_space_free_percent from disk_storage where storage_usage='DEPOT';
Finds the number of queries executed in each fault group in a given time window. select g.parent_name fault_group,count(*) Number_of_queries from dc_requests_issued i join fault_groups g on i.node_name=g.member_name where i.time between '' and '' group by 1 order by 1;
To pre-fetch files from communal storage to the depot for a query. explain local<query>;