Distributing Named Sequences
When you create a named sequence, the CACHE parameter determines the number of sequence values each node maintains during a session. The default cache value is 250K, so each node reserves 250,000 values per session for each sequence. The default cache size provides an efficient means for large insert or copy operations. Specifying a smaller number of cache values can impact performance of large loads, since Vertica must create a new set of cache values whenever more are required. Getting more cache for a new set of sequence values requires Vertica to perform a catalog lock. Such locks can adversely affect database performance, since some activities, such as data inserts, cannot occur until Vertica releases the lock.
By default, when the cache is initially empty, the initiator node requests and reserves cache for all nodes in a cluster. You can change this default so that each node requests its own cache. To change the default, set the parameter ClusterSequenceCacheMode to 0. For information on how Vertica requests and distributes cache among all nodes in a cluster, refer to How Vertica Allots Cache for Sequencing.
Effects of Distributed Sessions
Vertica distributes a session across all nodes. After you create a named sequence, the first time any cluster node executes a NEXTVAL()
statement within a query, the node requests its own cache of sequence values. The node then maintains that set of values for the current session. Other nodes executing a NEXTVAL()
statement create and maintain their own cache of sequence values.
During a session, nodes can increment sequence values from NEXTVAL()
statements at different rates. This behavior results in the sequences from a NEXTVAL statement on one node not being sequential with sequence values from another node. Each sequence is guaranteed to be unique, but can be out of order with a NEXTVAL statement executed on another node. Regardless of the number of calls to NEXTVAL and CURRVAL, Vertica increments a sequence only once per row. If multiple calls to NEXTVAL occur in the same row, the statement returns the same value.
If sequences are used in join statements, Vertica increments a sequence once for each composite row output by the join.
Calculating Named Sequences
Vertica calculates the current value of a sequence as follows:
- At the end of every statement, the state of all sequences used in the session is returned to the initiator node.
- The initiator node calculates the maximum CURRVAL of each sequence across all states on all nodes.
- This maximum value is used as CURRVAL in subsequent statements until another NEXTVAL is invoked.
Losing Sequence Values
Sequence values in cache can be lost in the following situations:
- If a statement fails after NEXTVAL is called (thereby consuming a sequence value from the cache), the value is lost.
- If a disconnect occurs (for example, dropped session), any remaining values in cache that have not been returned through NEXTVAL are lost.
- When the initiator node distributes a new block of cache to each node where one or more nodes has not used up its current cache allotment. For information on this scenario, refer to How Vertica Allots Cache for Sequencing.
To recover lost sequence values, you can run an ALTER SEQUENCE command to define a new sequence number generator, which resets the counter to the correct value in the next session.