Looking Under the Hood at Vertica Queries

Three 3D arrows, different colors pointing in different directions

When you submit a query to Vertica, you want it to execute as quickly and efficiently as possible. The query optimizer creates a plan that is designed to do just that. The directives in the query plan determine your query’s run-time performance and resource consumption, but the properties of your projections and the system parameters also impact the query’s performance.

But the query plan is just thata plan. You really want to know what actually happened while the query executed. What projections did the query use? What system resources did the query acquire? Can adjust any parameters to make the query run faster?

Those are a lot of questions. Read on to get some answers.

What projections did your query access?

Projections, of course, are the physical location where Vertica stores your data. For best performance, you should always run Database Designer to optimize the projections that your frequently executed queries need to access.

To find out which projections were used for a particular query, you could review the query plan, but there’s an easier way: query the QUERY_PROFILES and PROJECTION_USAGE system tables. These system tables will give you information about specific queries, as well as usage information. When joined, these two tables provide you with an overall picture of what projection a query accessed.

Identifying the query

To see the projections used by a particular query, first, get the statement and transaction IDs from the QUERY_PROFILES system table. The following results list the executed queries in reverse chronological order, with the latest query listed first:
=> SELECT query, transaction_id, statement_id, query_start FROM query_profiles ORDER BY query_start DESC; -[ RECORD 1 ]--+------------------------------------------------------------------------------ query | SELECT avg(employee_age) FROM public.employees GROUP BY employee_location; transaction_id | 45035996278252303 statement_id | 1 query_start | 2016-02-23 10:03:00.270725-05 -[ RECORD 2 ]--+------------------------------------------------------------------------------ query | SELECT * FROM query_profiles; transaction_id | 45035996278252113 statement_id | 18 query_start | 2016-02-23 10:01:59.563359-05 -[ RECORD 3 ]--+------------------------------------------------------------------------------ query | GRANT EXECUTE ON FUNCTION public.GetData() TO PUBLIC; transaction_id | 45035996278251057 statement_id | 1 query_start | 2016-02-23 05:10:15.164605-05
Now that we have a statement and transaction id for a couple queries, we can use the ids to find the projections associated with them.

Identifying a query’s associated projections

Let’s identify the projection used by the first query from the QUERY_PROFILES example in the previous section:

=> SELECT avg(employee_age) FROM public.employees GROUP BY employee_location;

To find the projection associated with this query, join the QUERY_PROFILES and PROJECTION_USAGE system tables on the statement_id and transaction_id columns, and then specify the statement id and transaction id in a WHERE clause.

The following natural join returns the text of the executed query and the name of the projection it accessed. You can see that the SELECT query used the employees_b0 projection.
=> SELECT query_profiles.transaction_id, query_profiles.statement_id, query_profiles.query, projection_usage.projection_name FROM query_profiles, projection_usage WHERE ( projection_usage.transaction_id = query_profiles.transaction_id AND projection_usage.statement_id = query_profiles.statement_id ) AND ( projection_usage.transaction_id = 49539595901078494 AND projection_usage.statement_id = 20 ); -[ RECORD 1 ]---+--------------------------------------------------------------------------- transaction_id | 49539595901078494 statement_id | 20 query | SELECT (avg(employee_age) FROM public.employees GROUP BY employee_location; projection_name | employees_b0

Did my query insert data?

The PROJECTION_USAGE system table can also tell you whether data was inserted into or retrieved from a projection by a particular query. The io_type column contains one of two possible values:

inputSELECT statements
outputINSERT and COPY statements

Let’s return to our example query:

=> SELECT avg(employee_age) FROM public.employees GROUP BY employee_location;

Not surprisingly, for this SELECT query, the io_type column is input because that query retrieved data from the employees_b0 projection. In the following results, you can see that other queries inserted or copied data into the specified projections:
=> SELECT transaction_id, statement_id, io_type, projection_name FROM projection_usage ORDER BY query_start_timestamp DESC; -[ RECORD 1 ]---+------------------------------------------------------ transaction_id | 49539595901078494 statement_id | 20 io_type | output projection_name | employees_b0 -[ RECORD 2 ]---+------------------------------------------------------ transaction_id | 45035996278259395 statement_id | 3 io_type | input projection_name | summary_b0 -[ RECORD 3 ]---+------------------------------------------------------ transaction_id | 45035996278256653 statement_id | 1 io_type | input projection_name | sales_b0 -[ RECORD 4 ]---+------------------------------------------------------ transaction_id | 45035996278256651 statement_id | 1 io_type | output projection_name | locations_b0 ...
The PROJECTION_USAGE system table has a lot of information about the projections your queries used. For complete information, see PROJECTION_USAGE in the Vertica documentation.

What system resources did my query use?

Now suppose you are evaluating your system and query performance. What system resources do your queries acquire? Which queries consume resources that might affect performance?

You can write a natural join that provides resource usage information from the RESOURCE_ACQUISITIONS system table. The following query retrieves the:

Number of open file handles used
Number of threads used
Amount of memory (kb) acquired
Duration of the resource request (ms)

=> SELECT query_profiles.transaction_id, query_profiles.statement_id, query_profiles.query, resource_acquisitions.open_file_handle_count, resource_acquisitions.thread_count, resource_acquisitions.memory_inuse_kb, resource_acquisitions.duration_ms FROM query_profiles, resource_acquisitions WHERE ( resource_acquisitions.transaction_id = query_profiles.transaction_id AND resource_acquisitions.statement_id = query_profiles.statement_id ) AND ( query_profiles.transaction_id = 49539595901078494 AND query_profiles.statement_id = 1 ); -[ RECORD 1 ]----------+------------------------------------------------------------------------ transaction_id | 49539595901078494 statement_id | 1 query | SELECT (avg(employee_age) FROM public.employees GROUP BY employee_location; open_file_handle_count | 3 thread_count | 6 memory_inuse_kb | 70760 duration_ms | 22Can I adjust any parameters to make the query run faster?

You can use the information in the RESOURCE_AQUISITIONS system table to determine if you want to adjust your resource pool or system parameters for queries that consume more or fewer resources. Read more about resource pool parameters and how you can use the CREATE RESOURCE POOL and ALTER RESOURCE POOL statements to configure these parameters.

For More Information

For complete information about the system tables used in these example, in the Vertica documentation, see:

Product Overview

Vertica Announces Vertica 12 for Future-Proof Analytics

Harness the Internet of Things (IoT)

Support & Services

Partners

Vertica Inside – Embedded Analytics at Scale

Resources

About Vertica

Stay Informed

Looking Under the Hood at Vertica Queries

What projections did your query access?

Identifying the query

Identifying a query’s associated projections

Did my query insert data?

What system resources did my query use?

For More Information

About the Author

Search The Blog

Explore Popular Topics

Subscribe For Email Updates

Product Overview

Vertica Announces Vertica 12 for Future-Proof Analytics

Harness the Internet of Things (IoT)

Support & Services

Partners

Vertica Inside – Embedded Analytics at Scale

Resources

About Vertica

Stay Informed

Looking Under the Hood at Vertica Queries

What projections did your query access?

Identifying the query

Identifying a query’s associated projections

Did my query insert data?

What system resources did my query use?

For More Information

About the Author

Search The Blog

Explore Popular Topics

Subscribe For Email Updates

See More Analytics Posts

Identifying a query’s associated projections