|Share this article:|
Vertica QuickStart for Pentaho BI
Note The Vertica QuickStart apps are being migrated to a new page. They will be available again soon.
Click here to read this document in PDF format.
What is a QuickStart Application?
The Vertica QuickStarts are sample applications that show how complementary technologies can work together to deliver outstanding benefits to end users. Each QuickStart uses Vertica Analytic Database with a different BI or ETL tool from a Vertica technology partner.
The QuickStarts are available for download free of charge.
About this Document
This document explains how to deploy and use the Vertica QuickStart for business intelligence with Pentaho. The document includes the setup information that you need to get up and running, and it provides an overview of the Pentaho dashboards and the Vertica data source.
Vertica QuickStart BI with Pentaho Overview
The Vertica QuickStart for Pentaho BI is a sample application implemented as a set of Pentaho dashboards powered by Vertica Analytic Database. The dashboards present sample retail data for analysis. The QuickStart shows how retail companies could use Vertica and Pentaho to quickly explore, visualize, and gain insight into their data stored in Vertica.
For a quick introduction to Vertica QuickStart BI with Pentaho, watch this short video:
The Vertica QuickStart for Pentaho BI requires Vertica database server with a standard installation of the VMart example database, Vertica client with JDBC, Pentaho Server, and a web browser for running the Pentaho dashboards.
The QuickStart was created using Pentaho 5.4 and the Vertica Analytic Database.
Install the Software
To install the software that is required for running the QuickStart, follow these steps:
Install the Vertica Database Server
Vertica database server runs on Linux platforms. If you do not have Vertica, you can download the RPMs or a virtual machine free of charge from the Vertica Community Portal at https://my.vertica.com/.
To download and install Vertica Community Edition:
- On myVertica, under Vertica Community Edition, click Signup Now to register for a Community Edition Account.
- Provide your information and click Signup.
- Follow the on-screen instructions to download and install Vertica Community Edition.
Install the VMart Example Database
The Vertica QuickStart for Pentaho assumes a default installation of the Vertica VMart example database. Follow the tutorial in the Vertica Getting Started Guide to install VMart.
In the tutorial:
- Follow the steps in Installing and Connecting to the VMart Example Database.
- Note the default VMart database location shown in VMart Database Location and Scripts.
The VMart example database includes three schemas: Public, Online Sales, and Store. The schemas are interrelated and share many dimensions. For details, see VMart Example Database Schema, Tables, and Scripts.
Install Pentaho Server
Pentaho Server is available for Windows, Mac OS, and Linux. See the Pentaho Compatibility Matrix:
To install Pentaho Server 5.4:
- Navigate to the following URL:
- Click Download to download the version of Pentaho for your platform.
- Extract the contents of the download file. The files are extracted into the following directory:
To start Pentaho Server, execute the start_pentaho command. For example, on Windows:
To stop Pentaho Server, execute the stop_pentaho command. For example, on Windows:
Install the JDBC Client Driver
Before you can connect to Vertica using Pentaho you must download and install a Vertica client package. This package includes the Vertica JDBC driver that Pentaho uses to connect to Vertica.
To download and install the JDBC driver:
- Go to http://www.vertica.com/resources/vertica-client-drivers/ for the latest client drivers.
- Download the Vertica client package for your platform.
- Place the jar file in the following directory:
<Pentaho Home >\biserver-ce\tomcat\lib
For example, on Windows:
Note Vertica drivers are forward compatible, so you can connect to the Vertica server using previous versions of the client. For more information, see Client Driver and Server Version Compatibility in the Vertica documentation.
Connect Pentaho Server with Vertica
The following steps show the process of configuring a connection to Vertica using Pentaho Server on Windows.
- From the Pentaho Server File menu, select New, then Data Source.
- Click New Data Source.
- In the Database Connection dialog box:
- For Database Type, select Vertica 5+.
- For Access, select Native (JDBC).
- Fill out the rest of the connection details.
- Click Test to test the connection.
- When the connection succeeds, click OK.
Download the QuickStart Dashboards
To download the Pentaho QuickStart BI dashboards:
- Go to the Big Data Marketplace and log in with your Marketplace credentials.
- Select QuickStart Examples.
- Select Vertica QuickStart for Pentaho BI.
- Click Download.
- Save the zip file, zip on your computer.
Deploy the QuickStart Dashboards
To deploy the dashboards:
- Create a folder for deploying the QuickStart dashboards.
- In Pentaho Server, select Browse Files from the File menu.
- Navigate to the deployment folder that you created.
- Under Folder Actions, click Upload.
- Navigate to the folder that contains zip, the QuickStart zip file that you downloaded from the Big Data Marketplace
- Select the zip file.
The following screenshot shows the Upload action with a deployment folder called VMART.
About the QuickStart Dashboards
The QuickStart dashboards present sample business and operational data that a large retail chain might track over time. The chain operates brick-and-mortar stores and an online marketplace. It sells a wide variety of products that it purchases from different vendors.
Note The data in your dashboards will not match the data in the screen shots in this document. This is because the VMart data generator generates data randomly.
The Executive Dashboard presents a high-level view of the business data that is shown in greater detail in the other dashboards. The dashboard also allows you to take a closer look at the data for a specific state, in this case Texas. The dashboard shown here is displaying revenue from sales to resellers (companies) in 2005 and 2006. You can see at a glance that:
- Store sales were more volatile in 2005 than in 2006.
- The lowest revenue month for store sales was June 2005.
- There was a sharp drop in online sales in January and February of 2006.
By changing the Sale/Return filter from Sale to Return, you can see the the negative revenue resulting from products returned during the same time period.
Online Sales Overview
The Online Sales Overview presents an overview of the online business. In this instance, the dashboard is displaying quarterly revenue from sales to resellers in the SouthWest region in 2005 and 2006.
The filters on the bottom of the page allow you to examine the data in detail. For example, by viewing product categories, you can see that food was by far the greatest source of revenue from online sales during this time period.
Store Sales Overview
The Store Sales Overview presents an overview of the traditional business conducted in the brick-and-mortar stores owned by this retail chain. In this instance, the dashboard is displaying quarterly revenue from store sales to resellers in 2005 and 2006.
The filters on the bottom of the page allow you to examine the data in detail For example, by selecting Has Membership Card for Customer Attribute, you can see that a membership card had no effect on sales to resellers.
This Customer Dashboard presents information about customers, both individuals and resellers. In this instance, the dashboard is showing the individual customers in the SouthWest region who returned items they purchased either online or in stores in 2005 and 2006.
For additional insight into your customer base, you can view customer characteristics such as age, gender, and income by selecting a different Customer Attribute.
Call Center Dashboard
The Call Center Dashboard presents an overview of the performance of the chain’s sales personnel, both in stores and in online call centers. In this instance, the dashboard is displaying data for 2005 and 2006. It’s clear that the performance of sales personnel in stores was far more uniform than the performance of call center personnel during this time period.
The Product Dashboard presents an overview of the products sold in stores and online. In this instance, the dashboard is displaying revenue from products sold in the Southwest region in 2005 and 2006.
The Vendor Performance Dashboard presents an overview of the performance of the vendors used by this retail chain. In this instance, the dashboard is displaying Average days to deliver during 2005 and 2006 for the vendor Market Wholesale. It’s clear that this vendor’s deliveries to California were seriously delayed during this time period.
The Inventory Dashboard presents an overview of the inventory held by this retail chain. In this instance, the dashboard is showing the inventory broken down by warehouse and product category for 2005 and 2006.
Find More Information
Pentaho product overview
VMart example database
Big Data Marketplace
The Vertica QuickStart for Pentaho is intended as an example of complementary technologies: Pentaho with Vertica Analytic Database. As such, it is freely available for demonstration and educational purposes to anyone wishing to explore these technologies. The QuickStart is not a product, and is not governed by any license or support agreement.