myVertica  

Vertica Blog

Skipping Records with Unspecified JSON Fields

Serge Bonte and Jim Knicely authored this post.

Vertica provides a built-in file parser named FJSONPARSER that parses and loads a JSON file. This file can contain either repeated JSON data objects (including nested maps) or an outer list of JSON elements.

For a flex table, the parser stores the JSON data in a single-value VMap.

For a hybrid or columnar table, the parser loads data directly in any table column with a column name that matches a key in the JSON source data.

Load the JSON Data into Vertica

Let’s look at an example JSON file:

dbadmin=> \! cat /home/dbadmin/some_data.json {"age": 5, "name": "Tim"} {"age": 7, "name": "Joe", "zip": 92022} {"age": 8} {"name" : "Jim"} First, you need to create a Vertica table to hold the contents of the JSON data:

dbadmin=> CREATE TABLE some_data1(age INT, name VARCHAR); CREATE TABLE Copy the data from the JSON file into the new table:

dbadmin=> COPY some_data1 FROM '/home/dbadmin/some_data.json' PARSER fjsonparser(); Rows Loaded ------------- 4 (1 row) dbadmin=> SELECT * FROM some_data1; age | name -----+------ 5 | Tim 8 | | Jim 7 | Joe (4 rows)

Skip the JSON Records if Field not Specified

To skip loading records when the value for a field is not specified, add a NOT NULL constraint to the corresponding table column. Vertica does not load that record into the table.

Again, create a Vertica table. This time specify the NOT NULL constraint to the age column:

dbadmin=> CREATE TABLE some_data2(age INT NOT NULL, name VARCHAR); CREATE TABLE Copy the JSON data into this table. Note that Vertica does not load the record that does not contain an age value the some_data2.json.

dbadmin=> COPY some_data2 FROM '/home/dbadmin/some_data.json' PARSER fjsonparser(); Rows Loaded ------------- 3 (1 row) dbadmin=> SELECT * FROM some_data2; age | name -----+------ 5 | Tim 8 | 7 | Joe (3 rows)

For More Information

In the Vertica documentation, see:

Load JSON Data

FJSONPARSER