Using Druid as a linked data source in Metatron Discovery

Created with Sketch.

Using Druid as a linked data source in Metatron Discovery

5
(1)

When creating a dashboard from a data source already loaded in Druid, it is often difficult to visualize with Metatron Discovery when complex queries such as subquery are required. In such a case, it can be solved by creating a Linked Datasource after creating an appropriate query using Druid SQL.

1. Data Set

Use the NYC Taxi zone data and motor collision data and queries used in the previous example to create a Discovery Map chart. See the article below for how to get the data(Refer URL : https://metatron.app/2020/08/24/kepler-gl-on-druid-spatial-query/).

The query used in the previous example is as follows. The query below is to get the result of two datasources through a spatial join. To get the same results with Metatron Discovery, we need to create a new data source from the query.

WITH NYC AS (
   SELECT 
		t1.Borough, t1.LocationID, t1.Zone, t1.service_zone, "NUMBER OF CYCLIST INJURED"
	FROM 
		"druid"."motor_vehicle_collisions_sample3" t2 JOIN "druid"."nyc_boundaries" t1
	ON 
		ST_Contains( t1.Zone, t2.LOCATION )
)
SELECT 
	NYC.LocationID, NYC.Borough, NYC.service_zone, t3.Zone as geom, count(*) as number, sum(NYC."NUMBER OF CYCLIST INJURED") as "cyclist injured"
FROM 
	NYC join "druid"."nyc_boundaries" t3 on (NYC.LocationID = t3.LocationID)
GROUP BY 
	NYC.LocationID, NYC.Borough, NYC.service_zone, t3.Zone

2. Creating Linked datasource

Let’s create a linked datasource using the above Druid SQL.
First, when creating a datasource, select the Linked datasource option and set the data source type to Druid. And then enter the Broker URL for Host and Port.

Next, use the above query in the Query tab in the “Select data” step.

After that, in the step of selecting the Datasource schema, select the geom type as Polygon. EPSG:4326 is selected for the coordinate system, and indexing is optional.

Linked datasource is a temporary datasource, and expiration time must be set. In this example, it is set to the default value.

Name the datasource and complete datasource creation. Since it is a linked datasource, the ingestion step is ignored.

Create a datasource using the following query in a similar way.

SELECT 
		t2.*, t1.LocationID
	FROM 
		"druid"."motor_vehicle_collisions_sample3" t2 JOIN "druid"."nyc_boundaries" t1
	ON 
		ST_Contains( t1.Zone, t2.LOCATION )

3. Interconnecting the created datasources

Now using this data we can create dashboards, and it is also possible to correlate with other datasources using Metatron Discovery’s association datasources function. Datasources are irrelevant to each other, but it is possible to connect datasources to each other by using the Association or Join function of Discovery. Use the LocationID field for data source connection.

4. Creating dashboard

Create a Map chart using NYC Taxizone and a Bar chart using Motor Collsion. Bar chart expresses the number of accidents by LocationID. If you select a LocationID with a lot of accidents in the dashboard created in this way, the screen that displays the area including the corresponding LocationID on the map is shown below.

5. More things to think about

Metatron distributed Druid supports CTAS in Druid SQL format. CTAS can be used in the following syntax format.

create [temporary] table "table name" as
select ...

In this way, it is also possible to create a dashboard after temporarily or permanently creating a Druid datasource.

How useful was this post?

Click on a star to rate it!

Average rating 5 / 5. Vote count: 1

No votes so far! Be the first to rate this post.

As you found this post useful...

Share this post on your social media!

We are sorry that this post was not useful for you!

Let us improve this post!

Tell us how we can improve this post?

 

Leave a Reply

Your email address will not be published. Required fields are marked *