Using Druid as a linked data source in Metatron Discovery
When creating a dashboard from a data source already loaded in Druid, it is often difficult to visualize with Metatron Discovery when complex queries such as subquery are required. In such a case, it can be solved by creating a Linked Datasource after creating an appropriate query using Druid SQL.
1. Data Set
Use the NYC Taxi zone data and motor collision data and queries used in the previous example to create a Discovery Map chart. See the article below for how to get the data(Refer URL : https://metatron.app/2020/08/24/kepler-gl-on-druid-spatial-query/).
The query used in the previous example is as follows. The query below is to get the result of two datasources through a spatial join. To get the same results with Metatron Discovery, we need to create a new data source from the query.
WITH NYC AS ( SELECT t1.Borough, t1.LocationID, t1.Zone, t1.service_zone, "NUMBER OF CYCLIST INJURED" FROM "druid"."motor_vehicle_collisions_sample3" t2 JOIN "druid"."nyc_boundaries" t1 ON ST_Contains( t1.Zone, t2.LOCATION ) ) SELECT NYC.LocationID, NYC.Borough, NYC.service_zone, t3.Zone as geom, count(*) as number, sum(NYC."NUMBER OF CYCLIST INJURED") as "cyclist injured" FROM NYC join "druid"."nyc_boundaries" t3 on (NYC.LocationID = t3.LocationID) GROUP BY NYC.LocationID, NYC.Borough, NYC.service_zone, t3.Zone
2. Creating Linked datasource
Let’s create a linked datasource using the above Druid SQL.
First, when creating a datasource, select the Linked datasource option and set the data source type to Druid. And then enter the Broker URL for Host and Port.
Next, use the above query in the Query tab in the “Select data” step.
After that, in the step of selecting the Datasource schema, select the geom type as Polygon. EPSG:4326 is selected for the coordinate system, and indexing is optional.
Linked datasource is a temporary datasource, and expiration time must be set. In this example, it is set to the default value.
Name the datasource and complete datasource creation. Since it is a linked datasource, the ingestion step is ignored.
Create a datasource using the following query in a similar way.
SELECT t2.*, t1.LocationID FROM "druid"."motor_vehicle_collisions_sample3" t2 JOIN "druid"."nyc_boundaries" t1 ON ST_Contains( t1.Zone, t2.LOCATION )
3. Interconnecting the created datasources
Now using this data we can create dashboards, and it is also possible to correlate with other datasources using Metatron Discovery’s association datasources function. Datasources are irrelevant to each other, but it is possible to connect datasources to each other by using the Association or Join function of Discovery. Use the LocationID field for data source connection.
4. Creating dashboard
Create a Map chart using NYC Taxizone and a Bar chart using Motor Collsion. Bar chart expresses the number of accidents by LocationID. If you select a LocationID with a lot of accidents in the dashboard created in this way, the screen that displays the area including the corresponding LocationID on the map is shown below.
5. More things to think about
Metatron distributed Druid supports CTAS in Druid SQL format. CTAS can be used in the following syntax format.
create [temporary] table "table name" as select ...
In this way, it is also possible to create a dashboard after temporarily or permanently creating a Druid datasource.
We are sorry that this post was not useful for you!
Let us improve this post!
Tell us how we can improve this post?