Joining Druid datasources
In general, joining between tables in relational database is very important and must be performed. In a NoSQL-based database such as Druid, the operation such as Join is not required, but depending on the situation, it may be necessary to refer to values from other data sources (for example, to check the reference values of coded fields). For the same reason, Aapache Druid also provides join in two ways.
- Query time Lookup
- Join operation
Query time lookup can be used only in a single key-value map, and it is difficult to use in adhoc query environment due to the fact that there are many things to prepare in advance. Therefore, join operation between datasources is mainly used in adhoc query environment. Metatron distributed Druid extends the join function of Apache Druid to provide some additional functions.
- Right/Full outer join support
- Supports join algorithms other than Hash-join such as Sort-merge
For more detailed support, please refer to the previous blog “Running TPC-H using SQL in Druid“.
So let’s use the SQL query for the Druid datasource in Discovery’s workbench. To use Druid SQL in Discovery’s workbench, you need to create a connection for Druid. Go to Management->Data Storage->Data Connection to create a new Data Connection.
Create a data connection by entering the broker url and port for Druid as shown in the figure. After that, you can finish the setup by sharing the created Data Connection to the desired workspace. Now, go to the workspace where Data Connection is shared and create a workbench for Druid and execute the SQL statement.
We are sorry that this post was not useful for you!
Let us improve this post!
Tell us how we can improve this post?