Skip to content

Optimize query performance with hint

Like most analytic databases, ClapDB uses columnar storage to speed up retrieval. However, columnar storage often underperforms for cardinality-oriented queries. Therefore, ClapDB utilizes custom cardinality-oriented indexes to accelerate analysis. The query optimizer in ClapDB automatically makes the right choices, selecting the appropriate storage medium for each query.

Hint engineering?

For a complex system like a database, its core is the query optimizer. The query optimizer formulates an optimal or near-optimal execution plan based on a declarative query. However, the query optimizer is not omniscient. Due to performance and latency reasons, it only has access to limited system information, and many choices are related to future events. If users can inform the optimizer through hints, the optimizer can make better choices.

BTW, the hint can be used for query profiling. By giving hints to the query optimizer, users can control the optimizer’s behavior.

How ClapDB do the right decision for the trade-off between columnar and index storage?

In the database’s world, a basic query includes a SELECT statement, a FROM clause, and a WHERE clause. Generally speaking, SELECT statements are very well-suited for retrieving values using columnar storage, while WHERE statements are better suited for filtering using indexes, but if a query contains both, the IO cost may be very high. The brain of a database is the optimizer, which tries to gather system conditions and historical information to find the most optimal execution plan. For ClapDB, an important function of the optimizer is to decide whether to use columnar storage or indexes.

Why ClapDB need hint?

Since system information is constantly changing and probabilistic estimates are based on randomness, the query optimizer cannot guarantee that it will select the storage type a user might prefer for a single query. Therefore, using hints can inform the optimizer of the user’s preferences, allowing it to tailor the execution plan accordingly.

How to give hints to ClapDB?

select x_int8 from T /* prefer=columnar */
select x_int8 from T /* prefer=index */

Hint future

To be honest, we do not like hints. We believe that the query optimizer should be able to make the right choices. However, the query optimizer is not omniscient, and many choices are related to future events. ClapDB will continue to support hints, and ClapDB Enterprise will provide finer hints to users for making optimizer do better choices for special cases.

Conclusion

  1. Hint is a way to inform the query optimizer to make better choices.
  2. Hint is a double-edged sword. It can make the optimizer do better choices, but it can also make the optimizer do worse choices. Up to the user.
  3. ClapDB will always support hints.