Skip to content

Getting Started

Launch a serverless data platform in 5 Minutes

This page will help you start using ClapDB.

ClapDB is a serverless multi-modeling data analysis platform, the ClapDB can run on cloud computing vendor’s infrastructure.

clapctl

ClapDB use IaC technology provision resource, and deploying to the Cloud infrastructure. In AWS, it was CloudFormation, we provide a tool: clapctl to do make the job decently.

getting clapctl

Terminal window
curl -fsSL https://clapdb.com/install.sh | sh

how to use clapctl

Terminal window
clapctl --help

Visiting clapctl will get you detailed instructions on how to use clapctl.

configure your AWS credentials

You must make sure your credentials has AdministratorAccess

After the credentials were configured:

  • Directory~/.aws
    • config
    • credentials

the content of the config file is just like

[default]
region = ap-south-1

the content of ~/.aws/credentials just like

[default]
aws_access_key_id = <aws_access_key_id>
aws_secret_access_key = <aws_secret_access_key>

If you want a detail instruction, please check AWS Official document.

Increase your AWS Quota

AWS account default lambda Concurrent executions is 10, for better performance and low latency, you can increase the quota to 1000.
Request a quota increase in the AWS Management Console or use clapctl.

Terminal window
clapctl quota --set 1000

After AWS team approved, you can use the clapdb with 1000 concurrent executions.

Deploy your first ClapDB instance
Terminal window
clapctl deploy -n your_cluster_name

after deploy, you need register your account, and get a license for trial.

Quick Start with predefined dataset

ClapDB official provides some datasets for you to get started quickly, you can just use the clapctl to import the dataset.

Terminal window
clapctl dataset --list
# import the dataset that you want to try to your cluster
clapctl dataset --import dataset_name -n your_cluster_name

the dataset importing process will take a few minutes, after the dataset imported, you can start to query the data. the dataset importing includes DDL and ingesting, clapctl do it for you, if you want to use your own dataaset, please check the detail below sections.

HTTP Protocol / Data API

ClapDB is a serverless Database that running on AWS lambda by now, because of limitation of AWS lambda, ClapDB Free/Pro just like other serverless database(such as aws dynamodb / aws aurora), support http protocol. Please convert to ClapDB Enterprise if you want to use PostgreSQL client connect to ClapDB.

Create a Table

any ClapDB cluster will have a default database: local, and with a schema : public

Terminal window
clapctl -n your_cluster_alias sql -s "CREATE TABLE demo_table (
log_time timestamp,
client_ip ipv4,
request text,
status_code uint16,
object_size uint64
);"

DDL Query

Get Databases && Schemas

Terminal window
clapctl -n your_cluster_name sql -s "show databases;"
clapctl -n your_cluster_name sql -s "show schemas;"

Be aware, the result is in JSON format.

Get Tables

Terminal window
clapctl -n your_cluster_name sql -s "show tables;"

Show Schema

Terminal window
clapctl -n your_cluster_name sql -s "show create table demo_table;"

Drop Table

Terminal window
clapctl -n your_cluster_name sql -s "drop table demo_table;"

Alter Table

TODO

Ingesting Data

ClapDB Free supports importing data from S3, the Pro version additionally supports the HTTP protocol, and the Enterprise version can synchronize data using SQL statements. The ClapDB Enterprise provides data import tools for various languages and protocols.

As a serverless database, ClapDB is prioritized for ingesting data that stored in S3 or from SQS.

if you want to ingest your own data, you must handle the data schema (also called DDL), and data itself (also called DML).

please checkout the Ingesting Data for more detail.

Query

ClapDB Support Data API by default

please check out the link below.

Why Data API is best choice for cloudnative service

ClapDB’s Data API format is like below

Terminal window
curl -X 'POST' -d 'select count(*) from hdfs_logs' -H 'Authorization: Basic cm9vdC5jbGFwZGI6cDAvakInaFk=' -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' -H 'X-Pset-Value: null=NULL' 'https://qjh3nsq3yl.execute-api.ap-south-1.amazonaws.com/?database=local'

if you want to know how to get correct format of your query, please use clapctl

Terminal window
clapctl sql -n demo-for-rookie --verbose -s "select count(*) from hdfs_logs"

then the clapctl will dump the cURL command example for you.

SQL over TCP protocol

If you upgrade to ClapDB enterprise version, use legacy sql protocol is also enabled. just use any PostgreSQL client in your favorite language.

SQL Syntax Support

Data API is JSON format including authorication and SQL query. In this section, we will introduce ClapDB’s SQL syntax supported.

just like

Terminal window
curl -X 'POST' -d 'select count(*) from hdfs_logs' -H 'Authorization: Basic cm9vdC5jbGFwZGI6cDAvakInaFk=' -H 'Cache-Control: no-cache' -H 'Content-Type: application/json' -H 'X-Pset-Value: null=NULL' 'https://qjh3nsq3yl.execute-api.ap-south-1.amazonaws.com/?database=local'

Does ClapDB support SQL syntax? Yes, ClapDB support SQL syntax, and the SQL syntax is compatible with PostgreSQL.

check compatible difference between ClapDB and PostgreSQL in below link.

PostgreSQL compatible table

Relational Query Support

Relation Query Support

Time-Series Query Support

Time-series Query Support

Full-Text Query Support

Full-Test Query Support

Semi-Structure Query Support

Semi-Structure Query Support

Hint engineering for ClapDB

Hint for Query Performance