Ingesting
Where the data comes from
ClapDB is a data warehouse, which means it is a central repository for all the data that your organization collects. This data can come from a variety of sources.
In ClapDB free, you can only add data from Object Storage, and currently only from AWS S3.
ingesting throughput and latency
throughput
As a true cloud-native data warehouse, ClapDB is designed to handle write data volumes of any scale. (up to more than 100PB per day)
latency
ClapDB ingestion writes data to a Message Queue, from where it is batched and written to the storage layer. The process can be parallelized and distributed across any number of Lambdas, resulting in very low latency. However, since the processing requires batching a certain amount of data, the query latency for ClapDB Free / Pro will typically be in the order of minutes. If you have a higher write throughput, you can configure faster processing cycles. Of course, if you upgrade to ClapDB Enterprise, you can achieve sub-second query latency.
data formats supported
- ndjson
- csv
- tsv
use the insert select
sql to add data to ClapDB.
insert select from s3 use the s3()
table function. see s3 table function.
how to create a table
before ingesting, you should make sure the destination table was created.
ClapDB’s DDL data was based on JSON, you can use POST/GET http method to access the DDL data.
suggest use clapctl to generate the correct HTTP posting data.
will generate the following output
check more details in ClapDB DDL.
HTTP
By now, HTTP can just post one line once to ClapDB.
SQL client
ClapDB provides a SQL client to access the data, you can use the SQL client to ingest data to ClapDB.
ClapDB Enterprise support SQL client to ingest data to ClapDB.
just like below:
Auto ingest from other data sources
ClapDB Enterprise will support auto import from other data sources.
- OLTP databases, like MySQL, PostgreSQL, Oracle, SQL server, etc.
- Online database or table Services, like Google Sheet, Airtable, etc.
- SaaS services, like Salesforce, Hubspot, Google Analysis etc.
- Open Source software, like Kafka, Redis, etc.