ClapDB

Data API: New era protocol for cloud native service

Leo

Most databases or data analysis tools use SQL or similar methods for their APIs, but modern databases now prefer to use Data APIs through the HTTP protocol. Let’s look at the advantages and disadvantages of Data APIs.

Choosing Data APIs over HTTP versus traditional, legacy protocols over TCP (Transmission Control Protocol) involves several considerations. HTTP (Hypertext Transfer Protocol), which operates on top of TCP/IP, has become the foundation of data exchange on the Web due to its simplicity, flexibility, and widespread support. Here’s why Data APIs over HTTP are often favored over older TCP-based protocols:

Standardization and compatibility

HTTP is universally supported and standardized, making it compatible with a wide range of devices, platforms, and programming languages. This universal support facilitates easier integration and interoperability between different systems and services.

Legacy TCP Protocols may require specific implementations or custom-built solutions for each new system or application, increasing complexity and limiting interoperability.

Stateless Protocol with Session Control

HTTP is inherently stateless, meaning each request from a client to server is independent. This simplifies the server design as it doesn’t need to maintain session state. However, HTTP allows for session control through cookies, headers, and other mechanisms when needed.

Legacy TCP Protocols often manage state at the protocol level, which can increase complexity and resource consumption on the server side, especially with long-lived connections.

Security

HTTP can be easily secured with HTTPS (HTTP Secure), which integrates TLS (Transport Layer Security) encryption, ensuring data confidentiality and integrity between client and server. HTTPS is widely adopted and supported.

Legacy TCP Protocols may require additional layers of security to be implemented, which can be more complex and less standardized than HTTPS.

Ease of Use and Development

HTTP-based APIs are generally easier to develop, debug, and maintain due to the availability of numerous tools, libraries, and frameworks that support HTTP. Modern development practices and microservices architectures often natively support HTTP APIs.

Legacy TCP Protocols may require more specialized knowledge to implement and debug, and tooling may be less available or more complex.

Data Format

Data APIs designed to work over HTTP/HTTPS naturally fit into the web ecosystem, making it easier to integrate with web applications, cloud services, and other APIs. NDJSON and CSV are text-based formats easily transmitted over HTTP, and fast compressed by zstd or other compression algorithms, then very fast be parsed.

Legacy TCP Protocol must to design and implement a customized binary format, it is very hard to do it better over the whole industry’s effort.

Performance and Scalability

While HTTP/1.1 introduced some overhead compared to raw TCP connections (due to headers and the text-based format), the evolution to HTTP/2 and HTTP/3 has significantly improved efficiency, latency, and concurrency, reducing the gap in performance while retaining the benefits of HTTP.

Legacy TCP Protocols can be efficient for specific use cases, especially where minimal overhead is crucial, but they lack the modern enhancements found in HTTP/2 and HTTP/3, such as header compression and multiplexing.

Caching and Intermediaries

HTTP supports caching mechanisms both at the client and network levels (via proxies), which can significantly reduce server load and improve data retrieval times.

Legacy TCP Protocols generally do not have standardized caching mechanisms, which could lead to increased bandwidth and server load. In summary, while both HTTP and TCP have their place in network communications, the use of Data APIs over HTTP provides a more standardized, flexible, and secure approach that aligns well with modern web technologies and development practices. This makes it a preferred choice for many applications, especially those requiring broad compatibility and ease of development.

Summary

Cloud-native applications should inherently be distributed and highly available, but protocols based on TCP’s session mechanism naturally hinder applications from achieving distribution and high availability.

Http-based Data API protocol was the best choice for modern cloudnative architecture: stateless.

← Back to Blog