System Design
  • Introduction
  • Glossary of System Design
    • System Design Basics
    • Key Characteristics of Distributed Systems
    • Scalability - Harvard lecture
      • Scalability for Dummies - Part 1: Clones
      • Scalability for Dummies - Part 2: Database
      • Scalability for Dummies - Part 3: Cache
      • Scalability for Dummies - Part 4: Asynchronism
    • Trade-off
      • CAP Theorem
      • Performance vs scalability
      • Latency vs throughput
      • Availability vs consistency
    • Load Balancing
      • Load balancer
    • Proxies
      • Reverse proxy
    • Cache
      • Caching
    • Asynchronism
    • Processing guarantee in Kafka
    • Database
      • Relational database management system (RDBMS)
      • Redundancy and Replication
      • Data Partitioning
      • Indexes
      • NoSQL
      • SQL vs. NoSQL
      • Consistent Hashing
    • Application layer
    • DNS
    • CDN
    • Communication
      • Long-Polling vs WebSockets vs Server-Sent Events
    • Security
    • Lambda Architecture
  • OOD design
    • Concepts
      • Object-Oriented Basics
      • OO Analysis and Design
      • What is UML?
      • Use Case Diagrams
    • Design a parking lot
  • System Design Cases
    • Overview
    • Design a system that scales to millions of users on AWS
    • Designing a URL Shortening service like TinyURL
      • Design Unique ID Generator
      • Designing Pastebin
      • Design Pastebin.com (or Bit.ly)
    • Design notification system (scott)
      • Designing notification service
    • Designing Chat System
      • Designing Slack
      • Designing Facebook Messenger
    • Design Top K System
    • Designing Instagram
    • Design a newsfeed system
      • Designing Facebook’s Newsfeed
      • Design the data structures for a social network
    • Designing Twitter
      • Design the Twitter timeline and search
      • Designing Twitter Search
    • Design Youtube - Scott
      • Design live commenting
      • Designing Youtube or Netflix
    • Designing a Web Crawler
      • Designing a distributed job scheduler
      • Designing a Web Crawler/Archive (scott)
      • Design a web crawler
    • Designing Dropbox
    • Design Google Doc
    • Design Metrics Aggregation System
      • Design Ads Logging System
    • Design Instacart
    • Design a payment system
      • Airbnb - Avoiding Double Payments in a Distributed Payments System
    • Design Distributed Message Queue
      • Cherami: Uber Engineering’s Durable and Scalable Task Queue in Go
    • Design Distributed Cache
      • Design a key-value cache to save the results of the most recent web server queries
    • Design a scalable file distribution system
    • Design Amazon's sales ranking by category feature
    • Design Mint.com
    • Design Autocomplete System
      • Designing Typeahead Suggestion
    • Designing an API Rate Limiter
      • Designing Rate Limiter
    • Design Google Map
      • Designing Yelp or Nearby Friends
      • Designing Uber backend
    • Designing Ticketmaster
      • Design 12306 - Scott
    • Design AirBnB or a Hotel Booking System
  • Paper Reading
    • MapReduce
  • Other Questions
    • What happened after you input the url in the browser?
Powered by GitBook
On this page
  • Hypertext transfer protocol (HTTP)
  • Source(s) and further reading: HTTP
  • Transmission control protocol (TCP)
  • User datagram protocol (UDP)
  • Source(s) and further reading: TCP and UDP
  • Remote procedure call (RPC)
  • Representational state transfer (REST)
  • RPC and REST calls comparison
  • Source(s) and further reading: REST and RPC

Was this helpful?

  1. Glossary of System Design

Communication

PreviousCDNNextLong-Polling vs WebSockets vs Server-Sent Events

Last updated 4 years ago

Was this helpful?

Hypertext transfer protocol (HTTP)

HTTP is a method for encoding and transporting data between a client and a server. It is a request/response protocol: clients issue requests and servers issue responses with relevant content and completion status info about the request. HTTP is self-contained, allowing requests and responses to flow through many intermediate routers and servers that perform load balancing, caching, encryption, and compression.

A basic HTTP request consists of a verb (method) and a resource (endpoint). Below are common HTTP verbs:

HTTP is an application layer protocol relying on lower-level protocols such as TCP and UDP.

Source(s) and further reading: HTTP

Transmission control protocol (TCP)

TCP is a connection-oriented protocol over an IP network. Connection is established and terminated using a handshake. All packets sent are guaranteed to reach the destination in the original order and without corruption through:

TCP is useful for applications that require high reliability but are less time critical. Some examples include web servers, database info, SMTP, FTP, and SSH.

Use TCP over UDP when:

  • You need all of the data to arrive intact

  • You want to automatically make a best estimate use of the network throughput

User datagram protocol (UDP)

UDP is connectionless. Datagrams (analogous to packets) are guaranteed only at the datagram level. Datagrams might reach their destination out of order or not at all. UDP does not support congestion control. Without the guarantees that TCP support, UDP is generally more efficient.

UDP is less reliable but works well in real time use cases such as VoIP, video chat, streaming, and realtime multiplayer games.

Use UDP over TCP when:

  • You need the lowest latency

  • Late data is worse than loss of data

  • You want to implement your own error correction

Source(s) and further reading: TCP and UDP

Remote procedure call (RPC)

RPC is a request-response protocol:

  • Client program - Calls the client stub procedure. The parameters are pushed onto the stack like a local procedure call.

  • Client stub procedure - Marshals (packs) procedure id and arguments into a request message.

  • Client communication module - OS sends the message from the client to the server.

  • Server communication module - OS passes the incoming packets to the server stub procedure.

  • Server stub procedure - Unmarshalls the results, calls the server procedure matching the procedure id and passes the given arguments.

  • The server response repeats the steps above in reverse order.

Sample RPC calls:

GET /someoperation?data=anId

POST /anotheroperation
{
  "data":"anId";
  "anotherdata": "another value"
}

RPC is focused on exposing behaviors. RPCs are often used for performance reasons with internal communications, as you can hand-craft native calls to better fit your use cases.

Choose a native library (aka SDK) when:

  • You know your target platform.

  • You want to control how your "logic" is accessed.

  • You want to control how error control happens off your library.

  • Performance and end user experience is your primary concern.

HTTP APIs following REST tend to be used more often for public APIs.

Disadvantage(s): RPC

  • RPC clients become tightly coupled to the service implementation.

  • A new API must be defined for every new operation or use case.

  • It can be difficult to debug RPC.

  • You might not be able to leverage existing technologies out of the box. For example, it might require additional effort to ensure RPC calls are properly cached on caching servers such as Squid.

Representational state transfer (REST)

REST is an architectural style enforcing a client/server model where the client acts on a set of resources managed by the server. The server provides a representation of resources and actions that can either manipulate or get a new representation of resources. All communication must be stateless and cacheable.

There are four qualities of a RESTful interface:

  • Identify resources (URI in HTTP) - use the same URI regardless of any operation.

  • Change with representations (Verbs in HTTP) - use verbs, headers, and body.

  • Self-descriptive error message (status response in HTTP) - Use status codes, don't reinvent the wheel.

Sample REST calls:

GET /someresources/anId

PUT /someresources/anId
{"anotherdata": "another value"}

Disadvantage(s): REST

  • With REST being focused on exposing data, it might not be a good fit if resources are not naturally organized or accessed in a simple hierarchy. For example, returning all updated records from the past hour matching a particular set of events is not easily expressed as a path. With REST, it is likely to be implemented with a combination of URI path, query parameters, and possibly the request body.

  • REST typically relies on a few verbs (GET, POST, PUT, DELETE, and PATCH) which sometimes doesn't fit your use case. For example, moving expired documents to the archive folder might not cleanly fit within these verbs.

  • Fetching complicated resources with nested hierarchies requires multiple round trips between the client and server to render single views, e.g. fetching content of a blog entry and the comments on that entry. For mobile applications operating in variable network conditions, these multiple roundtrips are highly undesirable.

  • Over time, more fields might be added to an API response and older clients will receive all new data fields, even those that they do not need, as a result, it bloats the payload size and leads to larger latencies.

RPC and REST calls comparison

Source(s) and further reading: REST and RPC

Sequence numbers and for each packet

[Acknowledgement]()) packets and automatic retransmission

If the sender does not receive a correct response, it will resend the packets. If there are multiple timeouts, the connection is dropped. TCP also implements [flow control]()) and . These guarantees cause delays and generally result in less efficient transmission than UDP.

To ensure high throughput, web servers can keep a large number of TCP connections open, resulting in high memory usage. It can be expensive to have a large number of open connections between web server threads and say, a server. can help in addition to switching to UDP where applicable.

UDP can broadcast, sending datagrams to all devices on the subnet. This is useful with because the client has not yet received an IP address, thus preventing a way for TCP to stream without the IP address.

In an RPC, a client causes a procedure to execute on a different address space, usually a remote server. The procedure is coded as if it were a local procedure call, abstracting away the details of how to communicate with the server from the client program. Remote calls are usually slower and less reliable than local calls so it is helpful to distinguish RPC calls from local calls. Popular RPC frameworks include , , and .

(HTML interface for HTTP) - your web service should be fully accessible in a browser.

REST is focused on exposing data. It minimizes the coupling between client/server and is often used for public HTTP APIs. REST uses a more generic and uniform method of exposing resources through URIs, representation through , and actions through verbs such as GET, POST, PUT, DELETE, and PATCH. Being stateless, REST is great for horizontal scaling and partitioning.

What is HTTP?
Difference between HTTP and TCP
Difference between PUT and PATCH
checksum fields
https://en.wikipedia.org/wiki/Acknowledgement_(data_networks
https://en.wikipedia.org/wiki/Flow_control_(data
congestion control
memcached
Connection pooling
DHCP
Networking for game programming
Key differences between TCP and UDP protocols
Difference between TCP and UDP
Transmission control protocol
User datagram protocol
Scaling memcache at Facebook
Protobuf
Thrift
Avro
HATEOAS
headers
Do you really know why you prefer REST over RPC
When are RPC-ish approaches more appropriate than REST?
REST vs JSON-RPC
Debunking the myths of RPC and REST
What are the drawbacks of using REST
Crack the system design interview
Thrift
Why REST for internal use and not RPC