Designing Slack
Designing Slack
Ref: https://www.youtube.com/watch?v=-UXWETzHG38
Interview Process:
10 mins: Clarify business (Use Cases)/common sense
10 mins: Constraints
10 mins: Architecture
10 mins: API Design
10 mins: DB choice
10 mins: Data model
Use Case
Feature
Domain / Group / Individual
Support direct msg / group msg
Support link preview
Support multimedia msg
Msg Notification
Highlight unread msg
MVP (discuss with the interviewer)
Direct Message (1 to 1 msg)
Channel: Group Chat
Multimedia msg
Link preview
Requirements
High available
High reliable
Consistent
High performance: real-time
Bonus
Group Membership
Msg Notification
Highlight unread msg
Non-goals
Authentication/Authorization
Mobile/Web Client Side
Constraints
One domain
(Ask Interviewer) DAU: 10M
Traffic Estimation
Peak direct messages:
每个人一天发 5 条;peak 是平时的 100 倍来计算
One message size: 100 bytes
Throughput:
可能 5 MB/s 这个结果有点太小了,不过理解计算过程就行了。
Storage Estimation
One day storage
One month storage
Group chat number
1000 members limit in a group
Scale
QPS
1000 QPS needs multiple partitions
Data
Architecture & Database
split into small microservice
Channel service
master service serve all requests from the user
Account service
manage user account info
Friends service (may not need in Slack)
manage user friendship info
Message Storage Service
store message
Message Storage Service
manage messaging search index and lookup
Group service
manage group membership info
Media service
process multi-media / link info
Notification Service
notify the users
API
Public APIs
GetChannels(user_id)
store last viewed channel
store visible channels: recent direct msg/membership of the group
show unread msg / last msg preview: denormalized data
SendMessage(sender_id, receiver_id, message_type, message, media_id)
message_type is one of [
direct message
,group message
]create_timestamp
,update_timestamp
,expired_time
,STATE
will be appended by channel service.STATE
could be one of [received
,sent
,viewed
,notification_sent
,deleted
], the multimedia message may need to be processed, therefore need some time between received and sent
updateMessage(message_id, message)
deleteMessage(message_id)
uploadFile(user_id, permission_group, file_descriptor, some_metadata)
permission_group: used to manage who can upload files
file_descriptor: a file descriptor (FD, less frequently fildes) is a unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket. More read here.
LookupEntity(user_id)
search locally for given entity
use
trie
to build a local search tree
joinGroup(user_id, group_id)
getDirectMessages(user_id, friend_id)
getGroupMessages(user_id, group_id)
CRUD for group info getGroupNotice
/ getGroupMembers
/ getGroupDetail
/ GetGroupSharedFiles
/ update
...
CRUD for user info getUserInfo
/ updateUserInfo
Internal API
Media Service
processMedia(media_id)
Replicate media
Post-processing: Generate preview / thumbnail
Distributed to CDN
Notification Service
notifyUser(user_id, sender_id, message_id, msg_preview)
Either use database or Kafka
Exactly once is difficult, requiring two-phase commits
Data Model
User Table (SQL)
Id (Shard Key) | name | status | timezone | team |
---|---|---|---|---|
uuid | yang | online | Los Angeles | safety |
status
should store here. If this field is modified often, we can pull it out to a separate table.
Friend Table (SQL)
Friend 关系存两遍,要不然找的时候很麻烦
这样 shard
user_id_1
就可以了
user_id_1 | user_id_2 | connected_date | last_view_date |
---|---|---|---|
yang_123 | bob_456 | 123455 | 56891874 |
user_id_1 | user_id_2 | connected_date | last_view_date |
---|---|---|---|
bob_456 | yang_123 | 123455 | 29642145 |
Group Membership Table (SQL)
group_id | user_id(shard key) | join_date | role | last_view_date |
---|---|---|---|---|
uuid | yang_123 | 1235678 | Admin | 1256564878 |
uuid | bob_456 | 12345679 | user | 1238168745 |
Channel Table (Redis)
Message Storage Table (DynamoDB)
PartitionKey | SortKey | msg_id | sender_id | message | creation_timestamp |
---|---|---|---|---|---|
containerId | timestamp-msg_id | uuid | bob_456 | "hello world!" | 568749321 |
Group Chat:
containerId
isgroup_id
.Direct Message:
containerId
is sorted{user_id_1, user_id_2}
Emoji Table (redis)
This table is for emoji added on a message.
Good to have(if have some time left): also list group tables, etc.
Bottleneck & scale
Infinite scale: sharding, periodically cleanup
Throttle needed for "bad" user
Design dive deep
Network Protocols
basic relationship:
so for a chat service, the choice of network protocols is important.
Polling
Cons:
expensive; consume precious resources to answer a question that offers empty response
Long Polling
holds the connection open until there are actually new messages available or timeout
Cons:
sender and receiver may not connect to the same chat server. HTTP based servers are usually stateless and the server receives the msg might not have a long-polling connection with the client who receives the msg.
inefficient if a user doesn't chat much
server doesn't have a good way to know if a client is disconnected
WebSocket (Recommended)
WebSocket is the most common solution for sending asynchronous updates from server to client.
It starts its life as a HTTP connection and could be “upgraded” via some well-defined handshake to a WebSocket connection.
Pros:
connection is bi-directional and persistent
simplifies the design and makes implementation on both client and server more straightforward
Stateless vs. Stateful
some services are stateless, like authentication service, service discovery, etc.
some services are stateful, like chat services. All users should talk on chat services
scalability
stateless 的 hosts 可以 easily scale up
对于 stateful 的怎么办呢?
chat server: send/receive msg
presence server: manage online/offline status
API server: user login service, signup, profile service. etc.
notification server: send/push notification
kv store: store chat history
DB selection
generic data,like profile, setting, friend list: relational DB
chat history data: key-value store
easy horizontal scaling
low latency to access data
indexes grow large, random access in relational DB is expensive
fb messenger: HBase; Discord: Cassandra
Data models
message table (1 to 1)
primary key: message_id
message_id | message_from | message_to | content | created_at |
---|---|---|---|---|
uuid | big_int | big_int | text | timestamp |
message table for group chat
primary key: (channel_id, message_id)
channel_id(partition key) | message_id | user_id | content | created_at |
---|---|---|---|---|
899876 | 809707908098 | 123 | "hello world" | 24353546 |
message id
must be unique and sortable
server ticket
twitter snowflake is a good approach
local sequence number generator: easier to implement; sufficient in 1-on-1 or group chat
Service Discovery
recommend best chat server for a client
use Zookeeper
User A tries to log in to the app.
The load balancer sends the login request to API servers.
After the backend authenticates the user, service discovery finds the best chat server for User A. In this example, server 2 is chosen and the server info is returned back to User A.
User A connects to chat server 2 through WebSocket
Message Workflow
1-on-1 chat
User A sends a chat message to Chat server 1.
Chat server 1 obtains a message ID from the ID generator.
Chat server 1 sends the message to the message sync queue.
The message is stored in a key-value store.
If User B is online, the message is forwarded to Chat server 2 where User B is connected.
If User B is offline, a push notification is sent from push notification (PN) servers.
Chat server 2 forwards the message to User B. There is a persistent WebSocket connection between User B and Chat server 2.
Message synchronization across multiple devices
use a local
cur_max_message_id
to track which is the latest msgthen read from db: find
message_id
which is larger thancur_max_message_id
Group chat
sending message
User A copied msg to each group member's message sync queue
each client (B, C) only needs to check its own inbox to get new msgs
when group is small, storing a copy in each client's inbox is not expensive. WeChat is using a similar approach and have a limits of group size of 500
receiving message
Online presence
Further reading: Linkedin Presence Platform
User login
after client A and chat server build WebSocket connetion, change A's online
status
andlast_active_at
timestamp
User logout
change online
status
to offline
User disconnection
heartbeat: every 5 sec, client send heartbeat to presence server;
after sending 3 heartbeats, if not reconnect with in 30 sec, label this client as offline
Online status fanout
if small group, use publish mode:
if large group, fetch online status only when a user enters a group or manually refresh the friends list
Last updated