# Design Top K System

## Design Top K System

> <https://www.youtube.com/watch?v=kx-XDoPjoHw>

***

## Requirement

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-9af467c74314badf3baa93f8097645ce451676c5%2F1.png?alt=media)

***

## Single Host - Hash table/PriorityQueue

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-d363931813266d92ea905d538bab063cdf9dcb50%2F2.png?alt=media)

### Top K Algorithm Implementation

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-94ecee0bd08a715ae2c52c4fa913a2835f19716b%2F3.png?alt=media)

> However, this single host solution is not scalable with high throughput

***

## Multiple Host - Hash table

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-08caca5dc377d0708e803819eaae3343322a69b5%2F4.png?alt=media)

* Pro:
  * process in memory
  * throughput is increased
* Con:
  * hash table uses huge memory in total

### Partition and Merge

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-2413e53a780c4bb2e077cd0193b32f6f046ae5b0%2F5.png?alt=media)

* Pro:
  * increased throughput
  * reduce memory usage
* Con:
  * this solution dropped other info. e.g. 现在我们计算的是 last 1 hour 的 top K，如果我们要计算 last 1 day 呢？是不是就没办法了。

### Count-min sketch

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-f989ea793f38f7ca6b3deee4adc2f4712b0593da%2F7.png?alt=media)

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-1bb5da46e88869d33250af6537400da5a402bb67%2F8.png?alt=media)

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-76e8478495ae72239dfd82e4006cfe06d839d154%2F9.png?alt=media)

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-eb3d7953061f1e48e59a7700b4051e3a14c53fbe%2F10.png?alt=media)

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-0d9c54326c21a78f984adc3777c0ee821ee86bc8%2F11.png?alt=media)

* 一个字母对应多个 hash function，每次字母来了都更新这些 function 的值
* 因为过程中可能会有其他字母的 collision，所以我们最后找的是同一个字母对应的 minimum of all the counters

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-1574eb018300e89df20fe2f33a8825d994e54026%2F6.png?alt=media)

* Pro:
  * fixed size memory
* Con:
  * sacrifice with accuracy

***

## High-level architecture

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-0f390a6e6379b34fdc446ee25195582e801c0846%2F12.png?alt=media)

### API Gateway

* Single entry for all clients
* aggregate data on the fly or via background process that process logs. Data is flushed based on either time or size.
* Serialize data in a compact binary format(e.g. Apache Avro)

### Fast Processor

* Create count-min sketch and aggregate data for a short period fo time(seconds)
* Because memory is no longer a problem, no need to partition the data.
* Data replication is nice to have, but not be strictly required.

### Storage

* SQL, NoSQL
* Build the final count-min sketch and store a list of top k elements for a period of time
* Data replication is required

### Data Partitioner

* Parse batches of events into individual events.
* Hash partitioning(e.g. video identifier + time window)
* deal with hot partition

***

## Detailed Design

### Fast path data flow (last 1 min top k)

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-a0d0bebb00a0a1c5e73139235f428af0940f8c79%2F13.png?alt=media)

### Slow path data flow (last 1 hr top k)

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-afcb5ff80e596355972a0e85592dc2f9281d38e7%2F14.png?alt=media)

#### MapReduce jobs

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-e3657ed5fdee751fd596d4516ba2e3de74a7c822%2F15.png?alt=media)

***

### Data Retrieval

![](https://2407442552-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-Lpv9LvHzpublmUWisvz%2Fuploads%2Fgit-blob-b5194a0ca7024d0f595a343d10e25a08ba393199%2F16.png?alt=media)

***

## Other Questions

> You mentioned we aggregate data on the API Gateway side. What if this is not possible due to CPU or memory constraints on the API Gateway hosts?

* No matter how busy these hosts are, there will be some hosts offload these logs from Gateway hosts.
* We can run log storage and log parsing on some cluster and rest pipeline stays the same.

> Any alternatives to count-min sketch?

* Lossy Counting, Space Saving and Sticky Sampling.

> How big is the K?

* Fast path and slow path, we both need to merge lists to get the final resuts
* several thousands should be OK, but tens of thousands may cause performance degradation.

> What are the drawbacks of this architecture?

* This is Lambda Architecture. The drawback is the complexity of the system.
