Design Metrics Aggregation System
Last updated
Was this helpful?
Last updated
Was this helpful?
Two Sigma 的 , 讲得非常好!
这个 实际上是模仿 Two Sigma 做的。
对底层原理解释得不错。
对上面的进行了总结和讲解,讲得也不错。
add instrumentation for services that runs on hundreds of services in cloud
collect stats about servers where application runs
visualize & monitor the metrics
Reliable
Scalable
Low end-to-end latency
Flexible for different schema
we can collect query-specific information like what’s shown below
by storing the data in its raw form, we are able to do exploratory analysis using dynamic “if that, then what?” questions.
Time series database: InfluxDB, graphite
Elasticsearch
best ELK bundle: elasticsearch + logstash + kibana
easy searching and analysis, and it comes with plugins, like Kibana, that make data analysis and visualization easy.
Elasticsearch wasn’t built to handle 50,000 messages per second (we tested this through internal benchmarks).
Use a buffer, which is analogous to a water tank.
You can think of a Kafka topic as a partitioned queue where the partitions can be written to and read from simultaneously.
read data from Kafka and then serve to Elasticsearch require a lot fo application code, handling edge cases and errors
existing messaging solutions for connecting Kafka and Elasticsearch
config logstash
Use kubernetes to maintain multiple logstash instances
Two sigma uses marathon since they maintain it locally
Use Write-ahead-log (WAL) to store logs
compact old files using log-structured merge tree(LSMT).
不能只用 service name 作为 sharding key,否则会有 hot entity issue。
Because most of time we query by time range to list service instances’ API methods (sample picture below), the sharding key can be “time bucket” + “service name”, where the time bucket
is a timestamp rounded to some interval.
sample query:
如果依然有 hot entity issue,we can append sequence number such as 1,2,3 … to it.
read this .