System Design
  • Introduction
  • Glossary of System Design
    • System Design Basics
    • Key Characteristics of Distributed Systems
    • Scalability - Harvard lecture
      • Scalability for Dummies - Part 1: Clones
      • Scalability for Dummies - Part 2: Database
      • Scalability for Dummies - Part 3: Cache
      • Scalability for Dummies - Part 4: Asynchronism
    • Trade-off
      • CAP Theorem
      • Performance vs scalability
      • Latency vs throughput
      • Availability vs consistency
    • Load Balancing
      • Load balancer
    • Proxies
      • Reverse proxy
    • Cache
      • Caching
    • Asynchronism
    • Processing guarantee in Kafka
    • Database
      • Relational database management system (RDBMS)
      • Redundancy and Replication
      • Data Partitioning
      • Indexes
      • NoSQL
      • SQL vs. NoSQL
      • Consistent Hashing
    • Application layer
    • DNS
    • CDN
    • Communication
      • Long-Polling vs WebSockets vs Server-Sent Events
    • Security
    • Lambda Architecture
  • OOD design
    • Concepts
      • Object-Oriented Basics
      • OO Analysis and Design
      • What is UML?
      • Use Case Diagrams
    • Design a parking lot
  • System Design Cases
    • Overview
    • Design a system that scales to millions of users on AWS
    • Designing a URL Shortening service like TinyURL
      • Design Unique ID Generator
      • Designing Pastebin
      • Design Pastebin.com (or Bit.ly)
    • Design notification system (scott)
      • Designing notification service
    • Designing Chat System
      • Designing Slack
      • Designing Facebook Messenger
    • Design Top K System
    • Designing Instagram
    • Design a newsfeed system
      • Designing Facebook’s Newsfeed
      • Design the data structures for a social network
    • Designing Twitter
      • Design the Twitter timeline and search
      • Designing Twitter Search
    • Design Youtube - Scott
      • Design live commenting
      • Designing Youtube or Netflix
    • Designing a Web Crawler
      • Designing a distributed job scheduler
      • Designing a Web Crawler/Archive (scott)
      • Design a web crawler
    • Designing Dropbox
    • Design Google Doc
    • Design Metrics Aggregation System
      • Design Ads Logging System
    • Design Instacart
    • Design a payment system
      • Airbnb - Avoiding Double Payments in a Distributed Payments System
    • Design Distributed Message Queue
      • Cherami: Uber Engineering’s Durable and Scalable Task Queue in Go
    • Design Distributed Cache
      • Design a key-value cache to save the results of the most recent web server queries
    • Design a scalable file distribution system
    • Design Amazon's sales ranking by category feature
    • Design Mint.com
    • Design Autocomplete System
      • Designing Typeahead Suggestion
    • Designing an API Rate Limiter
      • Designing Rate Limiter
    • Design Google Map
      • Designing Yelp or Nearby Friends
      • Designing Uber backend
    • Designing Ticketmaster
      • Design 12306 - Scott
    • Design AirBnB or a Hotel Booking System
  • Paper Reading
    • MapReduce
  • Other Questions
    • What happened after you input the url in the browser?
Powered by GitBook
On this page
  • Design Ads Logging System
  • Requirements
  • Functional requirements
  • Non-functional requirements
  • High-level Architecture
  • Detailed Design
  • Ads Publish Service
  • Ads Service
  • Logging Service

Was this helpful?

  1. System Design Cases
  2. Design Metrics Aggregation System

Design Ads Logging System

Design Ads Logging System

Requirements

Functional requirements

  • Generic ads logging system

  • Advertiser - provide ads - want users to convert

  • Publisher - display ads

  • Influence - ranking of the ads

  • Log publisher page information, end user information

  • calculate the count of different ads(click, display, etc.)

  • Ad Event: display, click, conversion

Non-functional requirements

  • Scalability

  • Lower latency

  • High availability


High-level Architecture


Detailed Design

Ads Publish Service

  • user can publish new ads

  • new ads request will send to Ads Service

  • store ads metadata in the SQL database

Ads Table

adId(PK)
userID
description
created_time

123485

4541

"new ad"

456487


Ads Service

  • rank ads

  • other ads related operation

Search/Ads Ranking

  • ads relevance check

  • impression count

  • CTR: Click-Through Rate

  • CVR: Conversion Rate


Logging Service

  • we can use lambda architecture to do handle calculation of how many counts for different ads

  • Real-time processor is the speed layer - not that accurate count (calculate lastest 1 min, 5 min results)

  • Batch Processor is the batch layer - accurate count (calculate recent 1 day, 1 month, 1 year results)

Message Request

{
    eventId: 1254, // specify a unique ad
    advertiserId: 15464,
    userId: 676467,
    eventType: "click",
    timestamp: 45613854,
}

Real-time Processor

  • it contains multiple consumers to poll messages from the queue

  • impression calculated by the stream processor; impression + 1 if it is the user

  • sharding based on eventId or advertiserId

  • each of the consumer can handle all events for the same eventId

  • it is write-heavy, we can use Cassandra (append only) or ElastiSearch to store the results

  • in the processor, we can also build in-memory hashtable

    • use LSM(log-structured merge-tree) as the data structure

Batch Processor

  • log all the data

  • map-reduce job for accurate data

  • mainly for data correction and monitoring

PreviousDesign Metrics Aggregation SystemNextDesign Instacart

Last updated 3 years ago

Was this helpful?