Make Better Bets

This may seem like a ridiculous statement but understanding and appreciating how Bayesian inferences are being applied by your staff to identify and select talent can significantly improve your…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Reducing Under Delivery and Over Delivery of Ads. in Zomato

My first team at Zomato was AdTech Team. Zomato provide options to merchants to run their Ad. campaigns over Zomato’s app

So, first things first, Merchants here are restaurant Owners, chains are groups of merchants under same name( ex : Dominos, McDonald’s).
Now these Merchants need to Promote their restaurants in various verticals :

So they buy Clicks and Impression counts that they want in a given Period of time, and this is called a Campaign

An impression is when a user only sees an advertisement, Click is when user actually clicks on an advertisement.

Ex : Dominos Connaught Place runs a Campaign of 30k Clicks and 100k impressions and similarly x number of restaurants also running similar campaigns. y amount of money is charged from merchants on basis on number of impressions, clicks, time period, location and time of the year. If the campaign isn’t fulfilled, proportional amount of money is returned back to the Merchant let’s call it z

Now these Impressions and Clicks need to be positioned and shown to users in such a manner and that all these cumulative targets could be achieved with minimising z ( the amount that needs to be returned to merchant )

What’s the Problem that we were facing ?

The existing Pipeline for reporting Aggregated clicks and impressions per campaign per time window was a Batch Processing one, using Apache Spark as distributed processing system. This job used to run in every 15 minutes, Now During Peak loads this Batch Processing used to be so slow that it used to take anywhere from 1.5 hours to 2 hours.

Now this translates to the fact that the data that visibility and positioning algorithms are being fed is of -2 hours, and for those few hours that campaign is still live even though the Clicks/Impressions quota for that campaign could already be fulfilled. For Example: The campaign run by McDonalds was for getting 30k clicks, now let’s say till 7:00 p.m this particular campaign had 27k clicks and as the quota is yet to be fulfilled, it’s still on the at a boosted position, now the Algorithm which decides the position on the campaign didn’t get any data till 9:00 p.m and thus can’t take any action. But in meanwhile of this lag that campaign got 35k clicks, 5k over the asked limit.

So for every user action, an event with multiple user and action details are sent to dedicated service

On daily basis ~100 to ~150 million such events stream through our pipeline. These events are then Published to a particular Kafka topic to keep the pipeline asynchronous.

Further a Worker dumps this data to an Apache Hive Table, on top of this Presto was being used as a Distributed Querying engine for OLAP load.

The aggregation logic was written in Apache Spark, The Algorithm for the computation goes like this :

There are 3 different notions of time in streaming Programs :

So, what we did was that all the aggregations were done on Event Time contrary to Processing time as used to happened in Apache Spark

How Was the pipeline made entirely Fault Tolerant?

The fault tolerance mechanism was to consistently recover the state of data streaming applications. The mechanism ensures that even in the presence of failures, the program’s state will eventually reflect every record from the data stream exactly once. The fault tolerance mechanism continuously draws snapshots of the distributed streaming data flow

In case of a program failure (due to machine-, network-, or software failure), Flink stops the distributed streaming dataflow. The system then restarts the operators and resets them to the latest successful checkpoint. The input streams are reset to the point of the state snapshot.

We used RocksDB for checkpointing, RocksDB is an embedded key-value store with a local instance in each task manager.

Add a comment

Related posts:

Can we explain the unexplained?

The theory of the universe as a hologram is amazing and simple at the same time. Albert Einstein’s discoveries elude common sense and yet are a fact that has been experimentally proven. Without going…

Info On Weight Loss For Women

Weight loss is a major goal for many women. But it’s important to know that there are many factors involved in achieving your goal. One of the best ways to lose weight is to focus on healthy eating…

Test Test

You have to start somewhere.. “Test Test” is published by Patrick Starling.