Log Events @ Twitter: Challenges of Handling Billions of Events per Minute

At Twitter, hundreds of thousands of microservices emit important events triggered by user interactions on the platform. The Data Platform team has the requirement to aggregate these events by service type and generate consolidated datasets. These datasets are made available at different storage destinations for data processing jobs or analytical queries. In this presentation we discuss the architecture behind supporting event log pipelines which can handle billions of events per minute with data volumes of tens of petabytes of data every day. We discuss our challenges at scale and lay out our solution using both open source and in house software stack. This presentation describes our resource utilization and optimizations we had to do at scale. Towards the end we also introduce our improvements to move our event log pipeline to event stream pipelines. We show a use case which uses these event streams for real time analytics.

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy