Machine learning is at the heart of Pinterest and is powered by large scale ML training log collection. To solve the cost efficient data ingestion & transportation problem at Pinterest we developed MemQ, a PubSub system that leverages pluggable cloud native storage like S3 using a decoupled packet based storage design. MemQ is able to scale to GB/s traffic with 90% higher cost efficiency than Apache Kafka, enabling Pinterest to ingest all of our ML training data powering offline training, near real-time model quality validation and ad-hoc analysis.
- WATCH NOW
- 2024 EVENTS
- PAST EVENTS
- 2023
- 2022
- February
- RTC @Scale 2022
- March
- Systems @Scale Spring 2022
- April
- Product @Scale Spring 2022
- May
- Data @Scale Spring 2022
- June
- Systems @Scale Summer 2022
- Networking @Scale Summer 2022
- August
- Reliability @Scale Summer 2022
- September
- AI @Scale 2022
- November
- Networking @Scale Fall 2022
- Video @Scale Fall 2022
- December
- Systems @Scale Winter 2022
- 2021
- 2020
- 2019
- 2018
- 2017
- 2016
- 2015
- EXPLORE TOPICS
- Blog & Video Archive
- Speaker Submissions
- About @Scale