Real-Time Data Processing for ML Feature Engineering

In Meta, we had developed multiple real-time data processing infrastructure like Puma, Stylus and Turbine (SIGMOD ’16 and ICDE ’20). As Meta grows, the needs for real-time data has grown way beyond traditional data analytics & reporting scenarios. Recently, ML data engineering become increasingly a strong driving force. The real-time data is no longer only examined by human occasionally, but powers ML-based systems to always gain the freshest knowledge and make higher quality predictions. We will talk about the architecture of our latest generation, consolidated real-time data processing platform and how we evolve it for ML real-time feature engineering.

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy