
#Merlin project training how to#
In this repo we demonstrate how to operationalize NVTabular data preprocessing workflows using Vertex AI Pipelines and multi-GPU processing nodes. NVTabular - a core component of Merlin - is a feature engineering and preprocessing library designed to effectively manipulate terabytes of recommender system datasets and significantly reduce data preparation time. At this scale, data preprocessing steps often take much more time than training recommender machine learning models. The below figure summarizes a high level architecture of the solution demonstrated in this repo.Ĭommercial recommenders are trained on huge datasets, often several hundreds of terabytes in size. The dataset used by all samples in this repo is Criteo 1TB Click Logs dataset provided by The Criteo AI Lab. Implementing end to end data preprocessing, training, and deployment pipelines with Vertex AI Pipelines.įor detailed information about NVIDIA Merlin components and Vertex AI services, refer to NVIDIA Merlin and Vertex AI documentation.Deploying models and serving predictions with NVIDIA Triton Inference Server and Vertex AI Prediction.Training large-scale deep learning ranking models with NVIDIA Merlin HugeCTR and Vertex AI Training.Operationalizing large scale data preprocessing pipelines with NVIDIA Merlin NVTabular and Vertex AI Pipelines.Setting up Merlin experimentation and development environment in Vertex AI Workbench.The content in this repository centers around five core scenarios: Using NVIDIA Merlin with Vertex AI enables developers to build, train and deploy custom end-to-end recommender systems at scale, within Vertex AI’s unified and fully-managed MLOps platform. Vertex AI is Google Cloud's unified Machine Learning platform to help data scientists and machine learning engineers increase experimentation, deploy faster, and manage models with confidence. NVIDIA Merlin is an open-source framework for building large-scale deep learning recommender system.

#Merlin project training code#
This repository compiles prescriptive guidance and code samples for operationalization of NVIDIA Merlin framework on Google Cloud Vertex AI.
