{"id":3898,"date":"2022-05-02T15:46:40","date_gmt":"2022-05-02T13:46:40","guid":{"rendered":"https:\/\/dev.littlebigcode.fr\/mlops-why-data-model-experiment-tracking-is-important\/"},"modified":"2022-07-04T23:35:36","modified_gmt":"2022-07-04T21:35:36","slug":"mlops-why-data-model-experiment-tracking-is-important","status":"publish","type":"post","link":"https:\/\/dev.littlebigcode.fr\/en\/mlops-why-data-model-experiment-tracking-is-important\/","title":{"rendered":"MLOps : Why data and model experiment tracking is important ? How tools like DVC and MLflow can solve this challenge ?"},"content":{"rendered":"

[et_pb_section fb_built=”1″ admin_label=”section” _builder_version=”4.16″ da_disable_devices=”off|off|off” global_colors_info=”{}” da_is_popup=”off” da_exit_intent=”off” da_has_close=”on” da_alt_close=”off” da_dark_close=”off” da_not_modal=”on” da_is_singular=”off” da_with_loader=”off” da_has_shadow=”on”][et_pb_row admin_label=”row” _builder_version=”4.16″ background_size=”initial” background_position=”top_left” background_repeat=”repeat” custom_padding=”3px||3px||true|false” global_colors_info=”{}”][et_pb_column type=”4_4″ _builder_version=”4.16″ custom_padding=”|||” global_colors_info=”{}” custom_padding__hover=”|||”][et_pb_text admin_label=”Text” _builder_version=”4.17.4″ text_font=”Average Sans||||||||” text_text_color=”#242B57″ link_font=”Average Sans||||||||” link_text_color=”#1CACE4″ ul_font=”Average Sans||||||||” ul_text_color=”#242B57″ ol_text_color=”#242B57″ quote_font=”Average Sans||||||||” quote_text_color=”#242B57″ header_text_color=”#1CACE4″ header_2_text_color=”#1CACE4″ header_3_text_color=”#1CACE4″ header_4_font=”Average Sans||||||||” header_4_text_color=”#1CACE4″ header_5_font=”Century Gothic Bold||||||||” header_5_text_color=”#1CACE4″ header_6_font=”Century Gothic Bold||||||||” header_6_text_color=”#1CACE4″ background_size=”initial” background_position=”top_left” background_repeat=”repeat” custom_padding=”8px|||||” inline_fonts=”Century Gothic Bold,Century Gothic,Average Sans” global_colors_info=”{}”]<\/p>\n

With the rise of interest and the number of machine learning projects (self-driving car, facial recognition, recommendation systems), traditional software development has shifted from hard-coded rules to data-estimated rules a.k.a. data-driven models (cf. figure 1). A set of new challenges arose for building reliable and stable information systems that rely on imperfect data-driven models, such as model versioning, deployment, monitoring, explainability and reproducibility..<\/p>\n

By Samson ZHANG<\/a>, Data Scientist at LittleBigCode<\/p>\n

\"\"

Figure 1. Machine learning vs traditional software development. Source : http:\/\/datalya.com<\/a><\/span><\/p><\/div>\n

There is a whole new set of software engineering best practices that comes with the use of data-driven models in order to tackle those challenges, called MLOps. In order to get a broader view of what MLOps is, I recommend you to take a look at Jamila Rejeb\u2019s article: Why MLOps is so important to understand ?<\/a> The main purpose of MLOps is to make your entire ML project lifecycle automated and reproducible. In this article, we will mainly focus on data & model experiment tracking\/versioning. The main issues data & model experiment tracking aim to solve are : <\/strong><\/p>\n