Looking into the difference when configuration files are used


Before understanding configuration files my scripts are often very long, repetitive, and inefficient. In addition, every time there are changes to the variables, I spent most of my time making changes at different parts of the script, which is time-consuming. Then I noticed others are using configuration files as part of their development and I started exploring and implementing them as well and I realize things became more efficient, flexible, and organized when using a configuration file.

So what are Configuration Files?

  • Configuration files allow us to configure parameters and initial settings.
  • Format of Configuration files can be in — yaml, ini, json, xml

Configuration files are commonly used for storing sensitive information such as credentials for database, passwords, server hostname, managing parameters, etc.

In this article, I will share the difference between using configuration files vs not using a configuration file in a Machine learning project. The configuration file format that we will be using is in YAML format. YAML which represents Yet Another Markup language was selected as there are no formatting such as braces and brackets which makes it popular for their ease of readability and ease to write.

The scenario in this use case is to perform scoring (prediction) based on different pre-built models. Each model required a different set of data to perform the prediction but the source table is the same. The image below is an illustration of the process required to build:

Overview of Scoring Proces:

  • There are 2 models: Model_A & Model_B that have already been pre-built based on a different set of data but retrieved from the same features table.
  • To prepare a scoring script for model A to predict based on data set A and push the forecast result into the database.
  • To prepare a scoring script for model B to predict based on data set B and push the forecast result into the database.

(For this use case, the model developed is based on a data set taken from Kaggle: Superstore Sales Dataset)

Now let’s look at how to implement configuration files and also looking at the comparison when configuration files are not used.

Scoring Script with Config File:

First, let’s look at how a configuration file in YAML looks like. Below is an example of a configuration file (model_A.yml) that specified the segment, model file name,…

Continue reading: https://towardsdatascience.com/as-a-novice-i-learned-that-using-configuration-files-in-python-makes-development-more-efficient-29c75b4eabd5?source=rss—-7f60cf5620c9—4

Source: towardsdatascience.com