Folder Structure for Machine Learning Projects

Simple steps to create an automated folder structure!

Table of contents

No heading

No headings in the article.

Having a well-organized general Machine Learning project structure makes it easy to understand and make changes. Moreover, this structure can be the same for multiple projects, which avoids confusion. In this post, we will use the Cookiecutter package to create a Machine Learning project structure.

Step 1: Make sure that you have latest python and pip installed in your environment.

Step 2: Install cookiecutter

pip install cookiecutter

Step 3: Create a sample github repository,

Screenshot from 2022-02-15 20-50-47.png

Note: Don’t check any options under ‘Initialize this repository with:’ while creating a repository.

Step 4: Create a project structure

Go to a folder where you want to set up the project in your local system and run the following:

cookiecutter -c v1 https://github.com/drivendata/cookiecutter-data-science

It will ask the following options:

project_name [project_name]: <project-name>
repo_name [my-test]: <project-name>
author_name [Your name (or your organization/company/team)]: <Your name>
description [A short description of the project.]: <Details about the project>
Select open_source_license:
1 - MIT
2 - BSD-3-Clause
3 - No license file
Choose from 1, 2, 3 [1]: 1
s3_bucket [[OPTIONAL] your-bucket-for-syncing-data (do not include 's3://')]:
aws_profile [default]:
Select python_interpreter:
1 - python3
2 - python
Choose from 1, 2 [1]: 1

Screenshot from 2022-02-15 20-59-09.png

Note: You can ignore the ‘s3_bucket’ and ‘aws_profile’ options.

Step 5: Add project to the git repository

cd <project folder>

echo "# ML-Project-Structure" >> README.md
git init
git add README.md
git commit -m "first commit"
git branch -M main
git remote add origin https://github.com/syedjafer/ML-Project-Structure.git
git push -u origin main

The final structure will be the following:

Screenshot from 2022-02-15 21-03-33.png

Note: The data folder won’t appear in github. It will be in your local folder. This is not pushed to github as it will be in the ignore list (.gitignore file). If you want to checkin that also, just comment out in .gitignore file and add the data folder to github.

Some of the folder might not seem appropriate to your project. Feel free to delete it and adjust as per your need.

Did you find this article valuable?

Support Makereading by becoming a sponsor. Any amount is appreciated!