Mục lục bài viết

The Ultimate Deep Learning Project Structure: A Software Engineer’s Guide into the Land of AI

Greetings, tech lovers! I’m Alee, a coder with a passion for making computers dance to my tune 🎸🎚️💻 With two decades of computer experience and a decade of coding, I’ve had the chance to delve into a variety of frameworks and programming languages, including C, C++, Java, Swift, Kotlin, JavaScript, and TypeScript. But nothing can match the power of Python when it comes to Artificial Intelligence and Machine Learning projects 🧠💥

Mind blown 🤯 by the power of AI art and language models! ChatGPT 🤖 assisted in creating this post #FutureIsNow

Last year, I embarked on a new journey — a career change from software engineering into the exciting world of Artificial Intelligence and Machine Learning 🤖💻 I did my internship at the University of Amsterdam, where I did software engineering research on AI 🧐 and applied software engineering practices to AI/ML projects 🔧 And let me tell you, this journey has led to the creation of the “The Ultimate Deep Learning Project Structure” 💻🏗️ and a top-notch deep learning framework 💻💡🧉 if I do say so myself 💁‍😎

Engineering solutions one line at a time 🔧 #TrustMeImAnEngineer

Coding is the language of the digital species and just like speaking, it can be learned 💻 Expand your skillset and show computers who’s boss 💪 Don’t worry about grammar, computers never judge 🤖 Start with Python, functional programming, and AI YouTube videos and courses 📹 Happy coding! 🚀

Functional programming? 🛠️ It’s like oiling the gears of your code. Everything runs smoother, faster, and with fewer hiccups. Trust me, taking a functional programming course is time well spent.💡

What is coding? 🤔 It’s like giving your computer a to-do list! 📝 Think of it like telling your mom or dad how to make you a delicious pizza 🍕 with detailed steps. 📈 The computer will follow each step precisely 🔍 to make sure the task is done right. 💯 With coding, you can bring your imagination to life 🚀 and make the computer do some amazing things, like play video games 🎮, turn text into art 🎨, and even solve complex problems 🤖. But before you dive into the world of advanced AI and cutting-edge models, be sure to master the basics of coding first! 💻 It’s the recipe for machine learning success! 🎉🔥

Coding mastery is like wielding a wizard’s magic wand! 🧙‍♂️💻 With the right spells (keyboard strokes) and incantations (code snippets), you can cast a powerful spell on your code and watch it come to life 🔮👨‍💻 So, sharpen your coding skills, become a ninja wizard, and unleash the magic within! 🌟💻

Unleash the Jedi in you with coding! 🔍 & 💻 Harness the power of life-long learning, efficient keyword searching, and the quest for simple solutions. Train to become a coding ninja 🥷 and master even the trickiest problems with the power of the terminal by your side 🕵️‍♂️💻 Let your inner laziness and patience guide you to coding greatness and may the code force be with you! 🙌👨‍💻💻 #LifetimeLearning 🔍📚💡🌟📈🔍

Say no to the dark side of closed windows and embrace the power of the open-source force! 💻👨‍💻🐧 Search, learn, fix, upgrade, and repeat — the life of a coding Jedi. 🔍 & 💻 Join the ranks of the linux ninjas 🥷🏼 and leave the troubles of proprietary software behind 🕵️‍♂️💻 The terminal is your ally, so embrace it, master the command line, and may the source be with you! 🌟⚡️ #OpenSource #LinuxLife #LifetimeLearning

In 2023, I’ve got my problem-solving game on point 💪 My strategy is simple: I start with CoPilot in VS Code, follow up with ChatGPT, GitHub, CodeGPT, DuckDuckGo, Google search. It’s like having a team of experts at my fingertips 🤖🔍🧉💻

This is what I did back in 2014 when I started codingMeet CoPilot, your trusty coding companion 🤖 This innovative tool has been revolutionizing the AI world since its launch, and I’ve been lucky enough to use it since day one. With the ability to generate neural networks, CoPilot is truly a coding powerhouse 💪💻 Watch in awe as it effortlessly outputs the date in Python with ease 🔥 #CodingEfficiency #ProductivityBoost 🚀May the code be with you, young Padawan 🚀 Time to harness the power of the Force and conquer that bug with AI! 💻💪

The secret to success in coding? It’s simple: just keep learning, trying, experimenting with hobby projects, and embracing failures along the way 💻💪 Don’t be afraid to make mistakes, they’re just opportunities to learn and grow 💡

AI coding may test your skills, but remember young Padawan: try, fail and learn you must! ChatGPT, CodeGPT, CoPilot, GitHub, Documentation, DuckDuckGo, and Google, your allies they will be 🤖 #AIJedi

Get ready to take your Python skills to new heights! 🚀 With the Python Modularity principle, you can say goodbye to cluttered code and hello to clean, organized, and efficient programming 🔥 A Python module 🐉 is like a superhero sidekick🦸️🐉 that you can call upon whenever you need its powers 💪 It’s a collection of Python code that you can import 🚪 into your project and use to perform specific tasks 🤖 By breaking your code into modules, you can make your code more organized 📂, reusable 💻, and easier to maintain 🔧 Plus, you’ll be able to mix and match modules like a DJ 🎧 to create custom solutions for your specific needs. So embrace the power of Python Modularity 🔧 and become a coding ninja 🥷🏽💥

Don’t let spaghetti code bring you down, go modular and watch your code shine 🍝💡🌟 #CleanCodeMatters

Now, what is a Python Module 🐉🦸‍♂️ in action? A folder with a __init__.py file and multiple .py files. Think of it like a treasure box 💰 filled with precious coding gems 💎 The__init__.py file acts as the key 🔑 to the treasure box, exposing specific functions or classes to be exported 💻

A ResNet module’s __init__.py the can for example be:

from .resnet import ResNet, BasicBlock, Bottleneck

Ready to level up your AI/ML game? Let’s use Modularity to create a super organized and efficient AI project structure! 💻💪🔥🦸‍♂️ The Fantastic Four modules to guide your path: Models 🤖, Trainers 🚀, Experiments 🧪, and Data 💾, the all-star lineup for AI success! 🧉🏆

/
|-- models/
|   |-- __init__.py
|-- experiments/
|   |-- __init__.py
|-- trainers/
|   |-- __init__.py
|-- data/
|   |-- __init__.py

Now, Let’s dive in a deep learning project!🦸‍♂️ To follow this guide you have two options: run on Colab or locally use Python or Maté🧉 an open-source AI sidekick 💻🤖 With Maté🧉 you can harness the power of Modularity 🧩 and open source ☯

pip install yerbamate
cd path/to/folder
mate init deepnet

Now, let’s witness the empty project created with the Fantastic Four! 🤩🦸‍♂️

├── deepnet
│ ├── data
│ │ └── __init__.py
│ ├── experiments
│ │ └── __init__.py
│ ├── models
│ │ └── __init__.py
│ └── trainers
│ │ └── __init__.py
│ └── __init__.py
└── mate.json

Ready to reach for the AI stars 🚀? First, find the best open source Models from paperswithcode and Github 🔍 Second, gather your Data like a boss 😎 download it, process it, augment it, and shape your tensors until they fit and flow ⛵ Third, write your training procedure and define your Experiment 🦸🏽‍♂ Fourth step, experiment like crazy with different models, hyperparameters and data until you reach the AI sky 🚀🧠

When you’re so deep into deep learning, you forget to come up for air 🏊‍♂️🤓 #DeepLearningDive #NeverStopLearning

Now, let’s embark on an epic transfer learning adventure 🚀💻🧉 🧠 💪 For your dependencies, you have three options: Install with pip, conda, or manually, like a Jedi Knight 🤖

# "Big Transfer (BiT): General Visual Representation Learning"
# Forked from https://github.com/google-research/big_transfer
# Refactored repo: https://github.com/oalee/big_transfer

# You can use complete URLs or the short version.
# Here we use the short version of the following:
# https://github.com/oalee/big_transfer/tree/master/big_transfer/experiments/bit

# Installs the experiment, code and python dependencies with pip
mate install oalee/big_transfer/experiments/bit -y pip

# Installs python dependencies with conda
# Overwrites code dependencies if it already exists
mate install oalee/big_transfer/experiments/bit -yo conda

# Only installs the code without any pip/conda dependencies
mate install oalee/big_transfer/experiments/bit -n

Brewing up some code with Maté 🧉 #ModularMagic 🧩 #CodeCaffeine 💻🤖

The installation should be a breeze 🍃 just like brewing Maté 🧉 Brew the code and python dependencies directly into your machine using either pip 📦 or conda If everything goes smoothly, you won’t see any sneaky errors or bugs creeping up 🐛 But, if you do encounter any, don’t fret! 🙌 Use Colab or make sure to first install PyTorch. Now, let’s take a closer look at the project structure and dive into some code 🔍👀.

.
├── mate.json
└── deepnet
├── data
│ ├── bit
│ │ ├── fewshot.py
│ │ ├── __init__.py
│ │ ├── minibatch_fewshot.py
│ │ ├── requirements.txt
│ │ └── transforms.py
│ └── __init__.py
├── experiments
│ ├── bit
│ │ ├── aug.py
│ │ ├── dependencies.json
│ │ ├── __init__.py
│ │ ├── learn.py
│ │ └── requirements.txt
│ └── __init__.py
├── __init__.py
├── models
│ ├── bit_torch
│ │ ├── downloader
│ │ │ ├── downloader.py
│ │ │ ├── __init__.py
│ │ │ ├── requirements.txt
│ │ │ └── utils.py
│ │ ├── __init__.py
│ │ ├── models.py
│ │ └── requirements.txt
│ └── __init__.py
└── trainers
├── bit_torch
│ ├── __init__.py
│ ├── lbtoolbox.py
│ ├── logger.py
│ ├── lr_schduler.py
│ ├── requirements.txt
│ └── trainer.py
└── __init__.py

On my journey to software engineering for AI 🤖, I’ve looked at json, toml, and many formats others use as well. But I’ve found that Python 🐍 is the best format for defining experiments 💡 It’s a regular python file without any loops 💻 Defining hyperparameters in Python has the advantage of flexibility 💪 as you can easily change modules and classes 🔧 Let the experimentation begin! 🚀

Let’s take a look at the fourth hero of the fantastic four modules! The 🔬Experiment that combines 💾 Data, 🏋️‍♂️ Trainers, and 🤖 Models to make machines learn and soar to new heights! Let’s take a look at experiments/bit/learn.py 🔍 This is where the magic happens! 🔮

from ...trainers.bit_torch.trainer import test, train
from ...models.bit_torch.models import load_trained_model, get_model_list
from ...data.bit import get_transforms, mini_batch_fewshot
import torchvision as tv, yerbamate, os
from torch.utils.tensorboard import SummaryWriter


# BigTransfer Medium ResNet50 Width 1
model_name = "BiT-M-R50x1"
# Choose a model form get_model_list that can fit in to your memoery
# Try "BiT-S-R50x1" if this doesn't works for you

# set up environment variables
env = yerbamate.Environment()

# set up data and augmentation
train_transform, val_transform = get_transforms(img_size=[32, 32])
data_set = tv.datasets.CIFAR10(
    env["datadir"], train=True, download=True, transform=train_transform
)
val_set = tv.datasets.CIFAR10(env["datadir"], train=False, transform=val_transform)
train_set, val_set, train_loader, val_loader = mini_batch_fewshot(
    train_set=data_set,
    valid_set=val_set,
    examples_per_class=None,  # Fewshot disabled
    batch=128,
    batch_split=2,
    workers=os.cpu_count(),  # Auto-val to cpu count
)

# load pretrained model
imagenet_weight_path = os.path.join(env["weights_path"], f"{model_name}.npz")
model = load_trained_model(
    weight_path=imagenet_weight_path, model_name=model_name, num_classes=10
)
# set up logger
logger = SummaryWriter(log_dir=env["results"], comment=env.name)

if env.train:
    train(
        model=model,
        train_loader=train_loader,
        valid_loader=val_loader,
        train_set_size=len(train_set),
        save=True,
        save_path=os.path.join(env["results"], f"trained_{model_name}.pt"),
        batch_split=2,
        base_lr=0.003,
        eval_every=100,
        log_path=os.path.join(env["results"], "log.txt"),
        tensorboardlogger=logger,
    )

if env.test:
    test(
        model=model,
        val_loader=val_loader,
        save_path=os.path.join(env["results"], f"trained_{model_name}.pt"),
        log_path=os.path.join(env["results"], "log.txt"),
        tensorboardlogger=logger,
    )

Defining experiments in Python 💻 has the advantage of flexibility 🧡 With a few tweaks, you can experiment with different models, hyperparameters, learning rates, batch sizes, augmentations, model sizes, and even few-shot learning! 🧠🤓 Unleash your inner scientist and start your AI experiments today! 🔬

A powerful tool, Python is, for young Padawan’s AI experimentation 🐍 The AI revolution begun has, with its flexibility to test and explore 🚀 Mighty its power, to create models adjust hyperparameters 💫 A spell it is, use wisely you must, the Force of the code within it, lies 🔥🧙‍♀️ Go forth and discover, unleash the potential of AI, young Padawan, you shall 🤖 May the code be with you! 💻 Magic you shall witness, as your network trains and learns 🔮🧠

Before training 🏃‍♂️ you need to set up your local environment variables 🌍 for the result and data locations. Maté 🧉🤝 offers an Environment API 💻 that’s compatible with pure Python 🐍 Maté Environment API checks for env.json 🔍 first, and then the operating system’s variables. Your deep learning project should have a designated results folder 📂 for saving your model’s trained brain 🧠, and other variables will depend on your specific task🎯 Don’t forget to double-check your local paths and set the correct environment variables 🌍 either through the shell or in the env.json file:

{
"results": "./results",
"datadir": "./data",
"logdir": "./results/logs",
"weights_path": "./weights"
}

When you finally set the environment variables correctly and your training session takes off like a rocket 🚀 You just gotta say, it’s all about the ‘path’ to success! 🔥

Let’s train 💪 To run an experiment, you can either use Python directly 💻 or use Maté 🧉🤝 🔮

# train the model
mate train bit learn
# Alternatively you can directly run with python
python -m deepneet.experiments.bit.learn train

Now, it should be training 🦸‍♂️ 🔥 Get ready to witness the magic of deep learning at work. May the force be with you! ⭐🕵️‍♂️🤖💻

Learning never sleeps, especially when you’re a machine! 🤖💻📚📈🧠

And now…Results 🔥

May all your plots look like this and your machine learning model never get stuck in a local minimum and always reach the global optimum! 🤖📈 Transfer learning after 1 minutes reaches 93% and reaches 97% in 30 minutes 🛸 Way much faster than training from scratch 🤸🏻‍♀ See your results with `tensorboard ./results/logs`

Maté🧉 is a module and experiment manager that works seamlessly with any Python framework, be it TensorFlow/Keras, PyTorch/Lightning, Jax/Flax, or any other framework or experimentation in python you have in mind 💻

Saying goodbye to spaghetti code, hello to Maté-licious modularity 🍝🧉💻 #NoMoreSpaghettiCode #MakeCodeNotWar #ModularityMatters

With Maté🧉 your machine learning journey is enhanced by the Fantastic Four 🦸‍♂️ and the power of Modularity 🧩; share your models/code effortlessly and manage dependencies with a snap 🤗🧉 💻🤖 You can directly install independent code modules from any GitHub URL 🪐 into your project:

# installs a fine tuning resnet experiment
mate install https://github.com/oalee/deep-vision/tree/main/deepnet/experiments/resnet

# Short install version of this repo: https://github.com/oalee/deep-vision
# installs a customizable pytorch resnet model implementation
mate install oalee/deep-vision/deepnet/models/resnet

# installs cifar10 data loader for pytorch lightning
mate install oalee/deep-vision/deepnet/data/cifar10

# installs augmentation module seperated from torch image models
mate install oalee/deep-vision/deepnet/data/torch_aug

# installs over 30 Vision in Transformers implementations into models
mate install oalee/deep-vision/deepnet/models/torch_vit

# Or install torch_vit from lucidrains as a non independent module
mate install https://github.com/lucidrains/vit-pytorch/tree/main/vit_pytorch

# installs a pytorch lightning classifier module
mate install oalee/deep-vision/deepnet/trainers/pl_classification

# installs pytorch lightning gan training module from lightweight-gan repo
mate install oalee/lightweight-gan/lgan/trainers/lgan

Maté makes it possible to directly install code from GitHub 💻 Even if the project doesn’t follows ultimate deep learning project structure and modularity, you can still easily install the entire root module directly into your project with just one command 💻✨🔥💥 But don’t forget, installing the root module doesn’t include a side of python dependencies 🍔 A simple pip install command can fix that up in a snap! 💻💥

# Installs the source code of torch image models
mate install https://github.com/rwightman/pytorch-image-models/tree/main/timm

# This doesn't installs dependencies. If you encounter bugs, you can try:
# pip install timm

# Alternativly install from a forked version for installing dependencies
mate install https://github.com/oalee/deep-vision/tree/main/timm -yo pip

This amazing library includes over 💯 torch image models 🤖 data loading tools 💾 augmentations 🧪 and training implementations 🚀 for all your vision needs 🔥 Get ready to ignite your projects with cutting-edge technology 🔥💥 Now you can experiment with changing the architecture and use the models, augmentations in your experiments 🧠🔍, and take your AI/ML journey to new heights with a few imports! 🚀🌌

from timm.models import create_model, DenseNet, ResNet, EfficientNet, MobileNetV3
from timm.data.auto_augment import augment_and_mix_transform, auto_augment_transform

Let’s take a closer look at the project structure and installed modules! Root modules such as this one will be installed as a neighbor to your root module, residing alongside it 🔍💻✨

.
├── deepnet
│ ├── data
│ ├── experiments
│ ├── models
│ └── trainers
├── mate.json
└── timm
├── data
│ ├── auto_augment.py
| ├ ...
├── layers
│ ├── activations_jit.py
│ ├── ...
│ └── weight_init.py
├── loss
│ ├── asymmetric_loss.py
├── ...
│ └── jsd.py
├── models
│ ├── beit.py
│ ├── _builder.py
│ ├── byoanet.py
│ ├── byobnet.py
│ ├── cait.py
│ ├── coat.py
│ ├── convit.py
│ ├── convmixer.py
│ ├── convnext.py
│ ├── crossvit.py
│ ├── cspnet.py
│ ├── davit.py
│ ├── deit.py
│ ├── densenet.py
│ ├── dla.py
│ ├── dpn.py
│ ├── edgenext.py
│ ├── efficientformer.py
│ ├── efficientformer_v2.py
│ ├── ...
│ ├── hub.py
│ ├── inception_resnet_v2.py
│ ├── inception_v3.py
│ ├── inception_v4.py
│ ├── resnest.py
│ ├── resnet.py
│ ├── resnetv2.py
│ ├── rexnet.py
│ ├── selecsls.py
│ ├── senet.py
│ ├── sequencer.py
│ ├── sknet.py
│ ├── swin_transformer.py
│ ├── swin_transformer_v2_cr.py
│ ├── swin_transformer_v2.py
│ ├── tnt.py
│ ├── tresnet.py
│ ├── twins.py
│ ├── vgg.py
│ ├── visformer.py
│ ├── vision_transformer_hybrid.py
│ ├── vision_transformer.py
│ ├── vision_transformer_relpos.py
│ ├── volo.py
│ ├── vovnet.py
│ ├── xception_aligned.py
│ ├── xception.py
│ └── xcit.py
├── optim
│ ├── adabelief.pyy
│ ├── lamb.py
| ├ ...
├── scheduler
│ ├── cosine_lr.py
│ ├── scheduler.py
| ├ ...
├── utils
│ ├── metrics.py
| ├ ...
└── version.py

23 directories, 268 files

Now is your chance to start new adventures with all the necessary modules 🤓 The secret to AI/ML success is the Fantastic Four 💪🦸🏽 Gather Quality Data 💾, harness the power of SoTA Models 🤖, optimize your Training 🚀 and embrace Continuous Experimentation, Learning, and Updating 🧪 🚀 Unite with the Fantastic Four 💪🦸🏽 by acquiring top-notch Data 💾, discovering the capabilities of SoTA Models 🤖, training with precision 🚀, and always seeking new opportunities for experimentation 🧪 to reach new heights 🔭⚗️🔬🖥️🦸🏻‍♀

All AI/ML model success depend on data, it’s the fuel for the model 🔥 Without good data, augmentations and labeling of data, you won’t have much success, and your model will be like a car with no gas 🚗 So make sure you invest time in collecting, pre-processing, and labeling data, or your model’s predictions may be just a shot in the dark 🎯

For ultimate ease and convenience, consider using the Ultimate Deep Learning Project Structure. With this structure in place, your projects will not only be easily sharable but also highly organized. To share your code with others, simply use the mate export command, followed by a git push🤗🧉

mate export
# This will generate python requirements code dependencies for experiments
Generated requirements.txt for deepnet/data/cifar10
Generated requirements.txt for deepnet/data/cifar100
Generated requirements.txt for deepnet/data/keras
Generated requirements.txt for deepnet/data/preprocessing
Generated dependencies.json for deepnet/experiments/resnet
Generated requirements.txt for deepnet/experiments/resnet
Generated requirements.txt for deepnet/experiments/keras_finetune
Generated dependencies.json for deepnet/experiments/resnet_keras
Generated requirements.txt for deepnet/experiments/resnet_keras
Generated requirements.txt for deepnet/models/vit
Generated requirements.txt for deepnet/models/resnet
Generated requirements.txt for deepnet/models/test
Generated requirements.txt for deepnet/models/keras
Generated requirements.txt for deepnet/trainers/classification