pip install padl
pip install padl
Spend less time racking your brains about how to structure your deep learning projects, workflows and code, and more time focusing on the science.
Using and reproducing deep learning models in practice is about much more than writing individual PyTorch layers; it also means specifying how individuals PyTorch layers intercommunicate, how inputs are prepared as tensors, which pieces of auxiliary data are necessary to prepare those inputs, and how tensor outputs are post-processed and converted to make them useful for the target application. PADL's "Pipeline" specifies all of these additional things.
word_predict = ( cleantokenize to_tensor batch (dropout transformer) + right_shift cross_entropy_loss )
PADL allows developers to create composable and reusable blocks with a simple decorator: "transform". Decorated code is ported to a "Transform" instance, a powerful abstraction encompassing a wide range of computations required for preprocessing, forward pass and postprocessing step; "Transform" allows for auxiliary data such as PyTorch layer weights, lookup tables and much more.
from padl import transform def add(x, y): # this is a Transform return x + y transform(lambda x: x + 1000) # this is a Transform class MLP(torch.nn.Module): def __init__(self, n_in, hidden, n_out): ... def forward(self, x): ... mml = MLP(10, 10, 10) # this is a Transform
"Transform" instances may be conveniently linked to become "Pipelines" using a small set of functional operators >> , + , / and ~. This allows developers to manage complex branching and interdependent models with ease. Pre-processing, forward pass and post-processing are concisely delineated using batch and unbatch
my_classifier_transform = ( load_image # preprocessing ... transforms.ToTensor() # batch # ... stage models.resnet18() # forward pass unbatch # postprocessing ... classify # ... stage )
Data scientists love quick and dirty notebooks, production engineers love clean and orderly code. With PADL it's possible to have both with PADL's inbuilt serializer; it tracks down the minimal code and data necessary to instantiate the full "Pipeline", and compiles these to a compact module and set of data artifacts, ready for shipping to your production environment.
class MyModel(torch.nn.Module): ... class MyDataSet(torch.utils.data.DataSet): def __init__(self, raw_data): self.data = raw_data def __getitem__(self, item): x, target = self.data[item] x = preprocessing_step_1(item) x = preprocessing_step_2(item) return x, target data_set = MyDataSet(raw_data_lines) data_loader = torch.utils.data.DataLoader( data_set, batch_size=10 ) model = MyModel() for batch, target in data_loader: output = model(batch) loss = loss_function(output, target) ...
@transform class MyModel(torch.nn.Module): ... model_transform = ( preprocessing_step_1 preprocessing_step_2 batch MyModel() ) train_transform = ( model_transform / identity loss_function ) for loss in train_transform.train_apply(raw_data_lines, batch_size=100): ...
class PreProcessor: ... # lots of code with lots of sub routines etc. class PostProcessor: ... # lots of code with lots of sub routines etc. class Model(torch.nn.Module): ... # layer as usual torch.save(model.to_dict(), 'mydirectory/layer.pt') with open('mydirectory/preprocess.pkl', 'w') as f: dill.dump(preprocessor, f) # lots of errors / uncertainty with open('mydirectory/postprocess.pkl', 'w') as f: dill.dump(postprocessor, f) # lots of errors / uncertainty
from padl import save save(mytransform, 'mydirectory.padl')
class Model(torch.nn.Module): def __init__(self, la, lb, lc, ld): super().__init__() self.la = la self.lb = lb self.lc = lc self.ld = ld def forward(self, x, y): lhs = self.la(x) rhs_1 = self.lb(y) rhs_2 = self.lc(y) return self.ld(lhs, (rhs_1, rhs_2)) model = Model(layer_a, layer_b, layer_c, layer_d)
model = ( layer_a / (layer_b + layer_c)layer_d )
Input your search keywords and press Enter.