machine-learning – Tarik Billa

How can I do Train And Test step in Giza++?

April 11, 2024 by Tarik

How To Determine the ‘filter’ Parameter in the Keras Conv2D Function

January 7, 2024 by Tarik

Actually – there is no a good answer to your question. Most of the architectures are usually carefully designed and finetuned during many experiments. I could share with you some of the rules of thumbs one should apply when designing its own architecture: Avoid a dimension collapse in the first layer. Let’s assume that your … Read more

Is it good learning rate for Adam method?

January 6, 2024 by Tarik

The learning rate looks a bit high. The curve decreases too fast for my taste and flattens out very soon. I would try 0.0005 or 0.0001 as a base learning rate if I wanted to get additional performance. You can quit after several epochs anyways if you see that this does not work. The question … Read more

Soft attention vs. hard attention

January 6, 2024 by Tarik

What is exactly attention? To be able to understand this question, we need to dive a little into certain problems which attention seeks to solve. I think one of the seminal papers on hard attention is Recurrent Models of Visual Attention and I would encourage the reader to go through that paper, even if it … Read more

What is the difference between register_parameter and register_buffer in PyTorch?

December 30, 2023 by Tarik

Pytorch doc for register_buffer() method reads This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the persistent state. As you already observed, model parameters are learned and updated using SGD during the training process. However, … Read more

Unable to open Tensorboard in browser

December 29, 2023 by Tarik

How does binary cross entropy loss work on autoencoders?

December 26, 2023 by Tarik

In the context of autoencoders the input and output of the model is the same. So, if the input values are in the range [0,1] then it is acceptable to use sigmoid as the activation function of last layer. Otherwise, you need to use an appropriate activation function for the last layer (e.g. linear which … Read more

Why does sklearn Imputer need to fit?

December 26, 2023 by Tarik

The Imputer fills missing values with some statistics (e.g. mean, median, …) of the data. To avoid data leakage during cross-validation, it computes the statistic on the train data during the fit, stores it and uses it on the test data, during the transform. from sklearn.preprocessing import Imputer obj = Imputer(strategy=’mean’) obj.fit([[1, 2, 3], [2, … Read more

Precision/recall for multiclass-multilabel classification

December 26, 2023 by Tarik

For multi-label classification you have two ways to go First consider the following. is the number of examples. is the ground truth label assignment of the example.. is the example. is the predicted labels for the example. Example based The metrics are computed in a per datapoint manner. For each predicted label its only its … Read more

What’s the difference between LSTM() and LSTMCell()?

December 25, 2023 by Tarik

LSTM is a recurrent layer LSTMCell is an object (which happens to be a layer too) used by the LSTM layer that contains the calculation logic for one step. A recurrent layer contains a cell object. The cell contains the core code for the calculations of each step, while the recurrent layer commands the cell … Read more