machine-learning – Page 3

What is the difference between register_parameter and register_buffer in PyTorch?

December 30, 2023 by Tarik

Pytorch doc for register_buffer() method reads This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the persistent state. As you already observed, model parameters are learned and updated using SGD during the training process. However, … Read more

Unable to open Tensorboard in browser

December 29, 2023 by Tarik

Simple Python implementation of collaborative topic modeling?

December 28, 2023 by Tarik

Pytorch RuntimeError: CUDA out of memory with a huge amount of free memory

December 28, 2023 by Tarik

I wasted several hours until I discovered that reducing the batch size and resizing the width of my input image (image size) were necessary steps.

Does TensorFlow have cross validation implemented?

December 28, 2023 by Tarik

As already discussed, tensorflow doesn’t provide its own way to cross-validate the model. The recommended way is to use KFold. It’s a bit tedious, but doable. Here’s a complete example of cross-validating MNIST model with tensorflow and KFold: from sklearn.model_selection import KFold import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data # Parameters learning_rate = 0.01 … Read more

General approach to developing an image classification algorithm for Dilbert cartoons

December 28, 2023 by Tarik

So i think you are on the right track w/r/t your step 1 (apply some algorithm to the image, which converts it into a set of features). This project is more challenging that most ML problems because here you will actually have to create your training data set from the raw data (the individual frames … Read more

Data Standardization vs Normalization vs Robust Scaler

December 27, 2023 by Tarik

Am I right to say that also Standardization gets affected negatively by the extreme values as well? Indeed you are; the scikit-learn docs themselves clearly warn for such a case: However, when data contains outliers, StandardScaler can often be mislead. In such cases, it is better to use a scaler that is robust against outliers. … Read more

Information Gain calculation with Scikit-learn

December 27, 2023 by Tarik

You can use scikit-learn’s mutual_info_classif here is an example from sklearn.datasets import fetch_20newsgroups from sklearn.feature_selection import mutual_info_classif from sklearn.feature_extraction.text import CountVectorizer categories = [‘talk.religion.misc’, ‘comp.graphics’, ‘sci.space’] newsgroups_train = fetch_20newsgroups(subset=”train”, categories=categories) X, Y = newsgroups_train.data, newsgroups_train.target cv = CountVectorizer(max_df=0.95, min_df=2, max_features=10000, stop_words=”english”) X_vec = cv.fit_transform(X) res = dict(zip(cv.get_feature_names(), mutual_info_classif(X_vec, Y, discrete_features=True) )) print(res) this will output … Read more

how to get rid of pandas converting large numbers in excel sheet to exponential?

December 27, 2023 by Tarik

The way scientific notation is applied is controled via pandas’ display options: pd.set_option(‘display.float_format’, ‘{:.2f}’.format) df = pd.DataFrame({‘Traded Value’:[67867869890077.96,78973434444543.44], ‘Deals’:[789797, 789878]}) print(df) Traded Value Deals 0 67867869890077.96 789797 1 78973434444543.44 789878 If this is simply for presentational purposes, you may convert your data to strings while formatting them on a column-by-column basis: df = pd.DataFrame({‘Traded Value’:[67867869890077.96,78973434444543.44], … Read more

Extract target from Tensorflow PrefetchDataset

December 27, 2023 by Tarik

You can convert it to a list with list(ds) and then recompile it as a normal Dataset with tf.data.Dataset.from_tensor_slices(list(ds)). From there your nightmare begins again but at least it’s a nightmare that other people have had before. Note that for more complex datasets (e.g. nested dictionaries) you will need more preprocessing after calling list(ds), but … Read more