Changing column names of a data frame

Use the colnames() function: R> X <- data.frame(bad=1:3, worse=rnorm(3)) R> X bad worse 1 1 -2.440467 2 2 1.320113 3 3 -0.306639 R> colnames(X) <- c(“good”, “better”) R> X good better 1 1 -2.440467 2 2 1.320113 3 3 -0.306639 You can also subset: R> colnames(X)[2] <- “superduper”

How to sum a variable by group

Using aggregate: aggregate(x$Frequency, by=list(Category=x$Category), FUN=sum) Category x 1 First 30 2 Second 5 3 Third 34 In the example above, multiple dimensions can be specified in the list. Multiple aggregated metrics of the same data type can be incorporated via cbind: aggregate(cbind(x$Frequency, x$Metric2, x$Metric3) … (embedding @thelatemail comment), aggregate has a formula interface too aggregate(Frequency … Read more

Pandas read_csv: low_memory and dtype options

The deprecated low_memory option The low_memory option is not properly deprecated, but it should be, since it does not actually do anything differently[source] The reason you get this low_memory warning is because guessing dtypes for each column is very memory demanding. Pandas tries to determine what dtype to set by analyzing the data in each … Read more

Convert Python dict into a dataframe

The error here, is since calling the DataFrame constructor with scalar values (where it expects values to be a list/dict/… i.e. have multiple columns): pd.DataFrame(d) ValueError: If using all scalar values, you must must pass an index You could take the items from the dictionary (i.e. the key-value pairs): In [11]: pd.DataFrame(d.items()) # or list(d.items()) … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)