load csv into 2D matrix with numpy for plotting

Pure numpy numpy.loadtxt(open(“test.csv”, “rb”), delimiter=”,”, skiprows=1) Check out the loadtxt documentation. You can also use python’s csv module: import csv import numpy reader = csv.reader(open(“test.csv”, “rb”), delimiter=”,”) x = list(reader) result = numpy.array(x).astype(“float”) You will have to convert it to your favorite numeric type. I guess you can write the whole thing in one line: … Read more

Pandas long to wide reshape, by two variables

Here’s another solution more fleshed out, taken from Chris Albon’s site. Create “long” dataframe raw_data = {‘patient’: [1, 1, 1, 2, 2], ‘obs’: [1, 2, 3, 1, 2], ‘treatment’: [0, 1, 0, 1, 0], ‘score’: [6252, 24243, 2345, 2342, 23525]} df = pd.DataFrame(raw_data, columns = [‘patient’, ‘obs’, ‘treatment’, ‘score’]) Make a “wide” data df.pivot(index=’patient’, columns=”obs”, … Read more

Gather multiple sets of columns

This approach seems pretty natural to me: df %>% gather(key, value, -id, -time) %>% extract(key, c(“question”, “loop_number”), “(Q.\\..)\\.(.)”) %>% spread(question, value) First gather all question columns, use extract() to separate into question and loop_number, then spread() question back into the columns. #> id time loop_number Q3.2 Q3.3 #> 1 1 2009-01-01 1 0.142259203 -0.35842736 #> … Read more

Reshape three column data frame to matrix (“long” to “wide” format) [duplicate]

There are many ways to do this. This answer starts with what is quickly becoming the standard method, but also includes older methods and various other methods from answers to similar questions scattered around this site. tmp <- data.frame(x=gl(2,3, labels=letters[24:25]), y=gl(3,1,6, labels=letters[1:3]), z=c(1,2,3,3,3,2)) Using the tidyverse: The new cool new way to do this is … Read more

Split delimited strings in a column and insert as new rows [duplicate]

As of Dec 2014, this can be done using the unnest function from Hadley Wickham’s tidyr package (see release notes http://blog.rstudio.org/2014/12/08/tidyr-0-2-0/) > library(tidyr) > library(dplyr) > mydf V1 V2 2 1 a,b,c 3 2 a,c 4 3 b,d 5 4 e,f 6 . . > mydf %>% mutate(V2 = strsplit(as.character(V2), “,”)) %>% unnest(V2) V1 V2 … Read more

Reshaping data.frame from wide to long format

Three alternative solutions: 1) With data.table: You can use the same melt function as in the reshape2 package (which is an extended & improved implementation). melt from data.table has also more parameters that the melt-function from reshape2. You can for example also specify the name of the variable-column: library(data.table) long <- melt(setDT(wide), id.vars = c(“Code”,”Country”), … Read more