Selecting a row of pandas series/dataframe by integer index

echoing @HYRY, see the new docs in 0.11 http://pandas.pydata.org/pandas-docs/stable/indexing.html Here we have new operators, .iloc to explicity support only integer indexing, and .loc to explicity support only label indexing e.g. imagine this scenario In [1]: df = pd.DataFrame(np.random.rand(5,2),index=range(0,10,2),columns=list(‘AB’)) In [2]: df Out[2]: A B 0 1.068932 -0.794307 2 -0.470056 1.192211 4 -0.284561 0.756029 6 1.037563 … Read more

Quickly reading very large tables as dataframes

An update, several years later This answer is old, and R has moved on. Tweaking read.table to run a bit faster has precious little benefit. Your options are: Using vroom from the tidyverse package vroom for importing data from csv/tab-delimited files directly into an R tibble. See Hector’s answer. Using fread in data.table for importing … Read more

pandas create new column based on values from other columns / apply a function of multiple columns, row-wise

OK, two steps to this – first is to write a function that does the translation you want – I’ve put an example together based on your pseudo-code: def label_race (row): if row[‘eri_hispanic’] == 1 : return ‘Hispanic’ if row[‘eri_afr_amer’] + row[‘eri_asian’] + row[‘eri_hawaiian’] + row[‘eri_nat_amer’] + row[‘eri_white’] > 1 : return ‘Two Or More’ … Read more

Create an empty data.frame

Just initialize it with empty vectors: df <- data.frame(Date=as.Date(character()), File=character(), User=character(), stringsAsFactors=FALSE) Here’s an other example with different column types : df <- data.frame(Doubles=double(), Ints=integer(), Factors=factor(), Logicals=logical(), Characters=character(), stringsAsFactors=FALSE) str(df) > str(df) ‘data.frame’: 0 obs. of 5 variables: $ Doubles : num $ Ints : int $ Factors : Factor w/ 0 levels: $ Logicals … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)