r - Delete rows based on values in the columns and a thresh hold value -


i have table, start below:

                       sm_h1455     sm_h1456  sm_h1457   sm_h1461     sm_h1462     sm_h1463  ensg00000001617.7         0            0          0           0          0           0                               ensg00000001626.9         0            0          0           0          0           0                                                             ensg00000002587.5         10           0          6           2          0           2                                                ensg00000002726.15        8            14         0           2          16          2                                                                 ensg00000002745.8         6            2          2           0          0           4                                  

i want delete rows in >= 80% of columns have value 0. have 6 cols here, if 5 or more of columns in row have 0, row needs deleted.

i have code:

data = data[!rowsums(data == 0), ] 

but code delete rows long have 0, without taking account 80% thresh hold.

i think @hong ooi's answer incorrect in case. give result have asked for:

data <- data[rowsums(data==0)/ncol(data) < 0.8, ] 

data==0 returns data frame filled true if value @ location equal zero, otherwise false. numerically, r treats trueas having value of 1 , false having value of zero.

rowsums adds numerical equivalents of true , false values each row in data frame returned data==0. rowsums(data==0) gives number of elements in each row in data zero.

ncol number of columns in original data object.

rowsums(data==0)/ncol(data) therefore proportion of elements equal 0 in each row.

finally, can discard rows above proprtion not less 80% filtering (using [] notation).

update: @hong ooi's edit means answer correct now.


Comments

Popular posts from this blog

c++ - Creating new partition disk winapi -

Android Prevent Bluetooth Pairing Dialog -

VBA function to include CDATA -