r - Delete rows based on values in the columns and a thresh hold value -
i have table, start below:
sm_h1455 sm_h1456 sm_h1457 sm_h1461 sm_h1462 sm_h1463 ensg00000001617.7 0 0 0 0 0 0 ensg00000001626.9 0 0 0 0 0 0 ensg00000002587.5 10 0 6 2 0 2 ensg00000002726.15 8 14 0 2 16 2 ensg00000002745.8 6 2 2 0 0 4 i want delete rows in >= 80% of columns have value 0. have 6 cols here, if 5 or more of columns in row have 0, row needs deleted.
i have code:
data = data[!rowsums(data == 0), ] but code delete rows long have 0, without taking account 80% thresh hold.
i think @hong ooi's answer incorrect in case. give result have asked for:
data <- data[rowsums(data==0)/ncol(data) < 0.8, ] data==0 returns data frame filled true if value @ location equal zero, otherwise false. numerically, r treats trueas having value of 1 , false having value of zero.
rowsums adds numerical equivalents of true , false values each row in data frame returned data==0. rowsums(data==0) gives number of elements in each row in data zero.
ncol number of columns in original data object.
rowsums(data==0)/ncol(data) therefore proportion of elements equal 0 in each row.
finally, can discard rows above proprtion not less 80% filtering (using [] notation).
update: @hong ooi's edit means answer correct now.
Comments
Post a Comment