python - Extract Quarterly Data from Multi Quarter Periods -
public companies in make quarterly filings (10-q) , yearly filings (10-k). in cases file 3 10qs per year , 1 10k.
in cases, quarterly filings (10qs) contain quarterly data. example, "revenue 3 months ending march 31, 2005."
the yearly filings have year end sums. example: "revenue twelve months ending december 31, 2005."
in order value q4 of 2005, need take yearly data , subtract values each of quarters (q1-q3).
in cases, each of quarterly data expressed year date. example, first quarterly filing "revenue 3 months ending march 31, 2005." second "revenue 6 months ending june 30, 2005." third "revenue 9 months ending september 30, 2005." yearly above, "revenue twelve months ending december 31, 2005." represents generalization of above issues in desire extract quarterly data can accomplished repeated subtraction of previous period data.
my question best way in pandas accomplish quarterly data extraction?
there large number of fields (revenue, profit, exposes, etc) per period.
a related question asked in regards how express period data in pandas: creating period multi quarter timespan in pandas
here example data of first problem (three 10qs , 1 10k has year end data):
10q:
- http://www.sec.gov/archives/edgar/data/1174922/000119312512225309/d326512d10q.htm#tx326512_4
- http://www.sec.gov/archives/edgar/data/1174922/000119312512347659/d360762d10q.htm#tx360762_3
- http://www.sec.gov/archives/edgar/data/1174922/000119312512463380/d411552d10q.htm#tx411552_3
10k:
calcbench refers problem: http://www.calcbench.com/home/userguide: "q4 calculation: companies not report q4 data, rather opting report full year data instead. we’ll automatically calculate you. data in blue calculated.
there multiple years of data , each year want calculate missing fourth quarter:
2012q2 2012q3 2012y 2013q1 2013q2 2013q3 2013y revenue 1 1 1 1 1 1 1 expense 10 10 10 10 10 10 10
you define function subtract quarterly totals annual number, , apply function each row, storing result in new column.
in [2]: df out[2]: annual q1 q2 q3 revenue 18 3 4 5 expense 17 2 3 4 in [3]: def calc_q4(row): ...: return row['annual'] - row['q1'] - row['q2'] - row['q3'] in [4]: df['q4'] = df.apply(calc_q4, axis = 1) in [5]: df out[5]: annual q1 q2 q3 q4 revenue 18 3 4 5 6 expense 17 2 3 4 8
Comments
Post a Comment