Disclosures/Text

It is easy to access the contents of SEC filings from Calcbench.

Calcbench parses the sections of the 10-K/Q such as Management’s Discussion and Analysis and Risk Factors. See the list of available sections @ https://www.calcbench.com/disclosure_list

Footnotes and other text

Search for footnotes and other sections of 10-K, see https://www.calcbench.com/footnote.

Parameters
  • company_identifiers (list(str)) – list of tickers or CIK codes

  • year (int) – Year to get data for

  • period (int) – period of data to get. 0 for annual data, 1, 2, 3, 4 for quarterly data.

  • period_type (str) – “quarterly” or “annual”, only applicable when other period data not supplied. Use “annual” to only search end-of-year documents.

  • document_names (list(str)) – The sections to retrieve, see the full list @ https://www.calcbench.com/disclosure_list. You cannot request XBRL and non-XBRL sections in the same request. eg. [‘Management’s Discussion And Analysis’, ‘Risk Factors’]

  • all_history (bool) – Search all time periods

  • updated_from (datetime.date) – date, include filings from this date and after.

  • sub_divide (bool) – return the document split into sections based on headers.

  • all_documents (bool) – all of the documents for a single company/period.

  • entire_universe (bool) – Search all companies

  • progress_bar (tqdm.tqdm) – Pass a tqdm progress bar to keep an eye on things.

Returns

A generator of DocumentSearchResults

Return type

generator(DocumentSearchResults)

Usage:

>>> import tqdm
>>> sp500 = calcbench.tickers(index='SP500')
>>> with tqdm.tqdm() as progress_bar:
>>>     risk_factors = list(calcbench.document_search(company_identifiers=sp500, disclosure_names=['Risk Factors'], all_history=True, progress_bar=progress_bar))
calcbench.document_dataframe(company_identifiers=[], disclosure_names=[], all_history=False, year=None, period=None, progress_bar=None, period_type=None, identifier_key='ticker')

Disclosures/Footnotes in a DataFrame

Parameters
  • company_identifiers (list(str)) – list of tickers or CIK codes

  • disclosure_names (list(str)) – The sections to retrieve, see the full list @ https://www.calcbench.com/disclosure_list. You cannot request XBRL and non-XBRL sections in the same request. eg. [‘Management’s Discussion And Analysis’, ‘Risk Factors’]

  • all_history (bool) – Search all time periods

  • year (int) – The year to search

  • period (int) – period of data to get. 0 for annual data, 1, 2, 3, 4 for quarterly data.

  • period_type (str) – “quarterly” or “annual”, only applicable when other period data not supplied. Use “annual” to only search end-of-year documents.

  • progress_bar (tqdm.tqdm) – Pass a tqdm progress bar to keep an eye on things.

  • identifier_key (string) – “ticker” or “CIK”, how to index the returned DataFrame.

Returns

A DataFrame indexed by document name -> company identifier.

Rtyte

pandas.DataFrame

Usage:

>>> data = calcbench.document_dataframe(company_identifiers=["msft", "goog"], all_history=True, disclosure_names=["Management's Discussion And Analysis", "Risk Factors"])
>>> data = data.fillna(False)
>>> word_counts = data.applymap(lambda document: document and len(document.get_contents_text().split()))
class calcbench.DocumentSearchResults