Numeric Data

Calcbench extracts all of the GAAP numbers in section 8, face statments and footnotes, of the 10-K/Qs.

Standardized

Calcbench standardizes +1000 metrics to handle differences filers’s tagging. The list of stardized points is @ https://www.calcbench.com/home/standardizedmetrics

calcbench.standardized_data(company_identifiers=[], metrics=[], start_year=None, start_period=None, end_year=None, end_period=None, entire_universe=False, point_in_time=False, year=None, period=None, all_history=False, period_type=None, trace_hyperlinks=False, use_fiscal_period=False, company_identifier_scheme=CompanyIdentifierScheme.Ticker, accession_id=None)

Standardized Data.

Metrics are standardized by economic concept and time period.

The data behind the multi-company page, https://www.calcbench.com/multi.

Example https://github.com/calcbench/notebooks/blob/master/python_client_api_demo.ipynb

Parameters
  • company_identifiers (Sequence[Union[str, int]]) – Tickers/CIK codes. eg. [‘msft’, ‘goog’, ‘appl’, ‘0000066740’]

  • metrics (Sequence[str]) – Standardized metrics. Full list @ https://www.calcbench.com/home/standardizedmetrics eg. [‘revenue’, ‘accountsreceivable’]

  • start_year (Optional[int]) – first year of data

  • start_period (Union[Period, Literal[0, 1, 2, 3, 4], None]) – first quarter to get, for annual data pass 0, for quarters pass 1, 2, 3, 4

  • end_year (Optional[int]) – last year of data

  • end_period (Union[Period, Literal[0, 1, 2, 3, 4], None]) – last_quarter to get, for annual data pass 0, for quarters pass 1, 2, 3, 4

  • entire_universe (bool) – Get data for all companies, this can take a while, talk to Calcbench before you do this in production.

  • accession_id (Optional[int]) – Calcbench Accession ID

  • year (Optional[int]) – Get data for a single year, defaults to annual data.

  • period_type (Optional[PeriodType]) – Either “annual” or “quarterly”.

  • trace_hyperlinks (bool) – Values are URLs to the source documents

Return type

DataFrame

Returns

Dataframe with the periods as the index and columns indexed by metric and ticker

Usage:

>>> d = calcbench.standardized_data(company_identifiers=['msft', 'goog'],
>>>                                 metrics=['revenue', 'assets'],
>>>                                 all_history=True,
>>>                                 period_type='annual')

>>> # Make it look like Compustat data
>>> d.stack(level=1)
calcbench.standardized_raw(company_identifiers=[], metrics=[], start_year=None, start_period=None, end_year=None, end_period=None, entire_universe=False, accession_id=None, point_in_time=False, include_trace=False, update_date=None, all_history=False, year=None, period=None, period_type=None, include_preliminary=False, use_fiscal_period=False, all_face=False, all_footnotes=False, include_xbrl=False)

Standardized data.

Get normalized data from Calcbench. Each point is normalized by economic concept and time period.

Parameters
  • company_identifiers (Sequence[Union[str, int]]) – a sequence of tickers (or CIK codes), eg [‘msft’, ‘goog’, ‘appl’]

  • metrics (Sequence[str]) – a sequence of metrics, see the full list @ https://www.calcbench.com/home/standardizedmetrics eg. [‘revenue’, ‘accountsreceivable’]

  • start_year (Optional[int]) – first year of data

  • start_period (Union[Period, Literal[0, 1, 2, 3, 4], None]) – first quarter to get, for annual data pass 0, for quarters pass 1, 2, 3, 4

  • end_year (Optional[int]) – last year of data

  • end_period (Union[Period, Literal[0, 1, 2, 3, 4], None]) – last_quarter to get, for annual data pass 0, for quarters pass 1, 2, 3, 4

  • entire_universe – Get data for all companies, this can take a while, talk to Calcbench before you do this in production.

  • accession_id (Optional[int]) – Calcbench Accession ID

  • include_trace (bool) – Include the facts used to calculate the normalized value.

  • year (Optional[int]) – Get data for a single year, defaults to annual data.

  • period_type (Optional[PeriodType]) – Either “annual” or “quarterly”

  • include_preliminary (bool) – Include data from non-XBRL 8-Ks and press releases.

Return type

Sequence[StandardizedPoint]

Returns

A list of dictionaries with keys [‘ticker’, ‘calendar_year’, ‘calendar_period’, ‘metric’, ‘value’].

Point-In-Time

Our standardized data with timestamps. Useful for backtesting quantitative strategies.

calcbench.point_in_time(company_identifiers=[], all_footnotes=False, metrics=[], all_history=False, entire_universe=False, start_year=None, start_period=None, end_year=None, end_period=None, period_type=None, use_fiscal_period=False, include_preliminary=False, all_face=False, include_xbrl=True, accession_id=None, include_trace=False)

Point-in-Time Data

Standardized data with a timestamp when it was published by Calcbench

Parameters
  • update_date – The date on which the data was received, this does not work prior to ~2016, use all_history to get historical data then use update_date to get updates.

  • accession_id (Optional[int]) – Unique identifier for the filing for which to recieve data. Pass this to recieve data for one filing. Same as filing_id in filings objects

  • all_face (bool) – Retrieve all of the points from the face financials, income/balance/statement of cash flows

  • all_footnotes (bool) – Retrive all of the points from the footnotes to the financials

  • include_preliminary (bool) – Include facts from non-XBRL earnings press-releases and 8-Ks.

  • include_xbrl (bool) – Include facts from XBRL 10-K/Qs.

  • include_trace (bool) – Include a URL that points to the source document.

Return type

DataFrame

Returns

DataFrame of facts

Columns:

value

The value of the fact

revision_number

0 indicates an original, unrevised value for this fact. 1, 2, 3… indicates subsequent revisions to the fact value. https://knowledge.calcbench.com/hc/en-us/search?utf8=%E2%9C%93&query=revisions&commit=Search

period_start

First day of the fiscal period for this fact

period_end

Last day of the fiscal period for this fact

date_reported

Timestamp when Calcbench published this fact

metric

The metric name, see the definitions @ https://www.calcbench.com/home/standardizedmetrics

calendar_year

The calendar year for this fact. https://knowledge.calcbench.com/hc/en-us/articles/223267767-What-are-Calendar-Years-and-Periods-What-is-TTM-

calendar_period

The calendar period for this fact

fiscal_year

Company reported fiscal year for this fact

fiscal_period

Company reported fiscal period for this fact

ticker

Ticker of reporting company

CIK

SEC assigned Central Index Key for reporting company

calcbench_entity_id

Internal Calcbench identifier for reporting company

filing_type

The document type this fact came from, 10-K|Q, S-1 etc…

Usage::
>>> calcbench.point_in_time(company_identifiers=["msft", "goog"],
>>>                          all_history=True,
>>>                          all_face=True,
>>>                          all_footnotes=True)

Dimensional

Segments: geographic and operating, and other dimensionalized tabular data.

calcbench.dimensional_raw(company_identifiers=[], metrics=[], start_year=None, start_period=None, end_year=None, end_period=None, period_type=PeriodType.Annual)

Segments and Breakouts

The data behind the breakouts/segment page, https://www.calcbench.com/breakout.

Parameters
  • company_identifiers (sequence) – Tickers/CIK codes. eg. [‘msft’, ‘goog’, ‘appl’, ‘0000066740’]

  • metrics (sequence) – list of dimension tuple strings, get the list @ https://www.calcbench.com/api/availableBreakouts, pass in the “databaseName”

  • start_year (int) – first year of data to get

  • start_period (int) – first period of data to get. 0 for annual data, 1, 2, 3, 4 for quarterly data.

  • end_year (int) – last year of data to get

  • end_period (int) – last period of data to get. 0 for annual data, 1, 2, 3, 4 for quarterly data.

  • period_type (str) – ‘quarterly’ or ‘annual’, only applicable when other period data not supplied.

Returns

A list of points. The points correspond to the lines @ https://www.calcbench.com/breakout. For each requested metric there will be a the formatted value and the unformatted value denote bya _effvalue suffix. The label is the dimension label associated with the values.

Return type

sequence

Usage::
>>> cb.dimensional_raw(company_identifiers=['fdx'], metrics=['OperatingSegmentRevenue'], start_year=2018)
class calcbench.api_client.Period(value)

An enumeration.

Annual = 0
Q1 = 1
Q2 = 2
Q3 = 3
Q4 = 4
class calcbench.api_client.PeriodType(value)

An enumeration.

Annual = 'annual'
Combined = 'combined'
Quarterly = 'quarterly'
TrailingTwelveMonths = 'TTM'
class calcbench.api_client.CompanyIdentifierScheme(value)

An enumeration.

CentralIndexKey = 'CIK'
Ticker = 'ticker'
class calcbench.api_client.PeriodType(value)

An enumeration.