Files
2025-12-09 12:13:01 +01:00

12 KiB

Statement Class Documentation

Overview

The Statement class represents a single financial statement extracted from XBRL data. It provides methods for viewing, manipulating, and analyzing financial statement data including income statements, balance sheets, cash flow statements, and disclosure notes.

A Statement object contains:

  • Line items with values across multiple periods
  • Hierarchy showing the structure and relationships
  • Metadata including concept names and labels
  • Period information for time-series analysis

Getting a Statement

From XBRL

# Get XBRL data first
xbrl = filing.xbrl()

# Access specific statements
income = xbrl.statements.income_statement()
balance = xbrl.statements.balance_sheet()
cashflow = xbrl.statements.cash_flow_statement()
equity = xbrl.statements.statement_of_equity()

# By name
cover_page = xbrl.statements['CoverPage']

# By index
first_statement = xbrl.statements[0]

Viewing Statements

Rich Display

# Print statement to see formatted table
print(income)

# Shows:
# - Statement title
# - Line items with hierarchical structure
# - Values for multiple periods
# - Proper number formatting

Text Representation

# Get plain text version
text = str(income)

# Or explicitly
text_output = income.text()

Converting to DataFrame

Basic Conversion

# Convert statement to pandas DataFrame
df = income.to_dataframe()

# DataFrame structure:
# - Index: Line item labels or concepts
# - Columns: Period dates
# - Values: Financial amounts

With Period Filter

# Filter to specific periods
df = income.to_dataframe(period_filter='2024')

# Only includes periods matching the filter

Accessing Specific Data

# Convert to DataFrame for easy analysis
df = income.to_dataframe()

# Access specific line items
revenue = df.loc['Revenue']
net_income = df.loc['Net Income']

# Access specific periods
current_period = df.iloc[:, 0]  # First column (most recent)
prior_period = df.iloc[:, 1]    # Second column

# Specific cell
current_revenue = df.loc['Revenue', df.columns[0]]

Statement Properties

Available Periods

# Get list of periods in the statement
periods = statement.periods

# Each period is a date string (YYYY-MM-DD)
for period in periods:
    print(f"Data available for: {period}")

Statement Name and Type

# Get statement information
name = statement.name           # Statement display name
concept = statement.concept     # XBRL concept identifier

Raw Data Access

# Get underlying statement data structure
raw_data = statement.get_raw_data()

# Returns list of dictionaries with:
# - concept: XBRL concept name
# - label: Display label
# - values: Dict of period -> value
# - level: Hierarchy depth
# - all_names: All concept variations

Rendering and Display

Custom Rendering

# Render with specific options
rendered = statement.render()

# Rendered statement has rich formatting
print(rendered)

Text Export

# Get markdown-formatted text
markdown_text = statement.text()

# Suitable for:
# - AI/LLM consumption
# - Documentation
# - Text-based analysis

Working with Statement Data

Calculate Growth Rates

# Convert to DataFrame
df = income.to_dataframe()

# Calculate period-over-period growth
if len(df.columns) >= 2:
    current = df.iloc[:, 0]
    prior = df.iloc[:, 1]

    # Growth rate
    growth = ((current - prior) / prior * 100).round(2)

    # Create comparison DataFrame
    comparison = pd.DataFrame({
        'Current': current,
        'Prior': prior,
        'Growth %': growth
    })

    print(comparison)

Extract Specific Metrics

# Get income statement metrics
df = income.to_dataframe()

# Extract key metrics from most recent period
current = df.iloc[:, 0]

metrics = {
    'Revenue': current.get('Revenue', 0),
    'Operating Income': current.get('Operating Income', 0),
    'Net Income': current.get('Net Income', 0),
}

# Calculate derived metrics
if metrics['Revenue'] > 0:
    metrics['Operating Margin'] = (
        metrics['Operating Income'] / metrics['Revenue'] * 100
    )
    metrics['Net Margin'] = (
        metrics['Net Income'] / metrics['Revenue'] * 100
    )

Filter Line Items

# Convert to DataFrame
df = balance.to_dataframe()

# Filter for specific items
asset_items = df[df.index.str.contains('Asset', case=False)]
liability_items = df[df.index.str.contains('Liabilit', case=False)]

# Get subtotals
if 'Current Assets' in df.index:
    current_assets = df.loc['Current Assets']

Time Series Analysis

# Get multiple periods
df = income.to_dataframe()

# Plot revenue trend
if 'Revenue' in df.index:
    revenue_series = df.loc['Revenue']

    # Convert to numeric and plot
    import matplotlib.pyplot as plt
    revenue_series.plot(kind='line', title='Revenue Trend')
    plt.show()

Common Workflows

Compare Current vs Prior Period

# Get income statement
income = xbrl.statements.income_statement()
df = income.to_dataframe()

# Ensure we have at least 2 periods
if len(df.columns) >= 2:
    # Create comparison
    comparison = pd.DataFrame({
        'Current': df.iloc[:, 0],
        'Prior': df.iloc[:, 1],
        'Change': df.iloc[:, 0] - df.iloc[:, 1],
        'Change %': ((df.iloc[:, 0] - df.iloc[:, 1]) / df.iloc[:, 1] * 100).round(2)
    })

    # Show key metrics
    key_items = ['Revenue', 'Operating Income', 'Net Income']
    for item in key_items:
        if item in comparison.index:
            print(f"\n{item}:")
            print(comparison.loc[item])

Extract All Periods to CSV

# Get statement
statement = xbrl.statements.income_statement()

# Convert and save
df = statement.to_dataframe()
df.to_csv('income_statement.csv')

print(f"Exported {len(df)} line items across {len(df.columns)} periods")

Build Financial Ratios

# Get both income statement and balance sheet
income = xbrl.statements.income_statement()
balance = xbrl.statements.balance_sheet()

# Convert to DataFrames
income_df = income.to_dataframe()
balance_df = balance.to_dataframe()

# Extract values (most recent period)
revenue = income_df.loc['Revenue', income_df.columns[0]]
net_income = income_df.loc['Net Income', income_df.columns[0]]
total_assets = balance_df.loc['Assets', balance_df.columns[0]]
total_equity = balance_df.loc['Equity', balance_df.columns[0]]

# Calculate ratios
ratios = {
    'Net Profit Margin': (net_income / revenue * 100).round(2),
    'ROA': (net_income / total_assets * 100).round(2),
    'ROE': (net_income / total_equity * 100).round(2),
    'Asset Turnover': (revenue / total_assets).round(2),
}

print("Financial Ratios:")
for ratio, value in ratios.items():
    print(f"  {ratio}: {value}")

Search for Specific Items

# Get statement as DataFrame
df = income.to_dataframe()

# Search for items containing keywords
research_costs = df[df.index.str.contains('Research', case=False)]
tax_items = df[df.index.str.contains('Tax', case=False)]

# Or get raw data with concept names
raw = income.get_raw_data()
research_concepts = [
    item for item in raw
    if 'research' in item['label'].lower()
]

Aggregate Subcategories

# Get statement
df = balance.to_dataframe()

# Define categories (adjust based on actual labels)
current_asset_categories = [
    'Cash and Cash Equivalents',
    'Accounts Receivable',
    'Inventory',
    'Other Current Assets'
]

# Sum categories
current_assets_sum = sum([
    df.loc[cat, df.columns[0]]
    for cat in current_asset_categories
    if cat in df.index
])

# Verify against reported total
if 'Current Assets' in df.index:
    reported_total = df.loc['Current Assets', df.columns[0]]
    print(f"Calculated: {current_assets_sum}")
    print(f"Reported: {reported_total}")
    print(f"Difference: {current_assets_sum - reported_total}")

Integration with Analysis Tools

With Pandas

# Statement integrates seamlessly with pandas
df = statement.to_dataframe()

# Use all pandas functionality
summary = df.describe()
correlations = df.T.corr()
rolling_avg = df.T.rolling(window=4).mean()

With NumPy

import numpy as np

# Convert to numpy array for numerical operations
df = statement.to_dataframe()
values = df.values

# Numerical analysis
mean_values = np.mean(values, axis=1)
std_values = np.std(values, axis=1)
growth_rates = np.diff(values, axis=1) / values[:, :-1]

Export for Visualization

# Prepare data for plotting
df = income.to_dataframe()

# Select key items
plot_items = ['Revenue', 'Operating Income', 'Net Income']
plot_data = df.loc[plot_items].T

# Plot with matplotlib
import matplotlib.pyplot as plt
plot_data.plot(kind='bar', figsize=(12, 6))
plt.title('Income Statement Trends')
plt.xlabel('Period')
plt.ylabel('Amount (USD)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

Error Handling

Missing Line Items

# Check if item exists before accessing
df = statement.to_dataframe()

if 'Revenue' in df.index:
    revenue = df.loc['Revenue']
else:
    print("Revenue not found in statement")
    # Try alternative names
    for alt in ['Revenues', 'Total Revenue', 'Net Revenue']:
        if alt in df.index:
            revenue = df.loc[alt]
            break

Handling Different Formats

# Companies may use different labels
def find_item(df, possible_names):
    """Find item by trying multiple possible names."""
    for name in possible_names:
        if name in df.index:
            return df.loc[name]
    return None

# Usage
revenue_names = ['Revenue', 'Revenues', 'Total Revenue', 'Net Sales']
revenue = find_item(df, revenue_names)

if revenue is not None:
    print(f"Found revenue: {revenue}")
else:
    print("Revenue not found under common names")

Incomplete Period Data

# Check data availability
df = statement.to_dataframe()

# Check for null values
missing_data = df.isnull().sum()
if missing_data.any():
    print("Periods with missing data:")
    print(missing_data[missing_data > 0])

# Fill missing with 0 or forward fill
df_filled = df.fillna(0)  # Replace NaN with 0
# or
df_filled = df.fillna(method='ffill')  # Forward fill

Best Practices

  1. Always convert to DataFrame for analysis:

    df = statement.to_dataframe()  # Easier to work with
    
  2. Check item names before accessing:

    if 'Revenue' in df.index:
        revenue = df.loc['Revenue']
    
  3. Handle multiple naming conventions:

    # Try variations
    for name in ['Revenue', 'Revenues', 'Total Revenue']:
        if name in df.index:
            revenue = df.loc[name]
            break
    
  4. Validate calculated values:

    # Check against reported totals
    calculated = sum(components)
    reported = df.loc['Total']
    assert abs(calculated - reported) < 0.01, "Mismatch!"
    
  5. Use period filters appropriately:

    # Filter to specific years
    df_2024 = statement.to_dataframe(period_filter='2024')
    

Performance Tips

Caching DataFrames

# Cache the DataFrame if using repeatedly
df_cache = statement.to_dataframe()

# Reuse cached version
revenue = df_cache.loc['Revenue']
net_income = df_cache.loc['Net Income']
# ... more operations

Selective Period Loading

# If you only need recent data
current_only = xbrl.current_period.income_statement()
df = current_only.to_dataframe()  # Smaller, faster

Troubleshooting

"KeyError: Line item not found"

Cause: Item label doesn't match exactly

Solution:

# List all available items
print(df.index.tolist())

# Or search for pattern
matching = df[df.index.str.contains('keyword', case=False)]

"Empty DataFrame"

Cause: Statement has no data or wrong period filter

Solution:

# Check raw data
raw = statement.get_raw_data()
print(f"Statement has {len(raw)} items")

# Check periods
print(f"Available periods: {statement.periods}")

"Index error when accessing columns"

Cause: Fewer periods than expected

Solution:

# Check column count first
if len(df.columns) >= 2:
    current = df.iloc[:, 0]
    prior = df.iloc[:, 1]
else:
    print("Insufficient periods for comparison")

This guide covers the essential patterns for working with Statement objects in edgartools. For information on accessing statements from XBRL, see the XBRL documentation.