# Statement Class Documentation ## Overview The `Statement` class represents a single financial statement extracted from XBRL data. It provides methods for viewing, manipulating, and analyzing financial statement data including income statements, balance sheets, cash flow statements, and disclosure notes. A Statement object contains: - **Line items** with values across multiple periods - **Hierarchy** showing the structure and relationships - **Metadata** including concept names and labels - **Period information** for time-series analysis ## Getting a Statement ### From XBRL ```python # Get XBRL data first xbrl = filing.xbrl() # Access specific statements income = xbrl.statements.income_statement() balance = xbrl.statements.balance_sheet() cashflow = xbrl.statements.cash_flow_statement() equity = xbrl.statements.statement_of_equity() # By name cover_page = xbrl.statements['CoverPage'] # By index first_statement = xbrl.statements[0] ``` ## Viewing Statements ### Rich Display ```python # Print statement to see formatted table print(income) # Shows: # - Statement title # - Line items with hierarchical structure # - Values for multiple periods # - Proper number formatting ``` ### Text Representation ```python # Get plain text version text = str(income) # Or explicitly text_output = income.text() ``` ## Converting to DataFrame ### Basic Conversion ```python # Convert statement to pandas DataFrame df = income.to_dataframe() # DataFrame structure: # - Index: Line item labels or concepts # - Columns: Period dates # - Values: Financial amounts ``` ### With Period Filter ```python # Filter to specific periods df = income.to_dataframe(period_filter='2024') # Only includes periods matching the filter ``` ### Accessing Specific Data ```python # Convert to DataFrame for easy analysis df = income.to_dataframe() # Access specific line items revenue = df.loc['Revenue'] net_income = df.loc['Net Income'] # Access specific periods current_period = df.iloc[:, 0] # First column (most recent) prior_period = df.iloc[:, 1] # Second column # Specific cell current_revenue = df.loc['Revenue', df.columns[0]] ``` ## Statement Properties ### Available Periods ```python # Get list of periods in the statement periods = statement.periods # Each period is a date string (YYYY-MM-DD) for period in periods: print(f"Data available for: {period}") ``` ### Statement Name and Type ```python # Get statement information name = statement.name # Statement display name concept = statement.concept # XBRL concept identifier ``` ### Raw Data Access ```python # Get underlying statement data structure raw_data = statement.get_raw_data() # Returns list of dictionaries with: # - concept: XBRL concept name # - label: Display label # - values: Dict of period -> value # - level: Hierarchy depth # - all_names: All concept variations ``` ## Rendering and Display ### Custom Rendering ```python # Render with specific options rendered = statement.render() # Rendered statement has rich formatting print(rendered) ``` ### Text Export ```python # Get markdown-formatted text markdown_text = statement.text() # Suitable for: # - AI/LLM consumption # - Documentation # - Text-based analysis ``` ## Working with Statement Data ### Calculate Growth Rates ```python # Convert to DataFrame df = income.to_dataframe() # Calculate period-over-period growth if len(df.columns) >= 2: current = df.iloc[:, 0] prior = df.iloc[:, 1] # Growth rate growth = ((current - prior) / prior * 100).round(2) # Create comparison DataFrame comparison = pd.DataFrame({ 'Current': current, 'Prior': prior, 'Growth %': growth }) print(comparison) ``` ### Extract Specific Metrics ```python # Get income statement metrics df = income.to_dataframe() # Extract key metrics from most recent period current = df.iloc[:, 0] metrics = { 'Revenue': current.get('Revenue', 0), 'Operating Income': current.get('Operating Income', 0), 'Net Income': current.get('Net Income', 0), } # Calculate derived metrics if metrics['Revenue'] > 0: metrics['Operating Margin'] = ( metrics['Operating Income'] / metrics['Revenue'] * 100 ) metrics['Net Margin'] = ( metrics['Net Income'] / metrics['Revenue'] * 100 ) ``` ### Filter Line Items ```python # Convert to DataFrame df = balance.to_dataframe() # Filter for specific items asset_items = df[df.index.str.contains('Asset', case=False)] liability_items = df[df.index.str.contains('Liabilit', case=False)] # Get subtotals if 'Current Assets' in df.index: current_assets = df.loc['Current Assets'] ``` ### Time Series Analysis ```python # Get multiple periods df = income.to_dataframe() # Plot revenue trend if 'Revenue' in df.index: revenue_series = df.loc['Revenue'] # Convert to numeric and plot import matplotlib.pyplot as plt revenue_series.plot(kind='line', title='Revenue Trend') plt.show() ``` ## Common Workflows ### Compare Current vs Prior Period ```python # Get income statement income = xbrl.statements.income_statement() df = income.to_dataframe() # Ensure we have at least 2 periods if len(df.columns) >= 2: # Create comparison comparison = pd.DataFrame({ 'Current': df.iloc[:, 0], 'Prior': df.iloc[:, 1], 'Change': df.iloc[:, 0] - df.iloc[:, 1], 'Change %': ((df.iloc[:, 0] - df.iloc[:, 1]) / df.iloc[:, 1] * 100).round(2) }) # Show key metrics key_items = ['Revenue', 'Operating Income', 'Net Income'] for item in key_items: if item in comparison.index: print(f"\n{item}:") print(comparison.loc[item]) ``` ### Extract All Periods to CSV ```python # Get statement statement = xbrl.statements.income_statement() # Convert and save df = statement.to_dataframe() df.to_csv('income_statement.csv') print(f"Exported {len(df)} line items across {len(df.columns)} periods") ``` ### Build Financial Ratios ```python # Get both income statement and balance sheet income = xbrl.statements.income_statement() balance = xbrl.statements.balance_sheet() # Convert to DataFrames income_df = income.to_dataframe() balance_df = balance.to_dataframe() # Extract values (most recent period) revenue = income_df.loc['Revenue', income_df.columns[0]] net_income = income_df.loc['Net Income', income_df.columns[0]] total_assets = balance_df.loc['Assets', balance_df.columns[0]] total_equity = balance_df.loc['Equity', balance_df.columns[0]] # Calculate ratios ratios = { 'Net Profit Margin': (net_income / revenue * 100).round(2), 'ROA': (net_income / total_assets * 100).round(2), 'ROE': (net_income / total_equity * 100).round(2), 'Asset Turnover': (revenue / total_assets).round(2), } print("Financial Ratios:") for ratio, value in ratios.items(): print(f" {ratio}: {value}") ``` ### Search for Specific Items ```python # Get statement as DataFrame df = income.to_dataframe() # Search for items containing keywords research_costs = df[df.index.str.contains('Research', case=False)] tax_items = df[df.index.str.contains('Tax', case=False)] # Or get raw data with concept names raw = income.get_raw_data() research_concepts = [ item for item in raw if 'research' in item['label'].lower() ] ``` ### Aggregate Subcategories ```python # Get statement df = balance.to_dataframe() # Define categories (adjust based on actual labels) current_asset_categories = [ 'Cash and Cash Equivalents', 'Accounts Receivable', 'Inventory', 'Other Current Assets' ] # Sum categories current_assets_sum = sum([ df.loc[cat, df.columns[0]] for cat in current_asset_categories if cat in df.index ]) # Verify against reported total if 'Current Assets' in df.index: reported_total = df.loc['Current Assets', df.columns[0]] print(f"Calculated: {current_assets_sum}") print(f"Reported: {reported_total}") print(f"Difference: {current_assets_sum - reported_total}") ``` ## Integration with Analysis Tools ### With Pandas ```python # Statement integrates seamlessly with pandas df = statement.to_dataframe() # Use all pandas functionality summary = df.describe() correlations = df.T.corr() rolling_avg = df.T.rolling(window=4).mean() ``` ### With NumPy ```python import numpy as np # Convert to numpy array for numerical operations df = statement.to_dataframe() values = df.values # Numerical analysis mean_values = np.mean(values, axis=1) std_values = np.std(values, axis=1) growth_rates = np.diff(values, axis=1) / values[:, :-1] ``` ### Export for Visualization ```python # Prepare data for plotting df = income.to_dataframe() # Select key items plot_items = ['Revenue', 'Operating Income', 'Net Income'] plot_data = df.loc[plot_items].T # Plot with matplotlib import matplotlib.pyplot as plt plot_data.plot(kind='bar', figsize=(12, 6)) plt.title('Income Statement Trends') plt.xlabel('Period') plt.ylabel('Amount (USD)') plt.xticks(rotation=45) plt.tight_layout() plt.show() ``` ## Error Handling ### Missing Line Items ```python # Check if item exists before accessing df = statement.to_dataframe() if 'Revenue' in df.index: revenue = df.loc['Revenue'] else: print("Revenue not found in statement") # Try alternative names for alt in ['Revenues', 'Total Revenue', 'Net Revenue']: if alt in df.index: revenue = df.loc[alt] break ``` ### Handling Different Formats ```python # Companies may use different labels def find_item(df, possible_names): """Find item by trying multiple possible names.""" for name in possible_names: if name in df.index: return df.loc[name] return None # Usage revenue_names = ['Revenue', 'Revenues', 'Total Revenue', 'Net Sales'] revenue = find_item(df, revenue_names) if revenue is not None: print(f"Found revenue: {revenue}") else: print("Revenue not found under common names") ``` ### Incomplete Period Data ```python # Check data availability df = statement.to_dataframe() # Check for null values missing_data = df.isnull().sum() if missing_data.any(): print("Periods with missing data:") print(missing_data[missing_data > 0]) # Fill missing with 0 or forward fill df_filled = df.fillna(0) # Replace NaN with 0 # or df_filled = df.fillna(method='ffill') # Forward fill ``` ## Best Practices 1. **Always convert to DataFrame for analysis**: ```python df = statement.to_dataframe() # Easier to work with ``` 2. **Check item names before accessing**: ```python if 'Revenue' in df.index: revenue = df.loc['Revenue'] ``` 3. **Handle multiple naming conventions**: ```python # Try variations for name in ['Revenue', 'Revenues', 'Total Revenue']: if name in df.index: revenue = df.loc[name] break ``` 4. **Validate calculated values**: ```python # Check against reported totals calculated = sum(components) reported = df.loc['Total'] assert abs(calculated - reported) < 0.01, "Mismatch!" ``` 5. **Use period filters appropriately**: ```python # Filter to specific years df_2024 = statement.to_dataframe(period_filter='2024') ``` ## Performance Tips ### Caching DataFrames ```python # Cache the DataFrame if using repeatedly df_cache = statement.to_dataframe() # Reuse cached version revenue = df_cache.loc['Revenue'] net_income = df_cache.loc['Net Income'] # ... more operations ``` ### Selective Period Loading ```python # If you only need recent data current_only = xbrl.current_period.income_statement() df = current_only.to_dataframe() # Smaller, faster ``` ## Troubleshooting ### "KeyError: Line item not found" **Cause**: Item label doesn't match exactly **Solution**: ```python # List all available items print(df.index.tolist()) # Or search for pattern matching = df[df.index.str.contains('keyword', case=False)] ``` ### "Empty DataFrame" **Cause**: Statement has no data or wrong period filter **Solution**: ```python # Check raw data raw = statement.get_raw_data() print(f"Statement has {len(raw)} items") # Check periods print(f"Available periods: {statement.periods}") ``` ### "Index error when accessing columns" **Cause**: Fewer periods than expected **Solution**: ```python # Check column count first if len(df.columns) >= 2: current = df.iloc[:, 0] prior = df.iloc[:, 1] else: print("Insufficient periods for comparison") ``` This guide covers the essential patterns for working with Statement objects in edgartools. For information on accessing statements from XBRL, see the XBRL documentation.