Initial commit
This commit is contained in:
567
venv/lib/python3.10/site-packages/edgar/xbrl/docs/Statement.md
Normal file
567
venv/lib/python3.10/site-packages/edgar/xbrl/docs/Statement.md
Normal file
@@ -0,0 +1,567 @@
|
||||
# Statement Class Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The `Statement` class represents a single financial statement extracted from XBRL data. It provides methods for viewing, manipulating, and analyzing financial statement data including income statements, balance sheets, cash flow statements, and disclosure notes.
|
||||
|
||||
A Statement object contains:
|
||||
- **Line items** with values across multiple periods
|
||||
- **Hierarchy** showing the structure and relationships
|
||||
- **Metadata** including concept names and labels
|
||||
- **Period information** for time-series analysis
|
||||
|
||||
## Getting a Statement
|
||||
|
||||
### From XBRL
|
||||
|
||||
```python
|
||||
# Get XBRL data first
|
||||
xbrl = filing.xbrl()
|
||||
|
||||
# Access specific statements
|
||||
income = xbrl.statements.income_statement()
|
||||
balance = xbrl.statements.balance_sheet()
|
||||
cashflow = xbrl.statements.cash_flow_statement()
|
||||
equity = xbrl.statements.statement_of_equity()
|
||||
|
||||
# By name
|
||||
cover_page = xbrl.statements['CoverPage']
|
||||
|
||||
# By index
|
||||
first_statement = xbrl.statements[0]
|
||||
```
|
||||
|
||||
## Viewing Statements
|
||||
|
||||
### Rich Display
|
||||
|
||||
```python
|
||||
# Print statement to see formatted table
|
||||
print(income)
|
||||
|
||||
# Shows:
|
||||
# - Statement title
|
||||
# - Line items with hierarchical structure
|
||||
# - Values for multiple periods
|
||||
# - Proper number formatting
|
||||
```
|
||||
|
||||
### Text Representation
|
||||
|
||||
```python
|
||||
# Get plain text version
|
||||
text = str(income)
|
||||
|
||||
# Or explicitly
|
||||
text_output = income.text()
|
||||
```
|
||||
|
||||
## Converting to DataFrame
|
||||
|
||||
### Basic Conversion
|
||||
|
||||
```python
|
||||
# Convert statement to pandas DataFrame
|
||||
df = income.to_dataframe()
|
||||
|
||||
# DataFrame structure:
|
||||
# - Index: Line item labels or concepts
|
||||
# - Columns: Period dates
|
||||
# - Values: Financial amounts
|
||||
```
|
||||
|
||||
### With Period Filter
|
||||
|
||||
```python
|
||||
# Filter to specific periods
|
||||
df = income.to_dataframe(period_filter='2024')
|
||||
|
||||
# Only includes periods matching the filter
|
||||
```
|
||||
|
||||
### Accessing Specific Data
|
||||
|
||||
```python
|
||||
# Convert to DataFrame for easy analysis
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Access specific line items
|
||||
revenue = df.loc['Revenue']
|
||||
net_income = df.loc['Net Income']
|
||||
|
||||
# Access specific periods
|
||||
current_period = df.iloc[:, 0] # First column (most recent)
|
||||
prior_period = df.iloc[:, 1] # Second column
|
||||
|
||||
# Specific cell
|
||||
current_revenue = df.loc['Revenue', df.columns[0]]
|
||||
```
|
||||
|
||||
## Statement Properties
|
||||
|
||||
### Available Periods
|
||||
|
||||
```python
|
||||
# Get list of periods in the statement
|
||||
periods = statement.periods
|
||||
|
||||
# Each period is a date string (YYYY-MM-DD)
|
||||
for period in periods:
|
||||
print(f"Data available for: {period}")
|
||||
```
|
||||
|
||||
### Statement Name and Type
|
||||
|
||||
```python
|
||||
# Get statement information
|
||||
name = statement.name # Statement display name
|
||||
concept = statement.concept # XBRL concept identifier
|
||||
```
|
||||
|
||||
### Raw Data Access
|
||||
|
||||
```python
|
||||
# Get underlying statement data structure
|
||||
raw_data = statement.get_raw_data()
|
||||
|
||||
# Returns list of dictionaries with:
|
||||
# - concept: XBRL concept name
|
||||
# - label: Display label
|
||||
# - values: Dict of period -> value
|
||||
# - level: Hierarchy depth
|
||||
# - all_names: All concept variations
|
||||
```
|
||||
|
||||
## Rendering and Display
|
||||
|
||||
### Custom Rendering
|
||||
|
||||
```python
|
||||
# Render with specific options
|
||||
rendered = statement.render()
|
||||
|
||||
# Rendered statement has rich formatting
|
||||
print(rendered)
|
||||
```
|
||||
|
||||
### Text Export
|
||||
|
||||
```python
|
||||
# Get markdown-formatted text
|
||||
markdown_text = statement.text()
|
||||
|
||||
# Suitable for:
|
||||
# - AI/LLM consumption
|
||||
# - Documentation
|
||||
# - Text-based analysis
|
||||
```
|
||||
|
||||
## Working with Statement Data
|
||||
|
||||
### Calculate Growth Rates
|
||||
|
||||
```python
|
||||
# Convert to DataFrame
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Calculate period-over-period growth
|
||||
if len(df.columns) >= 2:
|
||||
current = df.iloc[:, 0]
|
||||
prior = df.iloc[:, 1]
|
||||
|
||||
# Growth rate
|
||||
growth = ((current - prior) / prior * 100).round(2)
|
||||
|
||||
# Create comparison DataFrame
|
||||
comparison = pd.DataFrame({
|
||||
'Current': current,
|
||||
'Prior': prior,
|
||||
'Growth %': growth
|
||||
})
|
||||
|
||||
print(comparison)
|
||||
```
|
||||
|
||||
### Extract Specific Metrics
|
||||
|
||||
```python
|
||||
# Get income statement metrics
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Extract key metrics from most recent period
|
||||
current = df.iloc[:, 0]
|
||||
|
||||
metrics = {
|
||||
'Revenue': current.get('Revenue', 0),
|
||||
'Operating Income': current.get('Operating Income', 0),
|
||||
'Net Income': current.get('Net Income', 0),
|
||||
}
|
||||
|
||||
# Calculate derived metrics
|
||||
if metrics['Revenue'] > 0:
|
||||
metrics['Operating Margin'] = (
|
||||
metrics['Operating Income'] / metrics['Revenue'] * 100
|
||||
)
|
||||
metrics['Net Margin'] = (
|
||||
metrics['Net Income'] / metrics['Revenue'] * 100
|
||||
)
|
||||
```
|
||||
|
||||
### Filter Line Items
|
||||
|
||||
```python
|
||||
# Convert to DataFrame
|
||||
df = balance.to_dataframe()
|
||||
|
||||
# Filter for specific items
|
||||
asset_items = df[df.index.str.contains('Asset', case=False)]
|
||||
liability_items = df[df.index.str.contains('Liabilit', case=False)]
|
||||
|
||||
# Get subtotals
|
||||
if 'Current Assets' in df.index:
|
||||
current_assets = df.loc['Current Assets']
|
||||
```
|
||||
|
||||
### Time Series Analysis
|
||||
|
||||
```python
|
||||
# Get multiple periods
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Plot revenue trend
|
||||
if 'Revenue' in df.index:
|
||||
revenue_series = df.loc['Revenue']
|
||||
|
||||
# Convert to numeric and plot
|
||||
import matplotlib.pyplot as plt
|
||||
revenue_series.plot(kind='line', title='Revenue Trend')
|
||||
plt.show()
|
||||
```
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Compare Current vs Prior Period
|
||||
|
||||
```python
|
||||
# Get income statement
|
||||
income = xbrl.statements.income_statement()
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Ensure we have at least 2 periods
|
||||
if len(df.columns) >= 2:
|
||||
# Create comparison
|
||||
comparison = pd.DataFrame({
|
||||
'Current': df.iloc[:, 0],
|
||||
'Prior': df.iloc[:, 1],
|
||||
'Change': df.iloc[:, 0] - df.iloc[:, 1],
|
||||
'Change %': ((df.iloc[:, 0] - df.iloc[:, 1]) / df.iloc[:, 1] * 100).round(2)
|
||||
})
|
||||
|
||||
# Show key metrics
|
||||
key_items = ['Revenue', 'Operating Income', 'Net Income']
|
||||
for item in key_items:
|
||||
if item in comparison.index:
|
||||
print(f"\n{item}:")
|
||||
print(comparison.loc[item])
|
||||
```
|
||||
|
||||
### Extract All Periods to CSV
|
||||
|
||||
```python
|
||||
# Get statement
|
||||
statement = xbrl.statements.income_statement()
|
||||
|
||||
# Convert and save
|
||||
df = statement.to_dataframe()
|
||||
df.to_csv('income_statement.csv')
|
||||
|
||||
print(f"Exported {len(df)} line items across {len(df.columns)} periods")
|
||||
```
|
||||
|
||||
### Build Financial Ratios
|
||||
|
||||
```python
|
||||
# Get both income statement and balance sheet
|
||||
income = xbrl.statements.income_statement()
|
||||
balance = xbrl.statements.balance_sheet()
|
||||
|
||||
# Convert to DataFrames
|
||||
income_df = income.to_dataframe()
|
||||
balance_df = balance.to_dataframe()
|
||||
|
||||
# Extract values (most recent period)
|
||||
revenue = income_df.loc['Revenue', income_df.columns[0]]
|
||||
net_income = income_df.loc['Net Income', income_df.columns[0]]
|
||||
total_assets = balance_df.loc['Assets', balance_df.columns[0]]
|
||||
total_equity = balance_df.loc['Equity', balance_df.columns[0]]
|
||||
|
||||
# Calculate ratios
|
||||
ratios = {
|
||||
'Net Profit Margin': (net_income / revenue * 100).round(2),
|
||||
'ROA': (net_income / total_assets * 100).round(2),
|
||||
'ROE': (net_income / total_equity * 100).round(2),
|
||||
'Asset Turnover': (revenue / total_assets).round(2),
|
||||
}
|
||||
|
||||
print("Financial Ratios:")
|
||||
for ratio, value in ratios.items():
|
||||
print(f" {ratio}: {value}")
|
||||
```
|
||||
|
||||
### Search for Specific Items
|
||||
|
||||
```python
|
||||
# Get statement as DataFrame
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Search for items containing keywords
|
||||
research_costs = df[df.index.str.contains('Research', case=False)]
|
||||
tax_items = df[df.index.str.contains('Tax', case=False)]
|
||||
|
||||
# Or get raw data with concept names
|
||||
raw = income.get_raw_data()
|
||||
research_concepts = [
|
||||
item for item in raw
|
||||
if 'research' in item['label'].lower()
|
||||
]
|
||||
```
|
||||
|
||||
### Aggregate Subcategories
|
||||
|
||||
```python
|
||||
# Get statement
|
||||
df = balance.to_dataframe()
|
||||
|
||||
# Define categories (adjust based on actual labels)
|
||||
current_asset_categories = [
|
||||
'Cash and Cash Equivalents',
|
||||
'Accounts Receivable',
|
||||
'Inventory',
|
||||
'Other Current Assets'
|
||||
]
|
||||
|
||||
# Sum categories
|
||||
current_assets_sum = sum([
|
||||
df.loc[cat, df.columns[0]]
|
||||
for cat in current_asset_categories
|
||||
if cat in df.index
|
||||
])
|
||||
|
||||
# Verify against reported total
|
||||
if 'Current Assets' in df.index:
|
||||
reported_total = df.loc['Current Assets', df.columns[0]]
|
||||
print(f"Calculated: {current_assets_sum}")
|
||||
print(f"Reported: {reported_total}")
|
||||
print(f"Difference: {current_assets_sum - reported_total}")
|
||||
```
|
||||
|
||||
## Integration with Analysis Tools
|
||||
|
||||
### With Pandas
|
||||
|
||||
```python
|
||||
# Statement integrates seamlessly with pandas
|
||||
df = statement.to_dataframe()
|
||||
|
||||
# Use all pandas functionality
|
||||
summary = df.describe()
|
||||
correlations = df.T.corr()
|
||||
rolling_avg = df.T.rolling(window=4).mean()
|
||||
```
|
||||
|
||||
### With NumPy
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
# Convert to numpy array for numerical operations
|
||||
df = statement.to_dataframe()
|
||||
values = df.values
|
||||
|
||||
# Numerical analysis
|
||||
mean_values = np.mean(values, axis=1)
|
||||
std_values = np.std(values, axis=1)
|
||||
growth_rates = np.diff(values, axis=1) / values[:, :-1]
|
||||
```
|
||||
|
||||
### Export for Visualization
|
||||
|
||||
```python
|
||||
# Prepare data for plotting
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Select key items
|
||||
plot_items = ['Revenue', 'Operating Income', 'Net Income']
|
||||
plot_data = df.loc[plot_items].T
|
||||
|
||||
# Plot with matplotlib
|
||||
import matplotlib.pyplot as plt
|
||||
plot_data.plot(kind='bar', figsize=(12, 6))
|
||||
plt.title('Income Statement Trends')
|
||||
plt.xlabel('Period')
|
||||
plt.ylabel('Amount (USD)')
|
||||
plt.xticks(rotation=45)
|
||||
plt.tight_layout()
|
||||
plt.show()
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Missing Line Items
|
||||
|
||||
```python
|
||||
# Check if item exists before accessing
|
||||
df = statement.to_dataframe()
|
||||
|
||||
if 'Revenue' in df.index:
|
||||
revenue = df.loc['Revenue']
|
||||
else:
|
||||
print("Revenue not found in statement")
|
||||
# Try alternative names
|
||||
for alt in ['Revenues', 'Total Revenue', 'Net Revenue']:
|
||||
if alt in df.index:
|
||||
revenue = df.loc[alt]
|
||||
break
|
||||
```
|
||||
|
||||
### Handling Different Formats
|
||||
|
||||
```python
|
||||
# Companies may use different labels
|
||||
def find_item(df, possible_names):
|
||||
"""Find item by trying multiple possible names."""
|
||||
for name in possible_names:
|
||||
if name in df.index:
|
||||
return df.loc[name]
|
||||
return None
|
||||
|
||||
# Usage
|
||||
revenue_names = ['Revenue', 'Revenues', 'Total Revenue', 'Net Sales']
|
||||
revenue = find_item(df, revenue_names)
|
||||
|
||||
if revenue is not None:
|
||||
print(f"Found revenue: {revenue}")
|
||||
else:
|
||||
print("Revenue not found under common names")
|
||||
```
|
||||
|
||||
### Incomplete Period Data
|
||||
|
||||
```python
|
||||
# Check data availability
|
||||
df = statement.to_dataframe()
|
||||
|
||||
# Check for null values
|
||||
missing_data = df.isnull().sum()
|
||||
if missing_data.any():
|
||||
print("Periods with missing data:")
|
||||
print(missing_data[missing_data > 0])
|
||||
|
||||
# Fill missing with 0 or forward fill
|
||||
df_filled = df.fillna(0) # Replace NaN with 0
|
||||
# or
|
||||
df_filled = df.fillna(method='ffill') # Forward fill
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always convert to DataFrame for analysis**:
|
||||
```python
|
||||
df = statement.to_dataframe() # Easier to work with
|
||||
```
|
||||
|
||||
2. **Check item names before accessing**:
|
||||
```python
|
||||
if 'Revenue' in df.index:
|
||||
revenue = df.loc['Revenue']
|
||||
```
|
||||
|
||||
3. **Handle multiple naming conventions**:
|
||||
```python
|
||||
# Try variations
|
||||
for name in ['Revenue', 'Revenues', 'Total Revenue']:
|
||||
if name in df.index:
|
||||
revenue = df.loc[name]
|
||||
break
|
||||
```
|
||||
|
||||
4. **Validate calculated values**:
|
||||
```python
|
||||
# Check against reported totals
|
||||
calculated = sum(components)
|
||||
reported = df.loc['Total']
|
||||
assert abs(calculated - reported) < 0.01, "Mismatch!"
|
||||
```
|
||||
|
||||
5. **Use period filters appropriately**:
|
||||
```python
|
||||
# Filter to specific years
|
||||
df_2024 = statement.to_dataframe(period_filter='2024')
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
### Caching DataFrames
|
||||
|
||||
```python
|
||||
# Cache the DataFrame if using repeatedly
|
||||
df_cache = statement.to_dataframe()
|
||||
|
||||
# Reuse cached version
|
||||
revenue = df_cache.loc['Revenue']
|
||||
net_income = df_cache.loc['Net Income']
|
||||
# ... more operations
|
||||
```
|
||||
|
||||
### Selective Period Loading
|
||||
|
||||
```python
|
||||
# If you only need recent data
|
||||
current_only = xbrl.current_period.income_statement()
|
||||
df = current_only.to_dataframe() # Smaller, faster
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "KeyError: Line item not found"
|
||||
|
||||
**Cause**: Item label doesn't match exactly
|
||||
|
||||
**Solution**:
|
||||
```python
|
||||
# List all available items
|
||||
print(df.index.tolist())
|
||||
|
||||
# Or search for pattern
|
||||
matching = df[df.index.str.contains('keyword', case=False)]
|
||||
```
|
||||
|
||||
### "Empty DataFrame"
|
||||
|
||||
**Cause**: Statement has no data or wrong period filter
|
||||
|
||||
**Solution**:
|
||||
```python
|
||||
# Check raw data
|
||||
raw = statement.get_raw_data()
|
||||
print(f"Statement has {len(raw)} items")
|
||||
|
||||
# Check periods
|
||||
print(f"Available periods: {statement.periods}")
|
||||
```
|
||||
|
||||
### "Index error when accessing columns"
|
||||
|
||||
**Cause**: Fewer periods than expected
|
||||
|
||||
**Solution**:
|
||||
```python
|
||||
# Check column count first
|
||||
if len(df.columns) >= 2:
|
||||
current = df.iloc[:, 0]
|
||||
prior = df.iloc[:, 1]
|
||||
else:
|
||||
print("Insufficient periods for comparison")
|
||||
```
|
||||
|
||||
This guide covers the essential patterns for working with Statement objects in edgartools. For information on accessing statements from XBRL, see the XBRL documentation.
|
||||
587
venv/lib/python3.10/site-packages/edgar/xbrl/docs/XBRL.md
Normal file
587
venv/lib/python3.10/site-packages/edgar/xbrl/docs/XBRL.md
Normal file
@@ -0,0 +1,587 @@
|
||||
# XBRL Class Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The `XBRL` class is the primary interface for working with XBRL (eXtensible Business Reporting Language) financial data from SEC filings. It provides structured access to financial statements, facts, and related data extracted from filings like 10-K, 10-Q, and 8-K reports.
|
||||
|
||||
XBRL documents contain:
|
||||
- **Financial statements** (Income Statement, Balance Sheet, Cash Flow, etc.)
|
||||
- **Facts** - Individual data points with values, periods, and dimensions
|
||||
- **Contexts** - Time periods and dimensional information
|
||||
- **Presentation** - How facts are organized into statements
|
||||
|
||||
## Getting XBRL Data
|
||||
|
||||
### From a Filing
|
||||
|
||||
```python
|
||||
# Get XBRL from any filing with financial data
|
||||
filing = company.get_filings(form="10-K").latest()
|
||||
xbrl = filing.xbrl()
|
||||
```
|
||||
|
||||
### Quick Check
|
||||
|
||||
```python
|
||||
# Print XBRL to see what's available
|
||||
print(xbrl)
|
||||
# Shows: company info, available statements, periods, and usage examples
|
||||
```
|
||||
|
||||
## Accessing Financial Statements
|
||||
|
||||
### Core Statement Methods
|
||||
|
||||
The XBRL class provides convenient methods for accessing standard financial statements:
|
||||
|
||||
```python
|
||||
# Access core financial statements
|
||||
income = xbrl.statements.income_statement()
|
||||
balance = xbrl.statements.balance_sheet()
|
||||
cashflow = xbrl.statements.cash_flow_statement()
|
||||
equity = xbrl.statements.statement_of_equity()
|
||||
comprehensive = xbrl.statements.comprehensive_income()
|
||||
```
|
||||
|
||||
### Access by Name
|
||||
|
||||
You can access any statement by its exact name as it appears in the filing:
|
||||
|
||||
```python
|
||||
# List all available statements
|
||||
print(xbrl.statements)
|
||||
|
||||
# Access specific statement by name
|
||||
cover_page = xbrl.statements['CoverPage']
|
||||
disclosure = xbrl.statements['CONDENSED CONSOLIDATED BALANCE SHEETS Unaudited']
|
||||
```
|
||||
|
||||
### Access by Index
|
||||
|
||||
Statements can also be accessed by their index position:
|
||||
|
||||
```python
|
||||
# Get statement by index (0-based)
|
||||
first_statement = xbrl.statements[0]
|
||||
sixth_statement = xbrl.statements[6]
|
||||
```
|
||||
|
||||
## Working with Periods
|
||||
|
||||
### Current Period Only
|
||||
|
||||
To work with just the most recent period's data:
|
||||
|
||||
```python
|
||||
# Get current period XBRL view
|
||||
current = xbrl.current_period
|
||||
|
||||
# Access statements for current period
|
||||
current_income = current.income_statement()
|
||||
current_balance = current.balance_sheet()
|
||||
```
|
||||
|
||||
### Multi-Period Statements
|
||||
|
||||
By default, statements include multiple periods for comparison:
|
||||
|
||||
```python
|
||||
# Get income statement with comparative periods
|
||||
income = xbrl.statements.income_statement()
|
||||
# Typically includes current year/quarter and prior periods
|
||||
|
||||
# Convert to DataFrame to see all periods
|
||||
df = income.to_dataframe()
|
||||
print(df.columns) # Shows all available periods
|
||||
```
|
||||
|
||||
### Available Periods
|
||||
|
||||
```python
|
||||
# See what periods are available
|
||||
for period in xbrl.reporting_periods:
|
||||
print(f"Period: {period['label']}, Key: {period['key']}")
|
||||
```
|
||||
|
||||
## Querying Facts
|
||||
|
||||
The `.facts` property provides a powerful query interface for finding specific data points:
|
||||
|
||||
### Basic Fact Queries
|
||||
|
||||
```python
|
||||
# Get all revenue facts
|
||||
revenue_facts = xbrl.facts.query().by_concept('Revenue').to_dataframe()
|
||||
|
||||
# Get net income facts
|
||||
net_income = xbrl.facts.query().by_concept('NetIncome').to_dataframe()
|
||||
|
||||
# Search by label instead of concept name
|
||||
revenue = xbrl.facts.query().by_label('Revenue').to_dataframe()
|
||||
```
|
||||
|
||||
### Filter by Period
|
||||
|
||||
```python
|
||||
# Get facts for a specific period
|
||||
period_key = "duration_2024-01-01_2024-12-31"
|
||||
facts_2024 = xbrl.facts.query().by_period_key(period_key).to_dataframe()
|
||||
|
||||
# Filter by fiscal year
|
||||
facts_fy2024 = xbrl.facts.query().by_fiscal_year(2024).to_dataframe()
|
||||
|
||||
# Filter by fiscal period
|
||||
q1_facts = xbrl.facts.query().by_fiscal_period("Q1").to_dataframe()
|
||||
```
|
||||
|
||||
### Filter by Statement Type
|
||||
|
||||
```python
|
||||
# Get all income statement facts
|
||||
income_facts = xbrl.facts.query().by_statement_type("IncomeStatement").to_dataframe()
|
||||
|
||||
# Get all balance sheet facts
|
||||
balance_facts = xbrl.facts.query().by_statement_type("BalanceSheet").to_dataframe()
|
||||
```
|
||||
|
||||
### Chaining Filters
|
||||
|
||||
```python
|
||||
# Combine multiple filters
|
||||
revenue_2024 = (xbrl.facts.query()
|
||||
.by_concept('Revenue')
|
||||
.by_fiscal_year(2024)
|
||||
.by_period_type('duration')
|
||||
.to_dataframe())
|
||||
```
|
||||
|
||||
### Pattern Matching
|
||||
|
||||
```python
|
||||
# Find all concepts matching a pattern (case-insensitive)
|
||||
asset_facts = xbrl.facts.query().by_concept('Asset', exact=False).to_dataframe()
|
||||
|
||||
# Search labels with pattern
|
||||
liability_facts = xbrl.facts.query().by_label('liabilities', exact=False).to_dataframe()
|
||||
```
|
||||
|
||||
## Converting to DataFrames
|
||||
|
||||
### Statement to DataFrame
|
||||
|
||||
```python
|
||||
# Convert any statement to pandas DataFrame
|
||||
income = xbrl.statements.income_statement()
|
||||
df = income.to_dataframe()
|
||||
|
||||
# DataFrame has:
|
||||
# - One row per line item
|
||||
# - One column per period
|
||||
# - Index is the concept/label
|
||||
```
|
||||
|
||||
### Facts to DataFrame
|
||||
|
||||
```python
|
||||
# Query returns DataFrame directly
|
||||
df = xbrl.facts.query().by_concept('Revenue').to_dataframe()
|
||||
|
||||
# DataFrame columns:
|
||||
# - concept: XBRL concept name
|
||||
# - label: Human-readable label
|
||||
# - value: Fact value
|
||||
# - period: Period identifier
|
||||
# - start: Period start date (for duration)
|
||||
# - end: Period end date
|
||||
# - unit: Unit of measure (e.g., USD)
|
||||
# - dimensions: Dimensional breakdowns (if any)
|
||||
```
|
||||
|
||||
## Advanced Patterns
|
||||
|
||||
### Finding Specific Disclosures
|
||||
|
||||
```python
|
||||
# Get statements organized by category
|
||||
categories = xbrl.statements.get_statements_by_category()
|
||||
|
||||
# View all disclosures
|
||||
disclosures = categories['disclosure']
|
||||
for disc in disclosures:
|
||||
print(f"{disc['index']}: {disc['title']}")
|
||||
|
||||
# View all notes
|
||||
notes = categories['note']
|
||||
for note in notes:
|
||||
print(f"{note['index']}: {note['title']}")
|
||||
|
||||
# Get core financial statements
|
||||
core_statements = categories['statement']
|
||||
|
||||
# Or list all statements to find specific ones
|
||||
all_statements = xbrl.get_all_statements()
|
||||
for stmt in all_statements:
|
||||
print(f"{stmt['type']}: {stmt['title']}")
|
||||
|
||||
# Access by exact name or index
|
||||
risk_factors = xbrl.statements['RiskFactorsDisclosure']
|
||||
# Or by index from the category list
|
||||
first_disclosure = xbrl.statements[disclosures[0]['index']]
|
||||
```
|
||||
|
||||
### Cross-Period Analysis
|
||||
|
||||
```python
|
||||
# Get multi-period income statement
|
||||
income = xbrl.statements.income_statement()
|
||||
df = income.to_dataframe()
|
||||
|
||||
# Calculate year-over-year growth
|
||||
if len(df.columns) >= 2:
|
||||
current = df.iloc[:, 0]
|
||||
prior = df.iloc[:, 1]
|
||||
growth = ((current - prior) / prior * 100).round(2)
|
||||
print(f"Revenue growth: {growth.loc['Revenue']}%")
|
||||
```
|
||||
|
||||
### Working with Dimensions
|
||||
|
||||
```python
|
||||
# Query facts with specific dimensional breakdowns
|
||||
segment_revenue = (xbrl.facts.query()
|
||||
.by_concept('Revenue')
|
||||
.by_dimension('Segment', 'ProductSegment')
|
||||
.to_dataframe())
|
||||
|
||||
# Group by dimensions
|
||||
segment_totals = segment_revenue.groupby('dimensions')['value'].sum()
|
||||
```
|
||||
|
||||
### Custom Fact Filtering
|
||||
|
||||
```python
|
||||
# Use custom filter function
|
||||
large_amounts = xbrl.facts.query().by_value(lambda v: abs(v) > 1000000).to_dataframe()
|
||||
|
||||
# Custom filter with lambda
|
||||
recent_facts = xbrl.facts.query().by_custom(
|
||||
lambda fact: fact['end'] >= '2024-01-01'
|
||||
).to_dataframe()
|
||||
```
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Extract Revenue from Income Statement
|
||||
|
||||
```python
|
||||
# Method 1: Via statement
|
||||
income = xbrl.statements.income_statement()
|
||||
df = income.to_dataframe()
|
||||
revenue = df.loc['Revenue']
|
||||
|
||||
# Method 2: Via facts query
|
||||
revenue_facts = xbrl.facts.query().by_concept('Revenues').to_dataframe()
|
||||
latest_revenue = revenue_facts.iloc[0]['value']
|
||||
```
|
||||
|
||||
### Compare Current vs Prior Year
|
||||
|
||||
```python
|
||||
# Get current period data
|
||||
current = xbrl.current_period
|
||||
current_income = current.income_statement()
|
||||
current_df = current_income.to_dataframe()
|
||||
|
||||
# Get full multi-period data
|
||||
full_income = xbrl.statements.income_statement()
|
||||
full_df = full_income.to_dataframe()
|
||||
|
||||
# Compare
|
||||
if len(full_df.columns) >= 2:
|
||||
comparison = pd.DataFrame({
|
||||
'Current': full_df.iloc[:, 0],
|
||||
'Prior': full_df.iloc[:, 1],
|
||||
'Change': full_df.iloc[:, 0] - full_df.iloc[:, 1]
|
||||
})
|
||||
print(comparison)
|
||||
```
|
||||
|
||||
### Extract Specific Disclosure Data
|
||||
|
||||
```python
|
||||
# Find debt-related disclosures
|
||||
all_statements = xbrl.get_all_statements()
|
||||
debt_statements = [s for s in all_statements if 'debt' in s['title'].lower()]
|
||||
|
||||
# Access first debt disclosure
|
||||
if debt_statements:
|
||||
debt_disclosure = xbrl.statements[debt_statements[0]['type']]
|
||||
debt_df = debt_disclosure.to_dataframe()
|
||||
```
|
||||
|
||||
### Export All Core Statements
|
||||
|
||||
```python
|
||||
# Export all core financial statements to CSV
|
||||
statements_to_export = {
|
||||
'income_statement': xbrl.statements.income_statement(),
|
||||
'balance_sheet': xbrl.statements.balance_sheet(),
|
||||
'cash_flow': xbrl.statements.cash_flow_statement(),
|
||||
}
|
||||
|
||||
for name, stmt in statements_to_export.items():
|
||||
if stmt:
|
||||
df = stmt.to_dataframe()
|
||||
df.to_csv(f"{name}.csv")
|
||||
```
|
||||
|
||||
### Build Custom Financial Summary
|
||||
|
||||
```python
|
||||
# Extract key metrics from multiple statements
|
||||
metrics = {}
|
||||
|
||||
# Revenue and profit from income statement
|
||||
income = xbrl.statements.income_statement()
|
||||
income_df = income.to_dataframe()
|
||||
metrics['Revenue'] = income_df.loc['Revenue', income_df.columns[0]]
|
||||
metrics['Net Income'] = income_df.loc['Net Income', income_df.columns[0]]
|
||||
|
||||
# Assets from balance sheet
|
||||
balance = xbrl.statements.balance_sheet()
|
||||
balance_df = balance.to_dataframe()
|
||||
metrics['Total Assets'] = balance_df.loc['Assets', balance_df.columns[0]]
|
||||
|
||||
# Cash flow from operations
|
||||
cashflow = xbrl.statements.cash_flow_statement()
|
||||
cashflow_df = cashflow.to_dataframe()
|
||||
metrics['Operating Cash Flow'] = cashflow_df.loc['Operating Activities', cashflow_df.columns[0]]
|
||||
|
||||
# Create summary DataFrame
|
||||
summary = pd.DataFrame([metrics])
|
||||
print(summary)
|
||||
```
|
||||
|
||||
## Entity Information
|
||||
|
||||
### Access Filing Metadata
|
||||
|
||||
```python
|
||||
# Get entity and filing information
|
||||
entity_info = xbrl.entity_info
|
||||
|
||||
print(f"Company: {entity_info.get('entity_name')}")
|
||||
print(f"Ticker: {entity_info.get('trading_symbol')}")
|
||||
print(f"CIK: {entity_info.get('entity_identifier')}")
|
||||
print(f"Form: {entity_info.get('document_type')}")
|
||||
print(f"Fiscal Year: {entity_info.get('document_fiscal_year_focus')}")
|
||||
print(f"Fiscal Period: {entity_info.get('document_fiscal_period_focus')}")
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Missing Statements
|
||||
|
||||
```python
|
||||
from edgar.xbrl.xbrl import StatementNotFound
|
||||
|
||||
try:
|
||||
equity = xbrl.statements.statement_of_equity()
|
||||
except StatementNotFound:
|
||||
print("Statement of equity not available in this filing")
|
||||
equity = None
|
||||
```
|
||||
|
||||
### Empty Query Results
|
||||
|
||||
```python
|
||||
# Query returns empty DataFrame if no matches
|
||||
results = xbrl.facts.query().by_concept('NonexistentConcept').to_dataframe()
|
||||
|
||||
if results.empty:
|
||||
print("No facts found matching query")
|
||||
```
|
||||
|
||||
### Handling Multiple Formats
|
||||
|
||||
```python
|
||||
# Some companies use different concept names
|
||||
revenue_concepts = ['Revenue', 'Revenues', 'SalesRevenue', 'RevenueFromContractWithCustomer']
|
||||
|
||||
for concept in revenue_concepts:
|
||||
revenue = xbrl.facts.query().by_concept(concept).to_dataframe()
|
||||
if not revenue.empty:
|
||||
print(f"Found revenue under concept: {concept}")
|
||||
break
|
||||
```
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Caching
|
||||
|
||||
```python
|
||||
# Facts are cached after first access
|
||||
facts = xbrl.facts # First call - loads data
|
||||
facts2 = xbrl.facts # Subsequent calls use cache
|
||||
```
|
||||
|
||||
### Limiting Results
|
||||
|
||||
```python
|
||||
# Use limit() to reduce memory usage for large result sets
|
||||
sample_facts = xbrl.facts.query().limit(100).to_dataframe()
|
||||
```
|
||||
|
||||
### Efficient Filtering
|
||||
|
||||
```python
|
||||
# Apply specific filters early in the query chain
|
||||
# Good: specific filters first
|
||||
revenue = (xbrl.facts.query()
|
||||
.by_statement_type("IncomeStatement") # Narrow down first
|
||||
.by_concept("Revenue") # Then more specific
|
||||
.to_dataframe())
|
||||
|
||||
# Less efficient: broad query then filter
|
||||
all_facts = xbrl.facts.query().to_dataframe()
|
||||
revenue = all_facts[all_facts['concept'] == 'Revenue']
|
||||
```
|
||||
|
||||
## Data Structure Reference
|
||||
|
||||
### Key Properties
|
||||
|
||||
| Property | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| `statements` | Statements | Access to financial statements |
|
||||
| `facts` | FactsView | Query interface for facts |
|
||||
| `entity_info` | dict | Company and filing metadata |
|
||||
| `reporting_periods` | list | Available time periods |
|
||||
| `contexts` | dict | XBRL contexts (periods + dimensions) |
|
||||
| `units` | dict | Units of measure |
|
||||
| `current_period` | CurrentPeriodView | Current period only |
|
||||
|
||||
### Fact DataFrame Columns
|
||||
|
||||
When you convert facts to a DataFrame using `.to_dataframe()`, you get:
|
||||
|
||||
- `concept`: XBRL element name (e.g., 'Revenues', 'Assets')
|
||||
- `label`: Human-readable label
|
||||
- `value`: Fact value (numeric or text)
|
||||
- `period`: Period identifier
|
||||
- `start`: Period start date (for duration periods)
|
||||
- `end`: Period end date
|
||||
- `unit`: Unit of measure (e.g., 'USD', 'shares')
|
||||
- `dimensions`: Dictionary of dimensional breakdowns
|
||||
- `decimals`: Precision indicator
|
||||
|
||||
## Integration with Other Classes
|
||||
|
||||
### With Filing
|
||||
|
||||
```python
|
||||
# XBRL comes from filing
|
||||
filing = company.get_filings(form="10-K").latest()
|
||||
xbrl = filing.xbrl()
|
||||
|
||||
# Access back to filing if needed
|
||||
# (Store reference if you need it)
|
||||
```
|
||||
|
||||
### With Company
|
||||
|
||||
```python
|
||||
# Get multiple filings and compare XBRL data
|
||||
filings = company.get_filings(form="10-Q", count=4)
|
||||
|
||||
revenue_trend = []
|
||||
for filing in filings:
|
||||
xbrl = filing.xbrl()
|
||||
revenue = xbrl.facts.query().by_concept('Revenue').to_dataframe()
|
||||
if not revenue.empty:
|
||||
revenue_trend.append({
|
||||
'filing_date': filing.filing_date,
|
||||
'revenue': revenue.iloc[0]['value']
|
||||
})
|
||||
|
||||
trend_df = pd.DataFrame(revenue_trend)
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Check statement availability** before accessing:
|
||||
```python
|
||||
print(xbrl) # See what's available
|
||||
```
|
||||
|
||||
2. **Use query chaining** for complex filters:
|
||||
```python
|
||||
results = (xbrl.facts.query()
|
||||
.by_statement_type("IncomeStatement")
|
||||
.by_fiscal_year(2024)
|
||||
.by_period_type("duration")
|
||||
.to_dataframe())
|
||||
```
|
||||
|
||||
3. **Handle missing data gracefully**:
|
||||
```python
|
||||
try:
|
||||
stmt = xbrl.statements.equity_statement()
|
||||
except StatementNotFound:
|
||||
stmt = None
|
||||
```
|
||||
|
||||
4. **Convert to DataFrame for analysis**:
|
||||
```python
|
||||
df = statement.to_dataframe() # Easier to work with
|
||||
```
|
||||
|
||||
5. **Use current_period for latest data**:
|
||||
```python
|
||||
current = xbrl.current_period
|
||||
latest_income = current.income_statement()
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Statement not found"
|
||||
|
||||
**Cause**: Statement doesn't exist in this filing or uses non-standard name
|
||||
|
||||
**Solution**:
|
||||
```python
|
||||
# List all available statements
|
||||
print(xbrl.statements)
|
||||
|
||||
# Or check available types
|
||||
all_statements = xbrl.get_all_statements()
|
||||
statement_types = [s['type'] for s in all_statements]
|
||||
```
|
||||
|
||||
### "No facts found"
|
||||
|
||||
**Cause**: Concept name doesn't match or no data for period
|
||||
|
||||
**Solution**:
|
||||
```python
|
||||
# Try pattern matching
|
||||
results = xbrl.facts.query().by_concept('Revenue', exact=False).to_dataframe()
|
||||
|
||||
# Or search by label
|
||||
results = xbrl.facts.query().by_label('revenue').to_dataframe()
|
||||
```
|
||||
|
||||
### "Empty DataFrame"
|
||||
|
||||
**Cause**: Period filter too restrictive or no data available
|
||||
|
||||
**Solution**:
|
||||
```python
|
||||
# Check available periods
|
||||
print(xbrl.reporting_periods)
|
||||
|
||||
# Query without period filter
|
||||
all_revenue = xbrl.facts.query().by_concept('Revenue').to_dataframe()
|
||||
```
|
||||
|
||||
This comprehensive guide covers the essential patterns for working with XBRL data in edgartools. For more examples, see the Filing and Statement documentation.
|
||||
Reference in New Issue
Block a user