Loading utilities for data file operations.
The loading_utils.py module provides functions to load data from CSV, TXT and Excel files.
Load data from a CSV file.
Parameters:
file_path: Path to the CSV file
Returns:
- DataFrame with the CSV data, treating 'no' as NaN values
Raises:
FileNotFoundError: If file does not existDataLoadError: If file cannot be read
Example:
from loaders.loading_utils import csv_reader
# Load CSV file
data = csv_reader('input/data.csv')
print(f"Loaded {len(data)} rows, {len(data.columns)} columns")Notes:
- The function treats the string 'no' as a NaN value
- Uses UTF-8 encoding
- Automatically infers data types
Load data from an Excel file (.xlsx).
Parameters:
file_path: Path to the Excel file
Returns:
- DataFrame with the Excel data
Raises:
FileNotFoundError: If file does not existDataLoadError: If file cannot be read
Example:
from loaders.loading_utils import excel_reader
# Load Excel file
data = excel_reader('input/data.xlsx')
print(f"Loaded {len(data)} rows, {len(data.columns)} columns")Notes:
- Supports
.xlsxformat - Reads the first sheet by default
- Handles missing values automatically
from loaders.loading_utils import csv_reader
from utils.exceptions import FileNotFoundError
try:
data = csv_reader('nonexistent.csv')
except FileNotFoundError as e:
print(f"File not found: {e}")from loaders.loading_utils import excel_reader
from utils.exceptions import DataLoadError
try:
data = excel_reader('corrupted.xlsx')
except DataLoadError as e:
print(f"Failed to load: {e}")from loaders.loading_utils import csv_reader
from utils.exceptions import DataLoadError
try:
data = csv_reader('empty.csv')
except DataLoadError as e:
print(f"File is empty: {e}")- Encoding: UTF-8
- Delimiter: Comma (
,) - Header: First row should contain column names
- Missing values: Use 'no' or leave empty
Example CSV:
x,y,ux,uy
1.0,2.0,0.1,0.2
2.0,4.0,0.1,0.2
3.0,6.0,0.1,0.2- Sheets: Reads first sheet by default
- Header: First row should contain column names
- Missing values: Empty cells or 'no'
Example Excel:
| x | y | ux | uy |
|---|---|---|---|
| 1.0 | 2.0 | 0.1 | 0.2 |
| 2.0 | 4.0 | 0.1 | 0.2 |
| 3.0 | 6.0 | 0.1 | 0.2 |
from loaders.loading_utils import csv_reader, excel_reader
# Load CSV
csv_data = csv_reader('input/experiment.csv')
# Load Excel
excel_data = excel_reader('input/experiment.xlsx')The loading utilities are typically used through the higher-level data_loader module:
from loaders.data_loader import load_data
# Load data using file path from native file picker or other source
data = load_data('input/experiment.csv', 'csv')-
Error Handling: Always wrap file operations in try-except blocks
try: data = csv_reader('data.csv') except FileNotFoundError: print("File not found") except DataLoadError as e: print(f"Loading error: {e}")
-
Path Validation: Use absolute paths when possible
from pathlib import Path file_path = Path('input') / 'data.csv' data = csv_reader(str(file_path))
-
Data Validation: Validate loaded data before use
from utils.validators import validate_dataframe data = csv_reader('data.csv') validate_dataframe(data, min_rows=2)
- CSV: Uses
pandas.read_csv()with UTF-8 encoding - Excel: Uses
pandas.read_excel()with automatic format detection - Error Handling: Catches pandas exceptions and converts to custom exceptions
For more information about data loading, see Data Loader.