Forecast App - Manual

Marco Zanotti

2025-09-30

Introduction

The Forecast App is a comprehensive tool designed to guide users through the entire forecasting workflow, from raw data to scenario analysis. It is tailored for both beginners and experienced practitioners, allowing the exploration, modeling, and explanation of time series data without requiring advanced knowledge of forecasting techniques.

The app is organized into four main sections —Data, Analyze, Features, and Forecast— which collectively enable users to:

Upload and explore time series data
Detect and handle missing values and anomalies
Transform and preprocess data for improved forecasting performance
Create and select relevant features, both internal and external
Fit and optimize a wide range of forecasting models, including classical, machine learning, deep learning, and ensemble approaches
Evaluate, compare, and explain the models’ predictions
Generate probabilistic forecasts and business-oriented scenarios

All actions performed in the app are automatically saved, meaning that any transformation, cleaning, or feature engineering applied to the series is carried forward to the subsequent steps. This ensures a consistent and reproducible workflow from raw data to final forecasts.

The app also provides great flexibility in analysis: interactive visualizations, configurable model parameters, and multiple options for evaluation and explanation allow users to tailor the process to their specific needs. Common controls, such as the gear button ⚙️, enable the selection of specific parameters. The play button ▶️ is used to execute actions like generating forecasts or running tests, while a reset button 🔄 is available to clear all settings and start fresh. Moreover, for most tables displayed throughout the dashboard, an Export button is available, allowing users to download the table contents in CSV or Excel format for further use.

Follow this user guide to leverage the full potential of the Forecast App to obtain accurate forecasts, understand model behavior, and explore different scenarios for informed decision-making.

If you are interested in who I am and what I do, visit my website.

The source code of the Forecast App is available on GitHub under the MIT license.

If you find bugs or have suggestions for improvements, please open an issue on GitHub.

Feel free to share the Forecast App on LinkedIn if you find it useful and remember to tag me!

Section 1: Data

The Data section is the starting point of the Forecasting App. Here you can choose which time series to analyze and prepare for forecasting.

Predefined datasets

The app includes a few predefined univariate time series datasets that you can load instantly. These datasets are useful for exploring the app’s functionalities and testing the workflow without needing to upload your own data.

Import your own data

You can also import your own univariate time series dataset in CSV format (sep = “,” and dec = “.”).
The file must contain the following columns:

date: the timestamp of the observation (must be in a valid date format, e.g., YYYY-MM-DD)
id: an identifier for the time series (can be set to a default value if only one series is provided)
value: the observed numerical value

The following frequencies are supported:

year (1 observation per year)
semester (2 observations per year)
quarter (4 observations per year)
month (12 observations per year)
week (52 observations per year)
bus-day (252 observations per year, business days)
day (365 observations per year, calendar days)
hour (365 × 24 observations per year, calendar hours)
half-hour (365 × 48 observations per year, calendar half-hours)

Options available

Once a dataset is loaded, you have to set some important global parameters that will be used throughout the app:

Frequency: set the frequency of the time series (e.g., daily, weekly, monthly).
Forecast Horizon: specify how many future periods to forecast.
Assessment Period: define the historical period to be used for model assessment.
Impute missing values: apply automatic interpolation based on the declared frequency to handle missing observations without the need of manual imputation.

Explore the data

You can briefly explore the dataset:

Table: view the raw data in tabular format
Summary statistics: obtain basic descriptive statistics of the time series (mean, median, min, max, missing values, etc.)

This initial exploration helps you quickly assess whether the data has been uploaded correctly and is ready for further analysis.

⚠️ Note: Make sure your dataset is properly formatted, as incorrect column names or unsupported frequency values will prevent the app from processing the data.

⚠️ Note: The uploaded data is now fully editable.
Users can directly modify values within the data table (e.g., correcting errors, adjusting entries).
Any edits are automatically saved and will affect all subsequent steps of the workflow.

Section 2: Analyze

The Analyze section provides a set of tools to better understand and prepare your time series before building forecasting models.

From the main navigation bar, you can click on Analyze and then select one of the available options from the dropdown menu. Each option corresponds to a specific step in the analysis workflow.

In this section you can:

Visualize the time series with interactive plots to better understand trends, seasonality, and irregular patterns.
Perform Hypothesis Testing to check for important statistical properties such as stationarity or autocorrelation.
Detect and correct Anomalies in the data (e.g., outliers or unusual spikes).
Apply Transformations to stabilize variance, remove seasonality, or make the series more suitable for modeling.

These tools are designed to help you diagnose potential issues in the data and apply the necessary preprocessing steps before moving on to feature creation and forecasting.

Visualize

The Visualize subsection allows you to explore the main characteristics of your time series through a set of interactive plots.

Panels displayed

Four plots are displayed:

Time series plot shows the raw time series (with optional smoothing and imputation).
Autocorrelation plot (ACF): displays correlations between the series and its lagged values.
Decomposition plot: breaks down the series into trend, seasonal, and residual components.
Seasonality plot: visualizes recurring seasonal patterns within the chosen frequency.

This visualization step helps you gain an initial understanding of the structure, seasonality, and quality of your time series before applying more advanced analysis or forecasting techniques.

Hypothesis Testing

The Hypothesis Testing subsection allows you to check key statistical properties of your time series. This step is important to understand whether the data satisfies common assumptions required by forecasting models.

Tests performed

The app groups the statistical tests into three categories:

Normality tests:
- Jarque-Bera test
- Shapiro-Wilk test
Autocorrelation tests:
- Box-Pierce test
- Ljung-Box test
Stationarity tests:
- Augmented Dickey-Fuller (ADF) test
- Phillips-Perron (PP) test
- Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test

Options available

Before running the tests, you can choose whether to apply a transformation to the series using a dropdown menu. The available transformations are:

Multiply by -1
Add 1
Log
Box-Cox
Log-Interval
Min-Max scaling
Standardization
Differencing
Seasonal Differencing

This feature allows you to test hypotheses not only on the original series but also on transformed versions, which can be useful to verify whether a transformation improves normality, reduces autocorrelation, or induces stationarity.

Panels displayed

The subsection is organized into four panels:

Test results table: displays the outcome of all hypothesis tests (test statistic, p-value, and decision).
Distribution plot: shows the distribution of the time series values compared with a normal curve.
Time series plot: visualizes the raw (or transformed) series to support interpretation of the tests.
Autocorrelation plot (ACF): highlights the presence of autocorrelation in the series.

This subsection helps you assess whether the series is normally distributed, autocorrelated, or stationary — information that is crucial when deciding which forecasting models and preprocessing steps are most appropriate.

Anomaly Detection

The Anomaly Detection subsection allows you to identify and optionally clean unusual observations in your time series. Detecting anomalies is important because extreme values can distort model estimation and forecasts.

Options available

You can adjust several parameters when detecting anomalies:

Anomaly method: choose the detection method (currently only STL is available).
Significance level: set the statistical threshold for identifying anomalies.
Maximum % of anomalies: limit the proportion of points flagged as anomalies.
Cleaning level: define the quantile threshold used for the cleaning step.

Panels displayed

Two plots are generated to visualize the results:

Time series with anomalies: shows the original series with anomalies highlighted.
Cleaned time series: displays the series after anomalies have been removed.
Anomaly data: lists the detected anomalies with their timestamps, values, and other characteristics.

This subsection ensures that subsequent analyses and forecasts are performed on a series that reflects the typical behavior of the data, minimizing the impact of extreme or erroneous observations.

Transform

The Transform subsection allows you to apply transformations to your time series to improve its statistical properties and suitability for forecasting models. Transformations can stabilize variance, reduce skewness, or remove trends and seasonality.

Options available

You can adjust the following parameters:

Transformation dropdown: choose a transformation to apply to the series. Available transformations include:
- Multiply by -1
- Add 1
- Log
- Box-Cox
- Log-Interval
- Min-Max scaling
- Standardization
- Differencing
- Seasonal Differencing
Time period: select the start and end of the series to transform.

Panels displayed

Four plots are generated to help you assess the effects of transformations:

Time series: shows the transformed series.
Autocorrelation (ACF) plot: displays autocorrelations after transformation.
Decomposition plot: visualizes trend, seasonality, and remainder components.
Seasonality plot: highlights the seasonal patterns in the transformed series.

Using this subsection, you can preprocess your series for modeling, ensuring that it meets the assumptions required by different forecasting methods.

Section 3: Features

The Features section allows you to create and manage features that can improve the performance of your forecasting models. Features provide additional information about the series and can capture patterns that help models generate more accurate forecasts.

You can navigate to the Features section in the main bar and select one of the following options from the dropdown menu:

Internal Features: create features derived from the time series itself.
External Features: incorporate external variables that may influence the series.
Feature Selection: identify the most relevant features to use in your models.

Internal Features

The Internal Features subsection allows you to create features derived directly from your time series. These features can capture trends, seasonality, and other patterns that improve the accuracy of forecasting models.

Options available

Add calendar features: include day, week, month, quarter, etc., as features.
Add holiday features: include indicators for holidays.
Fourier Periods: specify the periods for Fourier terms (values separated by commas).
Fourier Orders: select the Fourier order (from 1 to 10).
Spline Degrees: specify the degrees for spline features (values separated by commas).
Lag Orders: specify lag orders to include past values as features (values separated by commas).
Lag Rolling Periods: specify periods for rolling statistics (values separated by commas).
Interactions: create interaction features using expressions (expressions separated by commas).

Panels displayed

Table: view the features data in tabular format.
Summary statistics: obtain basic descriptive statistics of the features (mean, median, min, max, missing values, etc.).
Time series plot with features: displays the original series together with all added features for visual inspection.
Correlation: shows the correlation between the time series and internal features.

⚠️ Note: Lag orders must be greater or equal than the forecast horizon in order to be used in forecasting.

⚠️ Note: Lag Rolling Periods must be greater or equal than two and are computed only on the maximum lag order.

External Features

The External Features subsection allows you to include features from external datasets that may influence your time series. These features help improve forecast accuracy by incorporating additional information.

Options available

Upload external dataset: the dataset must follow the same structure as in Section 1, with columns date, id, and two or more feature columns.

Panels displayed

Table: view the features data in tabular format.
Summary statistics: obtain basic descriptive statistics of the features (mean, median, min, max, missing values, etc.).
Time series plot with features: displays the original series together with the selected external features.
Cross-correlation plot: visualizes the correlation between the time series and external features to assess potential predictive power.

⚠️ Note: External features must be available in the future for the whole forecast horizon to be used in forecasting.

⚠️ Note: External features must be at least two, otherwise it is not possible to perform Feature selection.

Feature Selection

The Feature Selection subsection allows you to identify the most relevant features to use in your forecasting models. Different statistical and machine learning methods help evaluate the predictive power of internal and external features.

Options available

Absolute Correlation Threshold: set the minimum normalized correlation value for a feature to be retained.
PPS Threshold (Predictive Power Score): set the minimum PPS value for a feature to be retained.
LASSO Threshold: set the minimum normalized score required for a feature to be retained in LASSO analysis.
Random Forest Threshold: set the minimum normalized importance score required for a feature to be retained in Random Forest analysis.
Linear regression formula: specify a formula to fit a linear regression model (e.g., value ~ .).
Save feature set: choose whether to save the selected features for later use.

Panels displayed

Correlation analysis: shows feature correlations with the target series.
PPS analysis: shows the Predictive Power Score for each feature.
LASSO analysis: shows feature importance using LASSO regression.
Random Forest analysis: shows feature importance using a Random Forest model.
Selected feature set: displays features retained by each selection method.
Linear regression time series plot: shows the original series with fitted values from the linear regression model.
Final dataset with selected features: shows the data with only the features included in the final feature set.

⚠️ Note: All feature selection scores (Correlation, PPS, LASSO, Random Forest) are normalized in the range 0–1.

Section 4: Forecast

The Forecast section allows you to test, optimize, compare, explain, and combine forecasting models, as well as perform scenario analysis.

You can navigate to the Forecast section in the main bar and select one of the following options from the dropdown menu:

Test & Evaluate: assess the performance of different forecasting models.
Explain: understand the drivers of the models’ predictions.
Optimize: tune hyperparameters to improve model accuracy.
Compare: compare the performance of multiple models.
Combine: create ensembles of models to improve forecasts.
Scenario Analysis: simulate different scenarios and assess their impact on forecasts.

Test & Evaluate

The Test & Evaluate subsection allows you to fit a forecasting model and evaluate its performance over the specified assessment period.

Options available

Select forecast algorithm: choose the model to use for forecasting.
Algorithm parameters: based on the selected algorithm, the corresponding parameters appear in the sidebar. By default, the automatic version of the algorithm is used (when available), but you can toggle off the automatic option and manually set parameter values.
Back-transform results: choose whether to back-transform the forecasts if a transformation was applied to the time series.
Forecast button: click to fit the model and generate forecasts for the specified horizon.

Panels displayed

Time series plot with forecasts on the test set: shows the model’s predictions over the test period compared with the actual values.
Time series plot with out-of-sample forecasts: displays the forecasts for the future horizon beyond the available data.
Table with evaluation metrics: summarizes performance metrics such as RMSE, MAE, and MAPE for the selected model.
Details of the algorithm used: provides information on the model type, selected parameters, and settings.
Residuals time series plot: shows the residuals (errors) of the model over the test period.
Autocorrelation plot of residuals: displays the autocorrelation function of residuals to check for remaining patterns or dependencies.

Available Algorithms

The forecasting algorithms are grouped into five categories:

Time Series (TS): classical time series methods.
- Naive
- Seasonal Naive
- Rolling Average
- ETS
- Theta
- SARIMA
- TBATS
- STLM
- Prophet
Machine Learning (ML): regression and tree-based models.
- Linear Regression
- Elastic Net
- MARS
- KNN
- SVM
- Random Forest
- Boosted Trees (XGBoost and LightGBM)
- Cubist
Deep Learning (DL): neural network-based models.
- Feed-Forward
Mixed (MIX): hybrid models combining TS and ML methods.
- Feed-Forward AR
- ARIMA-Boost
- Prophet-Boost
AutoML (AML): automated machine learning frameworks.
- H2O AutoML (including Deep Learning algorithms)

Explain

The Explain subsection allows the user to understand how the selected forecasting algorithm produces its predictions using eXplainable AI techniques. The process is divided into two steps.

Options available

Step 1: Select Algorithm
- Select forecast algorithm: choose the algorithm to explain.
- Back-transform: choose whether to back-transform forecasts if a transformation was applied to the time series.
- Explain button: click to produce the default XAI analysis.
Step 2: Local Explanation
- Feature: select the specific feature to explain.
- Date: choose the date for which a local explanation is desired.

Panels displayed

List of features: shows all features used by the selected algorithm.
Feature importance: displays the global importance of each feature for the model.
Break-down analysis: illustrates the contribution of each feature to the forecast.
Variable response analysis: shows how changes in a feature affect the forecast.
Local stability analysis: evaluates the stability of feature contributions for the selected date and feature.

⚠️ Note: XAI is only available for certain algorithms, including Linear Regression, Elastic Net, MARS, KNN, SVM, Random Forest, Boosted Trees, Cubist, Feed-Forward, Feed-Forward AR, ARIMA-Boost, Prophet-Boost, and H2O AutoML.

Optimize

The Optimize subsection allows you to perform hyperparameter tuning for a selected forecasting model.

Options available

Select forecast algorithm: choose the model for which you want to optimize hyperparameters.
Select hyperparameters to optimize: based on the selected algorithm, the corresponding hyperparameters are displayed for optimization.
Back-transform results: choose whether to back-transform forecasts if a transformation was applied to the time series.
Optimize button: click to perform hyperparameter optimization and identify the best hyperparameter combination for the selected model.
Use current optimized algorithm?: if an optimized version of the algorithm has already been created, this option allows the user to use it in the next forecasting steps instead of running a non-optimized version of the algorithm.

Panels displayed

Time series plot with forecasts on the test set: shows the model’s predictions over the test period compared with the actual values.
Time series plot with out-of-sample forecasts: displays the forecasts for the future horizon beyond the available data.
Table with evaluation metrics: summarizes performance metrics such as RMSE, MAE, and MAPE for the selected model.
Details of the algorithm used: provides information on the model type, selected parameters, and settings.
Residuals time series plot: shows the residuals (errors) of the model over the test period.
Autocorrelation plot of residuals: displays the autocorrelation function of residuals to check for remaining patterns or dependencies.

⚙️ General Options: Using the gear button, the user can specify the parameters of the tuning process, including the Validation Type (Time Series CV or K-Fold CV), the number of Folds, the Validation Metric, whether to Use Bayesian Optimization for optimization, and the Grid Size for grid search.

⚠️ Note: Hyperparameter optimization is only available for certain algorithms, including Elastic Net, MARS, KNN, SVM, Random Forest, Boosted Trees, Cubist, Feed-Forward, Feed-Forward AR, ARIMA-Boost, and Prophet-Boost.

Compare

The Compare subsection allows you to compare the forecasts of multiple algorithms.

Options available

Select forecast algorithm: choose the models to compare.
Back-transform: choose whether to back-transform forecasts if a transformation was applied to the time series.
Forecast button: click to fit the models and compare forecasts for the specified horizon.

Panels displayed

Time series plot with forecasts on the test set: shows the predictions of all selected algorithms over the test period.
Time series plot with out-of-sample forecasts: displays the forecasts of all algorithms for the future horizon beyond the available data.
Table with evaluation metrics: compares performance metrics such as RMSE, MAE, and MAPE for all selected algorithms.
Details of the algorithms used: provides information on the model type, parameters, and settings for each algorithm.

Combine

The Combine subsection allows the user to create ensembles of forecasting algorithms using simple combination methods or stacking approaches.

Options available

Select forecast algorithm: select one or more algorithms to include in the ensemble.
Back-transform: choose whether to back-transform forecasts if a transformation was applied to the time series.
Select ensemble method: choose between simple mean of all selected forecasts, weighted mean based on the ranking of algorithms’ performance, or median of forecasts.
Select stacking method: choose the meta-learner for stacking (linear regression, elastic net, or boosted tree).
Combine button: click to produce the combined forecasts through simple ensembles or stacking.

Panels displayed

Time series plot with forecasts on the test set: shows the ensemble or stacked forecasts over the test period.
Time series plot with out-of-sample forecasts: displays the ensemble or stacked forecasts for the future horizon.
Table with evaluation metrics: compares performance of the ensemble or stacked forecasts.
Details of the algorithm(s) used: provides information on the models included in the ensemble or stacking model.

Scenario Analysis

The Scenario Analysis subsection allows the user to generate alternative future scenarios based on one selected forecasting model. This step is always performed after obtaining forecasts, either from a single model or from combinations (ensembles or stacking).

Options available

Select forecast algorithm: choose the algorithm to use for scenario generation (can also be a combination of models).
Back-transform: choose whether to back-transform forecasts if a transformation was applied.
Forecast button: click to generate the baseline out-of-sample forecasts.

Once the baseline forecast is produced, the user can define three scenario parameters to generate alternative scenarios:
- Confidence level: defines the prediction intervals for upper and lower scenarios.
- Aggregation function: select how results should be aggregated over the forecast horizon. - Business adjustment (%): apply a manual adjustment to the mean forecast (negative values decrease).

Panels displayed

Out-of-sample forecast with prediction intervals: shows the baseline forecast with upper and lower confidence bounds.
Scenario forecast: displays mean forecast, worst-case, and best-case scenarios over the forecast horizon.
Table of evaluation metrics: performance metrics calculated on the test set.
Table of aggregate results: aggregated forecast values over the horizon, with comparisons to the previous period.
Table of scenario values: lists the values for worst-case, mean forecast, and best-case scenarios.

⚠️ Note: If combinations of models were used, only the best performing combination (in ensemble or stacking) is used for scenario analysis.

⚠️ Note: The worst and best case scenarios are not affected by the manual adjustments.

Typical User Workflow

This section describes the typical process of using the application from start to finish. While users may explore the app in different ways, the following workflow ensures a structured and reproducible analysis.

1. Import the Data

Load the dataset
- Choose between a predefined dataset (via the dataset selector) or upload a custom file.
- Once loaded, the available date range (min_date and max_date) is automatically detected.
Set basic parameters
- Frequency of the time series.
- Forecast Horizon to specify how far ahead forecasts will be generated.
- Assessment Period to define the evaluation window.
Data editing
- The dataset table is editable: users can directly modify values before proceeding.
Export
- Tables can be exported to CSV or Excel for external use.

2. Analyze the Data

a. Visualization

Inspect raw and cleaned time series with interactive plots.
Visualize transformations and decompositions.

b. Hypothesis Testing

Apply a range of statistical tests on the data.

c. Anomaly Detection

Select the Anomaly Method (currently only STL).
Set the Cleaning Level (quantile for filtering).
Review anomalies in a dedicated table.

d. Transformation

Adjust the Date Period using the date range selector.
Apply selected transformations to the dataset.

3. Create Features

a. Internal Features

Add calendar features, holiday features, Fourier terms, splines, lags, rolling periods, and interactions.
Inputs are flexible and can be updated interactively.

b. External Features

Upload external regressors to enrich the dataset.

c. Feature Selection

Compute correlation, predictive power score (PPS), LASSO, and Random Forest importance.
All scores are normalized between 0 and 1.
Adjust thresholds to refine selection.
Save the selected feature set for forecasting.
Inspect the final dataset containing only selected features in a dedicated panel.

4. Test some Models

Run preliminary models with default configurations.
Evaluate point accuracy and probabilistic accuracy on the assessment period.
Use metrics such as RMSE, MAE, or coverage probabilities.
Identify baseline performance before optimization.

5. Explain

Analyze feature importance across different algorithms.
Understand the drivers of the forecast through variable importance plots, coefficients, or SHAP-style explanations.
Facilitate interpretability for both technical and non-technical stakeholders.

6. Optimize

Choose a Forecast Algorithm from machine learning, deep learning, or mixed methods.
Configure validation strategy:
- Time Series CV or K-Fold CV.
- Number of folds, metric, and whether to use Bayesian optimization.
Adjust grid size for search.
Set hyperparameters to optimize (depending on the algorithm).
Launch optimization with the Optimize! button.
Once completed, the optimized model can be reused:
- Enable the option Use current optimized algorithm? to carry it forward into the Forecasting section.

7. Compare

Evaluate multiple algorithms side by side.
Compare models on accuracy, computation time, and robustness.
Results are presented both visually and in tabular format.

8. Combine

Build ensemble models by combining forecasts from different algorithms.
Evaluate whether ensembling improves predictive performance over single models.
Trade off accuracy gains with computational cost.

9. … and finally Forecast!

Translate forecasts into business scenarios.
Adjust forecasts based on external or managerial inputs.
Generate alternative what-if scenarios (e.g., demand shocks, promotions, policy changes).
Facilitate communication of results with stakeholders through scenario-driven visualizations.