The dark side of the database - part 2

Maximilian Selmair
Nov 28, 2021
5 min read

Updated: Nov 29, 2021

This is a special guest blog post by Maximillian Selmair and Patrick Wöhe.

Just as we have seen in a previous post, the internal Anylogic Database has a dark side...

In this post, we want to make you aware of some other notable behaviors of the database when working with Excel as the external data source and point out a way to make your data import more reliable.

Often when creating models for industrial customers all the important input data for the simulation model is provided as Excel files. These can be shift schedules, production plans or customer orders. Truly, Excel is not the best choice for handling data but most companies are using this simple-to-use software for preparing and gathering their data. Therefore, customers of simulation projects are mostly demanding Excel as the tool of choice to manage their input data for simulation models.

Thus, it makes sense for simulation modelers to be proficient in getting Excel-based data into their models without hassle and detours. But how do you manage this?

You have probably already guessed it: The internal Anylogic database does the trick. Just establish a connection to the Excel file from Anylogic and all the data is being extracted and loaded into the database and thus your model. Sounds too easy and too good to be true? Absolutely!

In most cases, the standard connections work sufficiently, especially when you are familiar with using AnyLogic. But if you are working as a consultant clients' demands often differ... Most clients want to use the model without licensing AnyLogic, so your deliverable is an exported standalone version or a cloud version. In both cases, the input data can only be adjusted through the simulation frontend or the excel files that are being used. Many times, the second option is favored, since most clients are already familiar with Excel. Therefore, it is important to structure the data import as robust as possible.

We setup an example Excel sheet to replicate and solve some of the common issues. Below, you can see the Excel file on the left side and the imported data in AnyLogic on the right side.

We can adjust the entries in every column and usually AnyLogic will update those entries at simulation startup, when auto-updating is enabled. But here lies the first pitfall: if there is a major change in the structure or the data type of the input file, the AnyLogic DB will not refresh the database anymore and even worse, this goes without any error message or notification to the user and thus maybe in misleading simulation results.

We have evaluated a few cases that can cause this problem to occur:

Adding another column including the header to the table: Usually, it is not a problem when some side calculations are performed in additional columns in the excel file, but if you add a new header to those columns, refreshing the data from this file will not work anymore.
Changing the data type of a column: The table will also not refresh if we, for example, add a decimal number to a predefined integer column. Therefore, it is important to define any column containing only integer values to a double if you might have a decimal number in the future in this column.
Using VLOOKUP: Per se, using VLOOKUP is not a problem until you have any “#N/A”-Value. #N/A-Values also represent a different data type that will cause your database to stop updating. This can be avoided, for example, by wrapping an IFERROR-Statement around the VLOOKUP that returns the right data type.

As you can see in the Screenshots below, even the untouched columns will not update anymore – no error message or notification is generated.

So what now?

This remains true until the additional headers are removed and the integer column contains only integer values again. Basically, until the original format is restored. After that, the table works and updates as expected again.

Bottom line is, you need to be careful when relying solely on the auto-update feature when working with external data sources. Failures in updating may go unnoticed and endanger the quality of your model. Because of that we recommend a second way to work with excel data inputs, that gives the user the possibility the receive feedback, when a database update fails.

If you want to use the internal update function and be aware of possible errors, the following code can be used. Actually, it doesn’t solve the issues themselves but at least you will receive a respective error message when the table update fails.

The code below must be used either in “Before simulation run” for a prior-experiment-update or “On startup” for an agent-related update. If you are utilizing any data in your initialization routines, it is important to perform the update before you are initializing the main class. Otherwise, the initialization will be based on outdated data.

try {
       // this will call the additional class code
       importTableDataOnStartup( getDatabaseConnection() );
} catch ( Exception e ) {
       // this will generate an error message to let the user know that there are some issues
       e.printStackTrace();
}

To use the importTableDataOnStartup function above, "additional class code" must be added to the respective class (or agent):

public void importTableDataOnStartup(java.sql.Connection internalDatabaseConnection) throws Exception {
    try (DatabaseDescriptorRegistry r = new DatabaseDescriptorRegistry()) {
             java.sql.Connection cachedSourceConnection;
             cachedSourceConnection = r.getConnection(DatabaseDescriptorFactory.createFileDescriptor( Utilities.findExistingFile("excelfile.xlsx"), null, ""));
             UtilitiesDatabase.copyDatabaseTable(cachedSourceConnection, internalDatabaseConnection, "\"EXCEL-TABLE-NAME\"", "INTERNAL AL DB TABLE NAME");
       }
}

To ensure, that your model will update and therefore work properly even if your customer is mishandling the import file, you can use the Database block located in the AnyLogic Connectivity Palette. With this, you can have sql-like SELECT FROM statements on any Excel file. But: there are disadvantages which are quite important:

You can add WHERE-Statements, but these statements are not considered without any notification. This means that SELECT material FROM mat_db WHERE density > 5 will retrieve you every material in the list and not only material that has a density higher than 5.
Any formulas in your data are not solved to a result. You will instead receive a string containing the formula itself

Therefore, this workaround might only be used in some rare use-cases where

the chance is high that the column positions in the excel file are changing their position frequently,
columns are not containing any formulas
and if you have no need of filtering the data before retrieving it (obviously, the data can be filtered after the dataset was retrieved from the file, but this may cause performance issues for large amount of data. Therefore, we cannot recommend this method in general)

Out last introduced method utilizes the Excel file block of the AnyLogic Connectivity Palette. With this, you can access any filled cell of an Excel file by using a combination of row and column numbers. To use this for database-like calls of data, you must develop relatively complex access functions. If you do this in a proper way, you can truly create a powerful flexible interface to your own demands.

We, the guest authors, used this method for quite complex customer projects to ensure that the life cycle of the model will last long without any additional support requests from the customer. We explicitly recommend this method to only advanced users and complex customer requests. The best practice is to use the update method of the internal database with our extended code the get notified if AnyLogic has issues during the update.

This is a special guest blog post by Maximillian Selmair and Patrick Wöhe

What next?

If you liked this post, you are welcome to read more posts by following the links above to similar posts. Why not subscribe to our blog or follow us on any of the social media accounts for future updates. The links are in the Menu bar at the top or the footer at the bottom.

If you want to contact us for some advice, maybe a potential partnership or project or just to say "Hi!", feel free to get in touch here and we will get back to you soon!

The dark side of the database - part 2

What next?

Recent Posts

댓글