MicroStrategy ONE

Data loading

The data loading process gathers and prepares data for analysis and reporting in the Enterprise Manager project. During the data load process, Enterprise Manager gathers metadata from projects, transfers metadata and statistics information from staging tables into the Enterprise Manager repository, and converts raw data into a form that can be analyzed and used for reporting.

A data load occurs according to the schedule you define when you create the data load. For instructions on setting the schedule, see Creating a data load. You can also run a data load at any time if your repository is missing data. For steps on running a data load immediately, see Running a data load now.

You can turn on and off single data loads or all data loads at once. For steps, see Data loading.

The data load process has four steps.

  1. Enterprise Manager gathers metadata from projects. Enterprise Manager provides Intelligence Server with a time window for the data load. Intelligence Server then transfers relevant information about the project sources specified in Enterprise Manager to lookup tables in the statistics staging tables. Relevant information includes such data as report names, user/group names, and object relationships. Examples include user/group relationships and which schedules are mapped to which reports.

  2. While Intelligence Server is transferring lookup table information, Enterprise Manager moves statistics data from the statistics staging tables into the statistics tables in the Enterprise Manager repository.

  3. After Enterprise Manager has finished transferring statistics data, it starts transferring the staging lookup tables that Intelligence Server has completed. Enterprise Manager moves the information in the staging lookup tables into lookup tables in the Enterprise Manager repository.

    Metadata information for all projects in a project source is transferred into the Enterprise Manager lookup tables, regardless of whether those projects are configured to log statistics.

  1. The final step in the data load process involves processing the data in the Enterprise Manager repository tables. These statistics tables contain raw data logged by Intelligence Server. For performance reasons, most fields are fixed-length fields and contain data that cannot be directly interpreted by an administrator. An example of such a field is an object GUID. This raw data must be processed further to support administrative analysis and reporting requirements. SQL scripts transform the statistics data into a form that can be useful for administrative reporting. The transformation ensures that reporting on MicroStrategy metadata content is feasible. This transformed data is stored in fact tables in the Enterprise Manager repository.

Some of Enterprise Manager's fact tables are views of certain statistics tables. This substantially speeds up the data load process.

To ensure that the statistics data is complete, at the beginning of the data load process a timestamp is created in the EM_IS_LAST_UPDATE table, according to the current date and time in the Enterprise Manager repository. This timestamp indicates the end of the data migration window. The beginning of the data migration window is determined by the previous data load's timestamp entry in the EM_IS_LAST_UPDATE table. Therefore, the data load transfers any statistics logged between the start of the last data load and the start of the current one. When the data load process is complete, Enterprise Manager updates the EM_IS_LAST_UPDATE table to indicate that the process is finished.

If the data load process is successful, Enterprise Manager deletes all data from the staging lookup and staging statistics tables.

If the data load process is interrupted before it finishes, this last update is not time stamped. In this case, the next time a data load runs, it starts with data from the time the last successful data load was finished.

For log file information you can use to troubleshoot the Enterprise Manager data loading process, see Troubleshooting the data loading process.

Prerequisites

  • Users must have the following privileges assigned to successfully run a data load:

    • Web Reporter

    • Web user

    • WEBUSER

    • Command Manager

    • Use Command Manager

    • USECOMMANDMANAGER

    • Administration

    • Monitor Cluster

    • USECLUSTERMONITOR

  • Projects with Environments

  • Schedule for recurring data loads.

  • The UPDATEWAREHOUSE task is required for executing a data load.

Best practices for Enterprise Manager data loading

  • Performing the CLOSESESSIONS action is not required, but MicroStrategy recommends that you run this task with every data load to ensure sessions are terminated for each project being loaded.

  • Set up the scheduled data loads according to the answers to these questions:

    • How long does the data load take?
    • How current does the data need to be?
    • If you need near-real-time data, and the data load does not take longer than a few minutes, you may want to run the data load as often as once per hour. However, if the data load process takes a long time, you should run it when Intelligence Server use is low, such as overnight.

  • Some data load maintenance tasks can increase data load times. However, the following tasks should be run at least weekly.

    • REPOPULATETABLES - This task synchronizes the relationship (relate) tables in the Enterprise Manager repository with the metadata.

    • UPDATEOBJECTDELETIONS - This task ensures that objects that are deleted in the project metadata are marked as having been deleted in Enterprise Manager.

    • For more information about data load maintenance options, see Selecting Enterprise Manager maintenance tasks to perform in the data load.

  • When you change the name of a project in Developer, you should refresh the list of available projects in MicroStrategy Enterprise Manager before the next data load.

  • Synchronize the time of the Intelligence Server machine with the Enterprise Manager repository if possible. When Intelligence Server writes statistics into the database, it uses the repository database management system's timestamp. This is written as em_record_ts (in the fact tables) and as recordtime (in the statistics tables). Enterprise Manager uses recordtime to determine which statistics to move over according to the time window for a data load process. The time window is determined according to the Enterprise Manager repository database management system's time.

  • Also, if the repository database time is different from the Intelligence Server machine time, certain reports in Enterprise Manager may have missing data. For example, if statistics appear for "Deleted report" in Enterprise Manager reports, it may be because statistics are being logged for reports that, according to the repository's timestamp, should not exist.

Modifying a data load

You can change the schedule and tasks performed for an existing data load. For steps to create a data load, see Creating a data load.

Making changes with ALTER DATA LOAD commands

To make changes to an existing data load use the ALTER DATA LOAD command in Command Manager and make the necessary changes to your data load parameters. You can use ALTER DATA LOAD to do any of the following:

  • Change the name of a data load.

  • Enable or disable a data load.

  • Add/Remove projects and environments to monitor.

  • Make changes to the data load schedule.

  • Modify actions and tasks performed during the data load.

The following example shows the ALTER DATA LOAD command to change the name of a data load.

ALTER DATA LOAD "<name>" [NEW NAME "<new_name>"] [FOR ENVIRONMENT "<server_name>" AND PROJECT "<project_name>" IN ENTERPRISE MANAGER "<em_machine>" IN PORT <port>;

For more information on the commands available see the Syntax Examples section of the Command Manager Help documentation.

While making changes to a data load with ALTER DATA LOAD is supported, it is recommended that users make changes to the original CREATE DATA LOAD command. Once the changes are in place, execute the DELETE DATA LOADstatement followed by the CREATE DATA LOAD command containing the desired changes.

To disable or enable all data loads

You can disable and enable all Enterprise Manager data loads by stopping and starting the Enterprise Manager data loading service.

  1. On the machine that hosts Enterprise Manager, open the MicroStrategy Service Manager.

    • In Windows: Double-click the Service Manager icon in the system tray. If the Service Manager icon is not present, from the Windows Start menu, point to All Programs, then MicroStrategy Tools, and then select Service Manager.

    • In UNIX: You must be in an XWindows environment to run Service Manager in UNIX. From the /bin directory in the MicroStrategy directory, type ./mstrsvcmgr and press ENTER.

  1. From the Service drop-down list, select MicroStrategy Enterprise Manager Data Loader.

  2. Choose from these options:

    • To stop the data loader, slick Stop. All data loads are disabled. This is available if the service is running.

    • To start the data loader service, click, Start. This is available if the service is stopped.

    • To stop and immediately start the data loader service, click Restart. This is available if the service is running.

Running a data load now

You can run a data load immediately rather than waiting for its next scheduled time by running the following in Command Manager:

EXECUTE DATA LOAD "<data_load_name>" IN ENTERPRISE MANAGER "<em_machine>" IN PORT <port>;

Deleting a data load

You can delete a data load and all its settings by running the following in Command Manager:

DELETE DATA LOAD "<data_load_name>" FROM ENTERPRISE MANAGER "<em_machine>" IN PORT <port>;