MicroStrategy ONE

Refine Data Quality Before Importing

Use data wrangling to improve the quality and usability of your data before importing it. You can remove white space, filter, delete duplicate rows or cells, find and replace data, concatenate columns, etc. In addition, you can use data wrangling as a data discovery tool and to bypass some traditional extraction, transformation, and loading (ETL) methods.

On the Data Wrangling Dialog, you work with sample data (that is, a subset of your actual data) to build a script of functions that clean, transform, and prepare your data.

If you republish or refresh your data, the data wrangling steps apply to the new data.

  1. Create a blank dashboard or open an existing one.
  2. Choose Add Data > New Data to import data into a new dataset.

    or

    In the Datasets panel, click More next to the dataset name and choose Edit Dataset to add data to the dataset. The Preview Dialog opens. Click Add a new table.

    The Data Sources dialog opens.

  1. Select the data you want to import, according to your data source type.
  2. Once you have selected the data you want to import, click Prepare Data to open the Preview dialog.
  1. In the top pane, click the drop-down arrow for the table you want to modify and choose Wrangle. The Data Wrangling dialog opens, displaying a sample of your dataset.
  2. Select the column to modify from the Select Column drop-down list.

    or

    Select the column to modify in the preview.

  3. Select the function to apply from the list of suggestions or the Select Function drop-down list. See Functions for Wrangling Data for more information.

    If a function needs no further input, it is automatically applied to your data.

    If the function requires parameters, define the parameters and click Apply.

    You can change the date data type for a column. In the Select Column drop-down list, choose the column you want to change. In the Select Function drop-down list, under the Transform Cell selections, choose Change Data Type. Select Date from the next drop-down list, enter the date format in the following field, and click Apply. See How to Define Date and Time Data Types When Wrangling Data for more information.

  4. You can filter column data by using the Selector options from the Select Function drop-down list. These options display the following filters at the bottom of the screen that you can adjust to filter a column.

    Text Selector: Hover over text and select whether to Include or Exclude it from your import.

    Numeric Selector: Displays a graph with a range of numeric values. Move the start and end points of the range to filter the values. In addition, use the checkboxes for Numeric, Non-Numeric, Blank, and Error to filter the numeric values further.

    Timeline Selector: Displays a graph with a range of dates. Move the start and end points of the range to filter the dates. In addition, use the checkboxes for Time, Non-Time, Blank, and Error to filter the dates further.

    Text Search: Enter the text by which you want to filter the data. Select Case Sensitive to make your search case-sensitive. Select Regular Expressionto use Regular Expression (Regex) language.

    Using the Selector options does not delete the data. Instead, the data is hidden from the preview.

  5. You can apply functions to the filtered data by selecting the drop-down arrow associated with each filter.

    Reset: Undo a specific filter.

    Reset ALL Selector: Undo all filters.

    Delete: Deletes a specific filter

    Delete ALL Selector: Deletes all filters.

  6. To group similar column data into a cluster and replace the values, select the column you want to modify and choose Cluster and Edit from the Other section of the Select Function drop-down list. The Cluster and Edit dialog opens with the column data grouped into clusters of related data. See How to Cluster Data for more information.
  7. Double-click an individual cell to edit the data directly in the cell.
  8. Enter your changes in the pop-up dialog.
  9. Click Apply to make changes to selected cell.
  10. Click Apply to All to make changes to all cells that match the data in the selected cell.
  11. Click Undo in the History Script area to undo the previous function.
  12. Click Redo to redo the previous function.
  13. To undo multiple functions, click the preceding function you want to keep. All functions that follow the selected one are undone. Once a function is undone, it is grayed out.
  14. To redo multiple functions, click the latest grayed out function you want to redo. The selected function and the ones that precede it are re-applied.
  15. To undo all actions in the script, click Reset . Your data is returned to its original state. The history script is not deleted until you select another function, which starts a new script.
  16. Click Save . The Extract History Script dialog opens, displaying the full script.
  17. Click Save to save the script and its functions for reuse later with a different dataset.
  18. Click Close to return the Data Wrangling dialog without saving the script.
  19. Click Import . The Import History Script dialog opens.
  20. Click Browse to import a saved script.

    or

    Paste a script into the text box.

  21. Click Apply. The script runs against your data sample and returns to the Preview dialog.
  22. Click Finish to import your refined data.

Related Topics

Preview Data and Specify Import Options

Import Data