MicroStrategy ONE
Apply Aggregation and Filtering to Hadoop Data Imports
When importing data from a Hadoop Distributed File System (HDFS) you can apply different aggregation functions, as well as filters, to the data during import. This allows you to customize the amount of data being brought into memory.
Aggregation
The Aggregation option is available in the Data Preview dialog.
To apply an aggregation function to your data:
- Click Aggregation to open the Aggregation Dialog Box.
- Right-click on a field and select the desired function from the menu.
-
Click Execute SQL to preview your data with the aggregation applied.
The aggregation/function executes against top 100,000 rows of entire dataset, by default. Therefore, the precise of Sum, Max, Min, Average, and Count functions are impacted during preview (other functions are not affected). Aggregation results are re-calculated against the entire dataset while publishing the cube. The
hgos.aggregation.preview.rows
property value in/conf/hgos-spark.properties
can be adjusted to specific number of rows, or set to a value of -1 for an unlimited number of rows. -
Click OK to save the new schema definitions with the aggregation/function applied.
If you attempt to wrangle data after aggregation or functions have been applied, the system discards your changes.
Filtering
Hadoop Gateway also supports filtering of data imports. Filtering allows you to import limited data which matches a certain condition instead of entire dataset.
The Filter option is available in the context menu and at the top of the Aggregation Dialog box.
Supported Functions
Category | Function | Notes |
---|---|---|
Basic | The result may incorrect if value exceeds numeric type's minimum or maximum. for example, integer value rang is from -2147483647 to 2147483647, the result will be unexpected if the result exceeds the limitation. | |
Sum | ||
Average | ||
Minimum | ||
Maximum | ||
Count | ||
Greatest | ||
Least | ||
Datetime | If function is applied on datetime value, the time part will be truncated. for example, apply AddDays on 2017/5/12 18:23:12, the value will be truncated to 2017/5/12 00:00:00 | |
AddDays | ||
CurrentDate | ||
CurrentDateTime | ||
CurrentTime | ||
Date | ||
DayOfMonth | ||
DayOfWeek | ||
DayOfYear | ||
DaysBetween | ||
Hour | ||
MilliSecond | ||
Minute | ||
Month | ||
MonthEndDate | ||
MonthBetween | ||
MonthStartDate | ||
Quarter | ||
Second | ||
Week | ||
Year | ||
YearEndDate | ||
YearStartDate | ||
Math | ||
Abs | ||
Acos | When the input value is infinity, the return value will be "-1.#IND" to indicate invalid value, because acos is not convergent. | |
Asin | When the input value is infinity, the return value will be "-1.#IND" to indicate invalid value, because asin is not convergent. | |
Asinh | ||
Atan2 | ||
Atanh | ||
Ceiling | ||
Combine | ||
Cos | When the input value is infinity, the return value will be "-1.#IND" to indicate invalid value, because cos is not convergent. | |
Cosh | ||
Degrees | ||
Exp | ||
Factorial | If #0 > 20, then null will be returned. If #0 <= 20 corresponding integer value will be returned. | |
Floor | ||
Int | ||
Ln | ||
Log | ||
Log10 | ||
Mod | ||
Multiply | ||
Power | ||
Quotient | ||
Radians | ||
Randbetween | ||
Round | ||
Round2 | ||
Sin | When the input value is infinity, the return value will be "-1.#IND" to indicate invalid value, because sin is not convergent. | |
Sinh | ||
Tan | When the input value is infinity, the return value will be "-1.#IND" to indicate invalid value, because tan is not convergent. | |
Tanh | ||
Trunc | ||
Null/Zero | ||
NullTo/Zero | ||
ZeroToNull | ||
String | ||
Concat | ||
ConcatBank | ||
InitCap | ||
LeftStr | ||
Length | ||
Lower | ||
LTrim | ||
Position | ||
RightStr | ||
RTrim | ||
SubStr | ||
Trim | ||
Upper |
Supported Filters
Category | Filter |
---|---|
Basic | In List |
Not In List | |
Exactly | |
Not Exactly | |
Comparison | Greater than |
Less Than | |
Greater than or equal to | |
Less than or equal to | |
Between | |
Not between | |
Pattern | Like |
Not like | |
Contains | |
Does not contain | |
Begin with | |
Does not begin with | |
End with | |
Does not end with |