Strategy ONE

Import Data from Hadoop Gateway

You can import the following file types from a Hadoop Distributed File System (HDFS): .avro, .csv, .json, .orc, .parquet, .txt.

If you choose to import files that don't have extension, you will be prompted by the Confirm File Type dialog to identify the file type.

  1. Create a document or open an existing one.
  2. Choose Data > Add Dataset.
  3. Click Add External Data.
  4. Hover over Hadoop and click Browse Hadoop Files.
  5. Enter your connection credentials in the Data Source dialog and click Save.
  6. Drag your files from the left pane to the right pane.
  7. Click Finish.
  8. Click Connect Live or Import as an In-Memory Dataset.

The connectivity timeout between Hadoop Gateway and Intelligence Server is 20 minutes by default. To increase this timeout limit, create a file named QueryDSServerTimeout.ini and place it in your Intelligence Server directory. The only entry in this file will be the numeric value (in minutes) for your timeout limit. Placing a value of -1 in this file will set the timeout to unlimited.