• Articles
  • Tutorials
  • Interview Questions

Performing Advanced Operations with Databases

Steps to perform Pentoho Advanced Operations with Databases

Populating the Jigsaw database:

  1. From Packt’s website download the js_data.sql script file.
  2. Launch the MySQL query browser.
  3. From the File menu select Open Script….
  4. Locate the downloaded file and open it.
  5. At the beginning of the script file you will see this line:

USE js;
If you created a new database, replace the name js by the name of your new database.

  1. Click on the Execute button.
  2. At the bottom of the screen, you’ll see a progress message.
  3. When the script execution ends, verify that the database has been populated. Execute some SELECT statements such as:

SELECT * FROM cities
All tables must have records.
Check the Pentaho training from Intellipaat now to kick-start your career in business intelligence.
Using a Database lookup step to create a list of products to buy:

  1. Create a new transformation.
  2. From the Input category of steps, drag a Get data from XML step to the canvas.
  3. Use it to read the orders.xml file. In the Content tab, fill the Loop XPath option with the /orders/order string. In the Fields tab get the fields.
  4. Do a preview. You will see the following:

1

  1. Add a Sort rows step and use it to sort the data by man_code, prod_code.
  2. Add a Group by step and double-click it.
  3. Use the upper grid for grouping by man_code and prod_code.
  4. Use the lower grid for adding a field with the number of orders in each group. As Name write quantity, as Subject ordernumber, and as Type write Number of Values (N). Expand the Lookup category of steps.
  5. Drag a Database lookup step to the canvas and create a hop from the Group by step toward this step.
  6. Double-click the Database lookup step.
  7. As Connection, select js and in Lookup table, browse the database and select products or just type its name.
  8. Fill the grids as follows:

1

  1. Click on OK.
  2. Add a filter step to pass only the rows where pro_stock<quantity.
  3. Add a Text file output step to send the manufacturer code, the product code, the product name, and the ordered quantity to a file named products_to_buy.txt.
  4. Run the transformation.
  5. The file should have the following content:
man_code;prod_code;pro_name;quantity
EDU;ED13_93;Times Square;1
RAV;RVZ50031;Disney World Map;2
RAV;RVZ50106;Star Wars Clone Wars;1

Learn Pentaho

Using a Database join step to create a list of suggested products to buy:

  1. Open the transformation of the previous tutorial and save it under a new name.
  2. Delete the Text file output step.
  3. Double-click the Group by step and add an aggregated field named customers with the list of customers separated by (,). Under Subject, select idcus and as Type, select Concatenate strings separated by ,.
  4. Double-click the Database lookup step. In the Values to return from the lookup table grid, add pro_theme as value in the String field.
  5. Add a Select values step. Use it to select the fields customers, quantity, pro_theme, and pro_name. Also rename quantity as quantity_param and pro_theme as theme_param. From the Lookup category, drag a Database join step to the canvas. Create a hop from the Select values step to this step.
  6. Double-click the Database join step.
  7. Select js as Connection.
  8. In the SQL frame type the following statement:
SELECT man_code
, pro_code
, pro_name
FROM products
WHERE pro_theme like ?
AND pro_stock>=?
  1. In the Number of rows to return textbox, type 4.
  2. Fill the grid as shown:

1

  1. Click on OK. The transformation looks like this:

1

  1. With the last step selected, do a Preview.
  2. You should see this:

1

  1. In the Step Metrics you should see this:

1

Using a Database join step to create a list of suggested products to buy:

  1. Open the transformation of the previous tutorial and save it under a new name.
  2. Delete the Text file output step.
  3. Double-click the Group by step and add an aggregated field named customers with the list of customers separated by (,). Under Subject, select idcus and as Type, select Concatenate strings separated by ,.
  4. Double-click the Database lookup step. In the Values to return from the lookup table grid, add pro_theme as value in the String field.
  5. Add a Select values step. Use it to select the fields customers, quantity, pro_theme, and pro_name. Also rename quantity as quantity_param and pro_theme as theme_param. From the Lookup category, drag a Database join step to the canvas. Create a hop from the Select values step to this step.
  6. Double-click the Database join step.
  7. Select js as Connection.
  8. In the SQL frame type the following statement:
SELECT man_code
, pro_code
, pro_name
FROM products
WHERE pro_theme like ?
AND pro_stock>=?
  1. In the Number of rows to return textbox, type 4.
  2. Fill the grid as shown:

1

  1. Click on OK. The transformation looks like this:

1

  1. With the last step selected, do a Preview.
  2. You should see this:

1

  1. In the Step Metrics you should see this:

1

Testing the transformation that keeps a history of product changes:

  1. In the previous tutorial you loaded a dimension with products by using a Dimension lookup/update step. You ran the transformation once, causing the insertion of one record for each product and a special record with values n/a for the descriptive fields. Let’s apply some changes in the operational database, and run the transformation again to see how the Dimension lookup/update step keeps history.
  2. In MySQL Query Browser, open the script update_jumbo_products.sql and run it.
  3. Switch to Spoon.
  4. If the transformation created in the last tutorial is not open, open it again.
  5. Run the transformation. Explore the js_dw database again. Press Open SQL for [lk_puzzles] and type the following sentence:
SELECT *
FROM lk_puzzles
WHERE id_js_man = 'JUM'
ORDER BY id_js_prod, version
  1. You will see this:

1

About the Author

Data Analyst & Machine Learning Associate

As a Data Analyst & Machine Learning Associate, Nishtha uses a combination of her analytical skills and machine learning knowledge to interpret complicated datasets. She is a passionate storyteller who transforms crucial findings into gripping tales that further influence data-driven decision-making in the business frontier.