Friday, November 5, 2010

Automatic report generation now possible in Pentaho Data Integration

Last Friday, Pentaho Data Integration (PDI) developer Matt Casters posted a preview of a new tool that allows Business Intelligence designers to include report generation (PRD) as a step of PDI. This is extremely useful, because the obvious step after ETL is often to generate reports.

Let's say a retail chain wants to send, everyday, to every shop manager, a report detailing this shop's performance and trends.
Imagine you have a data warehouse that contains all sales records, and a PRD report template. Then here is how to create a system that will automatically generate and send the reports everyday:
  1. Install the bleeding-edge PDI 4.1.0 RC1
  2. Add Matt's plugin as explained here
  3. Open PDI "Spoon" and create a new Transformation
  4. First, create a "Table Input" step to get for each shop it's code, name and email address.
  5. Second, create a minimal JavaScript step to compute the output's filename and set the PRPT file's name.
  6. Third, use the new "Pentaho Reporting step", and configure it to use your freshly computed PRPT and output filenames, as well as the shop codes.
  7. Finally, create an "Send mail" step and set it to use each shop's email address, putting the generated report as an attachment.
  8. Save and configure your cron to launch this transformation every night via the pan.sh command-line tool.

2 comments:

  1. Does this even work when we use charts. I saw a question on the same in the pentaho forums.
    Btw great work!!!!

    ReplyDelete
  2. Hello ,

    I was following your post on the pentaho forum, for the JNDI part. You mentioned about the pdi/simle-jndi/default.properties.

    Could you please tell me how did it finally work for JNDI ?
    Thanks !
    Suman

    ReplyDelete