Skip to main content

Executing Workflows Using Groovy

Overview

This guide outlines the steps to allow users to execute a workflow using a provided host URL, headers, and Workflow ID from the Command Line Interface. It triggers a workflow/rule and retrieves the execution status and result. The result can be saved in an output JSON file.

Steps

1. Pre-Requisites

Note

For scripts, refer to the Sample Code Snippets section below.

Before initiating the workflow automation, ensure the following pre-requisites are met:

  • Groovy Environment: Ensure that Groovy is installed on the system where the automation will be performed.
  • iceDQ Configuration: Confirm that the iceDQ is set up and configured correctly.
  • You need to have the runWorkflow.groovy file in any folder you choose.
  • You need to have a lib folder that houses the two functions: executeWkfl.groovy and getExecutionResult.groovy. The lib folder must be situated within the folder containing the runWorkflow.groovy file.

Lib Folder Groovy

  • You need to have the config.json file in the same folder as the runWorkflow.groovy file.
  • You may have a folder to house the execution logs. The path of this folder will then have to be specified as the --logsfolder argument during execution.
  • You may have a folder to house the execution details as an output. The path of this folder will need to be specified as the --outpath argument during execution.
  • You may want to specify the name of the output JSON file. The name of this file will then need to be specified as the --outname argument during execution.
  • If you do not specify either the logsfolder info or output folder info or both, then by default logs and/or output folders will be created within the folder in which the executeWorkflow.groovy file is located. The name of the output JSON file will be a default output.json.

2. Pre-Setup

Prior to executing the workflows, perform the following pre-setup tasks:

  • Get Config Details: The configuration file config.json contains a JSON object. You need to enter the Accept, Content-Type, Workspace-Id, and Authorization token values in the header payload.
  • Workflow ID: Ensure that you have the workflow_Id of the workflow you want to execute.
  • Base URL: Ensure that you have the base URL of your iceDQ instance.

3. Script Functionality Overview

This is an overview of how the script's functionality works:

  1. Script Structure:

    • Initialize values, parse options, read configuration, and set up dates.
    • Execute the workflow, log details, and handle output.
  2. Function Structure (executeWorkflow):

    • Define executeWkfl function for triggering workflow runs.
    • Handle HTTP POST requests and return responses.
  3. Function Structure (getExecutionResult):

    • Define getExecutionResultWithPolling function for polling results.
    • Accept parameters and return when execution is completed.
  4. Error Handling:

    • Manage unexpected input in the script and function parameters.
    • Use try-catch blocks for logging exceptions and handle HTTP response errors.

4. Command Line Arguments

Below are the Command Line Arguments relevant to executing Workflows:

FlagNote
-baseurlThe host URL.
-idThe ID of the rule/workflow to be executed.
-asyncAsynchronous status, either True or False.
-outnameOutput file name for saving the execution result.
-outpathOutput folder path.
-logsfolderLog folder path.

This is how you can execute an iceDQ Workflow using a Groovy script through the Command Line Interface. The following is the full standard format:

groovy runWorkflow.groovy --baseurl <base_url> --id <workflow_id> [--logsfolder <logs_folder>] [--outname <filename>] [--outpath <folderpath>]

The following is the output you will receive if the execution is successful:

yyyy-mm-dd hh-mm-ss InstanceId: <#######>
yyyy-mm-dd hh-mm-ss Workflow running...
yyyy-mm-dd hh-mm-ss Workflow still running. Waiting for 5 seconds...
....
....
yyyy-mm-dd hh-mm-ss Execution Status is <execution result>. Exiting...
yyyy-mm-dd hh-mm-ss Execution Result: {<response json>}

This is the error you may get if the authorization fails:

yyyy-mm-dd hh-mm-ss An error occurred: API request for result failed with response code: 401

5. Post Execution To Dos

After executing the workflows, perform the following post-execution tasks:

  1. Review Logs: Analyze the script logs to identify any issues or anomalies during the execution.
  2. Result Verification: You may manually verify the test results in iceDQ (optional).
  3. Documentation: Document the output JSON file to maintain records of the improvement/deterioration of data quality.

Sample Code Snippets

runWorkflow.groovy

import java.net.HttpURLConnection
import java.net.URL
import java.io.OutputStream
import java.nio.charset.StandardCharsets
import groovy.json.JsonSlurper
import groovy.json.JsonOutput
import java.text.SimpleDateFormat
import java.util.Date
import groovy.swing.SwingBuilder
import javax.swing.JFrame
import javax.swing.JOptionPane

// Define command-line options
if (args.length == 0 || args.contains('--help') || args.contains('-h')) {
println 'Usage: groovy executeWorkflow.groovy --baseurl <base_url> --id <workflow_id> [--logsfolder <logs_folder>] [--outname <filename>] [--outpath <folderpath>]'
System.exit(0)
}

def options = [:]
args.eachWithIndex { arg, index ->
switch (arg) {
case '--baseurl':
options.baseurl = args[index + 1]
break
case '--id':
options.id = args[index + 1]
break
case '--logsfolder':
options.logsfolder = args[index + 1]
break
case '--outname':
options.outname = args[index + 1]
break
case '--outpath':
options.outpath = args[index + 1]
break
}
}

// Read configuration from 'config.json'
def config = new JsonSlurper().parseText(new File('config.json').text)

// Extract command-line options or use defaults
BASE_URL = options.baseurl ?: config.hostUrl
WKFL_ID = options.id ?: ''
LOGS_FOLDER = options.logsfolder ?: 'logs-default' // Use provided logs folder or default 'logs-default'
OUT_NAME = options.outname ?: 'output.json' // Use provided output file name or default 'output.json'
OUT_PATH = options.outpath ?: 'output-default' // Use provided output path or default 'output-default'

// Function to execute workflow
def executeWkfl = new GroovyShell().evaluate(new File("C:/Users/AnujChawda/Desktop/GroovyFiles/lib/executeWkfl.groovy"))

// Function to get execution result
def getExecutionResult = new GroovyShell().evaluate(new File('lib/getExecutionResult.groovy'))

// Define the log file path
def logsFolder = new File(LOGS_FOLDER)
logsFolder.mkdirs() // Ensure the logs folder exists

// Define the output folder path
def outputFolder = new File(OUT_PATH)
outputFolder.mkdirs() // Ensure the output folder exists

def dateFormat = new SimpleDateFormat("yyyy-MM-dd HH-mm-ss")
def logfiledateFormat = new SimpleDateFormat("yyyy-MM-dd")
def logFileName = "${logfiledateFormat.format(new Date())}_runWkfl.log"
def logFilePath = new File(logsFolder, logFileName).toString()

// Define a closure to redirect the output
def logClosure = { String output ->
def timestamp = dateFormat.format(new Date())
new File(logFilePath).withWriterAppend { writer ->
writer << "$timestamp $output\n"
System.out << "$timestamp $output\n" // Also print to the console if needed
}
}

try {
// logClosure("Log file created/updated: ${logFilePath}")

// Run the Workflow
def executionResponse = executeWkfl(BASE_URL, config.headerPayload, WKFL_ID)
def instanceId = new groovy.json.JsonSlurper().parseText(executionResponse).instanceId

logClosure("InstanceId: $instanceId")
logClosure("Workflow running...")

// Poll for execution status
def maxRetries = 10
def retryIntervalMillis = 5000

for (int i = 0; i < maxRetries; i++) {
def resultResponse = getExecutionResult(BASE_URL, config.headerPayload, WKFL_ID, instanceId)
def status = new groovy.json.JsonSlurper().parseText(resultResponse).status

logClosure("Workflow still running. Waiting for ${retryIntervalMillis / 1000} seconds...")

if (status != 'running') {
logClosure("Execution Status is $status. Exiting...")
logClosure("Execution Result: $resultResponse")

// Save the formatted execution result to the output file
def formattedResult = JsonOutput.prettyPrint(resultResponse)
def outputFile = new File("${outputFolder}/${OUT_NAME}")

if (outputFile.exists()) {
// Append to the existing file
outputFile << "\n${formattedResult}"
} else {
// Create a new file and write the content
outputFile.withWriter { writer ->
writer << formattedResult
}
}

break
}

Thread.sleep(retryIntervalMillis)
}

} catch (Exception e) {
logClosure("An error occurred: ${e.message}")
// You can log the full stack trace if needed
// logClosure("Stack Trace: ${e.printStackTrace()}")
}

executeWkfl.groovy

// Function to execute rule
import java.nio.charset.StandardCharsets

def executeRule = { hostUrl, headers, objectId ->
def url = "${hostUrl}/workflowruns:trigger"
def payload = """
{
"objectId": "${objectId}"
}
"""
def connection = (HttpURLConnection) new URL(url).openConnection()

connection.setRequestMethod("POST")
headers.each { key, value ->
connection.setRequestProperty(key, value)
}

// Set Content-Length header
connection.setRequestProperty("Content-Length", String.valueOf(payload.length()))

connection.setDoOutput(true)
OutputStream os = connection.getOutputStream()
byte[] input = payload.getBytes(StandardCharsets.UTF_8)
os.write(input, 0, input.length)

def responseCode = connection.getResponseCode()

if (responseCode == HttpURLConnection.HTTP_OK) {
return connection.getInputStream().text
} else {
throw new Exception("API request for result failed with response code: ${responseCode}")
}
}

// Ensure the function is explicitly returned
executeRule

getExecutionResult.groovy

// Function to get execution result with polling
def getExecutionResultWithPolling = { hostUrl, headers, objectId, instanceId ->
try {
def resultUrl = "${hostUrl}/workflowruns/${instanceId}/result"
def maxRetries = 10 // Set the maximum number of retries
def retryIntervalMillis = 5000 // Set the retry interval in milliseconds

for (int i = 0; i < maxRetries; i++) {
def resultConnection = (HttpURLConnection) new URL(resultUrl).openConnection()
resultConnection.setRequestMethod("GET")

headers.each { key, value ->
resultConnection.setRequestProperty(key, value)
}

def resultResponseCode = resultConnection.getResponseCode()

if (resultResponseCode == HttpURLConnection.HTTP_OK) {
def resultResponse = resultConnection.getInputStream().text

// Check if the job status is no longer "running"
if (resultResponse.contains('"status":"running"')) {
Thread.sleep(retryIntervalMillis) // Wait before retrying
} else {
return resultResponse
}
} else {
def errorStream = resultConnection.getErrorStream()
def errorMessage = errorStream ? errorStream.text : "No error message available"
throw new Exception("API request for result failed with response code: ${resultResponseCode}. Error: ${errorMessage}")
}
}

throw new Exception("Maximum retries reached. Job still running.")
} finally {
// Close the connection
if (resultConnection) resultConnection.disconnect()
}
}

// Ensure the function is explicitly returned
getExecutionResultWithPolling

config.json

{
"headerPayload": {
"Accept": "application/json",
"Content-Type": "application/json",
"Workspace-Id": "<enter Workspace Id here>",
"Authorization": "Bearer <Enter bearer token here>"
}
}