NiFi and HTTP Post Configuration

Apache NiFi does a great job when it comes to data flow automation. It excels in many regards – ease of use, built-in processors of all kinds, straightforward debugging or data provenance, to name but a few. Despite of a fairly moderate learning curve, there surely are new concepts to grasp. It took me a while to understand of how to go about basics, such as configuring HTTP POST requests. NiFi is rather different from tools like Postman and one has to appreciate the importance of flow files. In this article, I present a number of complete examples to help you get started.

The examples are based on two different scenarios. First of all, I look at how to build a simple one-off request with a static body. In practice however, there is often a need to build iterative workflows, where request parameters change in each and every iteration. For instance, you might be on a mission to ingest large volumes of data via a web service. Some APIs take highly structured and elaborate (JSON) objects as input queries, thus exposing POST rather than GET endpoints.  In each iteration of such an API call, one would want to shift to a new page of results until all data has been read.

Enough of talking, let’s get to the bottom of it.

Setting up the scene

For simplicity I stick to JSONPlaceholder, an online API mock.In terms of using NiFi I reach out for the latest stable release, which is 1.0.0 at the time of writing. I am going to submit a new post in a hypothetical blog as follows.

POST /posts HTTP/1.1
{
"title": "foo",
"body": "bar",
"userId": 1
}

And here is a successful response.

HTTP 201
{
 "title": "foo",
 "body": "bar",
 "userId": 1,
 "id": 101
}

Scenario 1: A simple one-off request with a static body

To start with, let me tackle a choice of the right processor. Both InvokeHTTP and PostHTTP would suit.

InvokeHTTP Configuration
InvokeHTTP Configuration
PostHTTP Configuration
PostHTTP Configuration

As you can see almost all of the config is left with default settings. In fact, I only specified a URL and,in case of the InvokeHTTP processor, a HTTP method.

Please note, there is no option for the request body definition. The reason being, the body is constructed from an inbound flow file content. Normally, an UpdateAttribute would do, but since I need to pass a JSON object I apply an additional transformation via AttributesToJSON.

Request parameters
Request parameters
The parameters are turned into JSON
The parameters are turned into JSON
Parameters to JSON flow
Parameters to JSON flow

Note the JSON-ised output goes into a flow file content. This is essential to a successful request invocation. The only remaining part is to append the HTTP POST processor. Here is how the whole flow looks like.

HTTP POST with JSON body
HTTP POST with JSON body

Scenario 2: A repeated request with a dynamically built body

Now, there is often a need to run API calls in a loop until some terminal condition is met. In my example let’s say, for whatever reason, I want to create 5 consecutive posts. Like in any kind of a loop I need a counter and a termination expression.

The counter gets initialised upfront and incremented at the end of each iteration.

Counter initialisation
Counter initialisation
The counter gets incremented
The counter gets incremented

The termination can be expressed by RouteOnAttribute.

Termination expression
Termination expression

Finally, in terms of building the JSON parameters, let me make the user ID value dependent on the counter.

Request parameters with dynamic content
Request parameters with dynamic content

Here is the original flow revamped into a conditional loop of POST requests.

Flow with a conditional loop
Flow with a conditional loop

There is a subtle caveat though. If you look closely, you realise the request body is rebuilt from scratch in every iteration. That feels unnecessary.

To fix the issue I define the request parameters upfront, i.e. along with the counter initialisation. Note the change in user ID value. It no longer takes an expression. There is a hardcoded placeholder instead.

All parameters are initialised prior to the loop
All parameters are initialised prior to the loop

In each iteration, the predefined bunch of parameters gets turned into JSON, in order to create a flow file content fed into the HTTP POST. Obviously, prior to making the API call, the placeholder has to be replaced with the actual value based on the counter. That’s taken care of by ReplaceText.

Placeholder replacement
Placeholder replacement

Once again, here is the finite state of the flow.

Optimised flow for repeated API calls
Optimised flow for repeated API calls

I don’t want to overstate the significance of using placeholders over expressions. It really depends on what your needs are. NiFi is flexible enough to let you experiment and identify bottlenecks in your data flow.

That brings me to the end of the brief introduction into creating simple flows with NiFi. In this post we looked at how to build a HTTP POST request with JSON body and how to make iterative calls with a variable configuration.

NiFi templates for all of the discussed examples are available at GitHub – NiFi by Example.

Thanks for reading and stay tuned for my next post about NiFi, where I will look at how to configure an SSL service.

Similar Posts