This article is about Apache Nifi
Intro#
I’ve been using Apache Nifi for some time now. But has a Cloudera user it would be nice to have it managed centralized in Cloudera Manager.
I found this repo which had only one commit for doing exactly this, so i decided to give it a try, to see if it worked.
Requirements#
- Virtual machine with Cloudera quickstart pre-installed
- I had to patch the available Nifi version with NIFI-3466
- And parcel repo also with Issue1
So i’ll be using my own forks that have this changes already, but will be following the author (prateek) instructions.
Instructions#
Install Requirements#
- cloudera/cm_ext
cd /tmp
git clone https://github.com/cloudera/cm_ext
cd cm_ext/validator
mvn install
- Create parcel & CSD:
cd /tmp
git clone git@github.com:rramos/nifi.git
cd nifi
mvn clean install
cd /tmp
git clone git@github.com:rramos/nifi-parcel.git
cd nifi-parcel
POINT_VERSION=5 VALIDATOR_DIR=/tmp/cm_ext ./build-parcel.sh /tmp/nifi/nifi-assembly/target/nifi-*-SNAPSHOT-bin.tar.gz
VALIDATOR_DIR=/tmp/cm_ext ./build-csd.sh
Test the Parcel#
Cloudera quickstart images comes with java version 1.7. But i need version 1.8 for Nifi otherwise i’ll get minor major version mismatch issue. So i had to make some changes.
First start your cloudera quickstart VM and copy the folders:
build-csd
build-parcel
there in my case i had IP 192.168.122.132
(change for your case).
scp -r build-parcel cloudera@192.168.122.132:/tmp
scp -r build-csd cloudera@192.168.122.132:/tmp
- SSH in the machine and install java8
yum install java-1.8.0-openjdk
- Run the following command in
build-parcel
folder
cd build-parcel
python -m SimpleHTTPServer 14641
Now open your browser at http://192.168.122.132:7180
and access Cloudera Manager (cloudera:cloudera).
Navigate to -> Parcels -> Edit Settings.
Add http://192.168.122.132:14641
Download and install the Nifi Parcel.
- Copy NIFI home
cp -r /tmp/build-parcel/NIFI-0.0.5.nifi.p0.5 /opt/cloudera/parcels/NIFI/
- Correct java path in file
/opt/cloudera/parcels/NIFI/bin/nifi-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0
- Copy the csd jars
cp /tmp/build-csd/NIFI-1.2.0.jar /opt/cloudera/csd/
mkdir /opt/cloudera/csd/NIFI-1.2.0
cp /tmp/build-csd/NIFI-1.2.0.jar /opt/cloudera/csd/NIFI-1.2.0/
cd /opt/cloudera/csd/NIFI-1.2.0/
jar xvf NIFI-1.2.0.jar
rm -f NIFI-1.2.0.jar
- Correct ownership
chown -R cloudera-scm:cloudera-scm /opt/cloudera
Apply Configuration changes to Cloudera Manager and restart the service
Move CSD to Cloudera Manager’s CSD Repo
sudo service cloudera-scm-server restart
- After login again in CM wait for zookeeper to be running normal, and add a new service (Nifi)
After terminating the wizard you should have access to Nifi Interface
Conclusion#
I must say that prateek repo with only that commit worked pretty well and the instructions where also clear. There where some minor adjustments because of the java version, but we can start/stop the service via Cloudera Manager.
There are some pending items referred in his github page:
- Currently
NiFi
runs under theroot
user - Expose config options under Cloudera Manager
- Conf folder from parcels is used, this needs to be migrated to ConfigWriter
- Expose metrics from NiFi
I haven’t tested the configuration in cluster mode as i has using the quickstart VM.
The configuration options in CM would be very good improvement. Auto configuring zookeeper and the other nifi options. I’ll try to contribute to prateek excellent work if i manage to get some time.
Cheers, RR