X-Git-Url: http://git.vanrenterghem.biz/www2.vanrenterghem.biz.git/blobdiff_plain/9d5cad4353459e7fa815d6a8a677037b58e69668..3c9920148567c38da1fd09f8665264c436d85a8a:/source/posts/using_Apache_Nifi_Kafka_big_data_tools.org diff --git a/source/posts/using_Apache_Nifi_Kafka_big_data_tools.org b/source/posts/using_Apache_Nifi_Kafka_big_data_tools.org index 297c75f..c53f9f1 100644 --- a/source/posts/using_Apache_Nifi_Kafka_big_data_tools.org +++ b/source/posts/using_Apache_Nifi_Kafka_big_data_tools.org @@ -1,11 +1,10 @@ -#+date: 2018-09-11 21:09:06 +0800 +#+date: <2018-09-11 21:09:06 +0800> #+filetags: Apache Nifi Kafka bigdata streaming #+title: Using Apache Nifi and Kafka - big data tools Working in analytics these days, the concept of big data has been firmly established. Smart engineers have been developing cool technology to -work with it for a while now. The [[Apache Software -Foundation|https://apache.org]] has emerged as a hub for many of these - +work with it for a while now. The [[https://apache.org][Apache Software Foundation]] has emerged as a hub for many of these - Ambari, Hadoop, Hive, Kafka, Nifi, Pig, Zookeeper - the list goes on. While I'm mostly interested in improving business outcomes applying @@ -14,7 +13,7 @@ that easier. Over the past few weeks, I have been exploring some tools, installing them on my laptop or a server and giving them a spin. Thanks to -[[Confluent, the founders of Kafka|https://www.confluent.io]] it is +[[https://www.confluent.io][Confluent, the founders of Kafka]] it is super easy to try out Kafka, Zookeeper, KSQL and their REST API. They all come in a pre-compiled tarball which just works on Arch Linux. (After trying to compile some of these, this is no luxury - these apps @@ -26,11 +25,13 @@ started is: #+END_SRC I also spun up an instance of -[[nifi|https://nifi.apache.org/download.html]], which I used to monitor +[[https://nifi.apache.org/download.html][nifi]], which I used to monitor a (json-ised) apache2 webserver log. Every new line added to that log goes as a message to Kafka. -[[Apache Nifi configuration|/pics/ApacheNifi.png]] +#+CAPTION: Apache Nifi configuration +#+ATTR_HTML: :class img-fluid :alt Apache Nifi configuration +[[file:../assets/ApacheNifi.png]] A processor monitoring a file (tailing) copies every new line over to another processor publishing it to a Kafka topic. The Tailfile monitor