Post

AWS Kinesis Agent configuration and setup for Data Streaming to the Cloud

Abstract

Amazon Kinesis Agent is a powerful tool that helps you collect, process, and transfer data in real-time to AWS services like Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon CloudWatch. In this guide, we’ll walk you through the steps to set up and configure the AWS Kinesis Agent on an Ubuntu OS.

Target Infrastructure

infra.png

Install kinesis agent

Depending on your operating system, there are two paths to consider: a shorter, automated setup process and a longer one that involves manual compilation. If your operating system provides a pre-packaged version of the Kinesis agent in its repository, you can set it up automatically. Otherwise, you’ll need to build it from source manually. Centos installation is straight forward:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[cloudshell-user@ip-xx-xx-xx-xx ~]$ sudo yum install aws-kinesis-agent

Installed:
  aws-kinesis-agent.noarch 0:2.0.8-1.amzn2

Dependency Installed:
  alsa-lib.x86_64 0:1.1.4.1-2.amzn2                              atk.x86_64 0:2.22.0-3.amzn2.0.2                                         avahi-libs.x86_64 0:0.6.31-20.amzn2.0.2
  cairo.x86_64 0:1.15.12-4.amzn2                                 copy-jdk-configs.noarch 0:3.3-10.amzn2                                  cups-libs.x86_64 1:1.6.3-51.amzn2.0.3
  dejavu-fonts-common.noarch 0:2.33-6.amzn2                      dejavu-sans-fonts.noarch 0:2.33-6.amzn2                                 fontconfig.x86_64 0:2.13.0-4.3.amzn2
  fontpackages-filesystem.noarch 0:1.44-8.amzn2                  freetype.x86_64 0:2.8-14.amzn2.1.1                                      fribidi.x86_64 0:1.0.2-1.amzn2.1.2
  gdk-pixbuf2.x86_64 0:2.36.12-3.amzn2                           giflib.x86_64 0:4.1.6-9.amzn2.0.2                                       graphite2.x86_64 0:1.3.10-1.amzn2.0.2
  gtk-update-icon-cache.x86_64 0:3.22.30-3.amzn2                 gtk2.x86_64 0:2.24.31-1.amzn2.0.2                                       harfbuzz.x86_64 0:1.7.5-2.amzn2
  hicolor-icon-theme.noarch 0:0.12-7.amzn2                       hwdata.x86_64 0:0.252-9.3.amzn2                                         jasper-libs.x86_64 0:1.900.1-33.amzn2.0.1
  java-1.8.0-openjdk.x86_64 1:1.8.0.382.b05-1.amzn2.0.1          java-1.8.0-openjdk-headless.x86_64 1:1.8.0.382.b05-1.amzn2.0.1          javapackages-tools.noarch 0:3.4.1-11.amzn2
  jbigkit-libs.x86_64 0:2.0-11.amzn2.0.2                         libICE.x86_64 0:1.0.9-9.amzn2.0.2                                       libSM.x86_64 0:1.2.2-2.amzn2.0.2
  libX11.x86_64 0:1.6.7-3.amzn2.0.3                              libX11-common.noarch 0:1.6.7-3.amzn2.0.3                                libXau.x86_64 0:1.0.8-2.1.amzn2.0.2
  libXcomposite.x86_64 0:0.4.4-4.1.amzn2.0.2                     libXcursor.x86_64 0:1.1.15-1.amzn2                                      libXdamage.x86_64 0:1.1.4-4.1.amzn2.0.2
  libXext.x86_64 0:1.3.3-3.amzn2.0.2                             libXfixes.x86_64 0:5.0.3-1.amzn2.0.2                                    libXft.x86_64 0:2.3.2-2.amzn2.0.2
  libXi.x86_64 0:1.7.9-1.amzn2.0.2                               libXinerama.x86_64 0:1.1.3-2.1.amzn2.0.2                                libXrandr.x86_64 0:1.5.1-2.amzn2.0.3
  libXrender.x86_64 0:0.9.10-1.amzn2.0.2                         libXtst.x86_64 0:1.2.3-1.amzn2.0.2                                      libXxf86vm.x86_64 0:1.1.4-1.amzn2.0.2
  libdrm.x86_64 0:2.4.97-2.amzn2                                 libfontenc.x86_64 0:1.1.3-3.amzn2.0.2                                   libglvnd.x86_64 1:1.0.1-0.1.git5baa1e5.amzn2.0.1
  libglvnd-egl.x86_64 1:1.0.1-0.1.git5baa1e5.amzn2.0.1           libglvnd-glx.x86_64 1:1.0.1-0.1.git5baa1e5.amzn2.0.1                    libjpeg-turbo.x86_64 0:2.0.90-2.amzn2.0.6
  libpciaccess.x86_64 0:0.14-1.amzn2                             libpng.x86_64 2:1.5.13-8.amzn2.0.5                                      libthai.x86_64 0:0.1.14-9.amzn2.0.2
  libtiff.x86_64 0:4.0.3-35.amzn2.0.14                           libwayland-client.x86_64 0:1.17.0-1.amzn2.0.1                           libwayland-server.x86_64 0:1.17.0-1.amzn2.0.1
  libxcb.x86_64 0:1.12-1.amzn2.0.2                               libxshmfence.x86_64 0:1.2-1.amzn2.0.2                                   libxslt.x86_64 0:1.1.28-6.amzn2
  lksctp-tools.x86_64 0:1.0.17-2.amzn2.0.2                       log4j-cve-2021-44228-hotpatch.noarch 0:1.3-7.amzn2                      mesa-libEGL.x86_64 0:18.3.4-5.amzn2.0.1
  mesa-libGL.x86_64 0:18.3.4-5.amzn2.0.1                         mesa-libgbm.x86_64 0:18.3.4-5.amzn2.0.1                                 mesa-libglapi.x86_64 0:18.3.4-5.amzn2.0.1
  pango.x86_64 0:1.42.4-4.amzn2                                  pcsc-lite-libs.x86_64 0:1.8.8-7.amzn2                                   pixman.x86_64 0:0.34.0-1.amzn2.0.2
  pkgconfig.x86_64 1:0.27.1-4.amzn2.0.2                          python-javapackages.noarch 0:3.4.1-11.amzn2                             python-lxml.x86_64 0:3.2.1-4.amzn2.0.4
  ttmkfdir.x86_64 0:3.0.9-42.amzn2.0.2                           tzdata-java.noarch 0:2023c-1.amzn2.0.1                                  xorg-x11-font-utils.x86_64 1:7.5-21.amzn2
  xorg-x11-fonts-Type1.noarch 0:7.5-9.amzn2

Complete!

If you’re using Ubuntu, we’ll need to take an additional step: building the aws-kinesis-agent manually and then installing it. Please follow these instructions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
apt install git
git clone https://github.com/awslabs/amazon-kinesis-agent.git
sudo ./setup --install

BUILD SUCCESSFUL
Total time: 44 seconds
Configuration file installed at: /etc/aws-kinesis/agent.json
Configuration details:
{
  "cloudwatch.emitMetrics": true,
  "kinesis.endpoint": "",
  "firehose.endpoint": "",

  "flows": [
    {
      "filePattern": "/tmp/app.log*",
      "kinesisStream": "yourkinesisstream",
      "partitionKeyOption": "RANDOM"
    },
    {
      "filePattern": "/tmp/app.log*",
      "deliveryStream": "yourdeliverystream"
    }
  ]
}
Amazon Kinesis Agent is installed successfully.
Your installation has completed!

aws-kinesis-agent log file will be found at: /var/log/aws-kinesis-agent

Control aws-kinesis-agent service:

1
sudo service aws-kinesis-agent [start|stop|restart|status]

To make the agent automatically start at system startup, type:

1
sudo chkconfig aws-kinesis-agent on

Prepare Logs for ingestion

1
sudo mkdir /var/log/kinesis
1
2
3
4
5
6
$ cd /etc/aws-kinesis/
$ ls -l
total 5
drwxr-xr-x 2 root root 1024 Sep  6  2022 agent.d
-rw-r--r-- 1 root root  338 Sep  6  2022 agent.json
-rw-r--r-- 1 root root 2160 Sep  6  2022 log4j.xml
1
sudo nano agent.json

We will be using aws-kinesis-agent with Kinesis Firehose but not a Kinesis Data Stream, so let’s update configuration to the following (but same agent can work with multiple sources and sinks):

Additionally, the agent provides essential Amazon CloudWatch metrics for streamlined monitoring and troubleshooting of the entire streaming process (but for this setup we will set them to FALSE)

1
2
3
4
5
6
7
8
9
10
11
12
13
{
  "cloudwatch.emitMetrics": false,
  "kinesis.endpoint": "",
  "firehose.endpoint": "firehose.eu-west-1.amazonaws.com",
  "awsAccessKeyId": "ADD_KEY_HERE",
  "awsSecretAccessKey": "ADD_SECRET_HERE",
  "flows": [
    {
      "filePattern": "/var/log/kinesis/*.log*",
      "deliveryStream": "terraform-kinesis-firehose-logs-s3-stream"
    }
  ]
}

Update the following parameters in configuration:

ParamDescription
firehose.endpointbased on region you are have provisioned firehose it will have naming firehose..amazonaws.com
filePatternlocation of logs you want to monitor and push to the cloud
deliveryStreamname of Firehose Delivery Stream terraform-kinesis-firehose-logs-s3-stream from aws console

If you are running aws-kinesis-agent from EC2 instance in AWS - than instead of entering creds to configuration file, just attach IAM Role. If you are running from on-prem DC than AccessKey, SecretAccessKey should be provisioned in aws-kinesis-agent json config.

Now aws-kinesis-agent monitors the logs folder and once new records are appended it will ingest them to the Cloud.

Checking the results of aws-kinesis-agent

With the AWS Kinesis Agent successfully installed, you’re now ready to start streaming data seamlessly into your AWS environment. Once new records are appended to logs, kinesis agent starts operating them, and you can track the activity under kinesis-agent logs:

1
2
3
4
5
6
tail -f /var/log/aws-kinesis-agent/aws-kinesis-agent.log
2023-10-18 16:40:38.902+0000  (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 30000 records parsed (2519250 bytes), and 30000 records sent successfully to destinations. Uptime: 13200218ms
2023-10-18 16:41:08.889+0000  (FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log].MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.tailing.FileTailer [INFO] FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log]: Tailer Progress: Tailer has parsed 30000 records (2519250 bytes), transformed 0 records, skipped 0 records, and has successfully sent 30000 records to destination.
2023-10-18 16:41:08.902+0000  (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 30000 records parsed (2519250 bytes), and 30000 records sent successfully to destinations. Uptime: 13230218ms
2023-10-18 16:41:38.889+0000  (FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log].MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.tailing.FileTailer [INFO] FileTailer[fh:terraform-kinesis-firehose-logs-s3-stream:/var/log/data/*.log]: Tailer Progress: Tailer has parsed 30000 records (2519250 bytes), transformed 0 records, skipped 0 records, and has successfully sent 30000 records to destination.
2023-10-18 16:41:38.902+0000  (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 30000 records parsed (2519250 bytes), and 30000 records sent successfully to destinations. Uptime: 13260218ms```

Conclusion

Agent is a Java-based application designed for seamless data collection and transmission to Kinesis Data Streams. It efficiently monitors specified files, ensuring timely delivery of new data to your stream. With built-in features like file rotation, checkpointing, and automatic retries in case of failures, it guarantees reliable data delivery.

Because the kinesis-agent is built in Java, it’s important to consider allocating sufficient heap memory for the JVM. Prior to running the agent, ensure your machine has enough free resources and review the configuration accordingly.

In the next post Connecting Kinesis Firehose DataStream with aws-kinesis-agent to ingest Log Data into AWS S3 we will configure Cloud side of Kinesis Firehose to receive data from kinesis-agent and persist to S3.

This post is licensed under CC BY 4.0 by the author.