ScyllaDB University LIVE, FREE Virtual Training Event | March 21
Register for Free
ScyllaDB Documentation Logo Documentation
  • Server
  • Cloud
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Download
ScyllaDB Docs ScyllaDB Enterprise Getting Started Scylla Integrations and Connectors Integrate Scylla with KairosDB

Caution

You're viewing documentation for a previous version. Switch to the latest stable version.

Integrate Scylla with KairosDB¶

About KairosDB¶

KairosDB is a fast distributed scalable time-series database. It was initially a rewrite of the original OpenTSDB project, but it evolved into a different system where data management, data processing, and visualization are fully separated. When KairosDB introduced native CQL support in version 1.2.0, we created a performance test for KairosDB and Scylla. Through this process, we discovered how easily both platforms could be integrated with each other. The results are presented here in an example that you can adapt to suit your needs. More information on KairosDB can be found on the KairosDB website.

Benefits of integrating KairosDB with Scylla¶

A highly available time-series solution requires an efficient, tailored frontend framework and a backend database with a fast ingestion rate. KairosDB provides a simple and reliable way to ingest and retrieve sensors’ information or metrics, while Scylla provides a highly reliable, performant, and highly available backend that scales indefinitely and can store large quantities of time-series data.

Use case for integration¶

The diagram below shows a typical integration scenario where several sensors (in this case, GPU temperature sensors) are sending data to KairosDB node(s). The KairosDB nodes are using a Scylla cluster as a backend datastore. To interact with KairosDB, there is a web based UI.

scylla and kairos solution

Legend

  1. Scylla cluster

  2. KairosDB nodes

  3. GPU sensors

  4. WebUI for KairosDB

Integration example¶

Recommendations¶

In order to implement this integration example, the following are recommendations:

  • It is recommended to deploy KairosDB separately from Scylla, to prevent the databases from competing for resources.

  • Make sure to have sufficient disk space, as KairosDB accumulates data files queued on disk.

  • KairosDB requires Java (and JAVA_HOME setting) as per the procedure here.

Resource list¶

Although your requirements may be different, this example uses the following resources:

  • Scylla cluster: 3 x i3.8XL instances

  • KairosDB node: m5.2XL instance(s)

  • Loaders (python script emulating the sensors): m5.2XL instance(s)

  • Disk space 200GB for the KairosDB nodes

Note that in this case, 200GB was sufficient, but your disk space depends on the workload size from the application/s into Kairos and the speed in which KairosDB can handle the load and write it to the Scylla backend datastore.

Integration instructions¶

The commands shown in this procedure may require root user or sudo.

Before you begin

Verify that you have installed Scylla on a different instance/server and that you know the Scylla server IP address.

Procedure

  1. Download KairosDB. This example downloads version 1.2.0.

sudo curl -O --location https://github.com/kairosdb/kairosdb/releases/download/v1.2.0/kairosdb-1.2.0-1.tar.gz
  1. Extract KairosDB.

sudo tar xvzf kairosdb-1.2.0-1.tar.gz
  1. Configure KairosDB to connect to the Scylla server. Using an editor, open the kairosdb/conf/kairosdb.properties file and make the following edits:

    • Comment out the H2 module

      #kairosdb.service.datastore=org.kairosdb.datastore.h2.H2Module
      
    • Uncomment the Cassandra module

      kairosdb.service.datastore=org.kairosdb.datastore.cassandra.CassandraModule
      
    • In the #Cassandra properties section, set the Scylla nodes IP

      kairosdb.datastore.cassandra.cql_host_list=[IP1],[IP2]...
      
    • Set the replication factor (for production purposes use a Scylla cluster with a minimum of RF=3)

      kairosdb.datastore.cassandra.replication={'class': 'NetworkTopologyStrategy','replication_factor' : 3}
      
    • Set the read and write consistency level (for production purposes use write=ONE, read=QUORUM)

      kairosdb.datastore.cassandra.read_consistency_level=QUORUM
      kairosdb.datastore.cassandra.write_consistency_level=ONE (sufficient for time series workload)
      
    • In case your Scylla / Cassandra cluster is deployed on multiple data centers, change the local datacenter parameter to match the data center you are using.

      kairosdb.datastore.cassandra.local_datacenter=[your_local_DC_name]
      
    • Set connections per host to match the # of shards that Scylla utilizes. Check the number of shards by running the following command on your scylla nodes:

      > cat /etc/scylla.d/cpuset.conf
      CPUSET="--cpuset 1-15,17-31"
      

      In this case, Scylla is using 30 CPU threads (out of 32) as 1 physical core is dedicated to interrupts handling. Set the following Kairos connections:

      kairosdb.datastore.cassandra.connections_per_host.local.core=30
      kairosdb.datastore.cassandra.connections_per_host.local.max=30
      kairosdb.datastore.cassandra.connections_per_host.remote.core=30
      kairosdb.datastore.cassandra.connections_per_host.remote.max=30
      
    • Set max requests per connection to a smaller value than the default (default = 128). As the client only moves to a new connection after it saturates the first. Setting it to a smaller value will cause it to move to a new connection sooner:

      kairosdb.datastore.cassandra.max_requests_per_connection.local=8
      kairosdb.datastore.cassandra.max_requests_per_connection.remote=8
      
    • Set the Kairos batch size (default = 200) and the minimum batch size (default = 100). Testing found that it is necessary to use a smaller value than the default setting. This was because one of Scylla’s shard handling batches can spike to 100% CPU when handling a heavy load from Kairos, which leads to write timeout and poor latency results. In the example, we found the best performance when it is set to 50. When we deployed three Kairos nodes, we divided the load so that each node was set to 15.

      kairosdb.queue_processor.batch_size=50
      kairosdb.queue_processor.min_batch_size=50
      
    • Set the ingest executor thread count (default = 10). In our example, we found 20 to yield the best results.

      kairosdb.ingest_executor.thread_count=20
      
    • Optional: enable TTL for data points. Set the Time to Live value. Once the threshold is reached, the data is deleted automatically. If not set, the data is not deleted. TTLs are added to columns as they’re inserted. Note that setting the TTL does not affect existing data, only new data. Additional TTL parameters to use at your discretion (see their explanation in the properties file)

      #kairosdb.datastore.cassandra.datapoint_ttl=31536000 (Time to live in seconds for data points)
      
      kairosdb.datastore.cassandra.align_datapoint_ttl_with_timestamp=false
      
      kairosdb.datastore.cassandra.force_default_datapoint_ttl=false
      
  2. Using multiple Kairos instances (optional). You might need to use more than a single KairosDB instance to push more data into Scylla, as there are some limits in the Cassandra client that prevents a single kairos instance from pushing faster. To deploy multiple Kairos nodes, shard the clients / sensors, and assign several ingesting clients per Kairos node. Note that in this case, the data is not divided, but each Kairos node is assigned to several clients.

  3. Start KairosDB process. Change to the bin directory and start KairosDB using one of the following commands:

  • To start KairosDB and run it in the foreground:

    > sudo ./kairosdb.sh run
    
  • To run KairosDB as a background process:

    > sudo ./kairosdb.sh start
    
  • To stop KairosDB when running as a background process:

    > sudo ./kairosdb.sh stop
    
  1. To verify that the KairosDB Schema was created properly in your Scylla cluster, connect to one of the Scylla cluster nodes and open cql shell:

> cqlsh [node IP]
  1. Check that the keyspace and tables were created (default keyspace = kairosdb):

cqlsh> DESC TABLES
Keyspace kairosdb
----------------
row_keys       data_points    string_index
row_key_index  service_index  row_key_time_index
  1. Check that the ‘kairosdb’ schema exists and verify the keyspace replication factor:

cqlsh> DESC KEYSPACE kairosdb

Ansible playbook¶

A KairosDB deployment Ansible playbook for your use is available on github. It requires that you install Ansible v2.3 or higher and that a Scylla cluster up and running.

Setup Ansible playbook¶

Procedure

  1. Set the following variables in kairosdb_deploy.yml file:

    • Scylla node(s) IP address(es)

    • Number of shards per node that Scylla utilizes (cat /etc/scylla.d/cpuset.conf)

    • KairosDB batch size - when using a single KairosDB instance with Scylla, while Scylla runs on i3.8XL instance, the value should be set to ‘50’. When using multiple KairosDB nodes, or when Scylla runs on smaller instances, the value should be lower. If you are using multiple KairosDB nodes, you need to divide the batch size evenly per node.

  2. Run the playbook:

    • Run locally: add ‘localhost ansible_connection=local’ to the /etc/ansible/hosts file

    • Run on remote nodes: add an entry for each node’s IP in the /etc/ansible/hosts file

    • If you want to enable key checking, in the ansible-playbook kairosdb_deploy.yml file change the ANSIBLE_HOST_KEY_CHECKING=False to true.

Was this page helpful?

PREVIOUS
Integrate Scylla with Spark
NEXT
Integrate ScyllaDB with Presto
  • Create an issue

On this page

  • Integrate Scylla with KairosDB
    • About KairosDB
      • Benefits of integrating KairosDB with Scylla
      • Use case for integration
    • Integration example
      • Recommendations
      • Resource list
      • Integration instructions
    • Ansible playbook
      • Setup Ansible playbook
ScyllaDB Enterprise
  • 2024.2
    • 2024.2
    • 2024.1
    • 2023.1
    • 2022.2
  • Getting Started
    • Install ScyllaDB Enterprise
      • ScyllaDB Web Installer for Linux
      • Install ScyllaDB Without root Privileges
      • Air-gapped Server Installation
      • ScyllaDB Housekeeping and how to disable it
      • ScyllaDB Developer Mode
      • Launch ScyllaDB on AWS
      • Launch ScyllaDB on GCP
      • Launch ScyllaDB on Azure
    • Configure ScyllaDB
    • ScyllaDB Configuration Reference
    • ScyllaDB Requirements
      • System Requirements
      • OS Support by Linux Distributions and Version
      • Cloud Instance Recommendations
      • ScyllaDB in a Shared Environment
    • Migrate to ScyllaDB
      • Migration Process from Cassandra to Scylla
      • Scylla and Apache Cassandra Compatibility
      • Migration Tools Overview
    • Integration Solutions
      • Integrate Scylla with Spark
      • Integrate Scylla with KairosDB
      • Integrate ScyllaDB with Presto
      • Integrate Scylla with Elasticsearch
      • Integrate Scylla with Kubernetes
      • Integrate Scylla with the JanusGraph Graph Data System
      • Integrate Scylla with DataDog
      • Integrate Scylla with Kafka
      • Integrate Scylla with IOTA Chronicle
      • Integrate Scylla with Spring
      • Shard-Aware Kafka Connector for Scylla
      • Install Scylla with Ansible
      • Integrate Scylla with Databricks
      • Integrate Scylla with Jaeger Server
      • Integrate Scylla with MindsDB
    • Tutorials
  • ScyllaDB for Administrators
    • Administration Guide
    • Procedures
      • Cluster Management
      • Backup & Restore
      • Change Configuration
      • Maintenance
      • Best Practices
      • Benchmarking Scylla
      • Migrate from Cassandra to Scylla
      • Disable Housekeeping
    • Security
      • ScyllaDB Security Checklist
      • Enable Authentication
      • Enable and Disable Authentication Without Downtime
      • Creating a Custom Superuser
      • Generate a cqlshrc File
      • Reset Authenticator Password
      • Enable Authorization
      • Grant Authorization CQL Reference
      • Certificate-based Authentication
      • Role Based Access Control (RBAC)
      • ScyllaDB Auditing Guide
      • Encryption: Data in Transit Client to Node
      • Encryption: Data in Transit Node to Node
      • Generating a self-signed Certificate Chain Using openssl
      • Encryption at Rest
      • LDAP Authentication
      • LDAP Authorization (Role Management)
    • Admin Tools
      • Nodetool Reference
      • CQLSh
      • Admin REST API
      • Tracing
      • Scylla SStable
      • Scylla Types
      • SSTableLoader
      • cassandra-stress
      • SSTabledump
      • SSTableMetadata
      • Scylla Logs
      • Seastar Perftune
      • Virtual Tables
      • Reading mutation fragments
      • Maintenance socket
      • Maintenance mode
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
    • ScyllaDB Manager
    • Upgrade Procedures
      • ScyllaDB Versioning
      • ScyllaDB Enterprise
      • ScyllaDB Open Source to ScyllaDB Enterprise
      • ScyllaDB Image
    • System Configuration
      • System Configuration Guide
      • scylla.yaml
      • ScyllaDB Snitches
    • Benchmarking ScyllaDB
    • ScyllaDB Diagnostic Tools
  • ScyllaDB for Developers
    • Develop with ScyllaDB
    • Tutorials and Example Projects
    • Learn to Use ScyllaDB
    • ScyllaDB Alternator
    • ScyllaDB Features
      • Lightweight Transactions
      • Global Secondary Indexes
      • Local Secondary Indexes
      • Materialized Views
      • Counters
      • Change Data Capture
      • Workload Attributes
      • Workload Prioritization
    • ScyllaDB Drivers
      • Scylla CQL Drivers
      • Scylla DynamoDB Drivers
  • CQL Reference
    • CQLSh: the CQL shell
    • Appendices
    • Compaction
    • Consistency Levels
    • Consistency Level Calculator
    • Data Definition
    • Data Manipulation
      • SELECT
      • INSERT
      • UPDATE
      • DELETE
      • BATCH
    • Data Types
    • Definitions
    • Global Secondary Indexes
    • Expiring Data with Time to Live (TTL)
    • Functions
    • Wasm support for user-defined functions
    • JSON Support
    • Materialized Views
    • Non-Reserved CQL Keywords
    • Reserved CQL Keywords
    • DESCRIBE SCHEMA
    • Service Levels
    • ScyllaDB CQL Extensions
  • ScyllaDB Architecture
    • Data Distribution with Tablets
    • ScyllaDB Ring Architecture
    • ScyllaDB Fault Tolerance
    • Consistency Level Console Demo
    • ScyllaDB Anti-Entropy
      • Scylla Hinted Handoff
      • Scylla Read Repair
      • Scylla Repair
    • SSTable
      • ScyllaDB SSTable - 2.x
      • ScyllaDB SSTable - 3.x
    • Compaction Strategies
    • Raft Consensus Algorithm in ScyllaDB
  • Troubleshooting ScyllaDB
    • Errors and Support
      • Report a Scylla problem
      • Error Messages
      • Change Log Level
    • ScyllaDB Startup
      • Ownership Problems
      • Scylla will not Start
      • Scylla Python Script broken
    • Upgrade
      • Inaccessible configuration files after ScyllaDB upgrade
    • Cluster and Node
      • Handling Node Failures
      • Failure to Add, Remove, or Replace a Node
      • Failed Decommission Problem
      • Cluster Timeouts
      • Node Joined With No Data
      • SocketTimeoutException
      • NullPointerException
      • Failed Schema Sync
    • Data Modeling
      • Scylla Large Partitions Table
      • Scylla Large Rows and Cells Table
      • Large Partitions Hunting
      • Failure to Update the Schema
    • Data Storage and SSTables
      • Space Utilization Increasing
      • Disk Space is not Reclaimed
      • SSTable Corruption Problem
      • Pointless Compactions
      • Limiting Compaction
    • CQL
      • Time Range Query Fails
      • COPY FROM Fails
      • CQL Connection Table
    • ScyllaDB Monitor and Manager
      • Manager and Monitoring integration
      • Manager lists healthy nodes as down
  • Knowledge Base
    • Upgrading from experimental CDC
    • Compaction
    • Consistency in ScyllaDB
    • Counting all rows in a table is slow
    • CQL Query Does Not Display Entire Result Set
    • When CQLSh query returns partial results with followed by “More”
    • Run Scylla and supporting services as a custom user:group
    • Customizing CPUSET
    • Decoding Stack Traces
    • Snapshots and Disk Utilization
    • DPDK mode
    • Debug your database with Flame Graphs
    • Efficient Tombstone Garbage Collection in ICS
    • How to Change gc_grace_seconds for a Table
    • Gossip in Scylla
    • Increase Permission Cache to Avoid Non-paged Queries
    • How does Scylla LWT Differ from Apache Cassandra ?
    • Map CPUs to Scylla Shards
    • Scylla Memory Usage
    • NTP Configuration for Scylla
    • Updating the Mode in perftune.yaml After a ScyllaDB Upgrade
    • POSIX networking for Scylla
    • Scylla consistency quiz for administrators
    • Recreate RAID devices
    • How to Safely Increase the Replication Factor
    • Scylla and Spark integration
    • Increase Scylla resource limits over systemd
    • Scylla Seed Nodes
    • How to Set up a Swap Space
    • Scylla Snapshots
    • Scylla payload sent duplicated static columns
    • Stopping a local repair
    • System Limits
    • How to flush old tombstones from a table
    • Time to Live (TTL) and Compaction
    • Scylla Nodes are Unresponsive
    • Update a Primary Key
    • Using the perf utility with Scylla
    • Configure Scylla Networking with Multiple NIC/IP Combinations
  • Reference
    • AWS Images
    • Azure Images
    • GCP Images
    • Configuration Parameters
    • Glossary
    • Limits
    • ScyllaDB Enterprise vs. Open Source Matrix
    • API Reference (BETA)
    • Metrics (BETA)
  • ScyllaDB University
  • ScyllaDB FAQ
  • Contribute to ScyllaDB
  • Alternator: DynamoDB API in Scylla
    • Getting Started With ScyllaDB Alternator
    • ScyllaDB Alternator for DynamoDB users
    • Alternator-specific APIs
Docs Tutorials University Contact Us About Us
© 2025, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 09 Apr 2025.
Powered by Sphinx 7.4.7 & ScyllaDB Theme 1.8.6