Was this page helpful?
The easiest way to run a Scylla cluster on EC2 is by using Scylla AMI, which is Ubuntu-based since ScyllaDB Enterprise 2021.1.0 and ScyllaDB Open Source 4.5 (prior versions are CentOS-based). To use a different OS or your own AMI (Amazon Machine Image) or set up a multi DC Scylla cluster, you need to configure the Scylla cluster on your own. This page guides you through this process.
A Scylla cluster on EC2 can be deployed as a single-DC cluster or a multi-DC cluster. The table below describes how to configure parameters in the scylla.yaml
file for each node in your cluster for both cluster types.
For more information on Scylla AMI and the configuration of parameters in scylla.yaml
from the EC2 user data, see Scylla Machine Image.
The best practice is to use each EC2 region as a Scylla DC. In such a case, nodes communicate using Internal (Private) IPs inside the region and using External (Public) IPs between regions (Data Centers).
For further information, see AWS instance addressing.
Parameter |
Single DC |
Multi DC |
---|---|---|
seeds |
Internal IP address |
External IP address |
listen_address |
Internal IP address |
Internal IP address |
rpc_address |
Internal IP address |
Internal IP address |
broadcast_address |
Internal IP address |
External IP address |
broadcast_rpc_address |
Internal IP address |
External IP address |
endpoint_snitch |
Ec2Snitch |
Ec2MultiRegionSnitch |
EC2 instance with SSH access.
Ensure that all the relevant ports are open in your EC2 Security Group.
Select a unique name as cluster_name
for the cluster (identical for all the nodes in the cluster).
Perform the following steps for each node in the new cluster:
Install Scylla on the node. See Getting Started for installation instructions and
follow the procedure up to the scylla.yaml
configuration phase.
If the Scylla service is already running (for example, if you are using Scylla AMI), stop it before moving to the next step by using these instructions.
Edit the parameters listed below in the scylla.yaml
file located in /etc/scylla/
. See the EC2 Configuration Table above on how to configure your cluster.
cluster_name - Set the selected cluster_name.
seeds - IP address of the first node in the cluster. See Scylla Seed Nodes for details.
listen_address - IP address that Scylla used to connect to other Scylla nodes in the cluster.
endpoint_snitch - Set the selected snitch.
rpc_address - Address for client connection (Thrift, CQL).
broadcast_address - The IP address a node tells other nodes in the cluster to contact it by.
broadcast_rpc_address - Default: unset. The RPC address to broadcast to drivers and other Scylla nodes. It cannot be set to 0.0.0.0. If left blank, it will be set to the value of
rpc_address
. Ifrpc_address
is set to 0.0.0.0,broadcast_rpc_address
must be explicitly configured.consistent_cluster_management -
true
by default, can be set tofalse
if you don’t want to use Raft for consistent schema management in this cluster (will be mandatory in later versions). Check the Raft in ScyllaDB document to learn more.
After you have installed and configured Scylla and edited scylla.yaml
file on all the nodes, start the node specified with the seeds
parameter. Then start the rest of the nodes in your cluster, one at a time, using
sudo systemctl start scylla-server
.
Verify that the node has been added to the cluster using
nodetool status
.
EC2snitch and Ec2MultiRegionSnitch give each DC and rack default names. The region name is defined as the datacenter name, and availability zones are defined as racks within a datacenter. The rack names cannot be changed.
For a node in the us-east-1
region, us-east
is the datacenter name and 1
is the rack.
To change the name of the datacenter, open the cassandra-rackdc.properties
file located in /etc/scylla/
and edit the DC name.
The dc_suffix
defines a suffix added to the datacenter name. For example:
for region us-east and suffix dc_suffix=_1_scylla
, it will be us-east_1_scylla
.
for region us-west and suffix dc_suffix=_1_scylla
, it will be us-west_1_scylla
.