Skip to content

Commit

Permalink
Benchmark init commit
Browse files Browse the repository at this point in the history
  • Loading branch information
dcvan24 committed Oct 22, 2018
1 parent 03a4a53 commit a51eb67
Show file tree
Hide file tree
Showing 13 changed files with 501 additions and 0 deletions.
1 change: 1 addition & 0 deletions benchmark/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.idea/
83 changes: 83 additions & 0 deletions benchmark/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
Shared Filesystem Benchmark
===========================
Shared filesystem is commonly used among distributed containers to share data and states.

The benchmark aims at testing various shared filesystem solutions for (geo-)distributed containers
and finding the best one in terms of *read/write throughput*, *responsiveness* and *scalability*.


### Usage pattern

The usage pattern of the shared filesystem by distributed containers are characterized as follows:

- Must be POSIX-compliant
- Frequent small random reads/writes, usually more reads than writes
- Occasional bulk data transfers typically at the scale ranging from ~10 to ~1000 of GBs
- Concurrent reads/writes by multiple clients. The concurrency ranges from 2-3 to ~100.

### Performance definition and metrics

Based on the scenario and usage pattern, we define the performance metrics for evaluating a
solution. In general, we evaluate the following aspects of every solution.

#### Throughput

We measure throughput to evaluate the performance of a filesystem in *bulk data transfers*.
Specifically, we measure both *read* and *write* throughput. We define data transfers of GBs of data
as bulk data transfers.

#### Responsiveness

The responsiveness is defined as the average time for completing an operation on a filesystem. We
measure responsiveness to evaluate the performance of a filesystem in *small random reads/writes
operations*, which typically consists the majority of operations on a filesystem. The operations
include file *creation*, *read*, *write* and *deletion*.

The responsiveness also reflects the overhead for data access in a shared file system - with a large
number of operations, the delay incurred by each operation can accrue and become a significant
portion of the end-to-end runtime of the application performing the operations.

#### Scalability

The scalability is reflected by 1) *the ability of growing/shrinking the storage capacity* and 2)
*the ability of serving multiple distributed clients for concurrent reads/writes*.

We evaluate 1) of every filesystem qualitatively by checking whether it allows dynamic
growing/shrinking of storage capacity and the simplicity of scaling up and down.

We evaluate 2) by measuring the throughput and responsiveness of the target filesystem as the number
of clients increases.

#### Performance metrics

With the performance defined as above, we mainly use the following metrics to quantify the
performance of a file system:

- Read/write throughput (MB/s)
- Operation latency (ms)
- Maximum number of concurrent clients (without crashing the filesystem or causing significant
performance drop)
- Steps needed for growing/shrinking storage capacity

### [Benchmark Tools](tools/)

- Small-file I/O
- [smallfile](https://github.com/distributed-system-analysis/smallfile)
- Large-file I/O
- [fio](http://freshmeat.sourceforge.net/projects/fio)
- [iozone](http://www.iozone.org/)

### Shared filesystem solutions

The filesystem solutions to be evaluated are listed as below:

- NFSv4 (baseline)
- GlusterFS
- CephFS
- Ceph RBD + NFSv4
- GlusterFS + NFS-Ganesha

In addition, some solutions provide multiple configurations (not for performance tunning) for
different use cases, which are likely to impact the performance, *e.g.*, the
[GlusterFS volume type](https://docs.gluster.org/en/v3/Administrator%20Guide/Setting%20Up%20Volumes/),
distribution of backend Ceph OSDs.
30 changes: 30 additions & 0 deletions benchmark/ansible/group_vars/all.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
ansible_ssh_user: centos
ansible_ssh_public_key_file: ~/.ssh/id_rsa.pub
ansible_ssh_private_key_file: ~/.ssh/id_rsa

project: helium-dc
image: ubuntu-1804
machine_type: n1-standard-2
disks:
- name: boot
type: pd-ssd
size: 50
delete_on_termination: true
mode: READ_WRITE

max_retries: 5
supernet: 10.52.0.0/16
zone:
- name: us-east1-b
start: 1
end: 1
# - name: us-east4
# count: 2
# - name: us-central1
# count: 3
# - name: us-west1
# count: 4
# - name: northamerica-northeast1
# count: 5


Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
- name: Create attached disk(s) for host(s) at {{ _zone.name }}
gce_pd:
name: benchmark-{{ _zone.name }}-{{ _seq }}-{{ item.1.name }}
instance_name: benchmark-{{ _zone.name }}-{{ _seq }}
zone: "{{ _zone.name }}"
disk_type: "{{ item.1.type }}"
size_gb: "{{ item.1.size }}"
mode: "{{ item.1.mode }}"
delete_on_termination: "{{ item.1.delete_on_termination }}"
with_indexed_items: "{{ disks[1:] }}"
register: _create_attached_disks
async: 600
poll: 0

- name: Wait for attached disk creation to finish
async_status:
jid: "{{ item.ansible_job_id }}"
register: _attached_disks
until: _attached_disks.finished
with_items: "{{ _create_attached_disks.results }}"
delay: 6
retries: 10
64 changes: 64 additions & 0 deletions benchmark/ansible/roles/create_hosts/tasks/create_hosts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
- name: Create boot disk(s) for host(s) at {{ _zone.name }}
gce_pd:
name: benchmark-{{ _zone.name }}-{{ item }}-{{ disks.0.name }}
zone: "{{ _zone.name }}"
disk_type: "{{ disks.0.type }}"
size_gb: "{{ disks.0.size }}"
image: "{{ image }}"
mode: "{{ disks.0.mode }}"
with_sequence: start={{ _zone.start }} end={{ _zone.end }}
register: _create_boot_disk
async: 600
poll: 0

- name: Wait for boot disk creation to finish
async_status:
jid: "{{ item.ansible_job_id }}"
register: _boot_disk
until: _boot_disk
with_items: "{{ _create_boot_disk.results }}"
delay: 6
retries: 10

- name: Create {{ _zone.end - _zone.start + 1 }} host(s) at {{ _zone.name }}
gce:
name: benchmark-{{ _zone.name }}-{{ item }}
zone: "{{ _zone.name }}"
machine_type: "{{ machine_type }}"
network: "{{ project }}-vpc-{{ _zone.name.split('-')[:2] | join('-') }}"
subnetwork: "{{ project }}-subnet-{{ _zone.name.split('-')[:2] | join('-') }}"
state: present
metadata:
block-project-ssh-keys: true
ssh-keys: "{{ ansible_ssh_user }}:{{ lookup('file', ansible_ssh_public_key_file) }}"
disk_auto_delete: "{{ disks.0.delete_on_termination }}"
disks:
- name: benchmark-{{ _zone.name }}-{{ item }}-{{ disks.0.name }}
mode: READ_WRITE
register: _created_instances
with_sequence: start={{ _zone.start }} end={{ _zone.end }}
async: 600
poll: 0

- name: Wait for host creation to finish
async_status:
jid: "{{ item.ansible_job_id }}"
register: hosts
until: hosts.finished
with_items: "{{ _created_instances.results }}"
delay: 6
retries: 10

- include_tasks: create_attached_disks.yml
with_sequence: start={{ _zone.start }} end={{ _zone.end }}
loop_control:
loop_var: _seq

- name: Assign hosts to the benchmark group
add_host:
hostname: "{{ item.1.public_ip }}"
groupname:
- benchmark
with_subelements:
- "{{ hosts.results }}"
- instance_data
35 changes: 35 additions & 0 deletions benchmark/ansible/roles/create_hosts/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
- include_tasks: create_hosts.yml
with_items: "{{ zone }}"
loop_control:
loop_var: _zone

- name: Wait for host creation to finish
async_status:
jid: "{{ item.ansible_job_id }}"
register: hosts
until: hosts.finished
with_items: "{{ _created_instances.results }}"
delay: 6
retries: 10

- name: Check SSH
wait_for:
host: "{{ item }}"
port: 22
delay: 5
timeout: 300
when: >
'benchmark' in groups
with_items: "{{ groups.benchmark }}"
register: _check_ssh
async: 300
poll: 0

- name: Wait for SSH checking results
async_status:
jid: "{{ item.ansible_job_id }}"
register: _job
with_items: "{{ _check_ssh.results }}"
until: _job.finished
delay: 5
retries: 10
31 changes: 31 additions & 0 deletions benchmark/ansible/roles/install_common_packages/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
- name: Install Docker dependencies
apt:
name:
- apt-transport-https
- ca-certificates
- curl
- software-properties-common
update_cache: true
autoremove: true

- name: Add Docker CE official GPG key
apt_key:
url: https://download.docker.com/linux/ubuntu/gpg
state: present

- name: Add Docker CE repository
apt_repository:
repo: deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable
state: present

- name: Install Docker CE
apt:
name: docker-ce
update_cache: true

- name: Add sudo user to "docker" group
user:
name: "{{ ansible_ssh_user }}"
groups: docker
append: true
become: true
15 changes: 15 additions & 0 deletions benchmark/ansible/setup.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
- hosts: localhost
roles:
- create_hosts

- hosts: benchmark
gather_facts: false
become: true
tasks:
- name: Install Python
raw: apt-get update && apt-get install -y python

- hosts: benchmark
become: true
roles:
- install_common_packages
Loading

0 comments on commit a51eb67

Please sign in to comment.