Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: docker compose for dagger quickstart #206

Merged
merged 31 commits into from
Oct 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
8cf979a
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 19, 2022
1ca951e
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 20, 2022
7b4265d
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 20, 2022
1c4f4d0
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 20, 2022
8302d01
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
605ddb0
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
aff2087
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
e7d0730
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
50aa59b
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
df3df43
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
57c0d65
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
23d5660
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
dd566aa
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
73163be
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
f0f6241
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
cf223e8
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
2fd5d8e
docs: docker-compose setup for dagger quickstart
sumitaich1998 Oct 21, 2022
280143b
fix: fixed review comments
sumitaich1998 Oct 27, 2022
4526bd3
fix: fixed review comments
sumitaich1998 Oct 27, 2022
33d7f26
fix: fixed review comments
sumitaich1998 Oct 27, 2022
bc4cb38
fix: fixed review comments
sumitaich1998 Oct 27, 2022
8f5aded
fix: fixed review comments
sumitaich1998 Oct 28, 2022
89db527
fix: fixed review comments
sumitaich1998 Oct 28, 2022
b4e18c7
fix: fixed review comments
sumitaich1998 Oct 28, 2022
6fec5a1
fix: fixed review comments
sumitaich1998 Oct 28, 2022
d2ba792
fix: fixed review comments
sumitaich1998 Oct 28, 2022
ff08e77
fix: fixed review comments
sumitaich1998 Oct 28, 2022
7aa5d0f
fix: fixed review comments
sumitaich1998 Oct 28, 2022
2b45a2e
fix: fixed review comments
sumitaich1998 Oct 28, 2022
1899fa6
fix: fixed review comments
sumitaich1998 Oct 28, 2022
423ddcd
fix: fixed review comments
sumitaich1998 Oct 31, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Explore the following resources to get started with Dagger:

## Running locally

Please follow this [Dagger Quickstart Guide](https://odpf.github.io/dagger/docs/guides/quickstart) for setting up a local running Dagger consuming from Kafka.
Please follow this [Dagger Quickstart Guide](https://odpf.github.io/dagger/docs/guides/quickstart) for setting up a local running Dagger consuming from Kafka or to set up a Docker Compose for Dagger.

**Note:** Sample configuration for running a basic dagger can be found [here](https://odpf.github.io/dagger/docs/guides/create_dagger#common-configurations). For detailed configurations, refer [here](https://odpf.github.io/dagger/docs/reference/configuration).

Expand Down
29 changes: 28 additions & 1 deletion docs/docs/guides/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,4 +64,31 @@ After some initialization logs, you should see the output of the SQL query getti

2. **I see an exception `java.lang.RuntimeException: Unable to retrieve any partitions with KafkaTopicsDescriptor: Topic Regex Pattern`**

This can happen if the topic configured under `STREAMS` -> `SOURCE_KAFKA_TOPIC_NAMES` in `local.properties` is new and you have not pushed any messages to it yet. Ensure that you have pushed atleast one message to the topic before you start dagger.
This can happen if the topic configured under `STREAMS` -> `SOURCE_KAFKA_TOPIC_NAMES` in `local.properties` is new and you have not pushed any messages to it yet. Ensure that you have pushed atleast one message to the topic before you start dagger.

## Docker Compose Setup

### Prerequisites

1. **You must have docker installed**

Following are the steps for setting up dagger in docker compose -
1. Clone Dagger repository into your local

```shell
git clone https://github.com/odpf/dagger.git
```
2. cd into the docker-compose directory:
```shell
cd dagger/quickstart/docker-compose
```
3. fire this command to spin up the docker compose:
```shell
docker compose up
```
This will spin up docker containers for the kafka, zookeeper, stencil, kafka-producer and the dagger.
4. fire this command to gracefully close the docker compose:
```shell
docker compose down
```

126 changes: 126 additions & 0 deletions quickstart/docker-compose/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
services:

zookeeper:
image: confluentinc/cp-zookeeper:6.1.1
ports:
- "2187:2187"
expose:
- '21871'
environment:
ZOOKEEPER_CLIENT_PORT: 2187

# reachable on 9094 from the host and on 29094 from inside docker compose
kafka:
image: confluentinc/cp-kafka:6.1.1
depends_on:
- zookeeper
healthcheck:
test: [ "CMD-SHELL", " kafka-topics --version" ]
interval: 10s
timeout: 5s
retries: 5000
ports:
- '9094:9094'
expose:
- '29094'
environment:
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2187'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29094,PLAINTEXT_HOST://localhost:9094
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: '1'
KAFKA_MIN_INSYNC_REPLICAS: '1'

init-kafka:
image: confluentinc/cp-kafka:6.1.1
depends_on:
kafka:
condition: service_healthy
entrypoint: [ '/bin/sh', '-c' ]
command: |
"
# blocks until kafka is reachable

echo -e 'Creating kafka topics'
kafka-topics --bootstrap-server kafka:29094 --create --if-not-exists --topic dagger-test-topic-v1 --replication-factor 1 --partitions 1

echo -e 'Successfully created the following topics:'
kafka-topics --bootstrap-server kafka:29094 --list
"

stencil:
image: amd64/ubuntu:latest
volumes:
- ${PWD}/resources:/resources
ports:
- '2917:2917'
expose:
- '2917'
entrypoint: [ '/bin/sh', '-c' ]
command: |
"
yes Y | apt-get update
yes Y | apt install software-properties-common
yes Y | add-apt-repository universe
yes Y | apt-get update
yes Y | apt-get install protobuf-compiler=3.12.4-1ubuntu7
yes Y | apt-get install python3
cd resources
protoc --descriptor_set_out=file.desc --include_imports *.proto
yes Y | apt-get install python3-pip
python3 -m pip install simple_http_server
python3 -m http.server 2917
"


kafkaproducer:
image: amd64/ubuntu:latest
depends_on:
- init-kafka
volumes:
- ${PWD}/resources:/resources
entrypoint: [ '/bin/sh', '-c' ]
command: |
"
cd resources
yes Y | apt-get update
yes Y | apt install software-properties-common
yes Y | add-apt-repository universe
yes Y | apt-get update
yes Y | apt-get install protobuf-compiler=3.12.4-1ubuntu7
yes Y | apt-get install kafkacat=1.6.0-1
yes Y | apt-get install pwgen
echo -e 'Sending message to Kafka topic'
chmod 777 kafkafeeder.sh
while :
do
./kafkafeeder.sh
kafkacat -P -b kafka:29094 -D "\n" -T -t dagger-test-topic-v1 message.bin
sleep 1
rm message.bin
done
"

dagger:
image: amd64/ubuntu:latest
depends_on:
- init-kafka
- stencil
volumes:
- ${PWD}/resources:/resources
entrypoint: [ '/bin/sh', '-c' ]
command: |
"
yes Y | apt-get update
yes Y | apt install software-properties-common
yes Y | add-apt-repository universe
yes Y | apt-get update
yes Y | apt install git
yes Y | apt install openjdk-8-jdk
yes Y | apt install gradle
/var/lib/dpkg/info/ca-certificates-java.postinst configure
git clone https://github.com/odpf/dagger
cp /resources/local.properties /dagger/dagger-core/env/
cd dagger
./gradlew runFlink
"
99 changes: 99 additions & 0 deletions quickstart/docker-compose/resources/TestLogMessage.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
syntax = "proto3";

package io.odpf.dagger.consumer;

option java_multiple_files = true;
option java_package = "io.odpf.dagger.consumer";
option java_outer_classname = "TestLogMessageProto";

import "google/protobuf/struct.proto";
import "google/protobuf/timestamp.proto";

message TestBookingLogMessage {
TestServiceType.Enum service_type = 1;
string order_number = 2;
string order_url = 3;
google.protobuf.Timestamp event_timestamp = 5;
string customer_id = 6;
string customer_url = 7;
string driver_id = 8;
string driver_url = 9;

string activity_source = 11;
string service_area_id = 12;

float amount_paid_by_cash = 16;

TestLocation driver_pickup_location = 26;
TestLocation driver_dropoff_location = 27;

string customer_email = 28;
string customer_name = 29;
string customer_phone = 30;

string driver_email = 31;
string driver_name = 32;
string driver_phone = 33;

int32 cancel_reason_id = 36;
string cancel_reason_description = 37;

google.protobuf.Timestamp booking_creation_time = 41;

float total_customer_discount = 40;
float gopay_customer_discount = 42;
float voucher_customer_discount = 43;

google.protobuf.Timestamp pickup_time = 44;
float driver_paid_in_cash = 45;
float driver_paid_in_credit = 46;

int64 customer_total_fare_without_surge = 52;
bool customer_dynamic_surge_enabled = 55;

int64 driver_total_fare_without_surge = 56;
bool driver_dynamic_surge_enabled = 59;
repeated string meta_array = 60;

google.protobuf.Struct profile_data = 61;
google.protobuf.Struct event_properties = 62;
google.protobuf.Struct key_values = 63;

double cash_amount = 64;

repeated int32 int_array_field = 65;

map<string, string> metadata = 66;

double customer_price = 70;

repeated bool boolean_array_field = 71;

repeated double double_array_field = 72;

repeated float float_array_field = 73;

repeated int64 long_array_field = 74;
}

message TestServiceType {
enum Enum {
UNKNOWN = 0;
FLIGHT = 1;
BUS = 2;
TRAIN = 3;
}
}

message TestLocation {
string name = 1;
string address = 2;
double latitude = 3;
double longitude = 4;
string type = 5;
string note = 6;
string place_id = 7;
float accuracy_meter = 8;
string gate_id = 9;
}

8 changes: 8 additions & 0 deletions quickstart/docker-compose/resources/kafkafeeder.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash
timestamp_now=$(date +%s)
random_3char_suffix=$(pwgen 3 1)
random_enum_index=$(($RANDOM %3))
declare -a myArray=("FLIGHT" "BUS" "TRAIN")
cat sample_message.txt | \
sed "s/replace_timestamp_here/$timestamp_now/g; s/replace_service_type_here/${myArray[$random_enum_index]}/g; s/replace_customer_suffix_here/$random_3char_suffix/g" | \
protoc --proto_path=./ --encode=io.odpf.dagger.consumer.TestBookingLogMessage ./TestLogMessage.proto > message.bin
33 changes: 33 additions & 0 deletions quickstart/docker-compose/resources/local.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# == Query ==
FLINK_SQL_QUERY=SELECT count(1) as booking_count, TUMBLE_END(rowtime, INTERVAL '30' SECOND) AS window_timestamp from `data_stream_0` GROUP BY TUMBLE (rowtime, INTERVAL '30' SECOND)
FLINK_WATERMARK_INTERVAL_MS=10000
FLINK_WATERMARK_DELAY_MS=1000
# == Input Stream ==
STREAMS=[{"SOURCE_KAFKA_TOPIC_NAMES":"dagger-test-topic-v1","INPUT_SCHEMA_TABLE":"data_stream_0","INPUT_SCHEMA_PROTO_CLASS":"io.odpf.dagger.consumer.TestBookingLogMessage","INPUT_SCHEMA_EVENT_TIMESTAMP_FIELD_INDEX":"5","SOURCE_KAFKA_CONSUMER_CONFIG_BOOTSTRAP_SERVERS":"kafka:29094","SOURCE_KAFKA_CONSUMER_CONFIG_AUTO_COMMIT_ENABLE":"false","SOURCE_KAFKA_CONSUMER_CONFIG_AUTO_OFFSET_RESET":"latest","SOURCE_KAFKA_CONSUMER_CONFIG_GROUP_ID":"dagger-test-topic-cgroup-v1","SOURCE_KAFKA_NAME":"local-kafka-stream","SOURCE_DETAILS":[{"SOURCE_TYPE":"UNBOUNDED","SOURCE_NAME":"KAFKA_CONSUMER"}]}]

# == Preprocessor ==
PROCESSOR_PREPROCESSOR_ENABLE=false
PROCESSOR_PREPROCESSOR_CONFIG={}

# == Postprocessor ==
PROCESSOR_POSTPROCESSOR_ENABLE=false
PROCESSOR_POSTPROCESSOR_CONFIG={}

# == Sink ==
SINK_TYPE=log

# == Stencil ==
SCHEMA_REGISTRY_STENCIL_ENABLE=true
SCHEMA_REGISTRY_STENCIL_URLS=http://stencil:2917/file.desc

# == Telemetry ==
METRIC_TELEMETRY_SHUTDOWN_PERIOD_MS=10000
METRIC_TELEMETRY_ENABLE=true

# == Others ==
FUNCTION_FACTORY_CLASSES=io.odpf.dagger.functions.udfs.factories.FunctionFactory
FLINK_ROWTIME_ATTRIBUTE_NAME=rowtime

# == Python Udf ==
PYTHON_UDF_ENABLE=false
PYTHON_UDF_CONFIG={"PYTHON_FILES":"/path/to/files.zip", "PYTHON_REQUIREMENTS": "requirements.txt", "PYTHON_FN_EXECUTION_BUNDLE_SIZE": "1000"}
14 changes: 14 additions & 0 deletions quickstart/docker-compose/resources/sample_message.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
service_type: replace_service_type_here
order_number: "ynnhjv45"
event_timestamp {
seconds: replace_timestamp_here
}
customer_id: "customer-replace_customer_suffix_here"
driver_pickup_location {
latitude: 19.075983
longitude: 72.877655
}
driver_dropoff_location {
latitude: 19.237188
longitude: 72.844139
}