Skip to content
This repository was archived by the owner on Dec 15, 2025. It is now read-only.

Commit 450990f

Browse files
committed
doc update
1 parent a29ff2d commit 450990f

File tree

1 file changed

+10
-13
lines changed

1 file changed

+10
-13
lines changed

README.md

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -184,7 +184,7 @@ Note:
184184

185185
Sometimes you may need to clean up the data inside zookeeper. First stop the server, then run "rm -rf /path/to/zookeeper/datadir" to clean the data dir. The directory is defined in your config file.
186186

187-
2. Kafka setup
187+
3. Kafka setup
188188

189189
When configuring Kafka and topic count, we need to ensure disk won't become bottleneck. It is suggested to start several brokers in each kafka node, and configure each broker several disks. Different brokers in the same node may share disks but have their own directories in the same disk. Our topic count is 16 for each kafka node, that is, if the kafka cluster contains only 1 kafka node, then we create topics with 16 partitions. For environment with 3 kafka nodes, we create topics with 48 partitions.
190190

@@ -198,7 +198,7 @@ Note:
198198

199199
Same with ZooKeeper, you may need to clean old data that's located in disks of kafka brokers. Just `rm -rf <all_data_path>` in all your kafka nodes and directories.
200200

201-
3. Spark setup
201+
4. Spark setup
202202

203203
All spark streaming related parameters can be defined in `conf/99-user_defined_properties.conf`.
204204

@@ -212,7 +212,7 @@ Note:
212212

213213
Spark streaming can be deployed as YARN mode or standalone mode. For YARN mode, just set `hibench.spark.master` to `yarn-client`. For standalone mode, set it to `spark://spark_master_ip:port` and run `sbin/start-master.sh` in your spark home.
214214

215-
4. Storm setup
215+
5. Storm setup
216216

217217
The conf file is `conf/storm.yaml`. Basically we configure following params:
218218

@@ -227,7 +227,7 @@ Note:
227227
Run `bin/storm nimbus` to start nimbus and `bin/storm ui` to setup storm ui
228228
Run `bin/storm supervisor` to start storm supervisors
229229

230-
5. HiBench setup
230+
6. HiBench setup
231231

232232
Same as [step.2 in previous section](#hibenchconf).
233233

@@ -254,18 +254,15 @@ Note:
254254

255255
Note: For SparkStreaming, receiver mode (Spark version >= 1.4). The first run will always fail. You'll need to wait a few more minutes, running `prepare/zkUtils.sh` to ensure the topic has be created. Then re-run the workload again. For Spark version == 1.3, it'll be OK.
256256

257-
5. View the report:
257+
7. View the report:
258258

259-
Goto `<HiBench_Root>/report` to check for the final report:
260-
- `report/hibench.report`: Overall report about all workloads.
261-
- `report/<workload>/<language APIs>/bench.log`: Raw logs on client side.
262-
- `report/<workload>/<language APIs>/monitor.html`: System utilization monitor results.
263-
- `report/<workload>/<language APIs>/conf/<workload>.conf`: Generated environment variable configurations for this workload.
264-
- `report/<workload>/<language APIs>/conf/sparkbench/<workload>/sparkbench.conf`: Generated configuration for this workloads, which is used for mapping to environment variable.
265-
- `report/<workload>/<language APIs>/conf/sparkbench/<workload>/spark.conf`: Generated configuration for spark.
259+
Same as [step.4 in previous section](#viewreport)
266260

267-
Note, the throughput and lattency of each batch is printed to terminal during running streaming work endlessly. For SparkStreaming, press ctrl+c will stop the works. For Storm & Trident, you'll need to execute `storm/bin/stop.sh` to stop the works. For Samza, you'll have to kill all applications in YARN manually, or restart YARN.
261+
However, the streamingbench is very different with nonstreaming workloads. Streaming workloads will collect throughput and lattency endlessly and print to terminal directly and log to `report/<workload>/<language APIs>/bench.log`.
262+
263+
8. Stop the streaming workloads:
268264

265+
For SparkStreaming press `ctrl+c` will stop the works. For Storm & Trident, you'll need to execute `storm/bin/stop.sh` to stop the works. For Samza, currently you'll have to kill all applications in YARN manually, or restart YARN cluster directly.
269266

270267
---
271268
### Advanced Configurations ###

0 commit comments

Comments
 (0)