Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
4e80414
add ceph module
day0n Sep 27, 2023
caa0bb7
add normalization components
leishu-521 Sep 27, 2023
b379f30
Update README.md
yg000 Sep 28, 2023
a679dc3
Merge branch 'master' of github.com:cas-bigdatalab/piflow
tianyao-0315 Sep 28, 2023
c448b1d
add normalization components.
leishu-521 Sep 28, 2023
dc6bd8e
Merge branch 'cas-bigdatalab:master' into master
leishu-521 Sep 28, 2023
ddaf851
Merge pull request #78 from leishu-521/master
tianyao-0315 Sep 29, 2023
aa4ba1f
Merge branch 'master' into master
tianyao-0315 Sep 29, 2023
391a5fc
Merge pull request #77 from day0n/master
tianyao-0315 Sep 29, 2023
c572e61
add ceph module
day0n Sep 30, 2023
80d1fa2
Merge pull request #79 from day0n/master
tianyao-0315 Oct 12, 2023
9b6bd12
Delete piflow-bundle/src/main/resources/icon/jdbc/tbase.png
yg000 Oct 12, 2023
b899ecf
Add files via upload
yg000 Oct 12, 2023
92ecf1e
Add files via upload
yg000 Oct 12, 2023
02b08dc
Delete piflow-bundle/src/main/resources/icon/jdbc/Tbase.png
yg000 Oct 12, 2023
b5ac53e
add ceph module
day0n Oct 16, 2023
48f642b
add ceph module
day0n Oct 18, 2023
12fe52d
Merge pull request #80 from day0n/master
tianyao-0315 Oct 18, 2023
5fab4f6
Fix the issue of not displaying resources
Nov 1, 2023
c1f5935
add cla
tianyao-0315 Nov 12, 2023
722d0e8
Update 原则.md
tianyao-0315 Nov 12, 2023
a9bee85
Update 原则.md
tianyao-0315 Nov 12, 2023
e39a6d6
Update 原则.md
tianyao-0315 Nov 12, 2023
745554f
ADD CLA pdf
tianyao-0315 Nov 13, 2023
dcab338
Delete Governance/πFlow_Open_Source_Individual_CLA.pdf
tianyao-0315 Nov 13, 2023
61f80f3
add cla pdf
tianyao-0315 Nov 13, 2023
3a7f106
Delete Governance/πFlow_Open_Source_Individual_CLA.docx
tianyao-0315 Nov 13, 2023
f18031a
add cla
tianyao-0315 Nov 13, 2023
3466e5f
Update README.md
yg000 Nov 13, 2023
2b8860d
Update README.md
yg000 Nov 13, 2023
31a3086
Update README.md
yg000 Nov 13, 2023
05fe208
# add license and copyright
tianyao-0315 Nov 13, 2023
6eb2360
add stop ExcelWriteMultipleSheets
yg000 Nov 27, 2023
6f77dd7
Merge remote-tracking branch 'origin/master'
yg000 Nov 27, 2023
5097885
ADD dameng stop
yg000 Dec 17, 2023
5c60068
ADD dameng png
yg000 Dec 17, 2023
20af3f7
#fix: update flow state to failed
tianyao-0315 Mar 19, 2024
c20bba6
#feature:h2 database url can be customized
tianyao-0315 Mar 20, 2024
fdc7002
#feature:h2 database url can be customized
tianyao-0315 Mar 20, 2024
047c46c
#init unstructured data parser:.pdf/ .html/ .image/ ./docx/ .pptx
tianyao-0315 Mar 28, 2024
9212191
#init unstructured data parser:.pdf/ .html/ .image/ ./docx/ .pptx
tianyao-0315 Apr 7, 2024
976df6d
#init unstructured data parser:.pdf/ .html/ .image/ ./docx/ .pptx
tianyao-0315 Apr 17, 2024
711e57f
#init unstructured data parser:.pdf/ .html/ .image/ ./docx/ .pptx
tianyao-0315 Apr 18, 2024
72595bc
#fix: getIcon
tianyao-0315 May 20, 2024
dac80bd
#embed
tianyao-0315 Jul 23, 2024
0066fbd
#embed
tianyao-0315 Jul 23, 2024
fb6b14b
#embed
tianyao-0315 Jul 24, 2024
642835b
fix: typo in README.md
Sep 18, 2024
91154e9
Merge pull request #84 from JaylanLiu/master
judy0131 Sep 18, 2024
10f0846
#embed location
tianyao-0315 Sep 20, 2024
3fadfca
Merge remote-tracking branch 'origin/master'
tianyao-0315 Sep 20, 2024
b0e9c44
#embed location
tianyao-0315 Sep 20, 2024
92cd35b
qdrant向量化python组件
leishu-521 Sep 24, 2024
f1afdfe
pinecone组件
rorozi Sep 24, 2024
88c7043
qdrant向量库python组件
leishu-521 Sep 27, 2024
1848b35
Merge pull request #86 from rorozi/pinecone-branch
tianyao-0315 Sep 27, 2024
58e9bd4
Merge pull request #85 from leishu-521/master
tianyao-0315 Sep 27, 2024
7ffb2dc
#embed
tianyao-0315 Sep 27, 2024
d176364
pinecone组件修改connect
rorozi Sep 27, 2024
1777a62
qdrant组件和使用文档完善
leishu-521 Sep 27, 2024
5757bf4
解决说明文档的图片无法显示的问题
leishu-521 Sep 27, 2024
b150298
Merge remote-tracking branch 'upstream/master' into pinecone-branch
rorozi Sep 27, 2024
170adce
解决说明文档图片无法显示问题
rorozi Sep 27, 2024
d0aa0c2
Merge pull request #87 from rorozi/pinecone-branch
tianyao-0315 Sep 29, 2024
ff847d9
Merge pull request #88 from leishu-521/master
tianyao-0315 Sep 29, 2024
67bb075
#embed
tianyao-0315 Sep 30, 2024
095f0a8
#embed
tianyao-0315 Sep 30, 2024
e024251
#h2.path
tianyao-0315 Oct 8, 2024
2f87c0d
#embed.zip更新
tianyao-0315 Oct 10, 2024
8c34221
#readme
tianyao-0315 Nov 21, 2024
5810081
Update README.md
tianyao-0315 Nov 21, 2024
560138f
Update README.md
tianyao-0315 Nov 21, 2024
1bdbb00
Update README.md
tianyao-0315 Nov 21, 2024
dbd1666
Update README.md
tianyao-0315 Nov 25, 2024
5dd23bd
Update README.md
tianyao-0315 Nov 25, 2024
535f743
Update README.md
tianyao-0315 Nov 25, 2024
4e45f86
Update README.md
tianyao-0315 Nov 25, 2024
1a7d128
Update README.md
tianyao-0315 Nov 25, 2024
9b95da1
Update README.md
tianyao-0315 Nov 25, 2024
6c26604
Update README.md
tianyao-0315 Nov 25, 2024
777361c
#readme
tianyao-0315 Nov 25, 2024
f985185
run example
NatsusakiYomi Mar 14, 2025
53f088b
wrong config
NatsusakiYomi Apr 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file not shown.
Binary file added Governance/πFlow_Open_Source_Individual_CLA.pdf
Binary file not shown.
2 changes: 1 addition & 1 deletion Governance/原则.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ PifFow社区遵循[社区行为准则](https://github.com/cas-bigdatalab/piflow/

### CLA

所有贡献者都必须签署PifFow CLA,请具体看[这里](https://github.com/cas-bigdatalab/piflow/blob/master/Governance/image-20211118094103884.png)。
所有贡献者都必须签署PifFow CLA,请具体看[这里](https://github.com/cas-bigdatalab/piflow/blob/master/Governance/%CF%80Flow_Open_Source_Individual_CLA.docx)。
29 changes: 19 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,12 @@
![](https://github.com/cas-bigdatalab/piflow/blob/master/doc/architecture.png)
## Requirements
* JDK 1.8
* Scala-2.11.8
* Scala-2.12.18
* Apache Maven 3.1.0 or newer
* Spark-2.1.0、 Spark-2.2.0、 Spark-2.3.0
* Hadoop-2.6.0
* Apache Livy-0.7.1
* Spark-3.4.0
* Hadoop-3.3.0

Compatible with X86 architecture and ARM architecture, Support CentOS and Kirin system deployment

## Getting Started

Expand Down Expand Up @@ -319,12 +320,20 @@
![](https://github.com/cas-bigdatalab/piflow/blob/master/doc/piflow-stophublist.png)

## Contact Us
- Name:吴老师
- Mobile Phone:18910263390
- WeChat:18910263390
- Email: [email protected]
- QQ Group:1003489545
![](https://github.com/cas-bigdatalab/piflow/blob/master/doc/PiFlowUserGroup_QQ.jpeg)
- Name:Yang Gang, Tian Yao
- Mobile Phone:13253365393, 18501260806
- WeChat:13253365393, 18501260806
- Email: [email protected], [email protected]
- Private vulnerability contact information:[email protected]
- Wechat User Group
<center>
<img src="https://github.com/cas-bigdatalab/piflow/blob/master/doc/wechat_user.png" width="100"/>
</center>

- Wechat Official Account
<center>
<img src="https://github.com/cas-bigdatalab/piflow/blob/master/doc/tencent.jpg" width="100"/>
</center>



Expand Down
55 changes: 0 additions & 55 deletions conda-pack打包虚拟环境.md

This file was deleted.

8 changes: 4 additions & 4 deletions config.properties
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
spark.master=yarn
spark.deploy.mode=cluster

server.ip=172.18.32.1
#hdfs default file system
fs.defaultFS=hdfs://10.0.82.108:9000
fs.defaultFS=hdfs://172.18.39.41:9000
#yarn resourcemanager hostname
yarn.resourcemanager.hostname=10.0.82.108
yarn.resourcemanager.hostname=172.18.39.41

#if you want to use hive, set hive metastore uris
hive.metastore.uris=thrift://10.0.82.108:9083
#hive.metastore.uris=thrift://10.0.82.108:9083

#show data in log, set 0 if you do not show the logs
data.show=10
Expand Down
Binary file added doc/tencent.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/wechat_user.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions piflow-bin/config.properties
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ spark.master=yarn
spark.deploy.mode=cluster

#hdfs default file system
fs.defaultFS=hdfs://10.0.85.83:9000
fs.defaultFS=hdfs://172.18.39.41:9000

#yarn resourcemanager hostname
yarn.resourcemanager.hostname=10.0.85.83
yarn.resourcemanager.hostname=172.18.39.41

#if you want to use hive, set hive metastore uris
hive.metastore.uris=thrift://10.0.85.83:9083
Expand Down
107 changes: 4 additions & 103 deletions piflow-bin/example/flow.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,45 +7,15 @@
"paths": [
{
"inport": "",
"from": "XmlParser",
"to": "SelectField",
"outport": ""
},
{
"inport": "",
"from": "Fork",
"from": "CsvParser",
"to": "CsvSave",
"outport": "out1"
},
{
"inport": "data2",
"from": "SelectField",
"to": "Merge",
"outport": ""
},
{
"inport": "",
"from": "Merge",
"to": "Fork",
"outport": ""
},
{
"inport": "data1",
"from": "CsvParser",
"to": "Merge",
"to": "CsvSave",
"outport": ""
},
{
"inport": "",
"from": "Fork",
"to": "JsonSave",
"outport": "out3"
},
{
"inport": "",
"from": "Fork",
"to": "PutHiveMode",
"outport": "out2"
}
],
"executorCores": "1",
Expand All @@ -56,7 +26,7 @@
"bundle": "cn.piflow.bundle.csv.CsvSave",
"uuid": "8a80d63f720cdd2301723a4e67a52467",
"properties": {
"csvSavePath": "hdfs://master:9000/xjzhu/phdthesis_result.csv",
"csvSavePath": "hdfs://172.18.32.1:9000/user/Yomi/test1.csv",
"partition": "",
"header": "false",
"saveMode": "append",
Expand All @@ -66,87 +36,18 @@

}
},
{
"name": "PutHiveMode",
"bundle": "cn.piflow.bundle.hive.PutHiveMode",
"uuid": "8a80d63f720cdd2301723a4e67a22461",
"properties": {
"database": "sparktest",
"saveMode": "append",
"table": "dblp_phdthesis"
},
"customizedProperties": {

}
},
{
"name": "CsvParser",
"bundle": "cn.piflow.bundle.csv.CsvParser",
"uuid": "8a80d63f720cdd2301723a4e67a82470",
"properties": {
"schema": "title,author,pages",
"csvPath": "hdfs://master:9000/xjzhu/phdthesis.csv",
"csvPath": "hdfs://172.18.32.1:9000/user/Yomi/test.csv",
"delimiter": ",",
"header": "false"
},
"customizedProperties": {

}
},
{
"name": "JsonSave",
"bundle": "cn.piflow.bundle.json.JsonSave",
"uuid": "8a80d63f720cdd2301723a4e67a1245f",
"properties": {
"jsonSavePath": "hdfs://10.0.86.191:9000/xjzhu/phdthesis.json"
},
"customizedProperties": {

}
},
{
"name": "XmlParser",
"bundle": "cn.piflow.bundle.xml.XmlParser",
"uuid": "8a80d63f720cdd2301723a4e67a7246d",
"properties": {
"rowTag": "phdthesis",
"xmlpath": "hdfs://master:9000/xjzhu/dblp.mini.xml"
},
"customizedProperties": {

}
},
{
"name": "SelectField",
"bundle": "cn.piflow.bundle.common.SelectField",
"uuid": "8a80d63f720cdd2301723a4e67aa2477",
"properties": {
"columnNames": "title,author,pages"
},
"customizedProperties": {

}
},
{
"name": "Merge",
"bundle": "cn.piflow.bundle.common.Merge",
"uuid": "8a80d63f720cdd2301723a4e67a92475",
"properties": {
"inports": "data1,data2"
},
"customizedProperties": {

}
},
{
"name": "Fork",
"bundle": "cn.piflow.bundle.common.Fork",
"uuid": "8a80d63f720cdd2301723a4e67a42465",
"properties": {
"outports": "out1,out3,out2"
},
"customizedProperties": {

}
}
]
Expand Down
Loading