Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 15 additions & 13 deletions content/community/sig/digital-humanities/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Whether you are analysing texts, images, audio data or other media, **Galaxy is

# How can you use Galaxy for Digital Humanities analysis?

Galaxy is a flexible setup, so it can leverage analysis in Digital Humanities (DH) and Social Sciences in various ways. Due to the platform's flexibility, you can create an analysis that fits your research needs and supports you with tools for easier research data management on the go. Just go to the Galaxy instance of your choice, for example, the [European Galaxy Server](https://usegalaxy.eu/), create an account and get started.
Galaxy is a flexible setup, so it can leverage analysis in Digital Humanities (DH) and Social Sciences in various ways. Due to the platform's flexibility, you can create an analysis that fits your research needs and supports you with tools for easier research data management on the go. Simply visit the Galaxy instance of your choice, such as the [European Galaxy Server](https://usegalaxy.eu/), create an account, and get started.

Here are some examples:

Expand Down Expand Up @@ -40,9 +40,9 @@ You can combine various Galaxy tools into [workflows](https://training.galaxypro

## Tools

This is a non-exhaustive overview of Galaxy tools applicable to Humanities and Social sciences use-cases. You can find the tools by going to a Galaxy instance, for example, the [European Galaxy Server](https://usegalaxy.eu/) and clicking on "Tools" on the left-side bar. There, you can search the tools by name or browse them by category.
This is a non-exhaustive overview of Galaxy tools applicable to Humanities and social sciences use cases. You can find the tools by going to a Galaxy instance, for example, the [European Galaxy Server](https://usegalaxy.eu/) and clicking on "Tools" in the left-side bar. There, you can search the tools by name or browse them by category.

### 📤 **Uploading/accessing data**
### 📥 **Uploading/accessing data**
You can upload your files in various ways in the upload section on the left sidebar of the Galaxy website.

- If you have larger amounts of data, check out this tutorial to [Bring Your Own Data](https://training.galaxyproject.org/training-material/faqs/galaxy/manage_your_repositories.html).
Expand All @@ -51,17 +51,18 @@ You can upload your files in various ways in the upload section on the left side
### 🛠️ **General tools**
- Text cleaning tools such as text manipulation, select lines, add lines, search in text files, Line/Word/Character count, etc.
- Text file format converters (dos2unix): Convert text files with DOS or Mac line breaks to Unix line breaks and other text cleaning options.
- Concatenate multiple datasets tail-to-head while specifying how: Join multiple single files into one bigger file based on your needs.

### 📖 **Text tools**
- diff: Comparing two text documents
- Editor: make manual changes that will not be tracked. Find the editor on the left-hand side panel under *Visualisations*.
- Regular Expressions: Various tools in Galaxy to help you select specific patterns from your text as explained [here](https://training.galaxyproject.org/training-material/faqs/galaxy/analysis_regular_expressions.html).
- Regular Expressions: Various tools in Galaxy to help you select specific patterns from your text, as explained [here](https://training.galaxyproject.org/training-material/faqs/galaxy/analysis_regular_expressions.html).
- GPT: Large language model - this is only accessible with the OpenAI credentials. Find more information on how to set this up [here](https://galaxyproject.org/news/2024-09-02-chat-gpt/).

### **Tools for working with tabular data**
- **Text Manipulation for tabular data**: Replace Text, Replace text in a specific column, Replace parts of text, text reformatting, Text transformation, Sort a row, reverse a file, Compute an expression, Regex replace, Subtract Whole Dataset, Merge Columns, Unique occurences, Add column, Merge Columns together, Convert delimiters to TAB, Change Case, Trim characters, Secure Hash
- **Filter and Sort for tabular data**: Unique, Unique lines, Sort, Select random lines, Select first lines (head), Select last lines (tail), Remove beginning, Cut columns, Search in textfiles, XPath, Column arrange, Query tabular, Filter data on any column, Sort data, Select lines, Remove columns, Sort Column Order
- **Join, Substract and Group for tabular data**: Join two files, Split file, Reverse, Transpose, Datamash, Subtract, Join two Datasets, Group data by a column
- **Join, Subtract and Group for tabular data**: Join two files, Split file, Reverse, Transpose, Datamash, Subtract, Join two Datasets, Group data by a column
- **Further**: Tabular data prediction using TabPFN, Cleanlab Issue Handler to detect and optionally clean data issues, SaQC for quality control

### 🔊**Audio tools**
Expand All @@ -71,7 +72,7 @@ You can upload your files in various ways in the upload section on the left side
- FFmpeg Format Converter: Convert between video and audio formats

### 🎞️ **Image and video tools**
- Tesseract: Optical Character Recognition (OCR)
- Tesseract: Optical Character Recognition (OCR). You can see an example workflow using Tesseract to transcribe a German newspaper in [this video](https://usegalaxy.eu/static/share/OCR.mp4)
- [flux](https://galaxyproject.org/news/2024-11-26-flux/): text to image conversion
- Convert image format (including extracting images from a PDF)
- PDFimages: Extract images from a PDF file
Expand All @@ -92,7 +93,7 @@ You can upload your files in various ways in the upload section on the left side
- FFmpeg Format Converter: Convert between video and audio formats

### 👩‍💻 **Interactive tools**
- OpenRefine: data cleaning and enrichment
- OpenRefine: data cleaning and enrichment - for example, with the Gemeinsame Normdatei (translated as Integrated Authority File) or GND as explained [here](https://blog.lobid.org/2018/08/27/openrefine.html).
- R-Studio
- GIS
- Jupyter Notebook
Expand All @@ -108,7 +109,7 @@ You can upload your files in various ways in the upload section on the left side
### 🤖 **AI and Machine Learning tools**
- [flux](https://galaxyproject.org/news/2024-11-26-flux/): text to image conversion
- [Whisper](https://galaxyproject.org/news/2024-04-25-whisper-tool/): Audio to text conversion
- LLM Hub: Run various AI models hosted by the Freiburg Compute Center (RZ) without any API keys
- LLM Hub: Run various AI models hosted by the Freiburg Compute Center (RZ) without any API keys. Interested? Read [here](https://galaxyproject.org/news/2025-10-10-llm-hub/) how to set this up.
- GPT: Large language model - this is only accessible with the OpenAI credentials. Find more information on how to set this up [here](https://galaxyproject.org/news/2024-09-02-chat-gpt/).
- TabPFN: Tabular data prediction
- [Cleanlab](https://galaxyproject.org/news/2025-06-13-cleanlab/): Identifies label issues in Machine Learning datasets
Expand All @@ -117,18 +118,19 @@ You can upload your files in various ways in the upload section on the left side
### **Domain-specific tools**
- [ONTO-ToolKit](https://doi.org/10.1186/1471-2105-11-S12-S8) working with OBO files for ontologies

### 📤 **Export your data**
Galaxy enables you to share your data in various ways, if you prefer. Once you have set your credentials, the tool [Export datasets
to repositories](https://usegalaxy.eu/root?tool_id=export_remote) allows you to directly export your analysis results to the repositories of your choice, such as Zenodo. Check out our [example video](https://usegalaxy.eu/static/share/OCR.mp4) to see what this could look like.

### 🏗️ **Work in progress**
- E-Scriptorium: Handwritten text-recognition (HTR)
- Stanford Core NLP: Natural language processing tool

If tools are missing or information is not up-to-date in the list, please help us! We are available via this [Matrix Channel](https://matrix.to/#/#galaxyproject-digital-humanities:matrix.org)
If tools are missing or the information in the list is not up-to-date, please help us. We are available via this [Matrix Channel](https://matrix.to/#/#galaxyproject-digital-humanities:matrix.org)

## 🎓 Workflows and training materials

You can find an overview of the existing tutorials on the dedicated DH-section of the [Galaxy Training Network](https://training.galaxyproject.org/training-material/topics/digital-humanities/). Those are open educational resources (OER) that you can create from your workflows or reuse in teaching and training. Documentation for providing further tutorials can be accessed [here](https://training.galaxyproject.org/training-material/topics/contributing/). If you have never worked with Galaxy before, we suggest you start with our [Introduction to Digital Humanities in Galaxy](https://training.galaxyproject.org/training-material/topics/digital-humanities/tutorials/introduction_to_dh/tutorial.html).

In addition to those mentioned above, we are currently working on:
- Using OpenRefine in Galaxy to analyse data
You can find an overview of the existing tutorials on the dedicated DH-section of the [Galaxy Training Network](https://training.galaxyproject.org/training-material/topics/digital-humanities/). Those are open educational resources (OER) that you can create from your workflows or reuse in teaching and training. Documentation for providing further tutorials can be accessed [here](https://training.galaxyproject.org/training-material/topics/contributing/). If you have never worked with Galaxy before, we recommend starting with our [Introduction to Digital Humanities in Galaxy](https://training.galaxyproject.org/training-material/topics/digital-humanities/tutorials/introduction_to_dh/tutorial.html).

## 🏛️ Upcoming events

Expand Down
Loading