Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
4b7e7fd
Added .gitignore for sanity.
icedwater Sep 12, 2024
998a550
Updated raw pose processing for local machine.
icedwater Sep 12, 2024
c5bf04f
Updated README.md.
icedwater Sep 18, 2024
2429976
Added .mp4 to .gitignore.
icedwater Sep 18, 2024
29abc73
Cleaned output/metadata of motion_rep notebook.
icedwater Nov 6, 2024
33fccc2
Removed unused last cell.
icedwater Nov 6, 2024
aa16a3b
Added abs position version of motion_rep code
icedwater Nov 6, 2024
7bc5f6e
Cleaned output for all base notebooks
icedwater Nov 6, 2024
c876c25
Updated mean and variance calculation notebook.
icedwater Nov 6, 2024
0ed3831
Added base working version of build_vector.py.
icedwater Nov 20, 2024
0cca915
Updated the import structure
icedwater Nov 20, 2024
2a9cead
Updated build_vector script.
icedwater Nov 21, 2024
f260b57
Fixed custom_paramUtil.
icedwater Nov 21, 2024
a9a6eb1
Added updated and documented annotate_texts.py.
icedwater Nov 21, 2024
73e69be
Ignored content from unzipping downloads where instructed.
icedwater Nov 21, 2024
d1efe48
Added documented version of mean/variance calculations.
icedwater Nov 21, 2024
7faf1d5
Updated docstrings and rearranged imports.
icedwater Nov 21, 2024
d44dcff
Added changes needed for custom rig.
icedwater Nov 21, 2024
ab10ede
Upgraded annotate_texts.
icedwater Nov 21, 2024
5600415
Added names to arguments in cal_mean_variance for clarity.
icedwater Dec 19, 2024
d5b04b9
Added processed texts to .gitignore.
icedwater Dec 19, 2024
2591e48
Added precalculate script for convenience.
icedwater Mar 7, 2025
74c9ba2
Added requirements.txt and relevant line in README.
icedwater Mar 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# vim swaps
.*.sw?

# python binaries
*.py[oc]

# numpy arrays
*.np[yz]

# HumanML3D texts, when unzipped
HumanML3D/texts/*.txt

# Custom texts, once processed
Custom/texts

# amass or body model data, when unzipped
amass_data/
body_models/

# zip files
*.zip
*.bz2
*.tar
*.gz
*.tar.gz
*.tgz

# animations
*.mp4
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,16 @@ We double the size of HumanML3D dataset by mirroring all motions and properly re
[KIT Motion-Language Dataset](https://motion-annotation.humanoids.kit.edu/dataset/) (KIT-ML) is also a related dataset that contains 3,911 motions and 6,278 descriptions. We processed KIT-ML dataset following the same procedures of HumanML3D dataset, and provide the access in this repository. However, if you would like to use KIT-ML dataset, please remember to cite the original paper.
</details>

If this dataset is usefule in your projects, we will apprecite your star on this codebase. 😆😆
## Checkout Our Works on HumanML3D
If this dataset is useful in your projects, we will appreciate your star on this codebase. 😆😆

## Checkout Our Work on HumanML3D
:ok_woman: [T2M](https://ericguo5513.github.io/text-to-motion) - The first work on HumanML3D that learns to generate 3D motion from textual descriptions, with *temporal VAE*.
:running: [TM2T](https://ericguo5513.github.io/TM2T) - Learns the mutual mapping between texts and motions through the discrete motion token.
:dancer: [TM2D](https://garfield-kh.github.io/TM2D/) - Generates dance motions with text instruction.
:honeybee: [MoMask](https://ericguo5513.github.io/momask/) - New-level text2motion generation using residual VQ and generative masked modeling.

## How to Obtain the Data
For KIT-ML dataset, you could directly download [[Here]](https://drive.google.com/drive/folders/1D3bf2G2o4Hv-Ale26YW18r1Wrh7oIAwK?usp=sharing). Due to the distribution policy of AMASS dataset, we are not allowed to distribute the data directly. We provide a series of script that could reproduce our HumanML3D dataset from AMASS dataset.
For KIT-ML dataset, you could directly download [[Here]](https://drive.google.com/drive/folders/1D3bf2G2o4Hv-Ale26YW18r1Wrh7oIAwK?usp=sharing). Due to the distribution policy of AMASS dataset, we are not allowed to distribute the data directly. We provide a series of scripts that could reproduce our HumanML3D dataset from AMASS dataset.

You need to clone this repository and install the virtual environment.

Expand All @@ -49,6 +50,8 @@ conda env create -f environment.yaml
conda activate torch_render
```

Alternatively, install `requirements.txt` into the virtual environment using the workflow of your choice.

In the case of installation failure, you could alternatively install the following:
```sh
- Python==3.7.10
Expand Down Expand Up @@ -102,7 +105,7 @@ After all, the data under folder "./HumanML3D" is what you finally need.
```
HumanML3D data follows the SMPL skeleton structure with 22 joints. KIT-ML has 21 skeletal joints. Refer to paraUtils for detailed kinematic chains.

The file named in "MXXXXXX.\*" (e.g., 'M000000.npy') is mirrored from file with correspinding name "XXXXXX.\*" (e.g., '000000.npy'). Text files and motion files follow the same naming protocols, meaning texts in "./texts/XXXXXX.txt"(e.g., '000000.txt') exactly describe the human motions in "./new_joints(or new_joint_vecs)/XXXXXX.npy" (e.g., '000000.npy')
The file named in "MXXXXXX.\*" (e.g., 'M000000.npy') is mirrored from file with corresponding name "XXXXXX.\*" (e.g., '000000.npy'). Text files and motion files follow the same naming protocols, meaning texts in "./texts/XXXXXX.txt"(e.g., '000000.txt') exactly describe the human motions in "./new_joints(or new_joint_vecs)/XXXXXX.npy" (e.g., '000000.npy')

Each text file looks like the following:
```sh
Expand All @@ -111,11 +114,11 @@ the standing person kicks with their left foot before going back to their origin
a man kicks with something or someone with his left leg.#a/DET man/NOUN kick/VERB with/ADP something/PRON or/CCONJ someone/PRON with/ADP his/DET left/ADJ leg/NOUN#0.0#0.0
he is flying kick with his left leg#he/PRON is/AUX fly/VERB kick/NOUN with/ADP his/DET left/ADJ leg/NOUN#0.0#0.0
```
with each line a distint textual annotation, composed of four parts: *original description (lower case)*, *processed sentence*, *start time(s)*, *end time(s)*, that are seperated by *#*.
with each line a distinct textual annotation, composed of four parts: *original description (lower case)*, *processed sentence*, *start time(s)*, *end time(s)*, that are separated by *#*.

Since some motions are too complicated to be described, we allow the annotators to describe a sub-part of a given motion if required. In these cases, *start time(s)* and *end time(s)* denotes the motion segments that are annotated. Nonetheless, we observe these only occupy a small proportion of HumanML3D. *start time(s)* and *end time(s)* are set to 0 by default, which means the text is captioning the entire sequence of corresponding motion.

If you are not able to install ffmpeg, you could animate videos in '.gif' instead of '.mp4'. However, generating GIFs usually takes longer time and memory occupation.
If you are not able to install ffmpeg, you could animate videos in '.gif' instead of '.mp4'. However, generating GIFs usually takes longer time and uses more memory.

## Citation

Expand Down
35 changes: 10 additions & 25 deletions animation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -14,7 +14,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -112,7 +112,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -122,7 +122,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -132,7 +132,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -157,17 +157,9 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 10/10 [00:14<00:00, 1.43s/it]\n"
]
}
],
"outputs": [],
"source": [
"for npy_file in tqdm(npy_files):\n",
" data = np.load(pjoin(src_dir, npy_file))\n",
Expand All @@ -177,20 +169,13 @@
"# You may set the title on your own.\n",
" plot_3d_motion(save_path, kinematic_chain, data, title=\"None\", fps=20, radius=4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [conda env:torch_render]",
"display_name": "hml3d",
"language": "python",
"name": "conda-env-torch_render-py"
"name": "python3"
},
"language_info": {
"codemirror_mode": {
Expand All @@ -202,7 +187,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.10"
"version": "3.9.19"
}
},
"nbformat": 4,
Expand Down
114 changes: 114 additions & 0 deletions annotate_texts.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
"""
Given a text file with raw descriptions of actions, tag each description with
parts-of-speech tags, then write them in the training format to a new file.
"""

import spacy
from tqdm import tqdm

nlp = spacy.load('en_core_web_sm')

def process_text(sentence: str) -> tuple[list[str], list[str]]:
"""
Return lists of words and their parts of speech (POS) tags
for a given sentence.

:param sentence: string to be tagged
:return word_list: list of tokens found in the sentence
:return pos_list: list of part-of-speech tags by token
"""
sentence = sentence.replace('-', '')
doc = nlp(sentence)
word_list = []
pos_list = []
for token in doc:
word = token.text
if not word.isalpha():
continue
if (token.pos_ in ("NOUN", "VERB")) and (word != 'left'):
word_list.append(token.lemma_)
else:
word_list.append(word)
pos_list.append(token.pos_)
return (word_list, pos_list)


def read_text_from_file(input_file: str) -> list[str]:
"""
Read the text from a file of action descriptions for parsing.

:param input_file: string path to the input file
:return result: list of strings read from the input file
"""
with open(input_file, 'r', encoding="utf-8") as infile:
raw_lines = infile.readlines()
result = [line.strip() for line in raw_lines]

return result


def prepare_combined_line(sentence: str, start_time: float=0.0, end_time: float=0.0) -> str:
"""
Given each sentence, parse it and attach tags to each token.
Then include the description start and end time to be edited if needed.
By default, these are 0.0 if we are describing the full sequence.

:param sentence: string containing an input sentence
:param start_time: float representing start time of the description
:param end_time: float representing end time of the description
:return combined_line: string containing sentence#tagged_sentence#start_time#end_time
"""
(words, tags) = process_text(sentence)
tagged_sentence = ' '.join([f"{n[0]}/{n[1]}" for n in zip(words, tags)])
combined_line = f"{sentence}#{tagged_sentence}#{start_time}#{end_time}\n"

return combined_line


def write_output_file(output_list: list[str], output_file: str):
"""
Write a specified output list to a specified file.

:param output_list: list of strings to write
:param output_file: string path to output file location
"""
with open(output_file, 'w', encoding="utf-8") as outfile:
outfile.writelines(output_list)


def tag_one_file(input_file: str, output_file: str):
"""
Do the tagging for one input file and return one output file.

:param input_file: string path to file with untagged descriptions
:param output_file: string path to file for storing results
"""
output = []
strings = []

strings = read_text_from_file(input_file=input_file)

for input_line in strings:
output_line = prepare_combined_line(input_line)
output.append(output_line)

write_output_file(output_list=output, output_file=output_file)


def main():
"""
Allow for directories to be passed in
"""
from os import listdir
from os.path import join as pjoin

data_dir = "Custom/texts/raw"
save_dir = "Custom/texts"

for text_file in tqdm(listdir(data_dir)):
print(f"Processing {text_file}...")
tag_one_file(input_file=pjoin(data_dir, text_file), output_file=pjoin(save_dir, text_file))


if __name__ == "__main__":
main()
Loading