You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: en/lessons/extracting-keywords.md
+2-2
Original file line number
Diff line number
Diff line change
@@ -58,7 +58,7 @@ The lesson touches on Regular Expressions, so some readers may find it handy to
58
58
59
59
The first step of this process is to take a look at the data that we will be using in the lesson. As mentioned, the data includes biographical details of approximately 6,692 graduates who began study at the University of Oxford in the early seventeenth century.
{% include figure.html filename="extracting-keywords-1.png" caption="Screenshot of the first forty entries in the dataset" %}
64
64
@@ -378,7 +378,7 @@ Before you re-run your Python code, you'll have to update your `texts.txt` file
378
378
379
379
I'd challenge you to make a few refinements to your gazetteer before moving ahead, just to make sure you have the hang of it.
380
380
381
-
Once you are happy with that, you can snag my [completed list of English and Welsh counties, shortforms, and various other cities (London, Bristol etc) and places (Jersey, Ireland, etc)](/assets/extracting-keywords-final-gazetteer.txt). My completed list contains 157 entries, and should get you all of the entries that can be extracted from the texts in this collection.
381
+
Once you are happy with that, you can snag my [completed list of English and Welsh counties, shortforms, and various other cities (London, Bristol etc) and places (Jersey, Ireland, etc)](/assets/extracting-keywords/extracting-keywords-final-gazetteer.txt). My completed list contains 157 entries, and should get you all of the entries that can be extracted from the texts in this collection.
382
382
383
383
At this point you could stop, as you've achieved what you set out to do. This lesson taught you how to use a short Python program to search a fairly large number of texts for a set of keywords defined by you.
Copy file name to clipboardExpand all lines: en/lessons/generating-an-ordered-data-set-from-an-OCR-text-file.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -221,7 +221,7 @@ def rom2ar(rom):
221
221
222
222
return result
223
223
```
224
-
(run <[this little script](/assets/Roman_to_Arabic.txt)> to see in detail how `rome2ar` works. Elegant programming like this can offer insight; like poetry.)
224
+
(run <[this little script](/assets/generating-an-ordered-data-set-from-an-OCR-text-file/Roman_to_Arabic.txt)> to see in detail how `rome2ar` works. Elegant programming like this can offer insight; like poetry.)
225
225
226
226
## Some other things we'll need:
227
227
At the top of your Python module, you're going to want to import some python modules that are a part of the standard library. (see Fred Gibbs's tutorial [*Installing Python Modules with pip*](/lessons/installing-python-modules-pip)).
Copy file name to clipboardExpand all lines: en/lessons/json-and-jq.md
+5-5
Original file line number
Diff line number
Diff line change
@@ -132,7 +132,7 @@ These set various jq [command-line options, or _flags_](https://stedolan.github.
132
132
133
133
jq operates by way of _filters_: a series of text commands that you can string together, and which dictate how jq should transform the JSON you give it.
134
134
135
-
To learn the basic jq filters, we'll work with a sample response from the Rijksmuseum API: [rkm.json](/assets/jq_rkm.json)
135
+
To learn the basic jq filters, we'll work with a sample response from the Rijksmuseum API: [rkm.json](/assets/json-and-jq/jq_rkm.json)
136
136
Select all the text at that link, copy it, and paste it into the "JSON" box at [jq play] on the left hand side.
137
137
138
138
@@ -425,7 +425,7 @@ One of the easiest ways to search and download Twitter data is using the excelle
425
425
426
426
For this lesson, we will use a small sample of 50 public tweets.
427
427
Clear the "Filter", "JSON" and "Result" boxes on [jq play], and ensure all the checkboxes are unchecked.
428
-
[Then copy this sample Twitter data](/assets/jq_twitter.json) into [jq play].
428
+
[Then copy this sample Twitter data](/assets/json-and-jq/jq_twitter.json) into [jq play].
429
429
430
430
### One-to-many relationships: Tweet hashtags
431
431
@@ -895,7 +895,7 @@ You should get the following table:
895
895
"whiteprivilege",1
896
896
```
897
897
898
-
[There are multiple ways to solve this with jq. See my answer here.](/assets/filter_retweets.txt)
898
+
[There are multiple ways to solve this with jq. See my answer here.](/assets/json-and-jq/filter_retweets.txt)
899
899
900
900
#### Count total retweets per user
901
901
@@ -909,7 +909,7 @@ Hints:
909
909
910
910
As a way to verify your results, user `356854246` should have a total retweet count of `51` based on this dataset.
911
911
912
-
[See my answer.](/assets/count_retweets.txt)
912
+
[See my answer.](/assets/json-and-jq/count_retweets.txt)
913
913
914
914
## Using jq on the command line
915
915
@@ -959,7 +959,7 @@ This can be useful when downloading JSON with a utility like `wget` for retrievi
959
959
(See [Automated Downloading with Wget](/lessons/automated-downloading-with-wget) to learn the basics of this other command line program.)
Copy file name to clipboardExpand all lines: en/lessons/sentiment-analysis-syuzhet.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -245,7 +245,7 @@ library(tm)
245
245
246
246
## Load and Prepare the Text
247
247
248
-
Next, download a machine readable copy of the novel: [*Miau*](/assets/sentiment-analysis-syuzhet/galdos_miau.txt) and make sure to save it as a .txt file. When you open the file you will see that the novel is in [plain text](https://perma.cc/Z5WH-V9SW) format, which is essential for this particular analysis using R.
248
+
Next, download a machine readable copy of the novel: [*Miau*](/assets/analisis-de-sentimientos-r/galdos_miau.txt) and make sure to save it as a .txt file. When you open the file you will see that the novel is in [plain text](https://perma.cc/Z5WH-V9SW) format, which is essential for this particular analysis using R.
249
249
250
250
With the text at hand, you first need to load it into R as one long string so that you can work with it programmatically. Make sure to replace `FILEPATH` with the location of the novel on your own computer (don't just type 'FILEPATH'). This loading process is slightly different on Mac/Linux and Windows machines:
+[Excerpt from the Topic model of John Adams' Diary](/assets/sonification/sonification-diary.csv)
57
+
+[Excerpt from the Topic model of the Jesuit Relations](/assets/sonification/sonification-jesuittopics.csv)
58
58
59
59
# Some Background on Sonification
60
60
@@ -122,18 +122,18 @@ _There is no 'right' way to represent your data as sound_, at least not yet: but
122
122
But what about time? Historical data often has a punctuation point, a distinct 'time when' something occured. Thus, the amount of time between two data points has to be taken into account. This is where our next tool becomes quite useful, for when our data points have a relationship to one another in temporal space. We begin to move from sonfication (data points) to music (relationships between points).
123
123
124
124
### Practice
125
-
The [sample dataset](/assets/sonification-roman-data.csv) provided contains counts of Roman coins in its first column and counts of other Roman materials from the same locations, as contained in the Portable Antiquities Scheme database from the British Museum. A sonification of this data might reveal or highlight aspects of the economic situation along Watling street, a major route through Roman Britain. The data points are organized geographically from North West to South East; thus as the sound plays out, we are hearing movement over space. Each note represents another stop along the way.
125
+
The [sample dataset](/assets/sonification/sonification-roman-data.csv) provided contains counts of Roman coins in its first column and counts of other Roman materials from the same locations, as contained in the Portable Antiquities Scheme database from the British Museum. A sonification of this data might reveal or highlight aspects of the economic situation along Watling street, a major route through Roman Britain. The data points are organized geographically from North West to South East; thus as the sound plays out, we are hearing movement over space. Each note represents another stop along the way.
126
126
127
-
1. Open the[sonification-roman-data.csv](/assets/sonification-roman-data.csv) in a spreadsheet. Copy the first column into a text editor. Delete the line endings so that the data is all in a single row.
127
+
1. Open the[sonification-roman-data.csv](/assets/sonification/sonification-roman-data.csv) in a spreadsheet. Copy the first column into a text editor. Delete the line endings so that the data is all in a single row.
128
128
2. Add the following column information like so:
129
129
```
130
130
# Of Voices, Text Area Name, Text Area Data
131
131
1,morphBox,
132
132
,areaPitch1,
133
133
```
134
-
...so that your data follows immediately after that last comma (as like [this](/assets/sonification-romancoin-data-music.csv)). Save the file with a useful name like `coinsounds1.csv`.
134
+
...so that your data follows immediately after that last comma (as like [this](/assets/sonification/sonification-romancoin-data-music.csv)). Save the file with a useful name like `coinsounds1.csv`.
135
135
136
-
3. Go to the [Musicalgorithms](http://musicalgorithms.org/3.0/index.html) site (version 3), and hit the load button. In the pop-up, click the blue 'load' button and select the file saved in step 2. The site will load your materials and display a green check mark if it loaded successfully. If it did not, make sure that your values are separated by spaces, and that they follow immediately the last comma in the code block in step 2. You may also try loading up the [demo file for this tutorial](/assets/sonification-romancoin-data-music.csv) instead.{% include figure.html filename="sonification-musicalgorithms-upload-4.png" caption="Click 'load' on the main screen to get this dialogue box. Then 'load csv'. Select your file; it will appear in the box. Then click the bottom load button." %}
136
+
3. Go to the [Musicalgorithms](http://musicalgorithms.org/3.0/index.html) site (version 3), and hit the load button. In the pop-up, click the blue 'load' button and select the file saved in step 2. The site will load your materials and display a green check mark if it loaded successfully. If it did not, make sure that your values are separated by spaces, and that they follow immediately the last comma in the code block in step 2. You may also try loading up the [demo file for this tutorial](/assets/sonification/sonification-romancoin-data-music.csv) instead.{% include figure.html filename="sonification-musicalgorithms-upload-4.png" caption="Click 'load' on the main screen to get this dialogue box. Then 'load csv'. Select your file; it will appear in the box. Then click the bottom load button." %}
137
137
4. Click on 'Pitch Input'. You'll see the values of your data. For now, **do not select** any further options on this page (thus using the site's default values).
138
138
5. Click on 'Duration Input'. **Do not select any options here for now**. The options here will map various transformations against your data that will alter the duration for each note. Do not worry about these options for now; move on.
139
139
6. Click on 'Pitch Mapping'. This is the most crucial choice, as it will transform (that is, scale) your raw data to a mapping against the keys of the keyboard. Leave the `mapping` set to 'division'. (The other options are modulo or logarithmic). The option `Range` 1 to 88 uses the full 88 keys of the keyboard; thus your lowest value would accord to the deepest note on the piano and your highest value with the highest note. You might wish instead to constrain your music around middle C, so enter 25 to 60 as your range. The output should change to: `31,34,34,34,25,28,30,60,28,25,26,26,25,25,60,25,25,38,33,26,25,25,25` These are no longer your counts; they are notes on the keyboard.{% include figure.html filename="sonification-musicalgorithms-settings-for-pitch-mapping-5.png" caption="Click into the 'range' box and set it to 25. The values underneath will change automatically. Click into the 'to' box and set it to 60. Click back into the other box; the values will update." %}
@@ -244,7 +244,7 @@ Can you make your computer play this song? (This [chart](https://web.archive.org
244
244
245
245
### Getting your own data in
246
246
247
-
[This file](/assets/sonification-diary.csv) is a selection from the topic model fitted to John Adams' Diaries for[The Macroscope](http://themacroscope.org). Only the strongest signals have been preserved by rounding the values in the columns to two decimal places (remembering that .25 for instance would indicate that that topic is contributing to a quarter of that diary entry's composition). To get this data into your python script, it has to be formatted in a particular away. The tricky bit is getting the date field right.
247
+
[This file](/assets/sonification/sonification-diary.csv) is a selection from the topic model fitted to John Adams' Diaries for[The Macroscope](http://themacroscope.org). Only the strongest signals have been preserved by rounding the values in the columns to two decimal places (remembering that .25 for instance would indicate that that topic is contributing to a quarter of that diary entry's composition). To get this data into your python script, it has to be formatted in a particular away. The tricky bit is getting the date field right.
248
248
249
249
_For the purposes of this tutorial, we are going to leave the names of variables and so on unchanged from the sample script. The sample script was developed with earthquake data in mind; so where it says 'magnitude' we can think of it as equating to '% topic composition.'_
250
250
@@ -375,7 +375,7 @@ Why would you want to do this? As has progressively become clear in tutorial, wh
375
375
376
376
Here, I offer simply a code snippet that will allow you to import your data, where your data is simply a list of values saved as csv. I am indebted to George Washington University librarian Laura Wrubel who posted to [gist.github.com](https://gist.github.com/lwrubel) her experiments in sonifying her library's circulation transactions.
377
377
378
-
In this [sample file](/assets/sonification-jesuittopics.csv)(a topic model generated from the [Jesuit Relations](http://puffin.creighton.edu/jesuit/relations/)), there are two topics. The first row contains the headers: topic1, topic2.
378
+
In this [sample file](/assets/sonification/sonification-jesuittopics.csv)(a topic model generated from the [Jesuit Relations](http://puffin.creighton.edu/jesuit/relations/)), there are two topics. The first row contains the headers: topic1, topic2.
Copy file name to clipboardExpand all lines: es/lecciones/administracion-de-datos-en-r.md
+2-1
Original file line number
Diff line number
Diff line change
@@ -78,7 +78,8 @@ Copia el siguiente código en R Studio. Para ejecutarlo tienes que marcar las l
78
78
```
79
79
80
80
## Un ejemplo de dplyr en acción
81
-
Veamos un ejemplo de cómo dyplr nos puede ayudar a los historiadores. Vamos a cargar los datos del censo decenal de 1790 a 2010 de Estados Unidos. Descarga los datos haciendo [click aquí](/assets/data-wrangling-and-management-in-r/ejemplo_introductorio_estados.csv)[^2] y ponlos en la carpeta que vas a utilizar para trabajar en los ejemplos de este tutorial.
81
+
82
+
Veamos un ejemplo de cómo dyplr nos puede ayudar a los historiadores. Vamos a cargar los datos del censo decenal de 1790 a 2010 de Estados Unidos. Descarga los datos haciendo [click aquí](/assets/administracion-de-datos-en-r/ejemplo_introductorio_estados.csv)[^2] y ponlos en la carpeta que vas a utilizar para trabajar en los ejemplos de este tutorial.
82
83
83
84
Como los datos están en un archivo CSV, vamos a usar el comando de lectura ```read_csv()``` en el paquete [readr](https://cran.r-project.org/web/packages/readr/vignettes/readr.html) de "tidyverse".
0 commit comments