From c5ede726281059c9ee96cf2687d644f74dab9166 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Tue, 17 Jan 2023 15:44:50 -0500 Subject: [PATCH 01/15] Create data_mining_techniques.md --- .../data_mining_techniques.md | 442 ++++++++++++++++++ 1 file changed, 442 insertions(+) create mode 100644 data_mining_techniques/data_mining_techniques.md diff --git a/data_mining_techniques/data_mining_techniques.md b/data_mining_techniques/data_mining_techniques.md new file mode 100644 index 000000000..0a764e97f --- /dev/null +++ b/data_mining_techniques/data_mining_techniques.md @@ -0,0 +1,442 @@ + + +# Title + +(Note that the title is the only level-1 header in the document) + +To see how to use this template, you'll need to look at this file in its [raw format](https://raw.githubusercontent.com/arcus/education_r25/main/working_documentation/template_modules.md?token=ACEVZUTXZ6BTRFIIBXPN4SDBD3FR6). +To see what it looks like rendered via LiaScript, [click here](https://liascript.github.io/course/?https://raw.githubusercontent.com/arcus/education_r25/main/working_documentation/template_modules.md?token=ACEVZUTXZ6BTRFIIBXPN4SDBD3FR6#1) or go to [https://liascript.github.io/](https://liascript.github.io/) and paste the link to the **raw** file into the box on that page and click "load course". + +
+ +## Overview +@comment + +**Is this module right for me?** @long_description + +**Estimated time to completion:** @estimated_time + +**Pre-requisites** + +List any skills and knowledge needed to do this module here. When available, include links to resources, especially other modules we've made (to show learners where this falls within our catalog). + +* one skill we have [another module for, linked here](https://education.arcus.chop.edu) +* some familiarity with [a topic](https://education.arcus.chop.edu) +* understanding of [one thing](https://education.arcus.chop.edu) and [another](https://education.arcus.chop.edu) + +If relevant, you can include recommendations for somewhere else to start if the learner doesn't have these prereqs. For example: If you are brand new to R or python (or want a refresher) consider starting with [Intro to R](link) or [Intro to python](link) first and then coming back here. + +**Learning Objectives** + +@learning_objectives + +For help articulating learning objectives, see [this guide to learning objectives, including lots of example verbs](https://cft.vanderbilt.edu/guides-sub-pages/blooms-taxonomy/). + +
+ +## Lesson Preparation + +If your module includes code learners may want to run, then give links to a pangeo binder here so they can start it up now. Also provide a link to the raw code so learners can download the code itself and run it on their own machines or copy it into a cloud server. + +This module makes use of [pangeo binder](https://binder.pangeo.io/) for interactive code examples in R and python. You don't need to install anything or set up an account, but you need a modern web browser like Chrome and a moderately good wifi connection. If you have R and/or python already installed on your computer and you prefer to work through code examples there, you can download the code for this module to run offline. + +If you intend to do the hands-on activities in this module with pangeo binder, we have a bit of preparation for you to do now. Because it can take a few minutes for the environment to be created, we suggest you click the link below to start up the activity. We recommend using right-click to open it in a new tab or window, and then returning here to continue learning while the environment finishes loading. Here is the link: + +[![Link to start Binder environment](https://binder.pangeo.io/badge_logo.svg)](https://binder.pangeo.io/v2/gh/arcus/education_r_environment/main?urlpath=rstudio) **Click the "launch binder" button!** + +You don't have to do anything except come back here once the link opens in a new tab or window. + +## Module Content + +Note that liascript will create a new page at each level 1, 2, or 3 header, so to avoid a page with only a header and no content, include text after each header before the next. + +Text after level 2 headers provides a good opportunity to give a sentence or two of overview, explain the structure of the coming content, and/or get preliminaries out of the way. + +## Including Media + + +![A valuable image, and this is its alt text.](https://upload.wikimedia.org/wikipedia/commons/0/0f/Grosser_Panda.JPG "Here is a subtitle that will display beneath the image.") + + +You can link to images online with their url, or locally with the file path, e.g. `![image caption](media/my_image.png)` + +If you want to provide several images in a gallery, just make a "paragraph" of image links and LiaScript will render it as a gallery: + +![img1 alt text.](https://upload.wikimedia.org/wikipedia/commons/6/68/Ailuropoda_melanoleuca_%28Panda_g%C3%A9ant%29_-_445.jpg) ![img2 alt text.](https://upload.wikimedia.org/wikipedia/commons/2/2d/Panda_giganti_al_Giant_Panda_Breeding_Research_Base_Chengdu.jpg) ![img3 alt text.](https://upload.wikimedia.org/wikipedia/commons/1/12/BabyPandaAtSDZ.jpg) + + +!?[This video is hosted on youtube.](https://www.youtube.com/watch?v=iIAO4Htzn8M) + +You can also embed local videos, just as with images: `!?[An embedded video.](vid/intro.mp4 "This is its subtitle")` + +In theory, you should be able to embed just about anything. Read more [here](https://liascript.github.io/course/?https://raw.githubusercontent.com/LiaScript/docs/master/README.md#24). + +You can also include movies, audio, and any other embedded content in galleries just by putting the links for them all in a paragraph. + +## Including highlight boxes + +Include special notes with different formatting. + +Note: There's an additional style of highlight not listed here, "answer", that is used in [quizzes](#quiz-quizzes). + + +### behind-the-scenes + +The style "behind-the-scenes" is for giving a little more technical detail about how something does what it does. +It has a gears icon and always begins with the text "Behind the scenes". +For example: + +
+Behind the scenes
+ +The commit number is a hash and associated details + +
+ +
+Behind the scenes
+ +In R the `<-` and `=` can both be used for assignment because... + +
+ +### care + +The style "care" is for content related to compassion, self-care, and motivation. For more technical help or troubleshooting, use "help" instead. +It has a hand-holding-heart icon and always begins with the text "A little encouragement..." +For example: + +
+A little encouragement...
+ +This is a topic with a tremendous amount of jargon, which can make resources you may find online hard to understand for folks new to the field. When that happens it's easy to feel like there's something wrong with you if you don't get it, but that's not the case! Those kinds of gatekeeping explanations are a failure on the part of the writer, not the learner. + +
+ +
+A little encouragement...
+ +Feeling overwhelmed? It takes a long time to learn git, so don't be disheartened if it doesn't click initially. Just focus on stage, commit, and push. Ignore the rest for now, until you've had a chance to practice just the stage-commit-push process several times. + +
+ +### cool-fact + +The style "cool-fact" is for any cool fact that really doesn't fit into any of our other categories. +It has a brain icon and always begins with the text "Did you know?" +For example: + +
+Did you know?
+ +Functions like this are sometimes called "syntactic sugar" because they don't change anything about how the code runs, they just make it easier for humans to read, the way that sugar makes food sweeter without adding any nutrition. + +
+ + +
+Did you know?
+ +This is a box showing how images work in a highlight box. + +![Carebear team.](https://media.giphy.com/media/W256ghnG9iV8I/giphy.gif) + + +
+ + +### help + +The style "help" is for troubleshooting help, common errors, and specific technical problems. If you want to emphasize a very serious potential problem, use "warning" instead. For support that is more psycho/emotional or meta-learning in nature, use "care" instead. +It has a circle-question icon and always begins with the text "Troubleshooting help". +For example: + +
+Troubleshooting help
+ +A common mistake when using `filter` is to write = when you mean ==. Remember that = is for argument assignment, and == is for testing equality in conditions. If you get them mixed up, your code won't run! + +
+ +### history + +The style "history" is for more historical context about how/when/why something came to be the way it currently is. +It has a clock-rotate-left icon and always begins with the text "Historical context". +For example: + +
+Historical context
+ +The reason this command is named grep is... + +
+ +
+Historical context
+ +The first README file was from 1971, etc. + +
+ + +### important + +The style "important" is for important points and key ideas. +It has a star icon and always begins with the text "Important note". +For example: + +
+Important note
+ +It's generally much easier to make any necessary changes to the dataframe, such as mutating variables, before sending it to the plotting command. + +
+ +### learn-more + +The style "learn-more" alerts users resources for further learning, especially links to a more in-depth discussion of an issue that might be touched on only briefly in the module. It can link to outside sources, or other modules by us. +It has a book icon and always begins with the text "Learning connection". +For example: + +
+Learning connection
+ +To learn more about the theory behind ggplot2, read [Hadley Wickham's article, "A Layered Grammar of Graphics"](http://vita.had.co.nz/papers/layered-grammar.pdf). + +
+ + +
+Learning connection
+ +To do this in R instead of python, see [this other module](example.com). + +
+ + +### options + +The style "options" is for an aside to let learners know there's another possible approach. This is for short explanations rather than linked resources; to link to another approach (e.g. here's a tutorial for another way to do this), use "learn-more" instead. +It has a left-right arrow icon and always begins with the text "Another option". +For example: + +
+Another option
+ +You could also skip setting up an OSF account completely and just use github to publish and share your research products, but many people prefer to have OSF links available. + +
+ +
+Another option
+ +You can run this in the cloud or download all of the files locally and run it on your computer. If you run it on your computer, be sure to make note of which directory you save the files in and update your working directory accordingly. + +
+ +### warning + +The style "warning" alerts users to potential pitfalls, and should be reserved for serious problems only. For less serious problems, use "help" instead. +It has a ! triangle icon and always begins with the text "Warning!". +For example: + +
+Warning!
+ +Double check your working directory before running this code. If you're in the wrong directory, you risk overwriting your files and losing all of your work with no way to recover it. + +
+ +
+Warning!
+ +Files uploaded to this account will be **publicly visible**. Be very careful not to upload anything with sensitive information like passwords or private data. + +
+ +## Including math + +I want to include a math statement here: $ 1 + \beta = 2 $ + +## Including code + +Next comes some code. This code won't do anything (it's not interactive). + +```r +# You only need to install it once +install.packages("ggplot2") + +# You'll need to load the library anew for each R session +library("ggplot2") +``` +You don't have to specify the programming language, but you can, and it should help you get appropriate syntax highlighting. + +```python +print("This is python code") +``` + +It is possible to include interactive code, too! See [the Rextester template for LiaScript](https://github.com/LiaTemplates/Rextester). + +## Quiz: Quizzes + +Quizzes are just more markdown text, so if you want it to show up on its own page, put a new header before it. Otherwise you can include quiz questions at the end of a section, or even interspersed with the rest of your content. + +Quizzes should connect directly to your learning objectives. Each quiz question should connect to one learning objective, and every learning objective should have at least one quiz question associated with it somewhere in the module. + +Quizzes should always be navigable from the sidebar, meaning they should be labeled with a level 2 or 3 header. If there is only one quiz in the module, it should be labelled as "Quiz". If there is more than one each header should be structured as "Quiz: label" where "label" is a short (ideally 1-2 words) description of the content covered in the question(s). E.g., "Quiz: Scatterplots" + +Here is the first question. It's multiple choice. + +[(X)] This answer is right +[( )] This is wrong +[( )] Also wrong +[[?]] Hint: Provide a hint here if you like. Hints are marked with the ? +[[?]] Hint: You can include as many hints as you want. + +You can have questions with multiple correct answers. Select all of the following correct choices: + +[[ ]] Not this one +[[X]] This is one of the correct ones +[[X]] Here's another correct one +[[ ]] This one is wrong, though +[[?]] Hint: Remember to select ALL of the correct choices. + +True or False: This statement is NOT true. ;) + +[( )] TRUE +[(X)] FALSE + +Short answer/text response. Note that, without any additional script, to get it marked "correct" the learner has to enter it exactly as you do. + +[[right answer]] +[[?]] Hint: The answer is "right answer" +*** +
+ +This is extra text that will show up after the learner clicks to have the correct answer revealed. It can be as long as you like, and allows any markdown formatting (you can embed pictures or videos, links, etc.). + +Use `
` to mark these sections with special styling, so that they're visually distinct from the rest of the quiz. The style for `"answer"` is defined in the css file. + +For this context to show up automatically when the learner answers the question correctly or clicks to have the right answer revealed, it needs to be surrounded by `***` (at least three, but you can use more if you want a more visually distinct horizontal marker in your md file). + +
+*** + +We can allow some flexibility in what we accept as correct answers for text by adding a little script after the answer, though. For the following, either "right answer" or "correct answer" (not case sensitive) will be accepted: + +[[right answer]] + +*** +
+ +For this question, either "right answer" or "correct answer" (not case sensitive) counts as correct. + +
+*** + +This question accepts any of several items from a list of possible correct answers. It is not case sensitive (that's the little `i` at the end of the regex). + +[[this text will never show up if they type a right answer and click "Check", only if they click the checkmark button to reveal the answer]] +[[?]] Hint: The answers are like "item1", "item2", etc. + +*** +
+ +With flexible answers like this, it's definitely a good idea to include a follow-up to help the learner put their answer in context. + +For example, if the question was "Name one or more colors" with acceptable answers including red, orange, yellow, green, blue, and purple, and they wrote "red, green, and the center of a black hole" that would be marked as correct because it contains at least one string from the acceptable list. Similarly, "hammered metal" would be marked as correct because it contains the string "red" ([you can prevent this if you want](https://www.w3schools.com/jsref/jsref_regexp_begin.asp)). On the other hand "teal, scarlet, indigo" would be marked wrong. + +Reiterate what the correct answer or answers should be, and try to anticipate likely wrong answers so you can explain why they're not correct. + +
+*** + +There are also questions that allow you to select from a drop down, but I don't know why that would be preferable over regular multiple choice. [Read more about quiz syntax here.](https://liascript.github.io/course/?https://raw.githubusercontent.com/liaScript/docs/master/README.md#quizzes) + +Note that you can use any markdown formatting you want in quizzes, including bold, links, math, etc. + +Surveys (ungraded questions) +--- + +You can ask questions with no graded answer as well. LiaScript calls these [surveys](https://liascript.github.io/course/?https://raw.githubusercontent.com/LiaScript/docs/master/README.md#111). + +Here's an ungraded question with a text box three lines long: + +[[___ ___ ___]] + +Here's one that's just one line long: + +[[___]] + +Here's a multiple choice with no correct answer. What is your favorite Beatles album? + +[(rev)] Revolver +[(wa)] The While Album +[(ar)] Abbey Road +[(sgtp)] Sgt. Pepper's Lonely Hearts Club Band + +Here's a survey multiple choice that lets you select more than one response. Which Beatles albums do you love super hard? + +[[rev]] Revolver +[[wa]] The While Album +[[ar]] Abbey Road +[[sgtp]] Sgt. Pepper's Lonely Hearts Club Band + +Hints and follow-up explanations don't work for survey questions. + + +## Additional Resources + +The last section of the module content should be a list of additional resources, both ours and outside sources, including links to other modules that build on this content or are otherwise related. + +## Feedback + +In the beginning, we stated some goals. + +**Learning Objectives:** + +@learning_objectives + +We ask you to fill out a brief (5 minutes or less) survey to let us know: + +* If we achieved the learning objectives +* If the module difficulty was appropriate +* If we gave you the experience you expected + +We gather this information in order to iteratively improve our work. Thank you in advance for filling out [our brief survey](https://redcap.chop.edu/surveys/?s=KHTXCXJJ93&module_name=%22Module+Template%22)! + +Remember to change the redcap link so that the module name is correct for this module! From a01ae1942bca548d5e3b02158a910a2084946430 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 11:16:18 -0500 Subject: [PATCH 02/15] new new --- .../demystifying_data_mining.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename data_mining_techniques/data_mining_techniques.md => demystifying_data_mining/demystifying_data_mining.md (100%) diff --git a/data_mining_techniques/data_mining_techniques.md b/demystifying_data_mining/demystifying_data_mining.md similarity index 100% rename from data_mining_techniques/data_mining_techniques.md rename to demystifying_data_mining/demystifying_data_mining.md From 118baf0bcc3e25571212288971ef38ff50f855f0 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 12:05:02 -0500 Subject: [PATCH 03/15] new new --- .../demystifying_data_mining.md | 53 ++++++++----------- 1 file changed, 22 insertions(+), 31 deletions(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index 0a764e97f..c5c0b146d 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -1,24 +1,24 @@ -# Title - -(Note that the title is the only level-1 header in the document) - -To see how to use this template, you'll need to look at this file in its [raw format](https://raw.githubusercontent.com/arcus/education_r25/main/working_documentation/template_modules.md?token=ACEVZUTXZ6BTRFIIBXPN4SDBD3FR6). -To see what it looks like rendered via LiaScript, [click here](https://liascript.github.io/course/?https://raw.githubusercontent.com/arcus/education_r25/main/working_documentation/template_modules.md?token=ACEVZUTXZ6BTRFIIBXPN4SDBD3FR6#1) or go to [https://liascript.github.io/](https://liascript.github.io/) and paste the link to the **raw** file into the box on that page and click "load course". +# Demystifying Data Mining
@@ -45,34 +40,30 @@ To see what it looks like rendered via LiaScript, [click here](https://liascript **Estimated time to completion:** @estimated_time **Pre-requisites** - -List any skills and knowledge needed to do this module here. When available, include links to resources, especially other modules we've made (to show learners where this falls within our catalog). - -* one skill we have [another module for, linked here](https://education.arcus.chop.edu) -* some familiarity with [a topic](https://education.arcus.chop.edu) -* understanding of [one thing](https://education.arcus.chop.edu) and [another](https://education.arcus.chop.edu) - -If relevant, you can include recommendations for somewhere else to start if the learner doesn't have these prereqs. For example: If you are brand new to R or python (or want a refresher) consider starting with [Intro to R](link) or [Intro to python](link) first and then coming back here. +No perequisites **Learning Objectives** @learning_objectives -For help articulating learning objectives, see [this guide to learning objectives, including lots of example verbs](https://cft.vanderbilt.edu/guides-sub-pages/blooms-taxonomy/). +
-## Lesson Preparation +## What is Data Mining? + +Data mining is basically extracting useful information from huge sets of data. During research, huge amount of data is gathered. Data mining is the process of mining that data to extract the relevant information that can be used to make informed decisions. + +Data mining is the inbetween phase of gathering the data and creating a model. -If your module includes code learners may want to run, then give links to a pangeo binder here so they can start it up now. Also provide a link to the raw code so learners can download the code itself and run it on their own machines or copy it into a cloud server. +Models are a system of representation of events/instances -This module makes use of [pangeo binder](https://binder.pangeo.io/) for interactive code examples in R and python. You don't need to install anything or set up an account, but you need a modern web browser like Chrome and a moderately good wifi connection. If you have R and/or python already installed on your computer and you prefer to work through code examples there, you can download the code for this module to run offline. +Gathering Data -> Mining data (DATA MINING!) -> creating models. -If you intend to do the hands-on activities in this module with pangeo binder, we have a bit of preparation for you to do now. Because it can take a few minutes for the environment to be created, we suggest you click the link below to start up the activity. We recommend using right-click to open it in a new tab or window, and then returning here to continue learning while the environment finishes loading. Here is the link: +The primary purpose of mining data is to identify trends, patterns, and relationships in order to make informed decisions and plans. -[![Link to start Binder environment](https://binder.pangeo.io/badge_logo.svg)](https://binder.pangeo.io/v2/gh/arcus/education_r_environment/main?urlpath=rstudio) **Click the "launch binder" button!** +**Note:** Data mining becomes very useful when dealing with very large data i.e the more data available, the more accurate and indepth the trends, patterns and relationships identified. -You don't have to do anything except come back here once the link opens in a new tab or window. ## Module Content From 9f6340c3db995d6958cecec828922ef3e8d1451d Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 13:28:23 -0500 Subject: [PATCH 04/15] update update --- .../demystifying_data_mining.md | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index c5c0b146d..eb1869fe6 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -64,6 +64,36 @@ The primary purpose of mining data is to identify trends, patterns, and relation **Note:** Data mining becomes very useful when dealing with very large data i.e the more data available, the more accurate and indepth the trends, patterns and relationships identified. +## Why is Data Mining important? + +Data mining is used by large companies for different reasons, one of which is to get as much information from the data they have of their consumers as possible. + +As data mining is also used to predict trends, large companies use it in order to prepare for their next line of production. This prediction capability is also used by stores to determine things such as what consumers are buying most and what placements of good encourages the consumer to spend more at their respective stores. That is, if we place the cereal section close to the milk section, would the consumer feel more inclined to buy cereal? + +Back to the ability to predict trends, data mining can be used to predict the future. That is why it is so favoured by large business and cooperation (who doesnt want the ability to predict the next big thing!). Data mining does this by consolidating the data and using previous events(gotten from the data) to run a sort of probability test of what could happen in the future. + +------ +**Example:** M hospital decided to get the data they have from the last five years, they clean the data and then mine it. After mining the data, it is found that during the last 3 springs, there has been a large influx of patient visitiing the hospital with cases of Rhinovirus. According to the data mined, it is predicted that the same would happen this spring. + +Data mining allowed the hospital to search through the gathered data for relevant information in order to make prediction about the future. + +## Applications of Data Mining + +**Data mining in Healtcare:** With the data accumulated in healthcare, data mining can help find the most appropriate and cost effective practices that benefit both the hospital and the pateints + +**Data mining in Research analysis:** Data mining is one of the best tools for cleaning data, pre-processing data and integrating data into a database, which makes it ideal for researchers. Data mining can help identify the correlation between activities or co-occurring sequences that can bring about change in the research direction. Data mining, when used with data visualization and visual data mining, can help clarify data in research. + +## Challenges of Data Mining + +**Big data:** Many existing systems struggle with handling, storing, and making use of the flood of unorganised input that come with handling big data set. Most system crash. + +**User competency:** To fully gain the benefits of data mining, the user must understand the data available and the context of the information they are seeking. They must also know, at least generally, how the tools (for data mining) work and what they can do. + +**Data quality and availability:** Like with anything that need data to be gathered, the quality of the data gathered is very significant. Mining data from low quality data will give low quality result. When the input is bad, the output would also be bad. Privacy concerns are also important, both in terms of the acquisition of the data and the care and handling once it is in your possession. + +## Additional Resources + + ## Module Content From e5869f78662da1578b012158b8ed8e32b715231b Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 15:26:57 -0500 Subject: [PATCH 05/15] update update --- .../demystifying_data_mining.md | 365 ------------------ 1 file changed, 365 deletions(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index eb1869fe6..2cbe2de71 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -95,369 +95,4 @@ Data mining allowed the hospital to search through the gathered data for relevan -## Module Content -Note that liascript will create a new page at each level 1, 2, or 3 header, so to avoid a page with only a header and no content, include text after each header before the next. - -Text after level 2 headers provides a good opportunity to give a sentence or two of overview, explain the structure of the coming content, and/or get preliminaries out of the way. - -## Including Media - - -![A valuable image, and this is its alt text.](https://upload.wikimedia.org/wikipedia/commons/0/0f/Grosser_Panda.JPG "Here is a subtitle that will display beneath the image.") - - -You can link to images online with their url, or locally with the file path, e.g. `![image caption](media/my_image.png)` - -If you want to provide several images in a gallery, just make a "paragraph" of image links and LiaScript will render it as a gallery: - -![img1 alt text.](https://upload.wikimedia.org/wikipedia/commons/6/68/Ailuropoda_melanoleuca_%28Panda_g%C3%A9ant%29_-_445.jpg) ![img2 alt text.](https://upload.wikimedia.org/wikipedia/commons/2/2d/Panda_giganti_al_Giant_Panda_Breeding_Research_Base_Chengdu.jpg) ![img3 alt text.](https://upload.wikimedia.org/wikipedia/commons/1/12/BabyPandaAtSDZ.jpg) - - -!?[This video is hosted on youtube.](https://www.youtube.com/watch?v=iIAO4Htzn8M) - -You can also embed local videos, just as with images: `!?[An embedded video.](vid/intro.mp4 "This is its subtitle")` - -In theory, you should be able to embed just about anything. Read more [here](https://liascript.github.io/course/?https://raw.githubusercontent.com/LiaScript/docs/master/README.md#24). - -You can also include movies, audio, and any other embedded content in galleries just by putting the links for them all in a paragraph. - -## Including highlight boxes - -Include special notes with different formatting. - -Note: There's an additional style of highlight not listed here, "answer", that is used in [quizzes](#quiz-quizzes). - - -### behind-the-scenes - -The style "behind-the-scenes" is for giving a little more technical detail about how something does what it does. -It has a gears icon and always begins with the text "Behind the scenes". -For example: - -
-Behind the scenes
- -The commit number is a hash and associated details - -
- -
-Behind the scenes
- -In R the `<-` and `=` can both be used for assignment because... - -
- -### care - -The style "care" is for content related to compassion, self-care, and motivation. For more technical help or troubleshooting, use "help" instead. -It has a hand-holding-heart icon and always begins with the text "A little encouragement..." -For example: - -
-A little encouragement...
- -This is a topic with a tremendous amount of jargon, which can make resources you may find online hard to understand for folks new to the field. When that happens it's easy to feel like there's something wrong with you if you don't get it, but that's not the case! Those kinds of gatekeeping explanations are a failure on the part of the writer, not the learner. - -
- -
-A little encouragement...
- -Feeling overwhelmed? It takes a long time to learn git, so don't be disheartened if it doesn't click initially. Just focus on stage, commit, and push. Ignore the rest for now, until you've had a chance to practice just the stage-commit-push process several times. - -
- -### cool-fact - -The style "cool-fact" is for any cool fact that really doesn't fit into any of our other categories. -It has a brain icon and always begins with the text "Did you know?" -For example: - -
-Did you know?
- -Functions like this are sometimes called "syntactic sugar" because they don't change anything about how the code runs, they just make it easier for humans to read, the way that sugar makes food sweeter without adding any nutrition. - -
- - -
-Did you know?
- -This is a box showing how images work in a highlight box. - -![Carebear team.](https://media.giphy.com/media/W256ghnG9iV8I/giphy.gif) - - -
- - -### help - -The style "help" is for troubleshooting help, common errors, and specific technical problems. If you want to emphasize a very serious potential problem, use "warning" instead. For support that is more psycho/emotional or meta-learning in nature, use "care" instead. -It has a circle-question icon and always begins with the text "Troubleshooting help". -For example: - -
-Troubleshooting help
- -A common mistake when using `filter` is to write = when you mean ==. Remember that = is for argument assignment, and == is for testing equality in conditions. If you get them mixed up, your code won't run! - -
- -### history - -The style "history" is for more historical context about how/when/why something came to be the way it currently is. -It has a clock-rotate-left icon and always begins with the text "Historical context". -For example: - -
-Historical context
- -The reason this command is named grep is... - -
- -
-Historical context
- -The first README file was from 1971, etc. - -
- - -### important - -The style "important" is for important points and key ideas. -It has a star icon and always begins with the text "Important note". -For example: - -
-Important note
- -It's generally much easier to make any necessary changes to the dataframe, such as mutating variables, before sending it to the plotting command. - -
- -### learn-more - -The style "learn-more" alerts users resources for further learning, especially links to a more in-depth discussion of an issue that might be touched on only briefly in the module. It can link to outside sources, or other modules by us. -It has a book icon and always begins with the text "Learning connection". -For example: - -
-Learning connection
- -To learn more about the theory behind ggplot2, read [Hadley Wickham's article, "A Layered Grammar of Graphics"](http://vita.had.co.nz/papers/layered-grammar.pdf). - -
- - -
-Learning connection
- -To do this in R instead of python, see [this other module](example.com). - -
- - -### options - -The style "options" is for an aside to let learners know there's another possible approach. This is for short explanations rather than linked resources; to link to another approach (e.g. here's a tutorial for another way to do this), use "learn-more" instead. -It has a left-right arrow icon and always begins with the text "Another option". -For example: - -
-Another option
- -You could also skip setting up an OSF account completely and just use github to publish and share your research products, but many people prefer to have OSF links available. - -
- -
-Another option
- -You can run this in the cloud or download all of the files locally and run it on your computer. If you run it on your computer, be sure to make note of which directory you save the files in and update your working directory accordingly. - -
- -### warning - -The style "warning" alerts users to potential pitfalls, and should be reserved for serious problems only. For less serious problems, use "help" instead. -It has a ! triangle icon and always begins with the text "Warning!". -For example: - -
-Warning!
- -Double check your working directory before running this code. If you're in the wrong directory, you risk overwriting your files and losing all of your work with no way to recover it. - -
- -
-Warning!
- -Files uploaded to this account will be **publicly visible**. Be very careful not to upload anything with sensitive information like passwords or private data. - -
- -## Including math - -I want to include a math statement here: $ 1 + \beta = 2 $ - -## Including code - -Next comes some code. This code won't do anything (it's not interactive). - -```r -# You only need to install it once -install.packages("ggplot2") - -# You'll need to load the library anew for each R session -library("ggplot2") -``` -You don't have to specify the programming language, but you can, and it should help you get appropriate syntax highlighting. - -```python -print("This is python code") -``` - -It is possible to include interactive code, too! See [the Rextester template for LiaScript](https://github.com/LiaTemplates/Rextester). - -## Quiz: Quizzes - -Quizzes are just more markdown text, so if you want it to show up on its own page, put a new header before it. Otherwise you can include quiz questions at the end of a section, or even interspersed with the rest of your content. - -Quizzes should connect directly to your learning objectives. Each quiz question should connect to one learning objective, and every learning objective should have at least one quiz question associated with it somewhere in the module. - -Quizzes should always be navigable from the sidebar, meaning they should be labeled with a level 2 or 3 header. If there is only one quiz in the module, it should be labelled as "Quiz". If there is more than one each header should be structured as "Quiz: label" where "label" is a short (ideally 1-2 words) description of the content covered in the question(s). E.g., "Quiz: Scatterplots" - -Here is the first question. It's multiple choice. - -[(X)] This answer is right -[( )] This is wrong -[( )] Also wrong -[[?]] Hint: Provide a hint here if you like. Hints are marked with the ? -[[?]] Hint: You can include as many hints as you want. - -You can have questions with multiple correct answers. Select all of the following correct choices: - -[[ ]] Not this one -[[X]] This is one of the correct ones -[[X]] Here's another correct one -[[ ]] This one is wrong, though -[[?]] Hint: Remember to select ALL of the correct choices. - -True or False: This statement is NOT true. ;) - -[( )] TRUE -[(X)] FALSE - -Short answer/text response. Note that, without any additional script, to get it marked "correct" the learner has to enter it exactly as you do. - -[[right answer]] -[[?]] Hint: The answer is "right answer" -*** -
- -This is extra text that will show up after the learner clicks to have the correct answer revealed. It can be as long as you like, and allows any markdown formatting (you can embed pictures or videos, links, etc.). - -Use `
` to mark these sections with special styling, so that they're visually distinct from the rest of the quiz. The style for `"answer"` is defined in the css file. - -For this context to show up automatically when the learner answers the question correctly or clicks to have the right answer revealed, it needs to be surrounded by `***` (at least three, but you can use more if you want a more visually distinct horizontal marker in your md file). - -
-*** - -We can allow some flexibility in what we accept as correct answers for text by adding a little script after the answer, though. For the following, either "right answer" or "correct answer" (not case sensitive) will be accepted: - -[[right answer]] - -*** -
- -For this question, either "right answer" or "correct answer" (not case sensitive) counts as correct. - -
-*** - -This question accepts any of several items from a list of possible correct answers. It is not case sensitive (that's the little `i` at the end of the regex). - -[[this text will never show up if they type a right answer and click "Check", only if they click the checkmark button to reveal the answer]] -[[?]] Hint: The answers are like "item1", "item2", etc. - -*** -
- -With flexible answers like this, it's definitely a good idea to include a follow-up to help the learner put their answer in context. - -For example, if the question was "Name one or more colors" with acceptable answers including red, orange, yellow, green, blue, and purple, and they wrote "red, green, and the center of a black hole" that would be marked as correct because it contains at least one string from the acceptable list. Similarly, "hammered metal" would be marked as correct because it contains the string "red" ([you can prevent this if you want](https://www.w3schools.com/jsref/jsref_regexp_begin.asp)). On the other hand "teal, scarlet, indigo" would be marked wrong. - -Reiterate what the correct answer or answers should be, and try to anticipate likely wrong answers so you can explain why they're not correct. - -
-*** - -There are also questions that allow you to select from a drop down, but I don't know why that would be preferable over regular multiple choice. [Read more about quiz syntax here.](https://liascript.github.io/course/?https://raw.githubusercontent.com/liaScript/docs/master/README.md#quizzes) - -Note that you can use any markdown formatting you want in quizzes, including bold, links, math, etc. - -Surveys (ungraded questions) ---- - -You can ask questions with no graded answer as well. LiaScript calls these [surveys](https://liascript.github.io/course/?https://raw.githubusercontent.com/LiaScript/docs/master/README.md#111). - -Here's an ungraded question with a text box three lines long: - -[[___ ___ ___]] - -Here's one that's just one line long: - -[[___]] - -Here's a multiple choice with no correct answer. What is your favorite Beatles album? - -[(rev)] Revolver -[(wa)] The While Album -[(ar)] Abbey Road -[(sgtp)] Sgt. Pepper's Lonely Hearts Club Band - -Here's a survey multiple choice that lets you select more than one response. Which Beatles albums do you love super hard? - -[[rev]] Revolver -[[wa]] The While Album -[[ar]] Abbey Road -[[sgtp]] Sgt. Pepper's Lonely Hearts Club Band - -Hints and follow-up explanations don't work for survey questions. - - -## Additional Resources - -The last section of the module content should be a list of additional resources, both ours and outside sources, including links to other modules that build on this content or are otherwise related. - -## Feedback - -In the beginning, we stated some goals. - -**Learning Objectives:** - -@learning_objectives - -We ask you to fill out a brief (5 minutes or less) survey to let us know: - -* If we achieved the learning objectives -* If the module difficulty was appropriate -* If we gave you the experience you expected - -We gather this information in order to iteratively improve our work. Thank you in advance for filling out [our brief survey](https://redcap.chop.edu/surveys/?s=KHTXCXJJ93&module_name=%22Module+Template%22)! - -Remember to change the redcap link so that the module name is correct for this module! From 3cc9c51013b5c483c2af432bbfc08205a979d22c Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 15:34:03 -0500 Subject: [PATCH 06/15] update update --- demystifying_data_mining/demystifying_data_mining.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index 2cbe2de71..c92b1721b 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -9,8 +9,7 @@ narrator: UK English Female title: Demystifying Data Mining comment: understand what data mining is and why it is important. long_description: Everyday, huge amount of data is generated, collected and stored. Learn what data mining is and why it is important. -estimated_time: This is rough guess of how long it might take a learner to work through the module. It will print under "Estimated time to completion" in the overview - +estimated_time: 15 minutes @learning_objectives After completion of this module, learners will be able to: @@ -72,8 +71,10 @@ As data mining is also used to predict trends, large companies use it in order t Back to the ability to predict trends, data mining can be used to predict the future. That is why it is so favoured by large business and cooperation (who doesnt want the ability to predict the next big thing!). Data mining does this by consolidating the data and using previous events(gotten from the data) to run a sort of probability test of what could happen in the future. ------- -**Example:** M hospital decided to get the data they have from the last five years, they clean the data and then mine it. After mining the data, it is found that during the last 3 springs, there has been a large influx of patient visitiing the hospital with cases of Rhinovirus. According to the data mined, it is predicted that the same would happen this spring. +
+Learning connection
+ +M hospital decided to get the data they have from the last five years, they clean the data and then mine it. After mining the data, it is found that during the last 3 springs, there has been a large influx of patient visitiing the hospital with cases of Rhinovirus. According to the data mined, it is predicted that the same would happen this spring. Data mining allowed the hospital to search through the gathered data for relevant information in order to make prediction about the future. From a9ed4973832f2eb8b185659a5322363a97dc7bef Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 15:39:14 -0500 Subject: [PATCH 07/15] update update --- demystifying_data_mining/demystifying_data_mining.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index c92b1721b..eecf69615 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -45,8 +45,6 @@ No perequisites @learning_objectives - -
## What is Data Mining? @@ -71,8 +69,8 @@ As data mining is also used to predict trends, large companies use it in order t Back to the ability to predict trends, data mining can be used to predict the future. That is why it is so favoured by large business and cooperation (who doesnt want the ability to predict the next big thing!). Data mining does this by consolidating the data and using previous events(gotten from the data) to run a sort of probability test of what could happen in the future. -
-Learning connection
+
+Example
M hospital decided to get the data they have from the last five years, they clean the data and then mine it. After mining the data, it is found that during the last 3 springs, there has been a large influx of patient visitiing the hospital with cases of Rhinovirus. According to the data mined, it is predicted that the same would happen this spring. From 2c4413ffaf30af08a9a25677b4acf44185e94a00 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 16:25:15 -0500 Subject: [PATCH 08/15] update update --- demystifying_data_mining/demystifying_data_mining.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index eecf69615..620fb70b6 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -92,6 +92,5 @@ Data mining allowed the hospital to search through the gathered data for relevan ## Additional Resources - - +[Top Ten Data Mining Applications in the Real World](https://intellipaat.com/blog/top-data-mining-applications/), provide more indepth examples of how data mining can be applied to real world situations and jobs. From 30410dc0fa0344a2688654b72cda9eb5e3202067 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Mon, 23 Jan 2023 16:30:13 -0500 Subject: [PATCH 09/15] update update --- demystifying_data_mining/demystifying_data_mining.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index 620fb70b6..f050fdb90 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -92,5 +92,5 @@ Data mining allowed the hospital to search through the gathered data for relevan ## Additional Resources -[Top Ten Data Mining Applications in the Real World](https://intellipaat.com/blog/top-data-mining-applications/), provide more indepth examples of how data mining can be applied to real world situations and jobs. +The [Top Ten Data Mining Applications in the Real World](https://intellipaat.com/blog/top-data-mining-applications/),a free online eduational site, provides more indepth examples of how data mining can be applied to real world situations and jobs. From 7d9cbeb10bbfead520c45c19105a94ff72f6cc10 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Tue, 24 Jan 2023 12:33:42 -0500 Subject: [PATCH 10/15] Update demystifying_data_mining.md I think this is the final output? --- demystifying_data_mining/demystifying_data_mining.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index f050fdb90..bde97f53c 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -88,7 +88,7 @@ Data mining allowed the hospital to search through the gathered data for relevan **User competency:** To fully gain the benefits of data mining, the user must understand the data available and the context of the information they are seeking. They must also know, at least generally, how the tools (for data mining) work and what they can do. -**Data quality and availability:** Like with anything that need data to be gathered, the quality of the data gathered is very significant. Mining data from low quality data will give low quality result. When the input is bad, the output would also be bad. Privacy concerns are also important, both in terms of the acquisition of the data and the care and handling once it is in your possession. +**Data quality and availability:** Data quality and availability:If the input is of low quality, the output will also be low quality. Data mining needs data and like with anything that needs data to be collected, the quality of that data needs to high. Mining low quality data will give low quality information. ## Additional Resources From 98d10df803fef7632c9d15a42a5613c0b9c7f211 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Tue, 24 Jan 2023 12:38:45 -0500 Subject: [PATCH 11/15] Update demystifying_data_mining.md update --- demystifying_data_mining/demystifying_data_mining.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index bde97f53c..8582dcca0 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -2,7 +2,7 @@ author: Agoro Oluwadamilare email: agoroo@chop.edu -version: 1.0.0 +version: 2.0.0 module_template_version: 3.0.0 language: en narrator: UK English Female From 41906751e611be5fdaf417129cd5c9b5f59e922a Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Tue, 24 Jan 2023 14:55:14 -0500 Subject: [PATCH 12/15] Update demystifying_data_mining.md correction on typo --- demystifying_data_mining/demystifying_data_mining.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index 8582dcca0..ffb0e6f69 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -88,7 +88,7 @@ Data mining allowed the hospital to search through the gathered data for relevan **User competency:** To fully gain the benefits of data mining, the user must understand the data available and the context of the information they are seeking. They must also know, at least generally, how the tools (for data mining) work and what they can do. -**Data quality and availability:** Data quality and availability:If the input is of low quality, the output will also be low quality. Data mining needs data and like with anything that needs data to be collected, the quality of that data needs to high. Mining low quality data will give low quality information. +**Data quality and availability:**If the input is low quality, the output will also be low quality. Data mining needs data and like with anything that needs data to be collected, the quality of that data needs to high. Mining low quality data will give low quality information. ## Additional Resources From 309aa6da825c02268dda60d213d1990c00760cc2 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Wed, 25 Jan 2023 13:23:56 -0500 Subject: [PATCH 13/15] Update demystifying_data_mining.md update --- demystifying_data_mining/demystifying_data_mining.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index ffb0e6f69..c464ab575 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -76,6 +76,9 @@ M hospital decided to get the data they have from the last five years, they clea Data mining allowed the hospital to search through the gathered data for relevant information in order to make prediction about the future. +
+ + ## Applications of Data Mining **Data mining in Healtcare:** With the data accumulated in healthcare, data mining can help find the most appropriate and cost effective practices that benefit both the hospital and the pateints From 40f256c9f2fcafdb898dc19bf2b2b92201e33550 Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Thu, 26 Jan 2023 12:30:48 -0500 Subject: [PATCH 14/15] Update demystifying_data_mining.md update --- demystifying_data_mining/demystifying_data_mining.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index c464ab575..6fa5ba458 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -91,7 +91,7 @@ Data mining allowed the hospital to search through the gathered data for relevan **User competency:** To fully gain the benefits of data mining, the user must understand the data available and the context of the information they are seeking. They must also know, at least generally, how the tools (for data mining) work and what they can do. -**Data quality and availability:**If the input is low quality, the output will also be low quality. Data mining needs data and like with anything that needs data to be collected, the quality of that data needs to high. Mining low quality data will give low quality information. +**Data quality and availability:** If the input is low quality, the output will also be low quality. Data mining needs data and like with anything that needs data to be collected, the quality of that data needs to high. Mining low quality data will give low quality information. ## Additional Resources From 7132fc3335e9a0cbb77cf459a68948bb09634a0c Mon Sep 17 00:00:00 2001 From: Damizen2 <105399365+Damizen2@users.noreply.github.com> Date: Thu, 26 Jan 2023 14:43:56 -0500 Subject: [PATCH 15/15] Update demystifying_data_mining.md update --- .../demystifying_data_mining.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/demystifying_data_mining/demystifying_data_mining.md b/demystifying_data_mining/demystifying_data_mining.md index 6fa5ba458..5c832191d 100644 --- a/demystifying_data_mining/demystifying_data_mining.md +++ b/demystifying_data_mining/demystifying_data_mining.md @@ -95,5 +95,21 @@ Data mining allowed the hospital to search through the gathered data for relevan ## Additional Resources -The [Top Ten Data Mining Applications in the Real World](https://intellipaat.com/blog/top-data-mining-applications/),a free online eduational site, provides more indepth examples of how data mining can be applied to real world situations and jobs. +The [Top Ten Data Mining Applications in the Real World](https://intellipaat.com/blog/top-data-mining-applications/), a free online eduational site, provides more indepth examples of how data mining can be applied to real world situations and jobs. + +## Feedback + +In the beginning, we stated some goals. + +**Learning Objectives** + +@learning_objectives + +We ask you to fill out a brief (5 minutes or less) survey to let us know: + +* If we achieved the learning objectives +* If the module difficulty was appropriate +* If we gave you the experience you expected + +We gather this information in order to iteratively improve our work. Thank you in advance for filling out [our brief survey](https://redcap.chop.edu/surveys/?s=KHTXCXJJ93&module_name=%22Demystifying+Data+Mining%22)!