|
| 1 | +# Job Recommendation System: End-to-End Deep Learning Workload |
| 2 | +<!-- Do not use backticks (`) to highlight parts of the title. --> |
| 3 | + |
| 4 | +This sample illustrates the use of Intel® Extension for TensorFlow* to build and run an end-to-end AI workload on the example of the job recommendation system. |
| 5 | + |
| 6 | +| Property | Description |
| 7 | +|:--- |:--- |
| 8 | +| Category | Reference Designs and End to End |
| 9 | +| What you will learn | How to use Intel® Extension for TensorFlow* to build end to end AI workload? |
| 10 | +| Time to complete | 30 minutes |
| 11 | + |
| 12 | +## Purpose |
| 13 | + |
| 14 | +This code sample show end-to-end Deep Learning workload in the example of job recommendation system. It consists of four main parts: |
| 15 | + |
| 16 | +1. Data exploration and visualization - showing what the dataset is looking like, what are some of the main features and what is a data distribution in it. |
| 17 | +2. Data cleaning and pre-processing - removal of duplicates, explanation all necessary steps for text pre-processing. |
| 18 | +3. Fraud job postings removal - finding which of the job posting are fake using LSTM DNN and filtering them. |
| 19 | +4. Job recommendation - calculation and providing top-n job descriptions similar to the chosen one. |
| 20 | + |
| 21 | +## Prerequisites |
| 22 | + |
| 23 | +| Optimized for | Description |
| 24 | +| :--- | :--- |
| 25 | +| OS | Linux, Ubuntu* 20.04 |
| 26 | +| Hardware | GPU |
| 27 | +| Software | Intel® Extension for TensorFlow* |
| 28 | +> **Note**: AI and Analytics samples are validated on AI Tools Offline Installer. For the full list of validated platforms refer to [Platform Validation](https://github.com/oneapi-src/oneAPI-samples/tree/master?tab=readme-ov-file#platform-validation). |
| 29 | +<!-- for migrated samples - modify the note above to provide information on samples validation and preferred installation option --> |
| 30 | +
|
| 31 | +## Key Implementation Details |
| 32 | + |
| 33 | +This sample creates Deep Neural Networ to fake job postings detections using Intel® Extension for TensorFlow* LSTM layer on GPU. It also utilizes `itex.experimental_ops_override()` to automatically replace some TensorFlow operators by Custom Operators form Intel® Extension for TensorFlow*. |
| 34 | + |
| 35 | +The sample tutorial contains one Jupyter Notebook and one Python script. You can use either. |
| 36 | + |
| 37 | +## Environment Setup |
| 38 | +You will need to download and install the following toolkits, tools, and components to use the sample. |
| 39 | +<!-- Use numbered steps instead of subheadings --> |
| 40 | + |
| 41 | +**1. Get AI Tools** |
| 42 | + |
| 43 | +Required AI Tools: <Intel® Extension for TensorFlow* - GPU><!-- List specific AI Tools that needs to be installed before running this sample --> |
| 44 | + |
| 45 | +If you have not already, select and install these Tools via [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html). AI and Analytics samples are validated on AI Tools Offline Installer. It is recommended to select Offline Installer option in AI Tools Selector. |
| 46 | + |
| 47 | +>**Note**: If Docker option is chosen in AI Tools Selector, refer to [Working with Preset Containers](https://github.com/intel/ai-containers/tree/main/preset) to learn how to run the docker and samples. |
| 48 | +
|
| 49 | +**2. (Offline Installer) Activate the AI Tools bundle base environment** |
| 50 | +<!-- this step is from AI Tools GSG, please don't modify unless GSG is updated --> |
| 51 | +If the default path is used during the installation of AI Tools: |
| 52 | +``` |
| 53 | +source $HOME/intel/oneapi/intelpython/bin/activate |
| 54 | +``` |
| 55 | +If a non-default path is used: |
| 56 | +``` |
| 57 | +source <custom_path>/bin/activate |
| 58 | +``` |
| 59 | + |
| 60 | +**3. (Offline Installer) Activate relevant Conda environment** |
| 61 | +<!-- specify relevant conda environment name in Offline Installer for this sample --> |
| 62 | +``` |
| 63 | +conda activate tensorflow-gpu |
| 64 | +``` |
| 65 | + |
| 66 | +**4. Clone the GitHub repository** |
| 67 | +<!-- for oneapi-samples: git clone https://github.com/oneapi-src/oneAPI-samples.git |
| 68 | +cd oneAPI-samples/AI-and-Analytics/<samples-folder>/<individual-sample-folder> --> |
| 69 | +<!-- for migrated samples - provide git clone command for individual repo and cd to sample dir --> |
| 70 | +``` |
| 71 | +git clone https://github.com/oneapi-src/oneAPI-samples.git |
| 72 | +cd oneAPI-samples/AI-and-Analytics/End-to-end-Workloads/JobRecommendationSystem |
| 73 | +``` |
| 74 | + |
| 75 | +**5. Install dependencies** |
| 76 | +<!-- It is required to have requirement.txt file in sample dir. It should list additional libraries, such as matplotlib, ipykernel etc. --> |
| 77 | +>**Note**: Before running the following commands, make sure your Conda/Python environment with AI Tools installed is activated |
| 78 | +
|
| 79 | +``` |
| 80 | +pip install -r requirements.txt |
| 81 | +pip install notebook |
| 82 | +``` |
| 83 | +For Jupyter Notebook, refer to [Installing Jupyter](https://jupyter.org/install) for detailed installation instructions. |
| 84 | + |
| 85 | +## Run the Sample |
| 86 | +>**Note**: Before running the sample, make sure [Environment Setup](https://github.com/oneapi-src/oneAPI-samples/tree/master/AI-and-Analytics/Getting-Started-Samples/INC-Quantization-Sample-for-PyTorch#environment-setup) is completed. |
| 87 | +
|
| 88 | +Go to the section which corresponds to the installation method chosen in [AI Tools Selector](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-tools-selector.html) to see relevant instructions: |
| 89 | +* [AI Tools Offline Installer (Validated)](#ai-tools-offline-installer-validated) |
| 90 | +* [Conda/PIP](#condapip) |
| 91 | +* [Docker](#docker) |
| 92 | +<!-- for migrated samples - it's acceptable to change the order of the sections based on the validated/preferred installation options. However, all 3 sections (Offline, Conda/PIP, Docker) should be present in the doc --> |
| 93 | +### AI Tools Offline Installer (Validated) |
| 94 | + |
| 95 | +**1. Register Conda kernel to Jupyter Notebook kernel** |
| 96 | + |
| 97 | +If the default path is used during the installation of AI Tools: |
| 98 | +``` |
| 99 | +$HOME/intel/oneapi/intelpython/envs/tensorflow-gpu/bin/python -m ipykernel install --user --name=tensorflow-gpu |
| 100 | +``` |
| 101 | +If a non-default path is used: |
| 102 | +``` |
| 103 | +<custom_path>/bin/python -m ipykernel install --user --name=tensorflow-gpu |
| 104 | +``` |
| 105 | +**2. Launch Jupyter Notebook** |
| 106 | +<!-- add other flags to jupyter notebook command if needed, such as port 8888 or allow-root --> |
| 107 | +``` |
| 108 | +jupyter notebook --ip=0.0.0.0 |
| 109 | +``` |
| 110 | +**3. Follow the instructions to open the URL with the token in your browser** |
| 111 | + |
| 112 | +**4. Select the Notebook** |
| 113 | +<!-- add sample file name --> |
| 114 | +``` |
| 115 | +JobRecommendationSystem.ipynb |
| 116 | +``` |
| 117 | +**5. Change the kernel to `tensorflow-gpu`** |
| 118 | + <!-- specify relevant kernel name(s), for example `pytorch` --> |
| 119 | +**6. Run every cell in the Notebook in sequence** |
| 120 | + |
| 121 | +### Conda/PIP |
| 122 | +> **Note**: Before running the instructions below, make sure your Conda/Python environment with AI Tools installed is activated |
| 123 | +
|
| 124 | +**1. Register Conda/Python kernel to Jupyter Notebook kernel** |
| 125 | +<!-- keep placeholders in this step, user could use any name for Conda/PIP env --> |
| 126 | +For Conda: |
| 127 | +``` |
| 128 | +<CONDA_PATH_TO_ENV>/bin/python -m ipykernel install --user --name=tensorflow-gpu |
| 129 | +``` |
| 130 | +To know <CONDA_PATH_TO_ENV>, run `conda env list` and find your Conda environment path. |
| 131 | + |
| 132 | +For PIP: |
| 133 | +``` |
| 134 | +python -m ipykernel install --user --name=tensorflow-gpu |
| 135 | +``` |
| 136 | +**2. Launch Jupyter Notebook** |
| 137 | +<!-- add other flags to jupyter notebook command if needed, such as port 8888 or allow-root --> |
| 138 | +``` |
| 139 | +jupyter notebook --ip=0.0.0.0 |
| 140 | +``` |
| 141 | +**3. Follow the instructions to open the URL with the token in your browser** |
| 142 | + |
| 143 | +**4. Select the Notebook** |
| 144 | +<!-- add sample file name --> |
| 145 | +``` |
| 146 | +JobRecommendationSystem.ipynb |
| 147 | +``` |
| 148 | +**5. Change the kernel to `<your-env-name>`** |
| 149 | +<!-- leave <your-env-name> as a placeholder as user could choose any name for the env --> |
| 150 | + |
| 151 | +**6. Run every cell in the Notebook in sequence** |
| 152 | + |
| 153 | +### Docker |
| 154 | +AI Tools Docker images already have Get Started samples pre-installed. Refer to [Working with Preset Containers](https://github.com/intel/ai-containers/tree/main/preset) to learn how to run the docker and samples. |
| 155 | + |
| 156 | +<!-- Remove Intel® DevCloud section or other outdated sections --> |
| 157 | + |
| 158 | +## Example Output |
| 159 | + |
| 160 | + If successful, the sample displays [CODE_SAMPLE_COMPLETED_SUCCESSFULLY]. Additionally, the sample shows multiple diagram explaining dataset, the training progress for fraud job posting detection and top job recommendations. |
| 161 | + |
| 162 | +## Related Samples |
| 163 | + |
| 164 | +<!--List other AI samples targeting similar use-cases or using the same AI Tools.--> |
| 165 | +* [Intel Extension For TensorFlow Getting Started Sample](https://github.com/oneapi-src/oneAPI-samples/blob/development/AI-and-Analytics/Getting-Started-Samples/Intel_Extension_For_TensorFlow_GettingStarted/README.md) |
| 166 | +* [Leveraging Intel Extension for TensorFlow with LSTM for Text Generation Sample](https://github.com/oneapi-src/oneAPI-samples/blob/master/AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_TextGeneration_with_LSTM/README.md) |
| 167 | + |
| 168 | +## License |
| 169 | + |
| 170 | +Code samples are licensed under the MIT license. See |
| 171 | +[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) |
| 172 | +for details. |
| 173 | + |
| 174 | +Third party program Licenses can be found here: |
| 175 | +[third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt) |
| 176 | + |
| 177 | +*Other names and brands may be claimed as the property of others. [Trademarks](https://www.intel.com/content/www/us/en/legal/trademarks.html) |
0 commit comments