Skip to content

Web scraping & automated data update#417

Open
laurenp-2 wants to merge 15 commits into
mainfrom
web_scraping
Open

Web scraping & automated data update#417
laurenp-2 wants to merge 15 commits into
mainfrom
web_scraping

Conversation

@laurenp-2
Copy link
Copy Markdown
Contributor

@laurenp-2 laurenp-2 commented Mar 23, 2026

Summary

This pull request is the first step towards implementing automated data updates and web scraping. The PR introduces backend scripts that export current property information in the database into a CSV, and then updates the database with information from the csv after it is updated with new information. Additionally, it introduces a script to scrape the PJApts website.
This PR also builds off of changes introduced in the Admin Data Editor PR.

  • implementedexport_apartments and update_apartments_from_csv
  • implementedscrapePJApts and runScrapers
  • connect the export/update CSV pipeline, web scraper, and admin data editor
  • API endpoint to trigger the scraper
  • diff logic (compare scraped results against existing information pulled from the database
  • Admin UI: button to trigger scraping, display of new and changed properties

Test Plan

Admin data editor for scraped properties
Screenshot 2026-04-24 at 4 00 14 PM

unit tests present in scripts.test.ts

Notes

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 23, 2026

CLA assistant check
All committers have signed the CLA.

@dti-github-bot
Copy link
Copy Markdown
Member

dti-github-bot commented Mar 23, 2026

[diff-counting] Significant lines: 2168. This diff might be too big! Developer leads are invited to review the code.

@laurenp-2 laurenp-2 changed the title WIP: Web scraping WIP: Web scraping & automated data update Mar 23, 2026
@NigelTatem
Copy link
Copy Markdown

Nice work! The scraper, folder system, and admin tooling all look solid and well thought out. The main thing I'd flag is that app.ts nearly doubles in size with this PR, so it'd be worth breaking the new routes into separate files to keep things manageable as the codebase grows.

- Updated backend and frontend dependencies to ensure compatibility and security.
- Implemented admin property management features, allowing admins to view, edit, and delete property listings directly from the admin dashboard.
- Refactored code for better maintainability and readability, including improved error handling and code organization.
- Updated ESLint configurations to enforce consistent coding styles across the project.
@laurenp-2 laurenp-2 changed the title WIP: Web scraping & automated data update Web scraping & automated data update Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants