Skip to content

Latest commit

 

History

History
117 lines (92 loc) · 7.82 KB

File metadata and controls

117 lines (92 loc) · 7.82 KB

Project Goal: Create bookmarks-fixer.py

Please write a single, monolithic Python script named bookmarks-fixer.py. This script will consolidate all bookmarks from Firefox, Edge, and Vivaldi, find their visit counts from the local history databases, and then send the complete, merged data to the Grok-4 API via the Poe.com API for cleaning and categorization.

Environment & Setup

  1. Virtual Environment: All Python commands must be run inside the virtual environment. Always activate it first with .venv\Scripts\Activate.ps1.
  2. Required Libraries: The script will need sqlite3, json, os, pathlib, sys, and requests (or httpx for Poe.com API calls). Ensure you include imports for all of these.

Core Script Logic (Step-by-Step)

Step 1: Locate Browser Profile Paths

  • The script must find the profile paths for Firefox, Edge, and Vivaldi.
  • Use os.getenv('APPDATA') and os.getenv('LOCALAPPDATA').
  • Firefox: %APPDATA%\Mozilla\Firefox\Profiles\ (Find the .default-release folder)
  • Edge: %LOCALAPPDATA%\Microsoft\Edge\User Data\Default\
  • Vivaldi: %LOCALAPPDATA%\Vivaldi\User Data\Default\

Step 2: Handle Database Lock Error

  • The history databases (places.sqlite, History) will be locked if the browsers are open.
  • Implement a try...except block around the database connection.
  • If a sqlite3.OperationalError (e.g., "database is locked") is caught, print a clear, user-friendly error message like "ERROR: Please close Firefox, Edge, and Vivaldi before running this script." and then sys.exit().

Step 3: Extract Firefox Bookmarks & Visits (The Easy Way)

  • Connect to %APPDATA%\...<profile>\places.sqlite.
  • Execute the following SQL query to get bookmarks and visit counts in one go:
    SELECT p.url, p.title, p.visit_count
    FROM moz_bookmarks AS b
    JOIN moz_places AS p ON b.fk = p.id
    WHERE b.type = 1; -- type 1 = bookmark
  • Store these (url, title, visit_count) tuples in a master list.

Step 4: Extract Edge Bookmarks & Visits (The Chromium Way)

  1. Read Bookmarks: Open and parse the Bookmarks JSON file from the Edge profile path. Recursively iterate through the 'roots' (e.g., 'bookmark_bar', 'other') to find all bookmark entries (where 'type': 'url'). Store their url and name.
  2. Get Visit Counts: Connect to the History SQLite database in the Edge profile path.
  3. Loop & Query: For each bookmark url you found, execute SELECT visit_count, title FROM urls WHERE url = ? on the History database.
  4. Store: Store the url, the name (from the JSON), and the visit_count (from the History DB). If fetchone() returns None, use a visit_count of 0. Add this data to your master list.

Step 5: Extract Vivaldi Bookmarks & Visits

  • Repeat the exact same "Chromium Way" process from Step 4, but use the Bookmarks and History files from the Vivaldi profile path. Add the results to the master list.

Step 6: Consolidate and De-duplicate Data

  • You now have a master list of bookmark objects, e.g., [{"url": ..., "name": ..., "visit_count": ...}, ...].
  • De-duplicate this list. Use the url as the unique key. If duplicate URLs exist, keep the one with the highest visit_count.
  • The final product of this step is a clean list of unique bookmark objects. This list is the "product" that will be inserted into the Grok-4 prompt.

Step 7: Check for Poe.com API Implementation

  • Use the tools/query_function_index_semantic.py script to query data/llm_function_index.json.
  • The query should check for the existence of a function to "call the Poe.com API" or "submit to Poe.com".
  • If found, use that function's implementation.
  • If not found, implement a simple function using requests to send a POST request to the Poe.com API endpoint. This function will need to accept an API key (from an env var or constant) and the prompt payload.

Step 8: Call the Grok-4 API

  1. Convert Data: Convert your final, de-duplicated bookmark list (from Step 6) into a compact JSON string.
  2. Formulate Prompt: Create the final prompt to send to the API. This prompt will be a combination of the System Prompt, User Prompt (below), and the JSON data string you just created.
  3. Send to API: Use the Poe.com API function (from Step 7) to send the complete prompt to Grok-4.
  4. Print Output: Print the JSON response from Grok-4 to the console.

Grok-4 API Prompt (To be included in the script)

When calling the Poe.com API, your script must use the following text as the prompt. The script will be responsible for replacing [YOUR_SCRIPT_WILL_INSERT_THE_CONSOLIDATED_BOOKMARKS_JSON_STRING_HERE] with the actual JSON data string it generated in Step 6.

System Prompt: You are an expert data organizer and librarian, specializing in cleaning, categorizing, and structuring large lists of web bookmarks. Your goal is to take a raw, potentially messy list and return a perfectly clean, categorized JSON structure.

User Prompt:

You will be provided with a JSON array of my consolidated bookmarks. My script has already processed all local browser files (Firefox, Edge, Vivaldi) to find the visit_count for each bookmark.

Each object in the array you receive will have a url, name, and visit_count. The visit_count may be 0 or null if the information was unavailable or the item was never visited.

Here is an example of the input format you will receive: [{\"url\": \"https://stackoverflow.com/questions/1234\", \"name\": \"python - how to... - Stack Overflow\", \"visit_count\": 88}, {\"url\": \"https://github.com\", \"name\": \"GitHub\", \"visit_count\": 210}, {\"url\": \"https://www.some-random-blog.com/article/10-best-things\", \"name\": \"10-best-things\", \"visit_count\": 2}, ...]

Your task is to perform the following three actions:

  1. Identify Top 15: First, create a category named Top 15 Most Visited. You must populate this category with the 15 bookmarks that have the highest visit_count. These 15 bookmarks must not be included in any other category.
  2. Categorize the Rest: For all remaining bookmarks, group them into logical categories based on their URL and name (e.g., Programming, News & Articles, Tools & Utilities, Shopping, Social Media, Reference, Personal, Miscellaneous).
  3. Clean Names: For every bookmark, create a newName. This newName should be a clean, concise, and human-readable version of the original name. If the original name is messy (like just a URL or a generic title like "Home"), use the URL to infer a proper, descriptive name. For example, python - how to... - Stack Overflow could be cleaned to Python Question on Stack Overflow. A clean name like GitHub can remain GitHub.

Output Format: You must return only a single JSON object. The keys of this object must be the category names (including Top 15 Most Visited). The value for each key must be an array of bookmark objects. Each bookmark object in your response must include the url, the newName you generated, and the original name for reference.

This is the required JSON output format:

{
  "Top 15 Most Visited": [
    {
      "url": "[https://github.com](https://github.com)",
      "newName": "GitHub",
      "originalName": "GitHub"
    },
    ... (14 more bookmarks)
  ],
  "Programming": [
    {
      "url": "[https://stackoverflow.com/questions/1234](https://stackoverflow.com/questions/1234)",
      "newName": "Python Question on Stack Overflow",
      "originalName": "python - how to... - Stack Overflow"
    },
    ...
  ],
  "News & Articles": [
    {
      "url": "[https://www.some-random-blog.com/article/10-best-things](https://www.some-random-blog.com/article/10-best-things)",
      "newName": "10 Best Things (some-random-blog.com)",
      "originalName": "10-best-things"
    },
    ...
  ],
  ... (other categories)
}