Web LLM is an advanced language model that allows users to interact with and extract information from web pages directly within their browser. By leveraging powerful natural language processing (NLP) and web scraping techniques, Web LLM can engage with live web content, respond to questions, and provide detailed summaries based on the text from a given webpage.
This repository contains the core components, architecture, and security/privacy features for setting up Web LLM.
- Real-time Web Interaction: Enables querying of live web content for up-to-date information.
- Contextual Responses: Generates coherent, context-sensitive answers based on the webpage's content.
- Multi-Language Support: Can interact with web pages in various languages.
- Customizable Chatbot: Integrates into existing web applications and websites.
- User Input Interface: A simple text input box that allows users to ask questions or request information from a given webpage.
- Display Area: This shows the chatbot's responses. It dynamically updates as the user interacts with the system.
- Local Processing: The language model (LLM) is downloaded and executed directly in the user's browser, eliminating the need for server-side processing. This ensures a fast and seamless experience, as all operations are local.
- Web Scraper: The content extraction happens on the client-side. The browser uses JavaScript-based scraping methods (e.g., DOM manipulation) to collect relevant text from the web page, which is then processed by the LLM.
- Contextual Understanding: The LLM processes the scraped content and generates responses based on user queries, utilizing local resources for execution.
- The user sends a query through the frontend interface.
- The browser scrapes the web page content locally using JavaScript.
- The LLM processes the scraped content and generates an appropriate response based on the user's query.
- The response is displayed to the user in the frontend interface.
When running Large Language Models (LLMs) locally on your own hardware (e.g., a powerful local machine or server), energy consumption can be more efficient compared to cloud-based solutions. This happens because all computations are handled directly on the client side, rather than relying on remote cloud servers. Below are the key points explaining how this can save energy:
- Cloud-Based LLMs: Running an LLM in the cloud requires frequent data transfer between the client device (e.g., your computer or phone) and the remote servers. These data transfers consume energy at both ends (client-side and server-side).
- Local LLMs: All computations are performed on your local machine, which eliminates the need for continuous data transfer to and from the cloud. This reduction in data movement reduces overall energy usage associated with networking.
- Cloud Servers: In a cloud-based setup, the remote servers are often shared by multiple users, meaning their resources (CPU, GPU, etc.) may not always be optimized for your specific task, leading to inefficient resource usage and increased energy consumption.
- Local Hardware: When running LLMs locally, you have control over the hardware you use. You can tailor the computation to match the capabilities of your own device. For example, running the model on a GPU designed for AI tasks can be more energy-efficient than using a general-purpose CPU in a cloud server.
- Cloud-Based Services: Cloud service providers keep servers running 24/7, even when they're not actively processing requests. This means the servers are still consuming power while idle, contributing to unnecessary energy use.
- Local LLMs: When you run models locally, your system only consumes energy when the model is actively being used. Once you're done, the system can be powered down or put into a low-power state, saving energy.
- Cloud Services: To ensure availability and scalability, cloud providers often over-provision their infrastructure, meaning they run additional servers or maintain high capacity, leading to wasted energy even when demand is low.
- Local Deployment: When running locally, you're only using the resources that are available on your machine. There’s no over-provisioning, and you only pay for the energy consumed by your own equipment.
- Cloud Providers' Carbon Footprint: Many cloud providers rely on large data centers to host their services. These data centers consume massive amounts of energy, often relying on non-renewable sources.
- Local Machines and Renewable Energy: By running LLMs on your own hardware, you can take advantage of any energy-efficient or renewable sources you may have, such as solar-powered setups, and reduce your reliance on large data centers that are less transparent about their energy sources.
Running LLMs locally on your own device provides several energy-saving benefits, including:
- Reduction in data transmission energy costs
- More efficient hardware utilization tailored to your specific needs
- Elimination of idle server energy consumption
- No over-provisioning of resources
- Control over the energy sources used
These factors combined result in a significant reduction in overall energy consumption compared to cloud-based LLM deployment, making local processing a more sustainable and energy-efficient option.
- Content Security:
- The web scraper is built to ensure that only relevant text content is extracted from the page, preventing the injection of malicious scripts or harmful code.
- Content scraping is done entirely within the browser environment, so no external server is involved in processing user input or page data.
- No Data Retention:
- As everything happens locally in the browser, no data is sent to external servers. User queries, webpage content, and LLM responses are processed entirely within the user's environment.
- There is no persistent storage of user data; everything is discarded once the session ends.
- Web Browser (e.g., Chrome, Firefox, Safari, or Edge)
- Node.js (v16 or higher) for building and running local web components
- npm or yarn (for frontend dependencies)