Telegram Channel Scraper

A modular TypeScript application for scraping closed channels like "arendabatumi" channel in Telegram using GramJS.

Project Structure

telegram-scraper/
├── src/
│   ├── client/             # Telegram client setup
│   ├── config/             # Environment configuration
│   ├── models/             # Type definitions
│   ├── scrapers/           # Channel-specific scraping functionality
│   ├── utils/              # Utility functions
│   └── index.ts            # Main entry point
├── .env                    # Environment variables (not tracked in git)
├── sample.env              # Sample environment variables
└── README.md               # Project documentation

Modules

The application is divided into several modules:

Client Module (src/client/telegram.ts): Handles Telegram authentication and client creation
Config Module (src/config/env.ts): Manages environment variables and configuration validation
Models Module (src/models/message.ts): Type definitions for message data
Scraper Module (src/scrapers/channelScraper.ts): Core functionality for scraping channel messages
Utilities:
- Logger (src/utils/logger.ts): Handles logging
- FileStorage (src/utils/fileStorage.ts): Manages saving and loading messages to/from files

Setup

Clone this repository
Install dependencies:
```
npm install
```
Copy sample.env to .env:
```
cp sample.env .env
```
Modify the .env file with your specific configuration if needed

Usage

Run the scraper in development mode:

npm run dev

Build and run in production:

npm run build
npm start

Environment Configuration

You MUST create new Telegram APP with API here https://my.telegram.org/apps You MUST be presented at the channel you trying to scrape

API_ID: Your Telegram API ID
API_HASH: Your Telegram API hash
SESSION_NAME: Name for the session file
TARGET_CHANNEL: Channel username to scrape (without @)

Data Storage

Scraped messages are saved as JSON files in the data/ directory:

Initial messages: [channel]_initial_[timestamp].json
New messages: [channel]_live_[timestamp].json

Authentication

On first run, you will be prompted to authenticate with your Telegram account. The session will be saved for future use

Session Persistence

The application now supports persistent sessions. When you first run the application and authenticate, your session will be saved to a file named telegram-scraper.session in the project root directory. On subsequent runs, the application will automatically load this session file, allowing you to skip the authentication process.

How it works:

On first run, you'll need to provide your phone number and verification code
The session data is saved to a file after successful authentication
Future runs will automatically use the saved session

If you need to authenticate with a different account, simply delete the .session file.

Configuration

Create a .env file in the project root with the following variables:

API_ID=your_api_id
API_HASH=your_api_hash
SESSION_NAME=telegram-scraper
TARGET_CHANNEL=channel_username

Running the Application

Start the application with:

npm start

Or in development mode:

npm run dev

On first run, you will be prompted to authenticate with your Telegram account. The session will be saved for future use

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src		src
.gitignore		.gitignore
README.md		README.md
example-shema.txt		example-shema.txt
package-lock.json		package-lock.json
package.json		package.json
sample.env		sample.env
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Telegram Channel Scraper

Project Structure

Modules

Setup

Usage

Environment Configuration

Data Storage

Authentication

Session Persistence

How it works:

Configuration

Running the Application

About

Uh oh!

Releases

Packages

Uh oh!

Languages

morovinger/scrap-tg-api

Folders and files

Latest commit

History

Repository files navigation

Telegram Channel Scraper

Project Structure

Modules

Setup

Usage

Environment Configuration

Data Storage

Authentication

Session Persistence

How it works:

Configuration

Running the Application

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages