-
Notifications
You must be signed in to change notification settings - Fork 3.8k
fix: adjustment in audio transcription with official api #1556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Reviewer's GuideRefactored media download and audio message handling in WhatsApp business service to use the official API, integrated OpenAI speech-to-text for audio, improved error logging, and ensured correct audio file extensions. Sequence Diagram for Modified Media Download from Meta APIsequenceDiagram
participant BSS as BusinessStartupService
participant AX as axios
participant META as Meta API Server
BSS->>AX: GET /<version>/<id> (to get media URL, Headers: Content-Type, Auth)
AX->>META: HTTP GET Request
META-->>AX: Response { data: { url: mediaFileUrl } }
AX-->>BSS: Return { data: { url: mediaFileUrl } }
BSS->>AX: GET mediaFileUrl (to download file, Headers: Auth only, responseType: arraybuffer)
AX->>META: HTTP GET Request
META-->>AX: Response (media file as arraybuffer)
AX-->>BSS: Return media file data
Sequence Diagram for Audio Processing with Speech-to-Text (S3 Enabled)sequenceDiagram
actor User
participant WP as WhatsAppPlatform
participant BSS as BusinessStartupService
participant META as Meta API Server
participant S3 as MinioService/S3
participant DB as PrismaRepository
participant OAI as OpenAIService
participant WS as WebhookSystem
User->>WP: Sends audio message
WP->>BSS: Receives audio message (received)
BSS->>BSS: Calls messageAudioJson(received)
BSS->>META: Download audio file
META-->>BSS: Audio file data
BSS->>S3: Upload audio file (buffer, fileName with correct extension)
S3-->>BSS: mediaUrl
BSS->>DB: findFirst OpenAI settings (instanceId)
DB-->>BSS: openAiDefaultSettings
alt OpenAI STT Enabled and Configured
BSS->>OAI: speechToText(creds, { message: { mediaUrl, ... } })
OAI-->>BSS: transcribedText
BSS->>BSS: Update messageRaw with speechToText
end
BSS->>WS: sendDataWebhook(MESSAGES_UPSERT, messageRaw)
Sequence Diagram for Audio Processing with Speech-to-Text (S3 Disabled)sequenceDiagram
actor User
participant WP as WhatsAppPlatform
participant BSS as BusinessStartupService
participant DB as PrismaRepository
participant OAI as OpenAIService
participant WS as WebhookSystem
User->>WP: Sends audio message
WP->>BSS: Receives audio message (received)
BSS->>BSS: Calls messageAudioJson(received)
BSS->>BSS: downloadMediaMessage(received.messages[0]) from Meta API
BSS-->>BSS: Audio file buffer
BSS->>BSS: Convert buffer to base64
BSS->>DB: findFirst OpenAI settings (instanceId)
DB-->>BSS: openAiDefaultSettings
alt OpenAI STT Enabled and Configured
BSS->>OAI: speechToText(creds, { message: { base64, ... } })
OAI-->>BSS: transcribedText
BSS->>BSS: Update messageRaw with speechToText
end
BSS->>WS: sendDataWebhook(MESSAGES_UPSERT, messageRaw)
Updated Class Diagram for BusinessStartupServiceclassDiagram
class BusinessStartupService {
-configService: ConfigService
-logger: Logger
-token: string
-instanceId: string
-prismaRepository: PrismaRepository
-openaiService: OpenAIService
+downloadMediaMessage(received: any): Promise<Buffer> // Modified logic for headers
-messageAudioJson(received: any): any // New private method for audio message structure
+onMessage(received: any): void // Represents main handler with updated audio processing & OpenAI STT
}
BusinessStartupService ..> axios : uses
BusinessStartupService ..> ConfigService : uses
BusinessStartupService ..> Logger : uses
BusinessStartupService ..> PrismaRepository : uses
BusinessStartupService ..> OpenAIService : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary by Sourcery
Refine audio message handling by structuring payloads, standardizing media downloads, and integrating optional OpenAI speech-to-text transcription with improved error handling.
New Features:
Bug Fixes:
Enhancements: