Skip to content

Conversation

@chime3
Copy link
Contributor

@chime3 chime3 commented Sep 16, 2024

Closes #98

chime3 and others added 30 commits August 8, 2024 20:33
@chime3
Copy link
Contributor Author

chime3 commented Oct 24, 2024

What's changed

Ignored archived channels and bot messages

How to test

image

Anything to be aware of

Please check ReadMe

@chime3 chime3 requested a review from tahpot October 24, 2024 03:59
Copy link
Member

@tahpot tahpot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review complete, see comments.

protected async buildChatGroupList(): Promise<SchemaSocialChatGroup[]> {
const client = this.getSlackClient();
let channelList: SchemaSocialChatGroup[] = [];
const types = ["im", "private_channel", "public_channel"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be using the channelTypes config options

sourceId: channel.id,
schema: CONFIG.verida.schemas.CHAT_GROUP,
sourceData: channel,
insertedAt: new Date().toISOString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is therre a channel.created you can use here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

channel.created indicates 1970 always, so used channel.updated instead

const types = ["im", "private_channel", "public_channel"];

for (const type of types) {
const conversations = await client.conversations.list({ types: type });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a limit to how many conversations are returned? Is it necessary to iterate through multiple pages of conversation lists?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use MAX_BATCH_SIZE and removed iteration

let totalMessages = 0;
let chatHistory: SchemaSocialChatMessage[] = [];

for (let i = 0; i < mergedGroupList.length; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use for (const group of mergedGroupList) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use for (const group of mergedGroupList) {

Okay

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be changed


let query: ConversationsHistoryArguments = {
channel: group.sourceId!,
limit: this.config.messagesPerGroupLimit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the maximum limit supported by Slack API?

What happens if this.config.messagesPerGroupLimit is higher than that number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slack has limit 1000, and if this.config.messagesPerGroupLimit is higher than 1000, will return maximum 1000 items

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be documented somewhere

throw new Error("Invalid group or missing group sourceId");
}

let items: SchemaSocialChatMessage[] = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See telegram implementation, this code should be wrapped in a loop to ensure the maximum number of results are returned, which will also automatically handle the backfill so that code can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slack supports cursor based pagination and I followed Google's item range tracking which is different from TG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment still stands, this needs to be refactored to match the Telegram impelementation.

This will only backfill once, however there could be multiple backfills required. As a result, the full batch size won't be filled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's leave this for now.

Google has the same issue.

When we do the refactor we can fix this for all the handlers.

totalMessages: number,
groupCount: number
) {
syncPosition.status = SyncHandlerStatus.SYNCING;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If totalMessages=0 the syncPosition.status should be ENABLED, not SYNCING.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

src/schemas.ts Outdated
export interface SchemaSocialChatGroup extends SchemaRecord {
newestId?: string
syncData?: string
type?: string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this?

Just tried to represent DM, private, or public. But not used. I will remove type

@chime3 chime3 requested a review from tahpot October 28, 2024 03:29
Copy link
Member

@tahpot tahpot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments.

let totalMessages = 0;
let chatHistory: SchemaSocialChatMessage[] = [];

for (let i = 0; i < mergedGroupList.length; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be changed


let query: ConversationsHistoryArguments = {
channel: group.sourceId!,
limit: this.config.messagesPerGroupLimit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be documented somewhere

throw new Error("Invalid group or missing group sourceId");
}

let items: SchemaSocialChatMessage[] = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment still stands, this needs to be refactored to match the Telegram impelementation.

This will only backfill once, however there could be multiple backfills required. As a result, the full batch size won't be filled.

throw new Error("Invalid group or missing group sourceId");
}

let items: SchemaSocialChatMessage[] = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's leave this for now.

Google has the same issue.

When we do the refactor we can fix this for all the handlers.

const firstGroupId = chatMessages[0].groupId;
const firstGroupMessages = chatMessages.filter(msg => msg.groupId === firstGroupId);
assert.equal(firstGroupMessages.length, handlerConfig.messagesPerGroupLimit, "Processed correct number of messages per group");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test to make sure most recent messages are processed first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test to make sure most recent messages are processed first

Done


// Ensure second batch results are returned
assert.ok(secondBatchResults && secondBatchResults.length, "Have second batch results returned");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check second batch items aren't in the first batch
Check second batch first item is older than the first batch last item

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check second batch items aren't in the first batch

Done

Check second batch first item is older than the first batch last item

A batch has multiple channels, can't do cross-channel ordering

);

// Check if synced every chat group correctly
const syncedGroup = (secondBatchChatGroups.filter(group => group.sourceId === secondBatchChatMessages[0].groupId))[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only checks one group, not all the groups

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only checks one group, not all the groups

Done

}
});
});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests to show that each type of chat group works:

SlackChatGroupType.IM,
SlackChatGroupType.PRIVATE_CHANNEL,
SlackChatGroupType.PUBLIC_CHANNEL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing this test requires to have all kind of channels and messages in every channel.
Enabled DM only by default.

  const groupTypes = {
        [SlackChatGroupType.IM]: false, // enabled
        [SlackChatGroupType.PRIVATE_CHANNEL]: true,
        [SlackChatGroupType.PUBLIC_CHANNEL]: true,
      };

@chime3 chime3 requested a review from tahpot November 18, 2024 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[New connector] Slack

3 participants