Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgsql/logger: open json object from logger fn - v3 #11008

Closed
wants to merge 2 commits into from

Conversation

jufajardini
Copy link
Contributor

Previous PR: #10997

Changes from last PR:

  • remove changes to template.rs as per Jason's analysis

Link to redmine ticket:
https://redmine.openinfosecfoundation.org/issues/6983
https://redmine.openinfosecfoundation.org/issues/7000

Provide values to any of the below to override the defaults.

SV_BRANCH=OISF/suricata-verify#1810

Before, the JsonBuilder object for the pgsql event was being created
from the C-side function that actually called the Rust logger.

This resulted that if another module - such as the Json Alert called the
PGSQL logger, we wouldn't have the `pgsql` key present in the log output
- only its inner fields.

Bug OISF#6983
When createing a new PgsqlTransaction, we first increment PgsqlState's
tx_id and then pass that onto the new tx tx_id. As it is not created
they're first id 0, there's no need to adjust when freeing or searching.

Bug OISF#7000
@victorjulien
Copy link
Member

I don't understand the tx_id explanation, can you expand on it? What was happening before? What is the tx_id space? 0-N or 1-N?

@jufajardini
Copy link
Contributor Author

I don't understand the tx_id explanation, can you expand on it? What was happening before? What is the tx_id space? 0-N or 1-N?

Pgsql's tx_id space is 1-N.

When Suricata does app-layer event logging, it uses JsonPgsqlLogger, which will use data from output-tx, it starts from tx_id, as that's the value that it gets from AppLayerParserState, from log_id.

When Suricata does alert event logging and adds app-layer metadata, the tx_id used comes from the PacketAlert, whose info comes from AppLayerParserGetTransactionInspectId, and that one seems to have discrepancy, so the metadata logging, without this fix, will always look for the alert's tx_id+1, and either log the incorrect tx data, or nothing.

If, when creating Pgsql Events, I call CreateEveHeaderWithTx instead of CreateEveHeader, I see the tx_id discrepancy in the Pgsql EVE logs. I think that's because the logging function won't actually use the tx_id passed for calling the parser logging function, and will only add that to the event header.

I hope this all make sense. I could update the commit message for bug 7000 to be something like:

    pgsql: tx_id doesn't need to be adjusted

   PgsqlTransaction.tx_id space is 1-N. Alerts are generated using the actual 
   tx_id from the parser, so, if tx_id == 1, PacketAlert.tx_id == 1.

   Without this fix, the EVE alert metadata for Pgsql will always log tx_id+1, and 
   thus either log the wrong tx, or no tx at all.
   
   When creating a new PgsqlTransaction, we first increment PgsqlState's
   tx_id and then pass that to the new tx's tx_id. As Pgsql txs are not created
   first with id 0, there's no need to adjust when freeing or searching for txs by 
   id.
   
   Extra: there is still a discrepancy between PgsqlState's `tx_id` and 
   AppLayerState's log_id, as that is initialized as zero.

   Bug #7000

About that extra part - feels like something that should be documented.

@jasonish
Copy link
Member

jasonish commented May 1, 2024

I'm initially curious why other protocols aren't having the same issue, or maybe they are. As this pattern used throughout.

@jasonish
Copy link
Member

jasonish commented May 1, 2024

Observation.

In master branch without this patch:

  • by default, the alert record DOES NOT get the pgsql metadata
  • with --simulate-ips, the alert records DOES get the pgsql metadata

With this patch:

  • by default, the alert record DOES get the pgsql metadata
  • with --simulate-ips, the alert record DOES NOT get the pgsql metadata

While I haven't yet, this should be looked at in another protocol as well to see if its a common pattern.

@suricata-qa
Copy link

Information:

ERROR: QA failed on SURI_TLPW2_autofp_suri_time.

field baseline test %
SURI_TLPW2_autofp_stats_chk
.uptime 138 146 105.8%

Pipeline 20450

@jufajardini
Copy link
Contributor Author

Observation.

In master branch without this patch:

  • by default, the alert record DOES NOT get the pgsql metadata
  • with --simulate-ips, the alert records DOES get the pgsql metadata

With this patch:

  • by default, the alert record DOES get the pgsql metadata
  • with --simulate-ips, the alert record DOES NOT get the pgsql metadata

While I haven't yet, this should be looked at in another protocol as well to see if its a common pattern.

I vaguely remember a discussion where something connected was mentioned - by you, I think? Where we'd see different results between IDS/IPS...

The tx_id for detection, in IDS, is recovered by AppLayerParserGetTransactionInspectId, which takes direction into consideration. Logging doesn't, so uses something different. I'm curious with what happens with IPS.

I will try to craft SV tests for other protocols, having versions with --simulate-ips (shall also add one of those for pgsql, too).

@jufajardini
Copy link
Contributor Author

I'm initially curious why other protocols aren't having the same issue, or maybe they are. As this pattern used throughout.

Maybe stateless parsers won't be affected by this?
Afaik we don't have this fix for the alert metadata for all app-protos, yet - might help in us not seeing this?

Wonder if the fact that we don't add the tx_id to the EVE header could help in masking the issue, or more likely how protocol transactions differ. Or none of the above. Just hope I'm not making us waste time in something that isn't important >__<'

@jasonish
Copy link
Member

jasonish commented May 2, 2024

I vaguely remember a discussion where something connected was mentioned - by you, I think? Where we'd see different results between IDS/IPS...

There are differences. Detection is run earlier that can lead to some differing results, such as packet logged, what transaction data might be available - but from adding some prints its clear the transaction is there, just the indexing is off.

However, I don't think the app-layer should need to worry if we're in IDS or IPS mode, that should be done more inside the engine, and this is where things seem to be off here.

@jufajardini
Copy link
Contributor Author

Closing this as I'm still investigating. I want to submit the next PR once we have a deeper understanding of what's going on, and whether the fix should include other protocols - or follow a different approach, entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants