MNT: Upgrade space_packet_parser to next release#2393
MNT: Upgrade space_packet_parser to next release#2393greglucas merged 3 commits intoIMAP-Science-Operations-Center:devfrom
Conversation
| values = [ | ||
| item.raw_value for i, item in enumerate(packet.values()) if i > 6 | ||
| ] | ||
| header = { | ||
| key: value | ||
| for i, (key, value) in enumerate(packet.items()) | ||
| if i <= 6 | ||
| } |
There was a problem hiding this comment.
In XTCE parsing there is no notion of header items, everything is just an item. So we have to know implicitly that we are parsing a CCSDS packet here and get the first 7 items if that is how we want to separate it out. i.e., the .header and .user_data attributes are now deprecated.
There was a problem hiding this comment.
I find this to be an undesirable change in space_packet_parser. I am surprised that the spp.ccsds_generator doesn't separate out the CCSDS primary header. Maybe a misnomer in calling it a ccsds_generator.
Maybe there is a way to specify a group of variables in XTCE? I just wonder if spp is trying to be too generic?
There was a problem hiding this comment.
I am surprised that the spp.ccsds_generator doesn't separate out the CCSDS primary header
ccsds_generator CCSDSPacketBytes do have the header and data items on it, so you can get those out from the raw bytes directly (and the individual fields)
https://github.com/lasp/space_packet_parser/blob/afb84a06a741e0a5f3ab4bfa5a77a5f7e75529e5/space_packet_parser/generators/ccsds.py#L123-L136
This is because XTCE does not have the notion of CCSDS-specific things as you point out below. The parse_bytes now returns a plain dictionary of items. You can access those items however you want, but they are what was in the XTCE, so you could even have the XTCE defined as PRIMARY_HEADER = 6 bytes and ignore all the subfields, but we have no way of knowing how a user defined their XTCE. Previously the header property did this automatically for the user under the hood under the assumption that everything had those header items defined.
Maybe there is a way to specify a group of variables in XTCE?
That is a really good question! I don't know.
I just wonder if spp is trying to be too generic?
Possibly... I think this issue is because GLOWS and MAG decided not to use the helper routine to go to xarray directly, and wanted to pull things out into dataclasses. I didn't want to mess with that portion here, but I personally think this should be refactored out to use those helpers instead.
|
|
||
| # Set up the parser from the input packet definition | ||
| packet_definition = definitions.XtcePacketDefinition(xtce_packet_definition) | ||
| packet_definition = spp.load_xtce(xtce_packet_definition) |
There was a problem hiding this comment.
There is a packet_file_to_datasets in space_packet_parser itself now. I'm going to leave that as a separate issue to tackle that upgrade and rip this code out because I have a feeling it won't be completely straightforward and require a bit of dataset manipulation on our end after that to update the time fields and the like.
There was a problem hiding this comment.
nice to know that the library has that now! so this function will be going away in the future?
| try: | ||
| time_key = list(data.keys())[7] | ||
| except IndexError: | ||
| logger.debug( |
There was a problem hiding this comment.
This is with one specific file in an Ultra test where there is only CCSDS headers being parsed.
tmplummer
left a comment
There was a problem hiding this comment.
Interesting changes from the spp update. Mostly looks good and the one thing that I really take issue with is on the spp side, not imap_processing.
| values = [ | ||
| item.raw_value for i, item in enumerate(packet.values()) if i > 6 | ||
| ] | ||
| header = { | ||
| key: value | ||
| for i, (key, value) in enumerate(packet.items()) | ||
| if i <= 6 | ||
| } |
There was a problem hiding this comment.
I find this to be an undesirable change in space_packet_parser. I am surprised that the spp.ccsds_generator doesn't separate out the CCSDS primary header. Maybe a misnomer in calling it a ccsds_generator.
Maybe there is a way to specify a group of variables in XTCE? I just wonder if spp is trying to be too generic?
| values = [ | ||
| item.raw_value for i, item in enumerate(packet.values()) if i > 6 | ||
| ] | ||
| header = { | ||
| key: value | ||
| for i, (key, value) in enumerate(packet.items()) | ||
| if i <= 6 | ||
| } |
There was a problem hiding this comment.
Should this repeated code go into a function somewhere... maybe in the ccsds_data module?
There was a problem hiding this comment.
Possibly. You are thinking of passing in the entire packet and plucking off the first 7 items within that routine instead of subsetting out the items before calling it? We would still need to do the values portion externally here though.
I'm inclined to live with the duplicate code, but can factor out that portion if it'd be helpful.
There was a problem hiding this comment.
I ended up pulling this into a helper function separate_header_userdata() and used it in both locations.
| with open(packet_file, "rb") as binary_data: | ||
| packet_generator = packet_definition.packet_generator(binary_data) | ||
| for packet in packet_generator: | ||
| for binary_packet in spp.ccsds_generator(binary_data): | ||
| try: | ||
| packet = packet_definition.parse_bytes(binary_packet) | ||
| except UnrecognizedPacketTypeError as e: | ||
| # NOTE: Not all of our definitions have all of the APIDs | ||
| # we may encounter, so we only want to process ones | ||
| # we can actual parse. | ||
| logger.debug(e) | ||
| continue |
There was a problem hiding this comment.
You could just call decom.decom_packets() here. I know that it would mean two for loops instead of one, but I don't think there would be a significant performance hit and it would reduce duplicate code.
There was a problem hiding this comment.
Let me look into this. I like the idea. I'm thinking we might be able to make this small portion a separate generator like yield_and_skip_unrecognized_packets
There was a problem hiding this comment.
Nice suggestion here! The latest commit moved the only function in decom.py over to here so we can re-use it as a generator instead (no need to be a list). I think this is a nice consolidation and I was even able to use that in the mag/glows routines to remove the other spp imports so that spp is only imported in this one module now.
This gets rid of some warnings in the code base and simplifies some instantiation logic.
82d376c to
b5f0b2e
Compare
tech3371
left a comment
There was a problem hiding this comment.
I don't know enough with MAG and GLOWS but other code changes looks good to me.
|
|
||
| # Set up the parser from the input packet definition | ||
| packet_definition = definitions.XtcePacketDefinition(xtce_packet_definition) | ||
| packet_definition = spp.load_xtce(xtce_packet_definition) |
There was a problem hiding this comment.
nice to know that the library has that now! so this function will be going away in the future?
| yield packet | ||
|
|
||
|
|
||
| def separate_header_userdata(packet: dict) -> tuple[dict, dict]: |
There was a problem hiding this comment.
can you add some more comments here? Looks like GLOWS and MAG is using that.
There was a problem hiding this comment.
I changed the name of the function to separate_ccsds_header_userdata to make it a bit more explicit that it is in relation to ccsds.
I also added a note that people shouldn't use this. Yes it is used only in GLOWS and MAG and we shouldn't use it elsewhere IMO so I added a note about that. We want to steer all instruments to packet_file_to_datasets if we can so we get to an xarray dataset immediately.
| return dataset_by_apid | ||
|
|
||
|
|
||
| def packet_generator( |
There was a problem hiding this comment.
Will this function replace packet_file_to_dateasets in the future?
There was a problem hiding this comment.
No, this is used by packet_file_to_datasets. It is a helper method for that routine.
This function shouldn't be used by other instruments, so we can explicitly mention that. Also document what it does by stripping off the first 7 values and then returning the rest as userdata.
| yield packet | ||
|
|
||
|
|
||
| def separate_ccsds_header_userdata(packet: dict) -> tuple[dict, dict]: |
There was a problem hiding this comment.
Thanks for adding this... especially nice documentation.
tmplummer
left a comment
There was a problem hiding this comment.
Sorry that I was a bit slow to get back to this review. Looks good to me.
edbc50a
into
IMAP-Science-Operations-Center:dev
This gets rid of some warnings in the code base (especially our AWS logs with lxml warnings) and simplifies some instantiation logic. I also want to try bringing in some segmented packet combinations in the future, so this is in preparation for that as well.
This does make things more explicit on our end which is probably a good thing, but comes at the detriment of adding some more code. I didn't realize how much was hidden by the previous generators with parse_bad_pkts and ignoring some of the warnings. There are things when it is an unrecognized packet that we need to catch in our dataset creation now rather than that being hidden for us by the previous generator.