Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserialize directly to vec #121

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

mward
Copy link

@mward mward commented Jan 24, 2020

Updated to allow deserializing directly to a Vec, without need to define an intermediate struct. This does not break current functionality, so you can still use intermediate struct if you want.

This will resolve questions like this: https://stackoverflow.com/questions/58571896/rust-serde-deserialize-xml-directly-to-vect

@mward
Copy link
Author

mward commented Jan 24, 2020

The travis CI build failed on my fork too, I had to clear cache in travis to get past this

@mward
Copy link
Author

mward commented Jan 25, 2020

Travis error details:

error[E0514]: found crate `serde_derive` compiled by an incompatible version of rustc
 --> README.md:13:1
  |
3 | extern crate serde_derive;
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: please recompile that crate using this compiler (rustc 1.40.0 (73528e339 2019-12-16))
  = note: the following crate versions were found:
          crate `serde_derive` compiled by rustc 1.39.0 (4560ea788 2019-11-04): /home/travis/build/RReverser/serde-xml-rs/target/debug/deps/libserde_derive-01b75c0c69c91066.so
error: aborting due to 2 previous errors

@punkstarman punkstarman self-assigned this Jan 27, 2020
@punkstarman
Copy link
Collaborator

@mward, I keep seeing this problem and another one that is similar. As you say, clearing the cache works around the problem. I'd love to find a permanent fix that can be automated.

A few months back, I noticed that the option to clear the cache (https://docs.travis-ci.com/user/caching/#clearing-caches) is no longer available on travis-ci.org.

@ghost ghost mentioned this pull request Feb 11, 2020
@ghost
Copy link

ghost commented Feb 13, 2020

@mward are you able to rebase/ merge master so we can see if the CI change I made fixes anything 😄 ?

@mward
Copy link
Author

mward commented Feb 13, 2020

@efx I merged upstream master to my forked project.

Thanks

@JRAndreassen
Copy link

JRAndreassen commented Apr 20, 2020

Do we have a schedule for merging this fix ?
It one solves part my issue with parsing:

// ----------------------------------------------------
#[derive(Debug, Deserialize, PartialEq)]
#[serde(rename_all(deserialize = "lowercase"))]
pub struct LookupEntrySingle {
    #[serde(rename = "state")]
    lkup_state: String,

    #[serde(rename = "value")]
    lookup_value: u64,    

    #[serde(rename = "$value", default)]
    lkup_label: String,    
}

#[derive(Debug, Deserialize, PartialEq)]
#[serde(rename_all(deserialize = "lowercase"))]
pub enum LookupEntry{
    #[serde(rename = "SingleInt")]
    SingleInt(LookupEntrySingle),

    #[serde(rename = "Boolean")]
    Boolean(LookupEntrySingle),

    #[serde(rename = "BitField")]
    BitField(LookupEntrySingle),
}



#[derive(Debug, Deserialize, PartialEq)]
pub struct ValueLookup {
    id: String,
    #[serde(rename = "desiredValue", default)]
    desiredvalue: u64,

    #[serde(rename = "undefinedState", default)]
    undefinedstate: Option<String>,

//    #[serde(rename = "Lookups", default)]
    #[serde(rename = "$value")]
    lookups: Vec<LookupEntry>,

}

static LOOKUP_STD: &'static str = r##"<?xml version="1.0" encoding="UTF-8"?>
<ValueLookup id="int" desiredValue="1" >
<Lookups>
    <SingleInt state="None"    value="0"  >Unknown</SingleInt>
    <SingleInt state="Ok"      value="1" >OK</SingleInt>
</Lookups>
</ValueLookup>
"##;

#[test]
fn test_lookupInt_JSON() {
/* This one is returns one element of the vec, only the last one 
[
    Object(
        {
            "Lookups": Object(
                {
                    "SingleInt": Object(
                        {
                            "$value": String(
                                "OK",
                            ),
                            "state": String(
                                "Ok",
                            ),
                            "value": String(
                                "1",
                            ),
                        },
                    ),
                },
            ),
            "desiredValue": String(
                "1",
            ),
            "id": String(
                "prtgc.std.lookups.status",
            ),
        },
    ),
]
*/
    let _ = simple_logger::init_with_level(log::Level::Debug);
    let json_values: ValueLookup = 
    match serde_xml_rs::from_str(LOOKUP_STD) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}

#[test]
fn test_lookupInt(){
// This one works
    let _ = simple_logger::init_with_level(Level::Debug);
    let json_values: Vec<serde_json::Value> = 
    match serde_xml_rs::from_str(LOOKUP_STD) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}


#[test]
fn test_lookupdange() {
// This one does not work
    let _ = simple_logger::init_with_level(log::Level::Debug);
    let s = r##"<?xml version="1.0" encoding="UTF-8"?>
<ValueLookup id="range" desiredValue="1" >
  <Lookups>
    <Range From="100" To="199" state="Ok">Information</Range>
    <Range From="200" To="299" state="Ok">Success</Range>
    <Range From="300" To="399" state="Warning">Redirection</Range>
    <Range From="400" To="499" state="Error">Client Error</Range>
    <Range From="500" To="599" state="Error">Server Error</Range>
  </Lookups>
</ValueLookup>
"##;

    let json_values: ValueLookup = 
    match serde_xml_rs::from_str(s) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}

#[test]
fn test_lookupBit() {
// This one does not work
    let _ = simple_logger::init_with_level(log::Level::Debug);
    let s = r##"<?xml version="1.0" encoding="UTF-8"?>
    <ValueLookup id="status" desiredValue="1" >
<Lookups>
  <BitField state="OK" value="0" BitNumber="1">No Alarm</BitField>
  <BitField state="Warning" value="1" BitNumber="1">
    Alarm Active
  </BitField>
</Lookups>
</ValueLookup>
"##;

    let json_values: ValueLookup = 
    match serde_xml_rs::from_str(s) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}

@Ten0
Copy link

Ten0 commented Apr 27, 2020

I've been experimenting with this and it seems to not behave properly when the considered Vec should be empty.

The following test (built similarly to your new vec_container test) does not pass:

/** For reference */
#[derive(Debug, Deserialize, PartialEq)]
struct Shelter {
    animals: Vec<Animal>,
}

#[derive(Debug, Deserialize, PartialEq)]
struct Animal {
    name: String,
    foo: String,
}
/********************/

#[test]
fn empty_vec_container() {
    let _ = simple_logger::init();

    let s = r##"
    <shelter>
      <animals>
      </animals>
    </shelter>
    "##;

    let shelter: Shelter = from_str(s).unwrap();

    assert_eq!(shelter.animals, Vec::new());
}
test empty_vec_container ... FAILED

failures:

---- empty_vec_container stdout ----
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Fetched StartElement(shelter, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Fetched StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked EndElement(animals)
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Pushed EndElement(animals)
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Pushed StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Fetched StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked EndElement(animals)
thread 'empty_vec_container' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { field: "missing field `name`" }', tests/test.rs:282:28

Other than that it's very practical!

@mward
Copy link
Author

mward commented May 19, 2020

@Ten0 I have a fix for the empty_vec_container on my branch.

Before fixing that I realized that the test "test_doctype_fail" was failing for some unknown reason. That test also fails when I run directly from "RReverser/serde-xml-rs" master. I temporarily ignored that test on my branch

@mward
Copy link
Author

mward commented May 19, 2020

I removed ignore from 'test_doctype_fail' test. Locking xml-rs and thiserror deps to specific versions fixed the problem

@Ten0
Copy link

Ten0 commented May 19, 2020

I removed ignore from 'test_doctype_fail' test. Locking xml-rs and thiserror deps to specific versions fixed the problem

I don't beleive pinning these crates to old versions is the right approach, I think it's better to rather fix the reason why it's failing in either the dependencies or this test (depending on which one is the cause). Besides, as you pointed out it's unrelated to this PR, and I had already opened an issue to track this: #124

@@ -1,6 +1,6 @@
# serde-xml-rs

[![Build Status](https://travis-ci.org/RReverser/serde-xml-rs.svg?branch=master)](https://travis-ci.org/RReverser/serde-xml-rs)
[![Build Status](https://travis-ci.org/mward/serde-xml-rs.svg?branch=master)](https://travis-ci.org/mward/serde-xml-rs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be included as part of this PR?

@steven-joruk
Copy link

Is there anything I can do to help get this merged?

This PR enables me to remove a lot of ugly container structs from my code base which has to support parsing very large and poorly designed XML document.

@mward
Copy link
Author

mward commented Oct 8, 2020

@steven-joruk - I'd like to see this get merged too, but in the meantime I have my Cargo.toml point to:

serde-xml-rs = { git = "https://github.com/mward/serde-xml-rs.git", branch = "deserialize-directly-to-vec" }

BTW - I've merged master onto my fork and the CI build is still passing

@steven-joruk
Copy link

steven-joruk commented Oct 26, 2020

This might be my misunderstanding, but I'm seeing my "nested" example fail where I would expect it to succeed. This is pretty much the nested_collection unit test within a container deserialized in to a vec.

use serde::Deserialize;

#[derive(Debug, Deserialize)]
struct Root {
    #[serde(rename = "Groups")]
    groups: Vec<Group>,
}

#[derive(Debug, Deserialize)]
struct Group {
    #[serde(rename = "Item")]
    items: Vec<Item>,
}

#[derive(Debug, Deserialize)]
struct Item {}

fn main() {
    let not_nested = "
    <Root>
        <Groups>
            <Group>
            </Group>
        </Groups>
    </Root>
    ";

    let root: Root = serde_xml_rs::from_str(not_nested).unwrap();
    println!("{:#?}", root);

    let nested = "
    <Root>
        <Groups>
            <Group>
                <Item/>
                <Item/>
            </Group>
        </Groups>
    </Root>
    ";

    let root: Root = serde_xml_rs::from_str(nested).unwrap();
    println!("{:#?}", root);
}

Output:

Root {
    groups: [],
}
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: UnexpectedToken { token: "XmlEvent::EndElement { name, .. }", found: "StartElement(Item, {\"\": \"\", \"xml\": \"http://www.w3.org/XML/1998/namespace\", \"xmlns\": \"http://www.w3.org/2000/xmlns/\"})" }', src/main.rs:41:53

@malthe
Copy link

malthe commented Jan 23, 2021

Bump

@MoralCode
Copy link

Been parsing though this PR to try and understand it and apply it to the latest version of this repo. i was able to cherry pick only commits 42b5c44 and 2e4d9f0 and have things work the same (as far as my limited testing) as if i had included this whole PR. In case I don't end up figuring out how this PR works and fixing the issue that @steven-joruk and I are seeing, I hope this information at least helps someone else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants