Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the ability to see command ack #168

Closed
wants to merge 1 commit into from

Conversation

djnugent
Copy link

A quick attempt at showing how I think command acknowledgment should be handled. This was denied as an option for issue #114 since mavlink is a "lossy protocol." I am not aware of the properties which make mavlink "lossy", but I feel this method would allow the user to see that the connection is acting "lossy" if the COMMAND_ACK message times out.

Here is an example: https://gist.github.com/djnugent/edfbab1b2458d5773490

@mrpollo @billbonney thoughts?

@eliao
Copy link

eliao commented Jun 18, 2015

It seems problematic to block on the dronekit script's thread waiting on an ack. Perhaps that works for some use cases, but it would not work for real time applications working over a lossy link. In that case this would cause delays while waiting for an ack that will never come. Over a lossy connection, one needs to assume that any and all packets can be dropped.

@mrpollo
Copy link
Member

mrpollo commented Jun 18, 2015

I agree with Eric, the "lossy" really comes from the link between mavproxy and the autopilot, the transport can be anything from a 3DR Radio to Wifi, we can't control the time it takes to get an ACK, some of the symptoms are what @eliao describe, and yes all packets can be dropped.

@geeksville probably has something to say here too

@djnugent
Copy link
Author

I put a timeout in so we don't block the thread forever. The documentation would just address the downsides of waiting for a command ack and explain the timeout. And I feel you are less likely to drop a packet if the connection is hard wire serial. i.e. a companion computer to pixhawk which is going to be the main audience for dronekit-python. wait_valid() is a bigger sin with no timeout(unless downloading waypoints is different).

Thanks for the input!

@billbonney
Copy link

Checking the ACK form sending a command would confirm the command was accepted etc., but not necessarily acted upon. ie you could see an ACK to send a command to TAke Off, but then it won't Take Off as other conditions will not have been met. As I have written before, when issuing a command the only indication that it was accepted, is that the vehicle properties change to reflect that action.

@billbonney
Copy link

PS: And qhat should be the value of the timeout? 0.5s or 5s or 10s. Timeouts are really tricky to gauge in protocols. A timeout for arming is actually about 5s and then variable.

@djnugent
Copy link
Author

Why would the vehicle not take off if it sent a MAV_RESULT_ACCEPT??? A MAV_RESULT_FAILED is sent if you are not in guided mode, not armed, and have already taken off a previously. I am confused...

@djnugent
Copy link
Author

Yes default timeouts are hard. We could also just set the default to 0 and make the user figure out the best timeout for their command.

@tcr3dr
Copy link
Contributor

tcr3dr commented Jun 18, 2015

Throwing in some comments off the top of my head, ideally you'd want something you can poll for success, so it could easily integrate into gevent.

import gevent

def command_1():
   await_ack(task())

def command_2():
   await_ack(task())

gevent.spawn(command_1)
gevent.spawn(command_2)
# Nothing is blocked, it's just in the event loop
gevent.wait()

Do we want to highly discourage people from just blocking waiting for an ack, but instead allow them to poll if they want complete error handling?

@eliao
Copy link

eliao commented Jun 18, 2015

Ideally, user code shouldn't have to deal with this. In a non-lossy environment, users don't have to. In a lossy environment, this could be solved at the protocol level via retries (preferred solution). A somewhat worse solution is having DroneKit itself spawn a thread for these commands and ensuring they get completed. This is what many of our applications currently do.

@hamishwillee
Copy link
Contributor

I'm most interested in the MAV_RESULT_FAILED as a way to understand there is a problem earlier than a timeout and get some idea of what the failure might be. The API should be asynchronous with some way to make it blockable if that is the way you want to code. It would be "best" if the callback is associated with a particular sent command and if for whatever reason it doesn't complete there is some cleanup on it.

@hamishwillee
Copy link
Contributor

Somewhat related, our solution should fit in with whatever we need to deliver #169

@djnugent
Copy link
Author

@tcr3dr I'm not familiar with events so you lost me there.
@eliao Retries on the lower lever is ideal, but that is beyond me. This was just an experiment.

I think call backs are great to prevent blocking but I imagine we are reaching out to "arduino" type developers who are coding amateurs. I bet they wouldn't care to wait for command_ack. It's just part of the API! The idea is to be able to do a,b,c,d... in sequential order. Dealing with mavlinks lossy connection is no different than dealing with a noisy sensor with arduino. You have to filter out bad behavior. This method is not super effective or ideal but it makes it easy for the new user.

Obviously we still need call back support for the advanced users.

So is this a good temporary solution or does it cause more problems than benefits?

@billbonney
Copy link

The callback ack from the command long doesn't add much to the logic. Even for the advanced user. It's indicative in a log for example. But in the end most actions commanded are only relevant when you look at the result from the vehicle status. I.e. A change in mode by monitoring the heartbeat and the status text.
If you want to make this easier for the Arduino crowd. Just make the default to ignore the ack as a false sense that a command has been sent and communicate in the docs how this works. Command long even has a bit to request the ack is not even sent.
In AP2 rarely do we check to see the result other than to put it in the log sometimes

@djnugent
Copy link
Author

djnugent commented Jul 1, 2015

Not a complete solution.

@djnugent djnugent closed this Jul 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants