You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the development branch for V2.0.0 which includes parallel trace support as well as all the SWO goodness that Orbuculum has always offered. It is pretty close to being labelled as 2.0.0 with only a few minor issues outstanding.
10
10
11
-
This is the development branch for V2.00 which includes parallel trace support as well as all the SWO goodness that Orbuculum has always offered. ORBTrace (the FPGA trace interface) has now been moved into its own separate repository as it's grown considerably and really needs its own identity. History for orbtrace until the split point is maintained here for provenance purposes, but new work is now done over in the new location.
11
+
ORBTrace (the FPGA trace interface) has now been moved into its own separate repository as it's grown considerably and really needs its own identity. History for orbtrace until the split point is maintained here for provenance purposes, but new work is now done over in the new location.
12
12
13
-
The CHANGES file now tells you what's been done when.
13
+
The CHANGES file now tells you what's been done recently.
14
14
15
15
Orbuculum now has an active Discord channel at https://discord.gg/P7FYThy . Thats the place to go if you need interactive help.
16
16
17
17
An Orbuculum is a Crystal Ball, used for seeing things that would
18
18
be otherwise invisible. A nodding reference to (the) BlackMagic
19
19
(debug probe), BMP.
20
20
21
-
You can find information about using this suite at [Orbcode](https://www.orbcode.org).
21
+
You can find information about using this suite at [Orbcode](https://www.orbcode.org), especially the 'Data Feed' section.
22
22
23
-
*** ORBTrace Mini is now available for debug and realtime tracing. Go to [Orbcode](https://www.orbcode.org) for more info. ***
23
+
** ORBTrace Mini is now available for debug and realtime tracing. Go to [Orbcode](https://www.orbcode.org) for more info. **
24
24
25
25
For the current development status you will need to use the `Devel` branch. Fixes are made on main and Devel.
26
26
27
-
The code is in daily use now and small issues are patched as they are found. The software runs on both Linuxand OSX and the whole suite is working OK on most workloads. Any bugs found now will be treated as high priority issues. Functional enhancements will also be folded in as time permits. A contribution offering a build for windows is in progress but isn't yet available.
27
+
The code is in daily use now and small issues are patched as they are found. The software runs on Linux, OSX and Windows and the whole suite is working OK on most workloads. Any bugs found now will be treated as high priority issues. Functional enhancements will also be folded in as time permits. Currently progress is reasonably rapid, and patches are always welcome.
28
28
29
29
What is it?
30
30
===========
@@ -60,17 +60,22 @@ is a very powerful code performance analysis tool.
60
60
61
61
* orbtrace: The fpga configuration controller.
62
62
63
+
* orbzmq: ZeroMQ server.
64
+
63
65
A few simple use cases are documented in the last section of this
64
66
document, as are example outputs of using orbtop to report on the
65
67
activity of BMP while emitting SWO packets.
66
68
67
69
The data flowing from the SWO pin can be encoded
68
70
either using NRZ (UART) or RZ (Manchester) formats. The pin is a
69
71
dedicated one that would be used for TDO when the debug interface is
70
-
in JTAG mode.
72
+
in JTAG mode. We've demonstrated
73
+
ITM feeds of around 4.3MBytes/sec on a STM32F427 via SWO with Manchester
74
+
encoding running at 48Mbps.
71
75
72
76
The data flowing from the TRACE pins is clocked using a separate TRACECLK pin. There can
73
-
be 1-4 TRACE pins which obviously give you much higher bandwidth than the single SWO.
77
+
be 1-4 TRACE pins which obviously give you much higher bandwidth than the single SWO. We've demonstrated
78
+
ITM feeds of around 12.5MBytes/sec on a STM32F427 via 4 bit parallel trace.
74
79
75
80
Whatever it's source, orbuculum takes this data flow and makes it accessible to tools on the host
76
81
PC. At its core it takes the data from the source, decodes it and presents it on a network
@@ -96,9 +101,9 @@ from the target;
96
101
gdb setup files for each device type can be found in the `Support` directory. You'll find
97
102
example get-you-going applications in the [Orbmule](https://github.com/orbcode/orbmule) repository including
98
103
`gdbinit` scripts for OpenOCD, pyOCD and Blackmagic Probe Hosted. There are walkthroughs for lots of examples
99
-
of the use of the orbuculum suite at [Orbcode]https://orbcode.org).
104
+
of the use of the orbuculum suite at (Orbcode)[https://orbcode.org].
100
105
101
-
When using SWO Orbuculum can use, or bypass, the TPIU. The TPIU adds (a small amount of) overhead
106
+
When using SWO Orbuculum can use, or bypass, the TPIU. The TPIU adds overhead
102
107
to the datastream, but provides better synchronisation if there is corruption
103
108
on the link. To include the TPIU in decode stack, provide the -t
104
109
option on the command line. If you don't provide it, and the ITM decoder sees
@@ -107,22 +112,24 @@ after I spent two days trying to find an obscure bug 'cos I'd left the `-t` opti
107
112
channels to the `-t` option, which is useful when you've got debug data in one stream and
108
113
trace data in another.
109
114
110
-
Beware that in parallel trace the TPIU is mandatory, so therefore so is the -t option. It can be stripped either
115
+
Beware that in parallel trace the TPIU is mandatory, so therefore so is the -t option.
116
+
117
+
TPIU framing can be stripped either
111
118
by individual applications or the `orbuculum` mux. When its stripped by the mux the data are made available on
112
119
consecutive TCP/IP ports...so `-t 1,2` would put stream 1 data out over TCP port 3443 and stream 2 over 3444, by default.
113
120
114
-
When in NRZ mode the SWO data rate that comes out of the chip _must_
121
+
When in NRZ (UART) mode the SWO data rate that comes out of the chip _must_
115
122
match the rate that the debugger expects. On the BMP speeds of
116
123
2.25Mbps are normal, TTL Serial devices tend to run a little slower
117
124
but 921600 baud is normally acheivable. On BMP the baudrate is set via
118
125
the gdb session with the 'monitor traceswo xxxx' command. For a TTL
119
126
Serial device its set by the Orbuculum command line. Segger devices
120
-
can normally work faster, but no experimentation has yet been done to
127
+
can normally work faster, but no experimentation has been done to
121
128
find their max limits, which are probably it's dependent on the specific JLink
122
129
you are using. ORBTrace Mini can operate with Manchester encoded SWO at up
123
130
to 48Mbps, which means there's no speed matching needed to use it, and it should
124
131
continue to work correctly even if the target clock speed changes (e.g. when it goes
125
-
into a low power mode).
132
+
into a low power mode). This is a good thing.
126
133
127
134
Configuring the Target
128
135
======================
@@ -259,13 +266,15 @@ Using
259
266
The command line options for Orbuculum are available by running
260
267
orbuculum with the -h option. Orbuculum is just the multiplexer, the
261
268
fifos are now provided by `orbfifo`. *This is a change to the previous
262
-
way the suite was configured where the fifos were integrated into `orbuculum` itself*.
269
+
way the suite was configured where the fifos were integrated into `orbuculum` itself*. Note that
270
+
orbfifo isn't available on Windows (there are no fifos), so use orbzmq to get similar functionality
271
+
on that platform.
263
272
264
273
Simply start orbuculum with the correct options for your trace probe and
265
274
then you can start of stop other utilities as you wish. A typical command
266
275
to run orbuculum would be;
267
276
268
-
$ orbuculum --monitor 100
277
+
```$ orbuculum --monitor 100```
269
278
270
279
In this case, because no source options were provided on the command line, input
271
280
will be taken from a Blackmagic probe USB SWO feed, or from an ORBTrace mini if it can find one.
@@ -383,14 +392,14 @@ The command line options are;
383
392
384
393
Orbzmq
385
394
------
386
-
`orbzmq` is utility that connects to orbuculum over the network and outputs data from various ITM HW and SW channels that it find. This output is sent over ZeroMQ PUBLISH socket bound to specified URL. Each published message is composed of two parts: **topic** and **payload**. Topic can be used by consumers to filter incoming messages, payload contains actual message data - for SW channels formatted or raw data and predefined format for HW channels.
395
+
`orbzmq` is utility that connects to orbuculum over the network and outputs data from various ITM HW and SW channels that it find. This output is sent over (ZeroMQ)[https://zeromq.org/] PUBLISH socket bound to specified URL. Each published message is composed of two parts: **topic** and **payload**. Topic can be used by consumers to filter incoming messages, payload contains actual message data - for SW channels formatted or raw data and predefined format for HW channels.
`orbzmq` will create ZeroMQ socket bound to address `tcp://localhost:1234` and publish messages with topics: `text`, `raw`, `formatted`, `hweventAWP`, `hweventOFS` and `hweventPC`.
400
+
`orbzmq` will create a ZeroMQ socket bound to address `tcp://localhost:1234` and publish messages with topics: `text`, `raw`, `formatted`, `hweventAWP`, `hweventOFS` and `hweventPC`.
392
401
393
-
Simple client written in Python can receive messsages in following way:
402
+
A simple python client can receive messsages in the following way:
394
403
```python
395
404
import zmq
396
405
@@ -698,7 +707,9 @@ sys 11m2.847s
698
707
699
708
That is 124.4GBytes of user to user transfer, 155.5GBytes of line transfer with no errors at 5.4MBytes/sec line,
700
709
4.32MBytes/sec user-to-user, assuming my math is holding up. As with blackmagic...if the wiring is reliable, the
701
-
link will be.
710
+
link will be. The equivalent test on the same chip using 4 bit parallel TRACE gives a user-to-user data rate
711
+
of around 12.5MBytes/sec...at that point you're limited by the speed the CPU can put data out onto line rather
0 commit comments