Skip to content

Commit c697bb8

Browse files
authored
Merge pull request #1957 from vivekverma-arista/agg-voq-4.0
Add HLD for aggregate VOQ counters.
2 parents 4575d1d + 23399e4 commit c697bb8

File tree

1 file changed

+131
-0
lines changed

1 file changed

+131
-0
lines changed

doc/voq/aggregate_voq_counters.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Aggregate VOQ Counters in SONiC #
2+
#### Rev 1.0
3+
4+
## Table of Content
5+
* [Revision](#revision)
6+
* [Overview](#overview)
7+
* [Requirements](#requirements)
8+
* [Architecture Design](#architecture-design)
9+
* [High-Level Design](#high-level-design)
10+
* [Repositories that need to be changed](#repositories-that-need-to-be-changed)
11+
* [Configuration and management](#configuration-and-management)
12+
* [Testing Requirements/Design](#testing-requirementsdesign)
13+
* [System Test cases](#system-test-cases)
14+
* [Limitations and future work](#limitations-and-future-work)
15+
16+
### Revision
17+
| Rev | Date | Author | Change Description |
18+
|:---:|:-----------:|:----------------------------------------------------------------------------------:|-----------------------------------|
19+
| 1.0 | 19-Nov-2024 | Harsis Yadav, Pandurangan R S, Vivek Kumar Verma (Arista Networks) | Initial public version |
20+
21+
### Overview
22+
23+
In a [distributed VOQ architecture](https://github.com/sonic-net/SONiC/blob/master/doc/voq/architecture.md) corresponding to each output VOQ present on an ASIC, there are VOQs present on every ASIC in the system. Each ASIC has its own set of VOQ stats maintained in the FSI that have to be gathered independently and can be hard to visualize, providing a non-cohesive experience.
24+
25+
### Requirements
26+
27+
Provide aggregate VOQ counters in a distributed VOQ architecture.
28+
29+
### Architecture Design
30+
31+
No new architecture changes are required to SONiC.
32+
33+
### High-Level Design
34+
35+
On multi-asic systems (fixed system or linecard) the redis databases corresponding to each namespace are exposed on the docker network (non-loopback IP): https://github.com/sonic-net/sonic-buildimage/blob/master/dockers/docker-database/docker-database-init.sh. And when the IP to be bound to redis instance is not loopback IP we start redis server in unprotected mode: https://github.com/sonic-net/sonic-buildimage/blob/master/dockers/docker-database/supervisord.conf.j2#L38
36+
37+
We can leverage this property to expose redis instances that correspond to each ASIC over midplane IP addresses in addition to docker network in case of a chassis based system using the shell script docker-database.sh by simple redis-commands.
38+
39+
```
40+
redis-cli -h $ip -p $port config set bind "$bound_ips $midplane_ip"
41+
redis-cli -h $ip -p $port config rewrite
42+
```
43+
44+
This gives us access to each ASIC's redis instance from the supervisor. Then queuestat script can access the counters data and provide the user an aggregated view of the VOQ counters.
45+
46+
#### sonic-buildimage changes
47+
`docker_image_ctl.j2` needs to be modified to incorporate changes in order expose namespace redis instances on linecards over midplane network.
48+
49+
PR: https://github.com/sonic-net/sonic-buildimage/pull/20803
50+
51+
#### sonic-swss-common changes
52+
A new API will be added to sonicv2connector.cpp which can take the db name and host IP as an argument and connect us to the redis instance. The existing [API](https://github.com/sonic-net/sonic-swss-common/blob/202411/common/sonicv2connector.cpp#L18-L30) needs the db_name and the host_ip and port (or unix_socket) is decoded using [database_config.json](https://github.com/sonic-net/sonic-buildimage/blob/master/dockers/docker-database/database_config.json.j2). This API is tailored for use cases when you want to connect to the namespace redis instances from the same device. Our use case involves connecting to a redis instances over midplane IP hence the new API is needed.
53+
54+
PR: https://github.com/sonic-net/sonic-swss-common/pull/1003
55+
56+
#### sonic-py-swsssdk changes
57+
Same rationale as sonic-swss-common applies here as well. Also we would like to maintain parity between dbconnector.py between swsscommon and swsssdk so that we can write a unit test for this feature or even otherwise.
58+
59+
PR: https://github.com/sonic-net/sonic-py-swsssdk/pull/147
60+
61+
#### sonic-utilities changes
62+
We would leverage the existing `show queue counters --voq` on the supervisor to connect to various forwarding ASIC's database instances and do a summation of VOQ counters corresponding to each system port.
63+
64+
PR: https://github.com/sonic-net/sonic-utilities/pull/3617
65+
66+
#### Repositories that need to be changed
67+
* sonic-buildimage
68+
* sonic-swss-common
69+
* sonic-py-swsssdk
70+
* sonic-utilities
71+
72+
### Configuration and management
73+
#### CLI
74+
75+
From linecard - cmp217-5 for asic0 (existing CLI)
76+
```
77+
admin@cmp217-5:~$ show queue counters -n asic0 "cmp217-5|asic1|Ethernet256" --voq
78+
For namespace asic0:
79+
Port Voq Counter/pkts Counter/bytes Drop/pkts Drop/bytes Credit-WD-Del/pkts
80+
-------------------------- ----- -------------- --------------- ----------- ------------ --------------------
81+
cmp217-5|asic1|Ethernet256 VOQ0 123 6150 0 0 0
82+
cmp217-5|asic1|Ethernet256 VOQ1 12 600 0 0 0
83+
cmp217-5|asic1|Ethernet256 VOQ2 1000 50000 0 0 0
84+
cmp217-5|asic1|Ethernet256 VOQ3 456 22800 0 0 0
85+
cmp217-5|asic1|Ethernet256 VOQ4 211 10550 0 0 0
86+
cmp217-5|asic1|Ethernet256 VOQ5 45 2250 0 0 0
87+
cmp217-5|asic1|Ethernet256 VOQ6 24 1200 0 0 0
88+
cmp217-5|asic1|Ethernet256 VOQ7 0 0 0 0 0
89+
```
90+
91+
From linecard - cmp217-5 for asic1 (existing CLI)
92+
```
93+
admin@cmp217-5:~$ show queue counters -n asic1 "cmp217-5|asic1|Ethernet256" --voq
94+
For namespace asic1:
95+
Port Voq Counter/pkts Counter/bytes Drop/pkts Drop/bytes Credit-WD-Del/pkts
96+
-------------------------- ----- -------------- --------------- ----------- ------------ --------------------
97+
cmp217-5|asic1|Ethernet256 VOQ0 1111 55550 0 0 0
98+
cmp217-5|asic1|Ethernet256 VOQ1 45 2250 0 0 0
99+
cmp217-5|asic1|Ethernet256 VOQ2 9 450 0 0 0
100+
cmp217-5|asic1|Ethernet256 VOQ3 91 4550 0 0 0
101+
cmp217-5|asic1|Ethernet256 VOQ4 55 2750 0 0 0
102+
cmp217-5|asic1|Ethernet256 VOQ5 88 4400 0 0 0
103+
cmp217-5|asic1|Ethernet256 VOQ6 21 1050 0 0 0
104+
cmp217-5|asic1|Ethernet256 VOQ7 48 14437 0 0 0
105+
106+
```
107+
108+
From supervisor (same command extended for sup.)
109+
110+
```
111+
admin@cmp217:~$ show queue counters "cmp217-5|asic1|Ethernet256" --voq
112+
Port Voq Counter/pkts Counter/bytes Drop/pkts Drop/bytes Credit-WD-Del/pkts
113+
-------------------------- ----- -------------- --------------- ----------- ------------ --------------------
114+
cmp217-5|asic1|Ethernet256 VOQ0 1234 61700 0 0 0
115+
cmp217-5|asic1|Ethernet256 VOQ1 57 2850 0 0 0
116+
cmp217-5|asic1|Ethernet256 VOQ2 1009 50450 0 0 0
117+
cmp217-5|asic1|Ethernet256 VOQ3 547 27350 0 0 0
118+
cmp217-5|asic1|Ethernet256 VOQ4 266 13300 0 0 0
119+
cmp217-5|asic1|Ethernet256 VOQ5 133 6650 0 0 0
120+
cmp217-5|asic1|Ethernet256 VOQ6 45 2250 0 0 0
121+
cmp217-5|asic1|Ethernet256 VOQ7 52 15809 0 0 0
122+
123+
```
124+
125+
### Testing Requirements/Design
126+
#### System Test cases
127+
Send traffic across different ASICs and ensure aggregate counters are correctly displayed.
128+
129+
### Limitations and future work
130+
1. Currently we are not exposing redis instance over midplane IP for single ASIC linecards as redis runs in protected mode.
131+
2. Clear functionality is not supported as of now for aggregate VOQ counters.

0 commit comments

Comments
 (0)