Skip to content

Commit 09e6fdd

Browse files
silaselishaarvindbr8dfawley
authored
Update docs and examples and tests to use NewClient instead of Dial (grpc#7068)
Co-authored-by: Arvind Bright <[email protected]> Co-authored-by: Doug Fawley <[email protected]>
1 parent 9cf408e commit 09e6fdd

File tree

78 files changed

+282
-312
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+282
-312
lines changed

Documentation/anti-patterns.md

+105-128
Original file line numberDiff line numberDiff line change
@@ -1,103 +1,97 @@
1-
## Anti-Patterns
2-
3-
### Dialing in gRPC
4-
[`grpc.Dial`](https://pkg.go.dev/google.golang.org/grpc#Dial) is a function in
5-
the gRPC library that creates a virtual connection from the gRPC client to the
6-
gRPC server. It takes a target URI (which can represent the name of a logical
7-
backend service and could resolve to multiple actual addresses) and a list of
8-
options, and returns a
1+
## Anti-Patterns of Client creation
2+
3+
### How to properly create a `ClientConn`: `grpc.NewClient`
4+
5+
[`grpc.NewClient`](https://pkg.go.dev/google.golang.org/grpc#NewClient) is the
6+
function in the gRPC library that creates a virtual connection from a client
7+
application to a gRPC server. It takes a target URI (which represents the name
8+
of a logical backend service and resolves to one or more physical addresses) and
9+
a list of options, and returns a
910
[`ClientConn`](https://pkg.go.dev/google.golang.org/grpc#ClientConn) object that
10-
represents the connection to the server. The `ClientConn` contains one or more
11-
actual connections to real server backends and attempts to keep these
12-
connections healthy by automatically reconnecting to them when they break.
13-
14-
The `Dial` function can also be configured with various options to customize the
15-
behavior of the client connection. For example, developers could use options
16-
such a
17-
[`WithTransportCredentials`](https://pkg.go.dev/google.golang.org/grpc#WithTransportCredentials)
18-
to configure the transport credentials to use.
19-
20-
While `Dial` is commonly referred to as a "dialing" function, it doesn't
21-
actually perform the low-level network dialing operation like
22-
[`net.Dial`](https://pkg.go.dev/net#Dial) would. Instead, it creates a virtual
23-
connection from the gRPC client to the gRPC server.
24-
25-
`Dial` does initiate the process of connecting to the server, but it uses the
26-
ClientConn object to manage and maintain that connection over time. This is why
27-
errors encountered during the initial connection are no different from those
28-
that occur later on, and why it's important to handle errors from RPCs rather
29-
than relying on options like
30-
[`FailOnNonTempDialError`](https://pkg.go.dev/google.golang.org/grpc#FailOnNonTempDialError),
31-
[`WithBlock`](https://pkg.go.dev/google.golang.org/grpc#WithBlock), and
32-
[`WithReturnConnectionError`](https://pkg.go.dev/google.golang.org/grpc#WithReturnConnectionError).
33-
In fact, `Dial` does not always establish a connection to servers by default.
34-
The connection behavior is determined by the load balancing policy being used.
35-
For instance, an "active" load balancing policy such as Round Robin attempts to
36-
maintain a constant connection, while the default "pick first" policy delays
37-
connection until an RPC is executed. Instead of using the WithBlock option, which
38-
may not be recommended in some cases, you can call the
39-
[`ClientConn.Connect`](https://pkg.go.dev/google.golang.org/grpc#ClientConn.Connect)
40-
method to explicitly initiate a connection.
41-
42-
### Using `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError`
43-
44-
The gRPC API provides several options that can be used to configure the behavior
45-
of dialing and connecting to a gRPC server. Some of these options, such as
46-
`FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError`, rely on
47-
failures at dial time. However, we strongly discourage developers from using
48-
these options, as they can introduce race conditions and result in unreliable
49-
and difficult-to-debug code.
50-
51-
One of the most important reasons for avoiding these options, which is often
52-
overlooked, is that connections can fail at any point in time. This means that
53-
you need to handle RPC failures caused by connection issues, regardless of
54-
whether a connection was never established in the first place, or if it was
55-
created and then immediately lost. Implementing proper error handling for RPCs
56-
is crucial for maintaining the reliability and stability of your gRPC
57-
communication.
58-
59-
### Why we discourage using `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError`
60-
61-
When a client attempts to connect to a gRPC server, it can encounter a variety
62-
of errors, including network connectivity issues, server-side errors, and
63-
incorrect usage of the gRPC API. The options `FailOnNonTempDialError`,
64-
`WithBlock`, and `WithReturnConnectionError` are designed to handle some of
65-
these errors, but they do so by relying on failures at dial time. This means
66-
that they may not provide reliable or accurate information about the status of
67-
the connection.
68-
69-
For example, if a client uses `WithBlock` to wait for a connection to be
70-
established, it may end up waiting indefinitely if the server is not responding.
71-
Similarly, if a client uses `WithReturnConnectionError` to return a connection
72-
error if dialing fails, it may miss opportunities to recover from transient
73-
network issues that are resolved shortly after the initial dial attempt.
11+
represents the virtual connection to the server. The `ClientConn` contains one
12+
or more actual connections to real servers and attempts to maintain these
13+
connections by automatically reconnecting to them when they break. `NewClient`
14+
was introduced in gRPC-Go v1.63.
15+
16+
### The wrong way: `grpc.Dial`
17+
18+
[`grpc.Dial`](https://pkg.go.dev/google.golang.org/grpc#Dial) is a deprecated
19+
function that also creates the same virtual connection pool as `grpc.NewClient`.
20+
However, unlike `grpc.NewClient`, it immediately starts connecting and supports
21+
a few additional `DialOption`s that control this initial connection attempt.
22+
These are: `WithBlock`, `WithTimeout`, `WithReturnConnectionError`, and
23+
`FailOnNonTempDialError.
24+
25+
That `grpc.Dial` creates connections immediately is not a problem in and of
26+
itself, but this behavior differs from how gRPC works in all other languages,
27+
and it can be convenient to have a constructor that does not perform I/O. It
28+
can also be confusing to users, as most people expect a function called `Dial`
29+
to create _a_ connection which may need to be recreated if it is lost.
30+
31+
`grpc.Dial` uses "passthrough" as the default name resolver for backward
32+
compatibility while `grpc.NewClient` uses "dns" as its default name resolver.
33+
This subtle diffrence is important to legacy systems that also specified a
34+
custom dialer and expected it to receive the target string directly.
35+
36+
For these reasons, using `grpc.Dial` is discouraged. Even though it is marked
37+
as deprecated, we will continue to support it until a v2 is released (and no
38+
plans for a v2 exist at the time this was written).
39+
40+
### Especially bad: using deprecated `DialOptions`
41+
42+
`FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError` are three
43+
`DialOption`s that are only supported by `Dial` because they only affect the
44+
behavior of `Dial` itself. `WithBlock` causes `Dial` to wait until the
45+
`ClientConn` reports its `State` as `connectivity.Connected`. The other two deal
46+
with returning connection errors before the timeout (`WithTimeout` or on the
47+
context when using `DialContext`).
48+
49+
The reason these options can be a problem is that connections with a
50+
`ClientConn` are dynamic -- they may come and go over time. If your client
51+
successfully connects, the server could go down 1 second later, and your RPCs
52+
will fail. "Knowing you are connected" does not tell you much in this regard.
53+
54+
Additionally, _all_ RPCs created on an "idle" or a "connecting" `ClientConn`
55+
will wait until their deadline or until a connection is established before
56+
failing. This means that you don't need to check that a `ClientConn` is "ready"
57+
before starting your RPCs. By default, RPCs will fail if the `ClientConn`
58+
enters the "transient failure" state, but setting `WaitForReady(true)` on a
59+
call will cause it to queue even in the "transient failure" state, and it will
60+
only ever fail due to a deadline, a server response, or a connection loss after
61+
the RPC was sent to a server.
62+
63+
Some users of `Dial` use it as a way to validate the configuration of their
64+
system. If you wish to maintain this behavior but migrate to `NewClient`, you
65+
can call `State` and `WaitForStateChange` until the channel is connected.
66+
However, if this fails, it does not mean that your configuration was bad - it
67+
could also mean the service is not reachable by the client due to connectivity
68+
reasons.
7469

7570
## Best practices for error handling in gRPC
7671

7772
Instead of relying on failures at dial time, we strongly encourage developers to
78-
rely on errors from RPCs. When a client makes an RPC, it can receive an error
79-
response from the server. These errors can provide valuable information about
73+
rely on errors from RPCs. When a client makes an RPC, it can receive an error
74+
response from the server. These errors can provide valuable information about
8075
what went wrong, including information about network issues, server-side errors,
8176
and incorrect usage of the gRPC API.
8277

8378
By handling errors from RPCs correctly, developers can write more reliable and
84-
robust gRPC applications. Here are some best practices for error handling in
79+
robust gRPC applications. Here are some best practices for error handling in
8580
gRPC:
8681

87-
- Always check for error responses from RPCs and handle them appropriately.
88-
- Use the `status` field of the error response to determine the type of error that
89-
occurred.
82+
- Always check for error responses from RPCs and handle them appropriately.
83+
- Use the `status` field of the error response to determine the type of error
84+
that occurred.
9085
- When retrying failed RPCs, consider using the built-in retry mechanism
9186
provided by gRPC-Go, if available, instead of manually implementing retries.
9287
Refer to the [gRPC-Go retry example
9388
documentation](https://github.com/grpc/grpc-go/blob/master/examples/features/retry/README.md)
94-
for more information.
95-
- Avoid using `FailOnNonTempDialError`, `WithBlock`, and
96-
`WithReturnConnectionError`, as these options can introduce race conditions and
97-
result in unreliable and difficult-to-debug code.
98-
- If making the outgoing RPC in order to handle an incoming RPC, be sure to
99-
translate the status code before returning the error from your method handler.
100-
For example, if the error is an `INVALID_ARGUMENT` error, that probably means
89+
for more information. Note that this is not a substitute for client-side
90+
retries as errors that occur after an RPC starts on a server cannot be
91+
retried through gRPC's built-in mechanism.
92+
- If making an outgoing RPC from a server handler, be sure to translate the
93+
status code before returning the error from your method handler. For example,
94+
if the error is an `INVALID_ARGUMENT` status code, that probably means
10195
your service has a bug (otherwise it shouldn't have triggered this error), in
10296
which case `INTERNAL` is more appropriate to return back to your users.
10397

@@ -106,7 +100,7 @@ gRPC:
106100
The following code snippet demonstrates how to handle errors from an RPC in
107101
gRPC:
108102

109-
```go
103+
```go
110104
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
111105
defer cancel()
112106

@@ -118,89 +112,72 @@ if err != nil {
118112
return nil, err
119113
}
120114

121-
// Use the response as appropriate
115+
// Use the response as appropriate
122116
log.Printf("MyRPC response: %v", res)
123117
```
124118

125119
To determine the type of error that occurred, you can use the status field of
126120
the error response:
127121

128-
129122
```go
130-
resp, err := client.MakeRPC(context.Background(), request)
123+
resp, err := client.MakeRPC(context.TODO(), request)
131124
if err != nil {
132-
status, ok := status.FromError(err)
133-
if ok {
134-
// Handle the error based on its status code
125+
if status, ok := status.FromError(err); ok {
126+
// Handle the error based on its status code
135127
if status.Code() == codes.NotFound {
136128
log.Println("Requested resource not found")
137129
} else {
138130
log.Printf("RPC error: %v", status.Message())
139131
}
140132
} else {
141-
//Handle non-RPC errors
133+
// Handle non-RPC errors
142134
log.Printf("Non-RPC error: %v", err)
143135
}
144136
return
145-
}
137+
}
146138

147-
// Use the response as needed
148-
log.Printf("Response received: %v", resp)
139+
// Use the response as needed
140+
log.Printf("Response received: %v", resp)
149141
```
150142

151143
### Example: Using a backoff strategy
152144

153-
154145
When retrying failed RPCs, use a backoff strategy to avoid overwhelming the
155146
server or exacerbating network issues:
156147

157-
158-
```go
148+
```go
159149
var res *MyResponse
160150
var err error
161151

162-
// If the user doesn't have a context with a deadline, create one
163-
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
164-
defer cancel()
152+
retryableStatusCodes := map[codes.Code]bool{
153+
codes.Unavailable: true, // etc
154+
}
165155

166-
// Retry the RPC call a maximum number of times
156+
// Retry the RPC a maximum number of times.
167157
for i := 0; i < maxRetries; i++ {
168-
169-
// Make the RPC call
170-
res, err = client.MyRPC(ctx, &MyRequest{})
171-
172-
// Check if the RPC call was successful
173-
if err == nil {
174-
// The RPC was successful, so break out of the loop
158+
// Make the RPC.
159+
res, err = client.MyRPC(context.TODO(), &MyRequest{})
160+
161+
// Check if the RPC was successful.
162+
if !retryableStatusCodes[status.Code(err)] {
163+
// The RPC was successful or errored in a non-retryable way;
164+
// do not retry.
175165
break
176166
}
177-
178-
// The RPC failed, so wait for a backoff period before retrying
179-
backoff := time.Duration(i) * time.Second
167+
168+
// The RPC is retryable; wait for a backoff period before retrying.
169+
backoff := time.Duration(i+1) * time.Second
180170
log.Printf("Error calling MyRPC: %v; retrying in %v", err, backoff)
181171
time.Sleep(backoff)
182172
}
183173

184-
// Check if the RPC call was successful after all retries
174+
// Check if the RPC was successful after all retries.
185175
if err != nil {
186176
// All retries failed, so handle the error appropriately
187177
log.Printf("Error calling MyRPC: %v", err)
188178
return nil, err
189179
}
190180

191-
// Use the response as appropriate
181+
// Use the response as appropriate.
192182
log.Printf("MyRPC response: %v", res)
193183
```
194-
195-
196-
## Conclusion
197-
198-
The
199-
[`FailOnNonTempDialError`](https://pkg.go.dev/google.golang.org/grpc#FailOnNonTempDialError),
200-
[`WithBlock`](https://pkg.go.dev/google.golang.org/grpc#WithBlock), and
201-
[`WithReturnConnectionError`](https://pkg.go.dev/google.golang.org/grpc#WithReturnConnectionError)
202-
options are designed to handle errors at dial time, but they can introduce race
203-
conditions and result in unreliable and difficult-to-debug code. Instead of
204-
relying on these options, we strongly encourage developers to rely on errors
205-
from RPCs for error handling. By following best practices for error handling in
206-
gRPC, developers can write more reliable and robust gRPC applications.

authz/audit/audit_logging_test.go

+2-2
Original file line numberDiff line numberDiff line change
@@ -279,9 +279,9 @@ func (s) TestAuditLogger(t *testing.T) {
279279
go s.Serve(lis)
280280

281281
// Setup gRPC test client with certificates containing a SPIFFE Id.
282-
clientConn, err := grpc.Dial(lis.Addr().String(), grpc.WithTransportCredentials(clientCreds))
282+
clientConn, err := grpc.NewClient(lis.Addr().String(), grpc.WithTransportCredentials(clientCreds))
283283
if err != nil {
284-
t.Fatalf("grpc.Dial(%v) failed: %v", lis.Addr().String(), err)
284+
t.Fatalf("grpc.NewClient(%v) failed: %v", lis.Addr().String(), err)
285285
}
286286
defer clientConn.Close()
287287
client := testgrpc.NewTestServiceClient(clientConn)

0 commit comments

Comments
 (0)